Data Wrangling: The Elephant in the Room of Big Data. Norman Paton University of Manchester

Size: px
Start display at page:

Download "Data Wrangling: The Elephant in the Room of Big Data. Norman Paton University of Manchester"

Transcription

1 Data Wrangling: The Elephant in the Room of Big Data Norman Paton University of Manchester

2 Data Wrangling Definitions: a process of iterative data exploration and transformation that enables analysis [1]. the process of manually converting or mapping data from one "raw" form into another format that allows for more convenient consumption of the data with the help of semiautomated tools [2]. [1] S. Kandal, et al., Research Directions in Data Wrangling: Vizualizations and Transformations for usable and credible data, Information Visualization, 10(4), , [2] 12 th May 2015.

3 Extract, Transform and Load - 1 Of course, this is not completely new, and Extract, Transform and Load (ETL) tools have been around for a significant time. ETL tools support source wrapping, warehouse population, workflow languages, etc. ETL vendors also have big data offerings.

4 Extract, Transform and Load - 2 ETL tools are clearly useful, with products from database vendors and data integration companies. ETL tools emerged to support data warehousing, and thus typically have roots in enterprise settings. ETL tools typically involve significant manual effort. ETL costs no doubt vary widely from project to project, but are quoted as representing up to 80% of the development time in warehousing projects [1]. [1] S. Kandal, et al., Research Directions in Data Wrangling: Vizualizations and Transformations for usable and credible data, Information Visualization, 10(4), , 2011.

5 Big Data does it make a difference? Big data is sometimes characterised by the 4 V s: Volume the scale of the data, Velocity speed of change, Variety different forms of data, and Veracity uncertainty of data. So size matters but, it isn't everything. Data wrangling for big data must address all four V s at the same time. Classical, substantially manual ETL may struggle with numerous and rapidly changing sources.

6 The Business Case - 1 There is strong support for big data being commercially important: The International Institute of Analytics estimate the Big Data market at $16.1B in 2014, growing 6 times faster than the overall IT market. Projection for 2017 is ~$50B. Gartner (2014) estimates the Data Integration tool market at over $2.2B at end 2013, predicted to rise to ~$3.6B by Gartner (2014) estimates the Data Quality market as $960M in software revenue at end 2012 predicted to rise to $2B by 2017.

7 The Business Case - 2 but many of the potential beneficiaries of big data cannot simply throw resource at data wrangling. The government s Information Economy Strategy states: the overwhelming majority of information economy businesses 95% of the 120,000 enterprises in the sector employ fewer than 10 people.

8 Case Study: e-commerce If you run an e-commerce site, then you need to be able to understand pricing trends among your competitors. This may involve getting to grips with: Volume: thousands of sites; Velocity: sites, site descriptions and contents changing; Variety: in format, content, user community, ; and Veracity: unavailability, inconsistent descriptions, Manual attempts at data wrangling are likely to be expensive, partial, unreliable, poorly targeted,

9 Data Wrangling Research So data wrangling is a research challenge, currently without a community or established priorities. The VADA (Value Added DAta Systems) project seeks to define principles and solutions for adding value to data, supporting users in discovering, extracting, integrating, accessing and interpreting the data of relevance to their questions. VADA takes account of the: user context: requirements such as the trade-off between completeness and correctness; and the data context: availability, cost, provenance, quality.

10 User Context: e-commerce The same application may involve different user contexts. For example: Price comparison may normally be able to work with a subset of high quality sources, but Issue investigation may require a more complete picture, at the risk of obtaining more incorrect data, where sales of a popular item have been falling. As a result, hard-wiring data wrangling tasks risks the production of data sets that are not always fit for purpose, where the reason for this is implicit.

11 Quality VADA Components VADA seeks to support wrangling by integrating: Data extraction, Data integration, Quality analysis and Querying in a best-effort, payas-you-go approach to data wrangling. Integration VADA

12 Example: Data Integration How do we avoid lots of slow, expensive expert input into data integration? In pay-as-you-go data integration, alternative ways of combining data from sources can be generated algorithmically. Automatically generated candidate integrations can be refined in the light of feedback, for example from users or crowds. Decision support techniques can be used to capture the user s requirements (e.g. in terms of quality or cost), in ways that inform which integrations are generated.

13 Example: Mapping Selection - 1 Problem statement: Given a set of candidate mappings, and feedback on their results, identify the subset that best meets the user s requirements in terms of precision and recall. Associated definitions: Precision: the fraction of the retrieved results that are correct. Recall: the fraction of the correct results that are retrieved. The following were among the mappings generated by a commercial schema mapping tool for populating a table with schema <name, country, province> M 1 = SELECT name, country, province from Mondial.city M 2 = SELECT city, country, province from Mondial.located

14 Example: Mapping Selection - 2 We can estimate the quality of the generated mappings using feedback. How much feedback do we need? The results presented: report the precision obtained for a given precision threshold, for different amounts of feedback. report the recall obtained for a given precision threshold, for different amounts of feedback. Khalid Belhajjame, et al., Incrementally improving dataspaces based on user feedback. Inf. Syst. 38(5): (2013).

15 Evaluate Result Precision Threshold

16 Evaluate Result Precision Threshold

17 VADA: All Change The component technologies for extraction, integration and cleaning must themselves: provide automated analyses that are informed by and take account of the user context; share information with each other, so that, e.g., integration can identify issues with extraction, etc; and make well informed decisions that use all available evidence about the data context, such as reference data sets and ontologies. Thus making data wrangling more cost effective and systematic involves a fundamental rethink across a wide front.

18 Conclusions Data wrangling is a problem and an opportunity: A problem because the 4V s of big data may all be present together a lot of the time, undermining manual approaches. An opportunity because if we can make data wrangling much more cost effective, all sorts of hitherto impractical tasks come into reach. Call to arms: we will have a serious go at this in VADA, but there is much to do, and there must be different viable approaches to be taken.

19 Acknowledgements VADA is funded by: The Engineering and Physical Sciences Research Council Through a grant to: Georg Gottlob, Thomas Lukasiewicz, Dan Olteanu, Giorgio Orsi, Tim Furche In cooperation with: Norman Paton, Alvaro Fernandes, John Keane Leonid Libkin, Wenfei Fan, Peter Buneman, Sebastian Maneth

Data Wrangling for Big Data: Challenges and Opportunities

Data Wrangling for Big Data: Challenges and Opportunities Visionary Paper Data Wrangling for Big Data: Challenges and Opportunities Tim Furche Dept. of Computer Science Oxford University Oxford OX1 3QD, UK tim.furche@cs.ox.ac.uk Giorgio Orsi School. of Computer

More information

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Joins and Aggregates Are Critical for Achieving Faster Data Retrieval in a Dimensional Data Warehouse

Joins and Aggregates Are Critical for Achieving Faster Data Retrieval in a Dimensional Data Warehouse Joins and Aggregates Are Critical for Achieving Faster Data Retrieval in a Dimensional Data Warehouse By Craig Abramson and Thomas Mascoli The processes discussed in this white paper can all be accomplished

More information

Operational Success: Targeting Performance

Operational Success: Targeting Performance Operational Success: Targeting Performance INTRODUCTION Operational success has become widely accepted as a critical factor for real estate service providers (hereinafter referred to as providers ). It

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

How Master Data Management powers big data decision making.

How Master Data Management powers big data decision making. decision ready. How Master Data Management powers big data decision making. Building an enterprise architecture that s decision ready. Bringing discipline to big data. The trouble with insight is it doesn

More information

Agile Data Warehousing

Agile Data Warehousing Agile Data Warehousing Chris Galfi Project Manager Brian Zachow Data Architect COUNTRY Financial IT Projects are too slow IT Projects cost too much money I never get what I expected There must be a better

More information

Open Source BI. Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com

Open Source BI. Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com Open Source BI Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com Agenda Introduction What is Open Source BI Opportunities and challenges of Open source BI A little

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

BIG DATA THE NEW OPPORTUNITY

BIG DATA THE NEW OPPORTUNITY Feature Biswajit Mohapatra is an IBM Certified Consultant and a global integrated delivery leader for IBM s AMS business application modernization (BAM) practice. He is IBM India s competency head for

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK 5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK CUSTOMER JOURNEY Technology is radically transforming the customer journey. Today s customers are more empowered and connected

More information

Data Virtualization: Achieve Better Business Outcomes, Faster

Data Virtualization: Achieve Better Business Outcomes, Faster White Paper Data Virtualization: Achieve Better Business Outcomes, Faster What You Will Learn Over the past decade, businesses have made tremendous investments in information capture, storage, and analysis.

More information

Information as a Service in a Data Analytics Scenario A Case Study

Information as a Service in a Data Analytics Scenario A Case Study 2008 IEEE International Conference on Web Services Information as a Service in a Analytics Scenario A Case Study Vishal Dwivedi, Naveen Kulkarni SETLabs, Infosys Technologies Ltd { Vishal_Dwivedi, Naveen_Kulkarni}@infosys.com

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

Measuring Quality of Service & tracking performance SMARTERDECISIONS

Measuring Quality of Service & tracking performance SMARTERDECISIONS Measuring Quality of Service & tracking performance SMARTERDECISIONS Business analytics Intergraph SG&I Deutschland GmbH Harald J. Behnke Business Analytics What is Business Analytics (BA)? Business Analytics

More information

Why Most Big Data Projects Fail

Why Most Big Data Projects Fail Learning from Common Mistakes to Transform Big Data into Insights What is Big Data?...2 Three Reasons Why Big Data Projects Fail...3 How Can Big Data Be Used?...5 The Lavastorm Approach to Big Data...5

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction

More information

Rapid Analytics. A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management

Rapid Analytics. A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management Rapid Analytics A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management Brought to you by: Agenda Why Do Traditional Analytics Projects

More information

www.inovoo.com Novo Mail

www.inovoo.com Novo Mail www.inovoo.com Novo Mail Email is and will remain a popular communications channel 01 For businesses... and for customers. Fast Convenient Easy You can have data without information, but you cannot have

More information

IJSER Figure1 Wrapper Architecture

IJSER Figure1 Wrapper Architecture International Journal of Scientific & Engineering Research, Volume 5, Issue 5, May-2014 24 ONTOLOGY BASED DATA INTEGRATION WITH USER FEEDBACK Devini.K, M.S. Hema Abstract-Many applications need to access

More information

ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS

ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS Ranjith Katragadda Unitech Institute of Technology Auckland, New Zealand Sreenivas Sremath Tirumala

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server Course Code: M20463 Vendor: Microsoft Course Overview Duration: 5 RRP: 2,025 Implementing a Data Warehouse with Microsoft SQL Server Overview This course describes how to implement a data warehouse platform

More information

90% of your Big Data problem isn t Big Data.

90% of your Big Data problem isn t Big Data. White Paper 90% of your Big Data problem isn t Big Data. It s the ability to handle Big Data for better insight. By Arjuna Chala Risk Solutions HPCC Systems Introduction LexisNexis is a leader in providing

More information

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Insurance Data and Analytics Summit 2013 18 April 2013 David Saul, Senior Vice President & Chief Scientist State Street

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper Retail POS Data Analytics Using MS Bi Tools Business Intelligence White Paper Introduction Overview There is no doubt that businesses today are driven by data. Companies, big or small, take so much of

More information

How the oil and gas industry can gain value from Big Data?

How the oil and gas industry can gain value from Big Data? How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics arild.kristensen@no.ibm.com, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert

More information

Improving Data Quality with Open Mapping Tools. February 2011. Robert Worden. Open Mapping Software Ltd. 2011 Open Mapping Software

Improving Data Quality with Open Mapping Tools. February 2011. Robert Worden. Open Mapping Software Ltd. 2011 Open Mapping Software Improving Data Quality with Open Mapping Tools February 2011 Robert Worden Open Mapping Software Ltd 2011 Open Mapping Software Contents 1. Introduction: The Business Problem 2 2. Initial Assessment: Understanding

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS In today's scenario data warehouse plays a crucial role in order to perform important operations. Different indexing techniques has been used and analyzed using

More information

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data

More information

Delivering Customer Value Faster With Big Data Analytics

Delivering Customer Value Faster With Big Data Analytics Delivering Customer Value Faster With Big Data Analytics Tackle the challenges of Big Data and real-time analytics with a cloud-based Decision Management Ecosystem James Taylor CEO Customer data is more

More information

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement white paper Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement»» Summary For business intelligence analysts the era

More information

The Principles of the Business Data Lake

The Principles of the Business Data Lake The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization

More information

Implementing a Data Warehouse with Microsoft SQL Server 2014

Implementing a Data Warehouse with Microsoft SQL Server 2014 Implementing a Data Warehouse with Microsoft SQL Server 2014 MOC 20463 Duración: 25 horas Introducción This course describes how to implement a data warehouse platform to support a BI solution. Students

More information

Next Generation Business Performance Management Solution

Next Generation Business Performance Management Solution Next Generation Business Performance Management Solution Why Existing Business Intelligence (BI) Products are Inadequate Changing Business Environment In the face of increased competition, complex customer

More information

The 2015 State of E-Commerce in Distribution

The 2015 State of E-Commerce in Distribution The 2015 State of E-Commerce in Distribution Featuring: Jonathan Bein, Managing Partner, Real Results Marketing Sponsored by: Date: March 26, 2015 Speakers Featured Presenter: Jonathan Bein Senior Partner

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

CHAPTER-6 DATA WAREHOUSE

CHAPTER-6 DATA WAREHOUSE CHAPTER-6 DATA WAREHOUSE 1 CHAPTER-6 DATA WAREHOUSE 6.1 INTRODUCTION Data warehousing is gaining in popularity as organizations realize the benefits of being able to perform sophisticated analyses of their

More information

Using Big Data Analytics to Understand Customer Journeys and Drive Revenue

Using Big Data Analytics to Understand Customer Journeys and Drive Revenue Using Big Data Analytics to Understand Customer Journeys and Drive Revenue .com Technology is radically transforming the customer journey. Today s customers are more empowered and connected than ever before.

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with SQL Server Integration Services, and

More information

I D C E X E C U T I V E B R I E F

I D C E X E C U T I V E B R I E F I D C E X E C U T I V E B R I E F E n a b l i n g B e t t e r D e c i s i o n s T h r o u g h U n i f i e d Ac c e s s t o I n f o r m a t i o n November 2008 Global Headquarters: 5 Speen Street Framingham,

More information

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Bruce Eckert, National Practice Director, Advisory Group Ramesh Sakiri, Executive Consultant, Healthcare

More information

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management Making Business Intelligence Easy Whitepaper Measuring data quality for successful Master Data Management Contents Overview... 3 What is Master Data Management?... 3 Master Data Modeling Approaches...

More information

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse Information management software solutions White paper Powerful data warehousing performance with IBM Red Brick Warehouse April 2004 Page 1 Contents 1 Data warehousing for the masses 2 Single step load

More information

Chapter 9: Software Tools and Dashboards

Chapter 9: Software Tools and Dashboards Chapter 9: Software Tools and Dashboards Overview Topics discussed CRM Implementation Options Developing Software In-house Buying Licensed CRM Software Outsourcing a Managed Service CRM Software and Applications

More information

7 Principles for Implementing. High Value Business Intelligence on a Budget

7 Principles for Implementing. High Value Business Intelligence on a Budget 7 Principles for Implementing High Value Business Intelligence on a Budget What Blastrac needed was the ability to share information consistently and appropriately across the organization. Executives,

More information

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE YOUR SUCCESS IS OUR FOCUS Whitepaper Published on: January 2009 Author: BIBA PRACTICE 2009 Hexaware Technologies. All rights reserved. Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

Advanced Data Integration Solution for Enterprise Information Systems

Advanced Data Integration Solution for Enterprise Information Systems Abstract Advanced Data Integration Solution for Enterprise Information Systems Gholamreza Jandaghi, Ph.D. Faculty of Management, Qom College, University of Tehran, Iran Email: jandaghi@ut.ac.ir Abolfazl

More information

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER Page 1 of 8 ABOUT THIS COURSE This 5 day course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server Page 1 of 7 Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL 2014, implement ETL

More information

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes This white paper will help you learn how to integrate your SalesForce.com data with 3 rd -party on-demand,

More information

Welcome to. Business Intelligence 101

Welcome to. Business Intelligence 101 Welcome to Business Intelligence 101 Hi There! Before choosing a (BI) partner, you ll want to understand the essentials about BI including the various categories of analytics, what sort of insight is possible,

More information

MDM and Data Warehousing Complement Each Other

MDM and Data Warehousing Complement Each Other Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

A Closer Look at BPM. January 2005

A Closer Look at BPM. January 2005 A Closer Look at BPM January 2005 15000 Weston Parkway Cary, NC 27513 Phone: (919) 678-0900 Fax: (919) 678-0901 E-mail: info@ultimus.com http://www.ultimus.com The Information contained in this document

More information

Offshore Holdings Analytics Using Datalog + RuleML Rules

Offshore Holdings Analytics Using Datalog + RuleML Rules Offshore Holdings Analytics Using Datalog + RuleML Rules Mohammad Sadnan Al Manir and Christopher J.O. Baker Department of Computer Science and Applied Statistics University of New Brunswick, Saint John,

More information

Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short

Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short Vijay Anand, Director, Product Marketing Agenda 1. Managed self-service» The need of managed self-service»

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Comparing Data Integration Algorithms

Comparing Data Integration Algorithms Comparing Data Integration Algorithms Initial Background Report Name: Sebastian Tsierkezos tsierks6@cs.man.ac.uk ID :5859868 Supervisor: Dr Sandra Sampaio School of Computer Science 1 Abstract The problem

More information

BEST PRACTICES IN DEMAND AND INVENTORY PLANNING

BEST PRACTICES IN DEMAND AND INVENTORY PLANNING WHITEPAPER BEST PRACTICES IN DEMAND AND INVENTORY PLANNING for Food & Beverage Companies WHITEPAPER BEST PRACTICES IN DEMAND AND INVENTORY PLANNING 2 ABOUT In support of its present and future customers,

More information

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led

More information

Business Insight Through Cloud-based Data Models. Javier Guillen, Solutions Architect - BlueGranite

Business Insight Through Cloud-based Data Models. Javier Guillen, Solutions Architect - BlueGranite Business Insight Through Cloud-based Data Models Javier Guillen, Solutions Architect - BlueGranite What we will cover The business process associated with generating undirected business insight Possible

More information

10426: Large Scale Project Accounting Data Migration in E-Business Suite

10426: Large Scale Project Accounting Data Migration in E-Business Suite 10426: Large Scale Project Accounting Data Migration in E-Business Suite Objective of this Paper Large engineering, procurement and construction firms leveraging Oracle Project Accounting cannot withstand

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Costs of Data Warehousing & Business Intelligence for the Small to Midsize Business

Costs of Data Warehousing & Business Intelligence for the Small to Midsize Business i White Paper Costs of Data Warehousing & Business Intelligence for the Small to Midsize Business By Ted Mountzuris March 6, 2004 ii Introduction Everyone seems to agree that a Business Intelligence (BI)

More information

Enterprise Data Quality

Enterprise Data Quality Enterprise Data Quality An Approach to Improve the Trust Factor of Operational Data Sivaprakasam S.R. Given the poor quality of data, Communication Service Providers (CSPs) face challenges of order fallout,

More information

QAD Business Intelligence Data Warehouse Demonstration Guide. May 2015 BI 3.11

QAD Business Intelligence Data Warehouse Demonstration Guide. May 2015 BI 3.11 QAD Business Intelligence Data Warehouse Demonstration Guide May 2015 BI 3.11 Overview This demonstration focuses on the foundation of QAD Business Intelligence the Data Warehouse and shows how this functionality

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy

What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy What do Big Data & HAVEn mean? Robert Lejnert HP Autonomy Much higher Volumes. Processed with more Velocity. With much more Variety. Is Big Data so big? Big Data Smart Data Project HAVEn: Adaptive Intelligence

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

WHITEPAPER IMPROVE YOUR MARKETING AND SALES PERFORMANCES THROUGH OPTIMISED LEAD MANAGEMENT

WHITEPAPER IMPROVE YOUR MARKETING AND SALES PERFORMANCES THROUGH OPTIMISED LEAD MANAGEMENT IMPROVE YOUR MARKETING AND SALES PERFORMANCES THROUGH OPTIMISED LEAD MANAGEMENT 2 ABOUT Why has lead management become so important today for business? Two trends currently reinforce the need to implement

More information

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff The Challenge IT Executives are challenged with issues around data, compliancy, regulation and making confident decisions on their business

More information

The Internet of Things and Big Data: Intro

The Internet of Things and Big Data: Intro The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific

More information

Cognos e-applications Fast Time to Success. Immediate Business Results.

Cognos e-applications Fast Time to Success. Immediate Business Results. Cognos e-applications Fast Time to Success. Immediate Business Results. www.cognos.com Cognos e-applications transform business-critical data into a readily available global view of our customers and our

More information

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management Datalogix Using IBM Netezza data warehouse appliances to drive online sales with offline data Overview The need Infrastructure could not support the growing online data volumes and analysis required The

More information

DATAOPT SOLUTIONS. What Is Big Data?

DATAOPT SOLUTIONS. What Is Big Data? DATAOPT SOLUTIONS What Is Big Data? WHAT IS BIG DATA? It s more than just large amounts of data, though that s definitely one component. The more interesting dimension is about the types of data. So Big

More information

Warehouse Builder 11g. Best Practices for a Data Quality Process with OWB. May 2008

Warehouse Builder 11g. Best Practices for a Data Quality Process with OWB. May 2008 Warehouse Builder 11g Best Practices for a Data Quality Process with OWB May 2008 Best Practices for a Data Quality Process with OWB INTRODUCTION Maybe the introduction to this paper should be the following

More information

Continuous Delivery of Software

Continuous Delivery of Software Continuous Delivery of Software Reducing risks with systems, feedback and flow SEPG North America 2013 Joanne Molesky October 3, 2013 2011 All rights reserved. Purpose Challenge traditional concepts for

More information

January 2010. Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling. Sponsored by:

January 2010. Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling. Sponsored by: Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling January 2010 Claudia Imhoff, Ph.D Sponsored by: Table of Contents Introduction... 3 What is a Data Model?...

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

Business Intelligence: How better analytics can lead your business to higher profits.

Business Intelligence: How better analytics can lead your business to higher profits. Business Intelligence: How better analytics can lead your business to higher profits. Introduction The economic downturn is forcing business leaders to rethink strategic plans. To remain competitive, businesses

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days Course

More information

CASE STUDY. How Salesforce.com Grows Quickly and Efficiently with DocuSign

CASE STUDY. How Salesforce.com Grows Quickly and Efficiently with DocuSign CASE STUDY How Salesforce.com Grows Quickly and Efficiently with DocuSign The Value Agile development cycles churn out new releases of disruptive technologies as often as every week, making the high-tech

More information

Does a Business Intelligence implementation scare you? Here are 5 things to avoid.

Does a Business Intelligence implementation scare you? Here are 5 things to avoid. Does a Business Intelligence implementation scare you? Here are 5 things to avoid. While small and midsize businesses (SMB s) recognize the value of Business Intelligence (BI) many feel that BI solutions

More information

Taming Big Data. 1010data ACCELERATES INSIGHT

Taming Big Data. 1010data ACCELERATES INSIGHT Taming Big Data 1010data ACCELERATES INSIGHT Lightning-fast and transparent, 1010data analytics gives you instant access to all your data, without technical expertise or expensive infrastructure. TAMING

More information

Whitepaper: PeopleAdmin and Oracle PeopleSoft

Whitepaper: PeopleAdmin and Oracle PeopleSoft Whitepaper: PeopleAdmin and Oracle PeopleSoft Executive Summary Organizations of higher education strive to acquire top talent, while facing a wide array of budgetary and technological challenges. The

More information

Deductive Data Warehouses and Aggregate (Derived) Tables

Deductive Data Warehouses and Aggregate (Derived) Tables Deductive Data Warehouses and Aggregate (Derived) Tables Kornelije Rabuzin, Mirko Malekovic, Mirko Cubrilo Faculty of Organization and Informatics University of Zagreb Varazdin, Croatia {kornelije.rabuzin,

More information

INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY

INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY Muhamad Shahbani, Azman Ta a, Mohd Azlan, and Norshuhada Shiratuddin INTRODUCTION Universiti Utara

More information

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy

More information