Data Wrangling: The Elephant in the Room of Big Data. Norman Paton University of Manchester

Size: px
Start display at page:

Download "Data Wrangling: The Elephant in the Room of Big Data. Norman Paton University of Manchester"

Transcription

1 Data Wrangling: The Elephant in the Room of Big Data Norman Paton University of Manchester

2 Data Wrangling Definitions: a process of iterative data exploration and transformation that enables analysis [1]. the process of manually converting or mapping data from one "raw" form into another format that allows for more convenient consumption of the data with the help of semiautomated tools [2]. [1] S. Kandal, et al., Research Directions in Data Wrangling: Vizualizations and Transformations for usable and credible data, Information Visualization, 10(4), , [2] 12 th May 2015.

3 Extract, Transform and Load - 1 Of course, this is not completely new, and Extract, Transform and Load (ETL) tools have been around for a significant time. ETL tools support source wrapping, warehouse population, workflow languages, etc. ETL vendors also have big data offerings.

4 Extract, Transform and Load - 2 ETL tools are clearly useful, with products from database vendors and data integration companies. ETL tools emerged to support data warehousing, and thus typically have roots in enterprise settings. ETL tools typically involve significant manual effort. ETL costs no doubt vary widely from project to project, but are quoted as representing up to 80% of the development time in warehousing projects [1]. [1] S. Kandal, et al., Research Directions in Data Wrangling: Vizualizations and Transformations for usable and credible data, Information Visualization, 10(4), , 2011.

5 Big Data does it make a difference? Big data is sometimes characterised by the 4 V s: Volume the scale of the data, Velocity speed of change, Variety different forms of data, and Veracity uncertainty of data. So size matters but, it isn't everything. Data wrangling for big data must address all four V s at the same time. Classical, substantially manual ETL may struggle with numerous and rapidly changing sources.

6 The Business Case - 1 There is strong support for big data being commercially important: The International Institute of Analytics estimate the Big Data market at $16.1B in 2014, growing 6 times faster than the overall IT market. Projection for 2017 is ~$50B. Gartner (2014) estimates the Data Integration tool market at over $2.2B at end 2013, predicted to rise to ~$3.6B by Gartner (2014) estimates the Data Quality market as $960M in software revenue at end 2012 predicted to rise to $2B by 2017.

7 The Business Case - 2 but many of the potential beneficiaries of big data cannot simply throw resource at data wrangling. The government s Information Economy Strategy states: the overwhelming majority of information economy businesses 95% of the 120,000 enterprises in the sector employ fewer than 10 people.

8 Case Study: e-commerce If you run an e-commerce site, then you need to be able to understand pricing trends among your competitors. This may involve getting to grips with: Volume: thousands of sites; Velocity: sites, site descriptions and contents changing; Variety: in format, content, user community, ; and Veracity: unavailability, inconsistent descriptions, Manual attempts at data wrangling are likely to be expensive, partial, unreliable, poorly targeted,

9 Data Wrangling Research So data wrangling is a research challenge, currently without a community or established priorities. The VADA (Value Added DAta Systems) project seeks to define principles and solutions for adding value to data, supporting users in discovering, extracting, integrating, accessing and interpreting the data of relevance to their questions. VADA takes account of the: user context: requirements such as the trade-off between completeness and correctness; and the data context: availability, cost, provenance, quality.

10 User Context: e-commerce The same application may involve different user contexts. For example: Price comparison may normally be able to work with a subset of high quality sources, but Issue investigation may require a more complete picture, at the risk of obtaining more incorrect data, where sales of a popular item have been falling. As a result, hard-wiring data wrangling tasks risks the production of data sets that are not always fit for purpose, where the reason for this is implicit.

11 Quality VADA Components VADA seeks to support wrangling by integrating: Data extraction, Data integration, Quality analysis and Querying in a best-effort, payas-you-go approach to data wrangling. Integration VADA

12 Example: Data Integration How do we avoid lots of slow, expensive expert input into data integration? In pay-as-you-go data integration, alternative ways of combining data from sources can be generated algorithmically. Automatically generated candidate integrations can be refined in the light of feedback, for example from users or crowds. Decision support techniques can be used to capture the user s requirements (e.g. in terms of quality or cost), in ways that inform which integrations are generated.

13 Example: Mapping Selection - 1 Problem statement: Given a set of candidate mappings, and feedback on their results, identify the subset that best meets the user s requirements in terms of precision and recall. Associated definitions: Precision: the fraction of the retrieved results that are correct. Recall: the fraction of the correct results that are retrieved. The following were among the mappings generated by a commercial schema mapping tool for populating a table with schema <name, country, province> M 1 = SELECT name, country, province from Mondial.city M 2 = SELECT city, country, province from Mondial.located

14 Example: Mapping Selection - 2 We can estimate the quality of the generated mappings using feedback. How much feedback do we need? The results presented: report the precision obtained for a given precision threshold, for different amounts of feedback. report the recall obtained for a given precision threshold, for different amounts of feedback. Khalid Belhajjame, et al., Incrementally improving dataspaces based on user feedback. Inf. Syst. 38(5): (2013).

15 Evaluate Result Precision Threshold

16 Evaluate Result Precision Threshold

17 VADA: All Change The component technologies for extraction, integration and cleaning must themselves: provide automated analyses that are informed by and take account of the user context; share information with each other, so that, e.g., integration can identify issues with extraction, etc; and make well informed decisions that use all available evidence about the data context, such as reference data sets and ontologies. Thus making data wrangling more cost effective and systematic involves a fundamental rethink across a wide front.

18 Conclusions Data wrangling is a problem and an opportunity: A problem because the 4V s of big data may all be present together a lot of the time, undermining manual approaches. An opportunity because if we can make data wrangling much more cost effective, all sorts of hitherto impractical tasks come into reach. Call to arms: we will have a serious go at this in VADA, but there is much to do, and there must be different viable approaches to be taken.

19 Acknowledgements VADA is funded by: The Engineering and Physical Sciences Research Council Through a grant to: Georg Gottlob, Thomas Lukasiewicz, Dan Olteanu, Giorgio Orsi, Tim Furche In cooperation with: Norman Paton, Alvaro Fernandes, John Keane Leonid Libkin, Wenfei Fan, Peter Buneman, Sebastian Maneth

Data Wrangling for Big Data: Challenges and Opportunities

Data Wrangling for Big Data: Challenges and Opportunities Visionary Paper Data Wrangling for Big Data: Challenges and Opportunities Tim Furche Dept. of Computer Science Oxford University Oxford OX1 3QD, UK tim.furche@cs.ox.ac.uk Giorgio Orsi School. of Computer

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com

More information

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK 5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK CUSTOMER JOURNEY Technology is radically transforming the customer journey. Today s customers are more empowered and connected

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC

Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep. Neil Raden Hired Brains Research, LLC Data Catalogs for Hadoop Achieving Shared Knowledge and Re-usable Data Prep Neil Raden Hired Brains Research, LLC Traditionally, the job of gathering and integrating data for analytics fell on data warehouses.

More information

Agile Data Warehousing

Agile Data Warehousing Agile Data Warehousing Chris Galfi Project Manager Brian Zachow Data Architect COUNTRY Financial IT Projects are too slow IT Projects cost too much money I never get what I expected There must be a better

More information

Operational Success: Targeting Performance

Operational Success: Targeting Performance Operational Success: Targeting Performance INTRODUCTION Operational success has become widely accepted as a critical factor for real estate service providers (hereinafter referred to as providers ). It

More information

BIG DATA THE NEW OPPORTUNITY

BIG DATA THE NEW OPPORTUNITY Feature Biswajit Mohapatra is an IBM Certified Consultant and a global integrated delivery leader for IBM s AMS business application modernization (BAM) practice. He is IBM India s competency head for

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

How Master Data Management powers big data decision making.

How Master Data Management powers big data decision making. decision ready. How Master Data Management powers big data decision making. Building an enterprise architecture that s decision ready. Bringing discipline to big data. The trouble with insight is it doesn

More information

Measuring Quality of Service & tracking performance SMARTERDECISIONS

Measuring Quality of Service & tracking performance SMARTERDECISIONS Measuring Quality of Service & tracking performance SMARTERDECISIONS Business analytics Intergraph SG&I Deutschland GmbH Harald J. Behnke Business Analytics What is Business Analytics (BA)? Business Analytics

More information

Healthcare Measurement Analysis Using Data mining Techniques

Healthcare Measurement Analysis Using Data mining Techniques www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik

More information

Why Most Big Data Projects Fail

Why Most Big Data Projects Fail Learning from Common Mistakes to Transform Big Data into Insights What is Big Data?...2 Three Reasons Why Big Data Projects Fail...3 How Can Big Data Be Used?...5 The Lavastorm Approach to Big Data...5

More information

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE

Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE YOUR SUCCESS IS OUR FOCUS Whitepaper Published on: January 2009 Author: BIBA PRACTICE 2009 Hexaware Technologies. All rights reserved. Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware

More information

IJSER Figure1 Wrapper Architecture

IJSER Figure1 Wrapper Architecture International Journal of Scientific & Engineering Research, Volume 5, Issue 5, May-2014 24 ONTOLOGY BASED DATA INTEGRATION WITH USER FEEDBACK Devini.K, M.S. Hema Abstract-Many applications need to access

More information

Rapid Analytics. A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management

Rapid Analytics. A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management Rapid Analytics A visual, live approach to requirements gathering and business analytic development Mark Marinelli, VP of Product Management Brought to you by: Agenda Why Do Traditional Analytics Projects

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Improving Data Quality with Open Mapping Tools. February 2011. Robert Worden. Open Mapping Software Ltd. 2011 Open Mapping Software

Improving Data Quality with Open Mapping Tools. February 2011. Robert Worden. Open Mapping Software Ltd. 2011 Open Mapping Software Improving Data Quality with Open Mapping Tools February 2011 Robert Worden Open Mapping Software Ltd 2011 Open Mapping Software Contents 1. Introduction: The Business Problem 2 2. Initial Assessment: Understanding

More information

How the oil and gas industry can gain value from Big Data?

How the oil and gas industry can gain value from Big Data? How the oil and gas industry can gain value from Big Data? Arild Kristensen Nordic Sales Manager, Big Data Analytics arild.kristensen@no.ibm.com, tlf. +4790532591 April 25, 2013 2013 IBM Corporation Dilbert

More information

Delivering Customer Value Faster With Big Data Analytics

Delivering Customer Value Faster With Big Data Analytics Delivering Customer Value Faster With Big Data Analytics Tackle the challenges of Big Data and real-time analytics with a cloud-based Decision Management Ecosystem James Taylor CEO Customer data is more

More information

Data Virtualization: Achieve Better Business Outcomes, Faster

Data Virtualization: Achieve Better Business Outcomes, Faster White Paper Data Virtualization: Achieve Better Business Outcomes, Faster What You Will Learn Over the past decade, businesses have made tremendous investments in information capture, storage, and analysis.

More information

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS

CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS In today's scenario data warehouse plays a crucial role in order to perform important operations. Different indexing techniques has been used and analyzed using

More information

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement white paper Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement»» Summary For business intelligence analysts the era

More information

Offshore Holdings Analytics Using Datalog + RuleML Rules

Offshore Holdings Analytics Using Datalog + RuleML Rules Offshore Holdings Analytics Using Datalog + RuleML Rules Mohammad Sadnan Al Manir and Christopher J.O. Baker Department of Computer Science and Applied Statistics University of New Brunswick, Saint John,

More information

Information as a Service in a Data Analytics Scenario A Case Study

Information as a Service in a Data Analytics Scenario A Case Study 2008 IEEE International Conference on Web Services Information as a Service in a Analytics Scenario A Case Study Vishal Dwivedi, Naveen Kulkarni SETLabs, Infosys Technologies Ltd { Vishal_Dwivedi, Naveen_Kulkarni}@infosys.com

More information

Open Source BI. Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com

Open Source BI. Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com Open Source BI Sometimes the best things in life ARE free! Dan Peacock Auto-Wares, Inc. danp@autowares.com Agenda Introduction What is Open Source BI Opportunities and challenges of Open source BI A little

More information

Next Generation Business Performance Management Solution

Next Generation Business Performance Management Solution Next Generation Business Performance Management Solution Why Existing Business Intelligence (BI) Products are Inadequate Changing Business Environment In the face of increased competition, complex customer

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of

An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

www.inovoo.com Novo Mail

www.inovoo.com Novo Mail www.inovoo.com Novo Mail Email is and will remain a popular communications channel 01 For businesses... and for customers. Fast Convenient Easy You can have data without information, but you cannot have

More information

ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS

ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS ETL tools for Data Warehousing: An empirical study of Open Source Talend Studio versus Microsoft SSIS Ranjith Katragadda Unitech Institute of Technology Auckland, New Zealand Sreenivas Sremath Tirumala

More information

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement

Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Creating a Business Intelligence Competency Center to Accelerate Healthcare Performance Improvement Bruce Eckert, National Practice Director, Advisory Group Ramesh Sakiri, Executive Consultant, Healthcare

More information

Turnkey Hardware, Software and Cash Flow / Operational Analytics Framework

Turnkey Hardware, Software and Cash Flow / Operational Analytics Framework Turnkey Hardware, Software and Cash Flow / Operational Analytics Framework With relevant, up to date cash flow and operations optimization reporting at your fingertips, you re positioned to take advantage

More information

10426: Large Scale Project Accounting Data Migration in E-Business Suite

10426: Large Scale Project Accounting Data Migration in E-Business Suite 10426: Large Scale Project Accounting Data Migration in E-Business Suite Objective of this Paper Large engineering, procurement and construction firms leveraging Oracle Project Accounting cannot withstand

More information

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2 Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue

More information

Spreadsheet Programming:

Spreadsheet Programming: Spreadsheet Programming: The New Paradigm in Rapid Application Development Contact: Info@KnowledgeDynamics.com www.knowledgedynamics.com Spreadsheet Programming: The New Paradigm in Rapid Application Development

More information

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I

Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy

More information

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse Information management software solutions White paper Powerful data warehousing performance with IBM Red Brick Warehouse April 2004 Page 1 Contents 1 Data warehousing for the masses 2 Single step load

More information

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes This white paper will help you learn how to integrate your SalesForce.com data with 3 rd -party on-demand,

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

ICT Perspectives on Big Data: Well Sorted Materials

ICT Perspectives on Big Data: Well Sorted Materials ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in

More information

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper

Retail POS Data Analytics Using MS Bi Tools. Business Intelligence White Paper Retail POS Data Analytics Using MS Bi Tools Business Intelligence White Paper Introduction Overview There is no doubt that businesses today are driven by data. Companies, big or small, take so much of

More information

Welcome to. Business Intelligence 101

Welcome to. Business Intelligence 101 Welcome to Business Intelligence 101 Hi There! Before choosing a (BI) partner, you ll want to understand the essentials about BI including the various categories of analytics, what sort of insight is possible,

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

Comparing Data Integration Algorithms

Comparing Data Integration Algorithms Comparing Data Integration Algorithms Initial Background Report Name: Sebastian Tsierkezos tsierks6@cs.man.ac.uk ID :5859868 Supervisor: Dr Sandra Sampaio School of Computer Science 1 Abstract The problem

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server Course Code: M20463 Vendor: Microsoft Course Overview Duration: 5 RRP: 2,025 Implementing a Data Warehouse with Microsoft SQL Server Overview This course describes how to implement a data warehouse platform

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data

More information

Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short

Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short Best Practices for Deploying Managed Self-Service Analytics and Why Tableau and QlikView Fall Short Vijay Anand, Director, Product Marketing Agenda 1. Managed self-service» The need of managed self-service»

More information

Does a Business Intelligence implementation scare you? Here are 5 things to avoid.

Does a Business Intelligence implementation scare you? Here are 5 things to avoid. Does a Business Intelligence implementation scare you? Here are 5 things to avoid. While small and midsize businesses (SMB s) recognize the value of Business Intelligence (BI) many feel that BI solutions

More information

SQL Server 2012 Business Intelligence Boot Camp

SQL Server 2012 Business Intelligence Boot Camp SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations

More information

CASE STUDY. How Salesforce.com Grows Quickly and Efficiently with DocuSign

CASE STUDY. How Salesforce.com Grows Quickly and Efficiently with DocuSign CASE STUDY How Salesforce.com Grows Quickly and Efficiently with DocuSign The Value Agile development cycles churn out new releases of disruptive technologies as often as every week, making the high-tech

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

University of Kentucky Leveraging SAP HANA to Lead the Way in Use of Analytics in Higher Education

University of Kentucky Leveraging SAP HANA to Lead the Way in Use of Analytics in Higher Education IDC ExpertROI SPOTLIGHT University of Kentucky Leveraging SAP HANA to Lead the Way in Use of Analytics in Higher Education Sponsored by: SAP Matthew Marden April 2014 Randy Perry Overview Founded in 1865

More information

90% of your Big Data problem isn t Big Data.

90% of your Big Data problem isn t Big Data. White Paper 90% of your Big Data problem isn t Big Data. It s the ability to handle Big Data for better insight. By Arjuna Chala Risk Solutions HPCC Systems Introduction LexisNexis is a leader in providing

More information

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff The Challenge IT Executives are challenged with issues around data, compliancy, regulation and making confident decisions on their business

More information

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data

Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Smart Financial Data: Semantic Web technology transforms Big Data into Smart Data Insurance Data and Analytics Summit 2013 18 April 2013 David Saul, Senior Vice President & Chief Scientist State Street

More information

The 2015 State of E-Commerce in Distribution

The 2015 State of E-Commerce in Distribution The 2015 State of E-Commerce in Distribution Featuring: Jonathan Bein, Managing Partner, Real Results Marketing Sponsored by: Date: March 26, 2015 Speakers Featured Presenter: Jonathan Bein Senior Partner

More information

The Principles of the Business Data Lake

The Principles of the Business Data Lake The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Business Intelligence and Customer Relationship Management

Business Intelligence and Customer Relationship Management Business Intelligence and Customer Relationship Management Aida Habul School of Economics and Business, University of Sarajevo Trg Oslobo enja-alija Izetbegovic, 71 000 Sarajevo, B&H Phone: + 387 33 275

More information

Voice. listen, understand and respond. enherent. wish, choice, or opinion. openly or formally expressed. May 2010. - Merriam Webster. www.enherent.

Voice. listen, understand and respond. enherent. wish, choice, or opinion. openly or formally expressed. May 2010. - Merriam Webster. www.enherent. Voice wish, choice, or opinion openly or formally expressed - Merriam Webster listen, understand and respond May 2010 2010 Corp. All rights reserved. www..com Overwhelming Dialog Consumers are leading

More information

Cognos e-applications Fast Time to Success. Immediate Business Results.

Cognos e-applications Fast Time to Success. Immediate Business Results. Cognos e-applications Fast Time to Success. Immediate Business Results. www.cognos.com Cognos e-applications transform business-critical data into a readily available global view of our customers and our

More information

Chapter ML:XI. XI. Cluster Analysis

Chapter ML:XI. XI. Cluster Analysis Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Data Mining in the Swamp

Data Mining in the Swamp WHITE PAPER Page 1 of 8 Data Mining in the Swamp Taming Unruly Data with Cloud Computing By John Brothers Business Intelligence is all about making better decisions from the data you have. However, all

More information

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies

More information

DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL. A Thesis. Presented to the. Faculty of. San Diego State University. In Partial Fulfillment

DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL. A Thesis. Presented to the. Faculty of. San Diego State University. In Partial Fulfillment DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science

More information

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining BUSINESS INTELLIGENCE Bogdan Mohor Dumitrita 1 Abstract A Business Intelligence (BI)-driven approach can be very effective in implementing business transformation programs within an enterprise framework.

More information

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management Datalogix Using IBM Netezza data warehouse appliances to drive online sales with offline data Overview The need Infrastructure could not support the growing online data volumes and analysis required The

More information

I D C E X E C U T I V E B R I E F

I D C E X E C U T I V E B R I E F I D C E X E C U T I V E B R I E F E n a b l i n g B e t t e r D e c i s i o n s T h r o u g h U n i f i e d Ac c e s s t o I n f o r m a t i o n November 2008 Global Headquarters: 5 Speen Street Framingham,

More information

Search Engine Optimization

Search Engine Optimization Search Engine Optimization Harness the Power of Content and be the Authority Site in Your Space By: Beth Lee-Browning Overview When was the last time you Purchased a product online without researching

More information

Implementing a Data Warehouse with Microsoft SQL Server 2014

Implementing a Data Warehouse with Microsoft SQL Server 2014 Implementing a Data Warehouse with Microsoft SQL Server 2014 MOC 20463 Duración: 25 horas Introducción This course describes how to implement a data warehouse platform to support a BI solution. Students

More information

Applying Big Data approaches to Competitive Intelligence challenges

Applying Big Data approaches to Competitive Intelligence challenges Applying Big Data approaches to Competitive Intelligence challenges THOMSON REUTERS IP & SCIENCE PHARMA CI EUROPE CONFERENCE & EXHIBITION TIM MILLER 19 FEBRUARY 2014 BIG DATA, NOT JUST ABOUT VOLUMES Patient

More information

Resolving Common Analytical Tasks in Text Databases

Resolving Common Analytical Tasks in Text Databases Resolving Common Analytical Tasks in Text Databases The work is funded by the Federal Ministry of Economic Affairs and Energy (BMWi) under grant agreement 01MD15010B. Database Systems and Text-based Information

More information

IRMAC SAS INFORMATION MANAGEMENT, TRANSFORMING AN ANALYTICS CULTURE. Copyright 2012, SAS Institute Inc. All rights reserved.

IRMAC SAS INFORMATION MANAGEMENT, TRANSFORMING AN ANALYTICS CULTURE. Copyright 2012, SAS Institute Inc. All rights reserved. IRMAC SAS INFORMATION MANAGEMENT, TRANSFORMING AN ANALYTICS CULTURE ABOUT THE PRESENTER Marc has been with SAS for 10 years and leads the information management practice for canada. Marc s area of specialty

More information

Pay-as-you-go Data Integration for Linked Data: opportunities, challenges and architectures

Pay-as-you-go Data Integration for Linked Data: opportunities, challenges and architectures Pay-as-you-go Data Integration for Linked Data: opportunities, challenges and architectures Norman W. Paton norm@cs.man.ac.uk Bijan Parsia bparsia@cs.man.ac.uk Klitos Christodoulou christk6@cs.man.ac.uk

More information

Using Business Intelligence to Achieve Sustainable Performance

Using Business Intelligence to Achieve Sustainable Performance Cutting Edge Analytics for Sustainable Performance Using Business Intelligence to Achieve Sustainable Performance Adam Getz Principal, About is a software and professional services firm specializing in

More information

Your Path to. Big Data A Visual Guide

Your Path to. Big Data A Visual Guide Your Path to Big Data A Visual Guide Big Data Has Big Value Start Here to Learn How to Unlock It By now it s become fairly clear that big data represents a major shift in the technology landscape. To tackle

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management

Making Business Intelligence Easy. Whitepaper Measuring data quality for successful Master Data Management Making Business Intelligence Easy Whitepaper Measuring data quality for successful Master Data Management Contents Overview... 3 What is Master Data Management?... 3 Master Data Modeling Approaches...

More information

Deductive Data Warehouses and Aggregate (Derived) Tables

Deductive Data Warehouses and Aggregate (Derived) Tables Deductive Data Warehouses and Aggregate (Derived) Tables Kornelije Rabuzin, Mirko Malekovic, Mirko Cubrilo Faculty of Organization and Informatics University of Zagreb Varazdin, Croatia {kornelije.rabuzin,

More information

CHAPTER-6 DATA WAREHOUSE

CHAPTER-6 DATA WAREHOUSE CHAPTER-6 DATA WAREHOUSE 1 CHAPTER-6 DATA WAREHOUSE 6.1 INTRODUCTION Data warehousing is gaining in popularity as organizations realize the benefits of being able to perform sophisticated analyses of their

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Whitepaper. Data Warehouse/BI Testing Offering. Published on: January 2010 Author: Sena Periasamy

Whitepaper. Data Warehouse/BI Testing Offering. Published on: January 2010 Author: Sena Periasamy Published on: January 2010 Author: Sena Periasamy Hexaware Technologies. All rights reserved. Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware Solution 4. DWH Testing Why is it

More information

Requirements are elicited from users and represented either informally by means of proper glossaries or formally (e.g., by means of goal-oriented

Requirements are elicited from users and represented either informally by means of proper glossaries or formally (e.g., by means of goal-oriented A Comphrehensive Approach to Data Warehouse Testing Matteo Golfarelli & Stefano Rizzi DEIS University of Bologna Agenda: 1. DW testing specificities 2. The methodological framework 3. What & How should

More information

Big Data and its use for Communications Service Providers. IoT, Washington, DC April 9, 2014 Sanjay Mishra

Big Data and its use for Communications Service Providers. IoT, Washington, DC April 9, 2014 Sanjay Mishra Big Data and its use for Communications Service Providers IoT, Washington, DC April 9, 2014 Sanjay Mishra 4/9/2014 IoT, Washington, DC, Big Data & its use for CSP (c) 2014 1 Information and Communications

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

1. Layout and Navigation

1. Layout and Navigation Success online whether measured in visits, ad revenue or ecommerce transactions requires compelling content and intuitive design. It all starts with the fundamentals: the key building blocks to create

More information

For Sales Kathy Hall 402-963-4466 khall@it4e.com

For Sales Kathy Hall 402-963-4466 khall@it4e.com IT4E Schedule 13939 Gold Circle Omaha NE 68144 402-431-5432 Course Number 10777 For Sales Chris Reynolds 402-963-4465 creynolds@it4e.com www.it4e.com For Sales Kathy Hall 402-963-4466 khall@it4e.com Course

More information

TeraCloud Storage Analytics: The Power of Knowledge

TeraCloud Storage Analytics: The Power of Knowledge TeraCloud Storage Analytics: The Power of Knowledge By Dallas Stewart VP Product Management Contents TeraCloud Storage Analytics for Storage Management...3 TeraCloud Storage Analytics for Strategic Outsourcing...5

More information

DYNAMIC QUERY FORMS WITH NoSQL

DYNAMIC QUERY FORMS WITH NoSQL IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 7, Jul 2014, 157-162 Impact Journals DYNAMIC QUERY FORMS WITH

More information