DATA WAREHOUSE/BIG DATA AN ARCHITECTURAL APPROACH

Size: px
Start display at page:

Download "DATA WAREHOUSE/BIG DATA AN ARCHITECTURAL APPROACH"

Transcription

1 DATA WAREHOUSE/BIG DATA AN ARCHITECTURAL APPROACH By W H Inmon and Deborah Arline

2 First there was data warehouse. Then came Big Data. Some of the proponents of Big Data have made the proclamation When you have Big Data, you won t need a data warehouse, such was the enthusiasm for Big Data. Indeed there is much confusion and much misunderstanding of information with regard to Big Data and data warehouse. In this paper it will be seen that data warehouse and Big Data indeed are separate environments and that they are complementary to each other. This paper takes an architectural view. AN ARCHITECTURAL PERSPECTIVE In order to understand the complex and symbiotic relationship between data warehouse and Big Data it is necessary to have some foundational groundwork laid. Without the groundwork the final solution will not make much sense. The starting point is that data warehouse is an architecture and Big Data is a technology. And as is the case with all technologies and all architectures, there may be some overlap but a technology and an architecture are essentially different things. WHAT IS BIG DATA? A good starting point is what is Big Data? Fig 1 shows a representation of Big Data. In Fig 1 we see Big Data. So what is Big Data? Big Data is technology that is designed to - Accommodate very large almost unlimited amounts of storage - Use inexpensive storage for the housing of data - Manage the storage using the Roman census method - Store data in an unstructured manner There are other definitions of Big Data but for the purpose of this paper this will be our working definition. Big Data centers around a technological component known as Hadoop. Hadoop is technology that satisfies all these conditions of the definition of Big Data. Many vendors have built suites of tools surrounding Hadoop. Big Data then is a technology that is useful for the storage and management of large volumes of data.

3 WHAT IS A DATA WAREHOUSE? So what is a data warehouse? A data warehouse is a structure of data where there is a single version of the truth. The essence of a data warehouse is the integrity of data contained inside the data warehouse. When an executive wants information that can be believed and trusted, the executive turns to a data warehouse. Data warehouses contain detailed historical data. Data in a data warehouse is typically integrated, where the data comes from different sources. From the standpoint of a definition of what a data warehouse is, the definition of a data warehouse has been established from the very beginning. A data warehouse is a Subject oriented Integrated Non volatile Time variant collection of data In support of management s decisions. Fig 2 depicts a representation of a data warehouse. In order to achieve the integrity of data that is central to a data warehouse, a data warehouse typically has a carefully constructed infrastructure, where data is edited, calculated, tested, and transformed before it enters the data warehouse. Because data going into the data warehouse comes from multiple sources, data typically passes through a process known as ETL (extract/transform/load). OVERLAP From a foundational standpoint, how much overlap is there between a data warehouse and Big Data? The answer is that there is actually very little overlap between a data warehouse and Big Data. Fig 3 shows the overlap.

4 In Fig 3 it is seen that sometimes a data warehouse contains a reasonably large amount of data. And of course, Big Data can certainly accommodate a reasonably large amount of data. So there is some overlap between a data warehouse and Big Data. But the overlap between the two entities is remarkable in how little overlap there really is. Another way to look at the overlap between a data warehouse and Big Data is seen in Fig 4. data warehouse and no Big Data data warehouse and Big Data Big Data and no data warehouse Big Data and data warehouse Fig 4 shows that there is no necessary overlap between a data warehouse and Big Data. A data warehouse and Big Data are COMPLETELY, mutually exclusive of each other. A NON TRADITIONAL VIEW In order to understand how Big Data and a data warehouse interface, it is necessary to look at Big Data in a non traditional way. There are indeed many different ways that Big Data can be analyzed. The way suggested here is only one of many ways. One way that Big Data can be sub divided is in terms of data and non data. Fig 5 notionally shows this sub division. yes yes yes yes non Repetitive unstructured data is data that occurs very frequently and has the same structure and often times the same content. There are many examples or unstructured data. One example of unstructured data is the record of phone calls, where the length of the call, the date of the call, the caller and the callee are noted. Another example of unstructured data is metering data. Metering data is data that is gathered each week or month where there is a register of the activity or usage of energy at a particular location. In metering data there is a metered amount, an account number, and a date. And there are many, many occurrences of metering records. Another type of data is oil and gas exploration data. There are in fact many examples of Big Data. The other type of Big Data is non unstructured data. With non unstructured data there often are many records of data. But each record of data is unique in terms of structure and

5 content. If any two non unstructured records are similar in terms of structure and content, it is an accident. There are many forms of non unstructured data. There are s, where one is very different from the next in the queue. Or there are call center records, where a customer interacts with an operator representing a company. There are telephone conversations, sales calls, litigation records, and many, many different types of non unstructured data. So Big Data can be divided into this classification of data unstructured data and non unstructured data. Admittedly there are many different ways to sub divide Big Data. But for the purpose of defining the relationship between a data warehouse and Big Data, this division is the one we will use. CONTEXT When dealing with data any data it is useful to consider the of the data. Indeed, using data where the is unknown is a dangerous thing to do. An important point to be made is that for unstructured data, identifying the of data comes very easily and naturally. Consider the diagram seen in Fig 6. content Fig 6 shows that there are many records in a Big Data environment. But the records are essentially records and the and meaning of each record is clear. That is because when it comes to unstructured data there records are essentially very records. Determining in a environment is a very easy and natural thing to do. Now consider in the non environment. There is plenty of to be found in the non unstructured environment. The problem is that is embedded in the document itself. The is found in a million different places and in a million different ways in the non unstructured environment. Sometimes is buried in the text of the document. Sometimes is inferred in the external characterization of the document. Sometimes is found in the words of the document itself. There are literally a million ways that is found in the non unstructured environment. In order to derive the inherent to a non unstructured document, it is necessary to use technology known as textual disambiguation (or textual ETL.) Fig 7 shows that textual disambiguation is used to derive from non unstructured data.

6 non textual disambiguation ANALYTIC PROCESSING So how is analytic processing done from Big Data? There are several ways that analytic processing can be done. One way is through technology. This approach is seen in Fig 8. non In Fig 8 it is seen that technology works well on unstructured Big Data. Simple technology works well where there is obvious and easily derived. The problem is that technology does not work well in the face of non unstructured data. In order for technology to work well, there must be and obvious of the data the processing is operating on. But it is possible to use textual disambiguation to derive from non unstructured data and then to replace the data back into the Big Data environment. In this case it is said that the Big Data environment has been enriched. Fig 9 shows this enrichment. non textual disambiguation In Fig 9 it is seen that non unstructured data is read and passed through textual disambiguation. Then the output is placed back into the Big Data environment but it is placed into Big

7 Data in a enriched state. After the data is placed back in Big Data in a enriched state, a tool can be used to analyze the data. THE DATA WAREHOUSE/Big Data INTERFACE The actual interface between data warehouse and Big Data is seen in Fig 10. direct raw Big Data distill unstructured data base classical data warehouse non enriched Big Data textual disambiguation of unstructured ualized data combined of enriched Big Data In Fig 10 it is seen that raw Big Data can be divided into data and non data, as has been discussed. Repetitive data can be directly analyzed or can be ed by a tool. Non data is accessed by textual disambiguation. When non data passes through textual disambiguation, the of the data is derived. Once the has been derived, the output can be placed either in a standard data base format or into en enriched Big Data environment. If data is

8 placed in a data base format, the data can be easily accessed and analyzed in conjunction with existing data warehouse data. In addition, data can be distilled and placed into a standard data base if desired. One interesting feature of this diagram is that the different kinds of that are done throughout the environment are quite different. The type of that is done is profoundly shaped by the data that is available for. Forest Rim Technology is located in Castle Rock, CO. Forest Rim Technology produces textual ETL, a technology that allows unstructured text to be disambiguated and placed into a standard data base where it can be analyzed. Forest Rim Technology was founded by Bill Inmon. Deborah Arline is. -

ACHIEVING BUSINESS VALUE WITH BIG DATA. By W H Inmon. copyright 2014 Forest Rim Technology, all rights reserved

ACHIEVING BUSINESS VALUE WITH BIG DATA. By W H Inmon. copyright 2014 Forest Rim Technology, all rights reserved ACHIEVING BUSINESS VALUE WITH BIG DATA By W H Inmon First there were Hollerith punched cards. Then there were magnetic tape files. Then there was disk storage followed by parallel disk storage. With each

More information

ANALYZING THE TEXT IN MEDICAL RECORDS: A COLLECTIVE APPROACH USING VISUALIZATION. By W H Inmon

ANALYZING THE TEXT IN MEDICAL RECORDS: A COLLECTIVE APPROACH USING VISUALIZATION. By W H Inmon ANALYZING THE TEXT IN MEDICAL RECORDS: A COLLECTIVE APPROACH USING VISUALIZATION By W H Inmon With the rising costs of medicine and the advent of an aging population, there has never been a better time

More information

TEXTUAL ETL THE COMPONENTS. A WHITE PAPER BY W H Inmon. copyright 2014 Forest Rim Technology, all rights reserved

TEXTUAL ETL THE COMPONENTS. A WHITE PAPER BY W H Inmon. copyright 2014 Forest Rim Technology, all rights reserved TEXTUAL THE COMPONENTS A WHITE PAPER BY W H Inmon For years, data bases have held numeric, repetitive data, typically generated by transactions. The same structure of data is repeated over and over. Each

More information

DATA WAREHOUSING IN THE HEALTHCARE ENVIRONMENT. By W H Inmon

DATA WAREHOUSING IN THE HEALTHCARE ENVIRONMENT. By W H Inmon DATA WAREHOUSING IN THE HEALTHCARE ENVIRONMENT By W H Inmon For years organizations had unintegrated data. With unintegrated data there was a lot of pain. No one could look across the information of the

More information

Apache Hadoop Patterns of Use

Apache Hadoop Patterns of Use Community Driven Apache Hadoop Apache Hadoop Patterns of Use April 2013 2013 Hortonworks Inc. http://www.hortonworks.com Big Data: Apache Hadoop Use Distilled There certainly is no shortage of hype when

More information

Implementation of Model-View-Controller Architecture Pattern for Business Intelligence Architecture

Implementation of Model-View-Controller Architecture Pattern for Business Intelligence Architecture Implementation of -- Architecture Pattern for Business Intelligence Architecture Medha Kalelkar Vidyalankar Institute of Technology, University of Mumbai, Mumbai, India Prathamesh Churi Lecturer, Department

More information

EC Wise Report: Unlocking the Value of Deeply Unstructured Data. The Challenge: Gaining Knowledge from Deeply Unstructured Data.

EC Wise Report: Unlocking the Value of Deeply Unstructured Data. The Challenge: Gaining Knowledge from Deeply Unstructured Data. EC Wise Report: Unlocking the Value of Deeply Unstructured Data Feedback from the Market: Forest Rim enables significant improvements in the quality of semantic information derived from text data. This

More information

SOME STRAIGHT TALK ABOUT THE COSTS OF DATA WAREHOUSING

SOME STRAIGHT TALK ABOUT THE COSTS OF DATA WAREHOUSING Inmon Consulting SOME STRAIGHT TALK ABOUT THE COSTS OF DATA WAREHOUSING Inmon Consulting PO Box 210 200 Wilcox Street Castle Rock, Colorado 303-681-6772 An Inmon Consulting White Paper By W H Inmon By

More information

A Comparison of System Dynamics (SD) and Discrete Event Simulation (DES) Al Sweetser Overview.

A Comparison of System Dynamics (SD) and Discrete Event Simulation (DES) Al Sweetser Overview. A Comparison of System Dynamics (SD) and Discrete Event Simulation (DES) Al Sweetser Andersen Consultng 1600 K Street, N.W., Washington, DC 20006-2873 (202) 862-8080 (voice), (202) 785-4689 (fax) albert.sweetser@ac.com

More information

Best Practices in Leveraging a Staging Area for SaaS-to-Enterprise Integration

Best Practices in Leveraging a Staging Area for SaaS-to-Enterprise Integration white paper Best Practices in Leveraging a Staging Area for SaaS-to-Enterprise Integration David S. Linthicum Introduction SaaS-to-enterprise integration requires that a number of architectural calls are

More information

The growth of computing can be measured in two ways growth in what is termed structured systems and growth in what is termed unstructured systems.

The growth of computing can be measured in two ways growth in what is termed structured systems and growth in what is termed unstructured systems. The world of computing has grown from a small, unsophisticated world in the early 1960 s to a world today of massive size and sophistication. Nearly every person on the globe in one way or the other is

More information

15.00 15.30 30 XML enabled databases. Non relational databases. Guido Rotondi

15.00 15.30 30 XML enabled databases. Non relational databases. Guido Rotondi Programme of the ESTP training course on BIG DATA EFFECTIVE PROCESSING AND ANALYSIS OF VERY LARGE AND UNSTRUCTURED DATA FOR OFFICIAL STATISTICS Rome, 5 9 May 2014 Istat Piazza Indipendenza 4, Room Vanoni

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.

Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles

More information

IST722 Data Warehousing

IST722 Data Warehousing IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF

More information

Architecture Artifacts Vs Application Development Artifacts

Architecture Artifacts Vs Application Development Artifacts Architecture Artifacts Vs Application Development Artifacts By John A. Zachman Copyright 2000 Zachman International All of a sudden, I have been encountering a lot of confusion between Enterprise Architecture

More information

Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE

Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE Management Consulting Systems Integration Managed Services WHITE PAPER DATA DISCOVERY VS ENTERPRISE BUSINESS INTELLIGENCE INTRODUCTION Over the past several years a new category of Business Intelligence

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

OLAP AND DATA WAREHOUSE BY W. H. Inmon

OLAP AND DATA WAREHOUSE BY W. H. Inmon OLAP AND DATA WAREHOUSE BY W. H. Inmon The goal of informational processing is to turn data into information. Online analytical processing (OLAP) is an important method by which this goal can be accomplished

More information

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University

CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University Given today s business environment, at times a corporate executive

More information

DATA MINING AND WAREHOUSING CONCEPTS

DATA MINING AND WAREHOUSING CONCEPTS CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data

De la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies

More information

THE ARCHIVAL SECTOR IN DW2.0 By W H Inmon

THE ARCHIVAL SECTOR IN DW2.0 By W H Inmon The fourth sector of the DW2.0 environment is the archival sector. Fig arch.1 shows the architectural positioning of the archival sector. Fig arch.1 The archival sector All data that flows into the archival

More information

A Design and implementation of a data warehouse for research administration universities

A Design and implementation of a data warehouse for research administration universities A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon

More information

Big Data: Rethinking Text Visualization

Big Data: Rethinking Text Visualization Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics Paper 1828-2014 Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics John Cunningham, Teradata Corporation, Danville, CA ABSTRACT SAS High Performance Analytics (HPA) is a

More information

BI, Analytics and Big Data A Modern-Day Perspective

BI, Analytics and Big Data A Modern-Day Perspective BI, Analytics and Big Data A Modern-Day Perspective By: Elad Israeli, Co-Founder, SiSense http://www.sisense.com Business Intelligence (Analytics) A set of theories, methodologies, processes, architectures,

More information

The GOBIA Method: Towards Goal-Oriented Business Intelligence Architectures

The GOBIA Method: Towards Goal-Oriented Business Intelligence Architectures The GOBIA Method: Towards Goal-Oriented Business Intelligence Architectures David Fekete 1 and Gottfried Vossen 1,2 1 ERCIS, Leonardo-Campus 3, 48149 Münster, Germany, firstname.lastname@ercis.de 2 University

More information

Deriving Business Intelligence from Unstructured Data

Deriving Business Intelligence from Unstructured Data International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 9 (2013), pp. 971-976 International Research Publications House http://www. irphouse.com /ijict.htm Deriving

More information

LOVE IT, LOATHE IT OR LEVERAGE IT

LOVE IT, LOATHE IT OR LEVERAGE IT LOVE IT, LOATHE IT OR LEVERAGE IT WHAT S YOUR CLOUD STRATEGY? C2C Systems 2013 www.c2c.com Abstract With all the hype and confusion surrounding cloud and cloud strategies, this paper looks at three main

More information

A Data-Warehouse Architecture supporting Energy Management of Buildings

A Data-Warehouse Architecture supporting Energy Management of Buildings A Data-Warehouse Architecture supporting Energy Management of Buildings H.U. Gökçe, Y. Wang, K.U. Gökçe, K. Menzel IRUSE, University College Cork, Ireland ABSTRACT: Environmental legislative and economical

More information

A Model-driven Approach to Predictive Non Functional Analysis of Component-based Systems

A Model-driven Approach to Predictive Non Functional Analysis of Component-based Systems A Model-driven Approach to Predictive Non Functional Analysis of Component-based Systems Vincenzo Grassi Università di Roma Tor Vergata, Italy Raffaela Mirandola {vgrassi, mirandola}@info.uniroma2.it Abstract.

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO

AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO DW2.0 The Architecture for the Next Generation of Data Warehousing W. H. Inmon Forest Rim Technology Derek Strauss Gavroshe Genia Neushloss Gavroshe AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS

More information

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something

More information

The Data & Analytics Opportunity. Mike Flannagan, Vice President, Data & Analytics June 8-9, 2015

The Data & Analytics Opportunity. Mike Flannagan, Vice President, Data & Analytics June 8-9, 2015 The Data & Opportunity Mike Flannagan, Vice President, Data & June 8-9, 2015 Forward-Looking Statements This presentation contains projections and other forward-looking statements regarding future events

More information

Master Data Management Architecture

Master Data Management Architecture Master Data Management Architecture Version Draft 1.0 TRIM file number - Short description Relevant to Authority Responsible officer Responsible office Date introduced April 2012 Date(s) modified Describes

More information

Business Rules and Business

Business Rules and Business Page 1 of 5 Business Rules and Business Intelligence Information Management Magazine, April 2007 Robert Blasum Business rules represent an essential part of any performance management system and any business

More information

Sistemi ICT per il Business Networking

Sistemi ICT per il Business Networking Corso di Laurea Specialistica Ingegneria Gestionale Sistemi ICT per il Business Networking Software Development Processes Docente: Vito Morreale (vito.morreale@eng.it) 17 October 2006 1 The essence of

More information

INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY

INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY INTELLIGENT PROFILE ANALYSIS GRADUATE ENTREPRENEUR (ipage) SYSTEM USING BUSINESS INTELLIGENCE TECHNOLOGY Muhamad Shahbani, Azman Ta a, Mohd Azlan, and Norshuhada Shiratuddin INTRODUCTION Universiti Utara

More information

BBBT Podcast Transcript

BBBT Podcast Transcript BBBT Podcast Transcript About the BBBT Vendor: The Boulder Brain Trust, or BBBT, was founded in 2006 by Claudia Imhoff. Its mission is to leverage business intelligence for industry vendors, for its members,

More information

TOTAL DATA INTEGRATION

TOTAL DATA INTEGRATION The Impact of Big Data on Integration and Governance Big data and Total Data have the potential to change the face of the data integration market. This report outlines the key drivers shaping this sector

More information

Master Data Management and Data Warehousing. Zahra Mansoori

Master Data Management and Data Warehousing. Zahra Mansoori Master Data Management and Data Warehousing Zahra Mansoori 1 1. Preference 2 IT landscape growth IT landscapes have grown into complex arrays of different systems, applications, and technologies over the

More information

Dr. Pedro Basagoiti, Jose Frias Software AG Espana

Dr. Pedro Basagoiti, Jose Frias Software AG Espana POSTER SESSIONS 107 GEOGRAPHIC INFORMATION AND EXPERT SYSTEMS INTEGRATION: SITUATION, REQUIREMENTS AND EXAMPLES Dr. Pedro Basagoiti, Jose Frias Software AG Espana Abstract I n this paper, the problem of

More information

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Data Virtualization and ETL. Denodo Technologies Architecture Brief Data Virtualization and ETL Denodo Technologies Architecture Brief Contents Data Virtualization and ETL... 3 Summary... 3 Data Virtualization... 7 What is Data Virtualization good for?... 8 Applications

More information

Automated Test Approach for Web Based Software

Automated Test Approach for Web Based Software Automated Test Approach for Web Based Software Indrajit Pan 1, Subhamita Mukherjee 2 1 Dept. of Information Technology, RCCIIT, Kolkata 700 015, W.B., India 2 Dept. of Information Technology, Techno India,

More information

IN-MEMORY DATABASES, INDUSTRY KNOW-HOW, AND USABILITY: WHAT REALLY MATTERS IN SUPPLY CHAIN PLANNING

IN-MEMORY DATABASES, INDUSTRY KNOW-HOW, AND USABILITY: WHAT REALLY MATTERS IN SUPPLY CHAIN PLANNING IN-MEMORY DATABASES, INDUSTRY KNOW-HOW, AND USABILITY: WHAT REALLY MATTERS IN SUPPLY CHAIN PLANNING Joshua Greenbaum, Principal Enterprise Applications Consulting June, 2013 Enterprise Applications Consulting

More information

Big Data - Infrastructure Considerations

Big Data - Infrastructure Considerations April 2014, HAPPIEST MINDS TECHNOLOGIES Big Data - Infrastructure Considerations Author Anand Veeramani / Deepak Shivamurthy SHARING. MINDFUL. INTEGRITY. LEARNING. EXCELLENCE. SOCIAL RESPONSIBILITY. Copyright

More information

Making Data Work. Florida Department of Transportation October 24, 2014

Making Data Work. Florida Department of Transportation October 24, 2014 Making Data Work Florida Department of Transportation October 24, 2014 1 2 Data, Data Everywhere. Challenges in organizing this vast amount of data into something actionable: Where to find? How to store?

More information

Dashboards as a management tool to monitoring the strategy. Carlos González (IAT) 19th November 2014, Valencia (Spain)

Dashboards as a management tool to monitoring the strategy. Carlos González (IAT) 19th November 2014, Valencia (Spain) Dashboards as a management tool to monitoring the strategy Carlos González (IAT) 19th November 2014, Valencia (Spain) Definitions Strategy Management Tool Monitoring Dashboard Definitions STRATEGY From

More information

A Case Study of Hadoop in Healthcare

A Case Study of Hadoop in Healthcare Leading a Healthcare Company to the Big Data Promised Land: A Case Study of Hadoop in Healthcare Mohammad Quraishi (IT Senior Principal - Cigna) atif71@gmail.com About me BS in Computer Science and Engineering

More information

Metadata Management for Data Warehouse Projects

Metadata Management for Data Warehouse Projects Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. stefano.cazzella@datamat.it Abstract Metadata management has been identified as one of the major critical success factor

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

A Knowledge Management Framework Using Business Intelligence Solutions

A Knowledge Management Framework Using Business Intelligence Solutions www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For

More information

Big Data Discovery: Five Easy Steps to Value

Big Data Discovery: Five Easy Steps to Value Big Data Discovery: Five Easy Steps to Value Big data could really be called big frustration. For all the hoopla about big data being poised to reshape industries from healthcare to retail to financial

More information

9.3 Case study: University of Helsinki, Finland

9.3 Case study: University of Helsinki, Finland .3 Case study: University of Helsinki, Finland There is no physical International Office in Helsinki international activities have been incorporated into the Strategic Planning Office. This construction

More information

Enterprise Intelligence - Enabling High Quality in the Data Warehouse/DSS Environment. by Bill Inmon. INTEGRITY IN All Your INformation

Enterprise Intelligence - Enabling High Quality in the Data Warehouse/DSS Environment. by Bill Inmon. INTEGRITY IN All Your INformation INTEGRITY IN All Your INformation R TECHNOLOGY INCORPORATED Enterprise Intelligence - Enabling High Quality in the Data Warehouse/DSS Environment by Bill Inmon WPS.INM.E.399.1.e Introduction In a few short

More information

ANALYTICS STRATEGY: creating a roadmap for success

ANALYTICS STRATEGY: creating a roadmap for success ANALYTICS STRATEGY: creating a roadmap for success Companies in the capital and commodity markets are looking at analytics for opportunities to improve revenue and cost savings. Yet, many firms are struggling

More information

Computer Aided Call Handling: Front End of Dispatch

Computer Aided Call Handling: Front End of Dispatch Computer Aided Call Handling: Improving Technology at the Front End of Dispatch Positioned at the front end of dispatch, CACH delivers protocols that are fully integrated to determine the appropriate responder

More information

Business Intelligence and Decision Support Systems

Business Intelligence and Decision Support Systems Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley

More information

Data Warehouse (DW) Maturity Assessment Questionnaire

Data Warehouse (DW) Maturity Assessment Questionnaire Data Warehouse (DW) Maturity Assessment Questionnaire Catalina Sacu - csacu@students.cs.uu.nl Marco Spruit m.r.spruit@cs.uu.nl Frank Habers fhabers@inergy.nl September, 2010 Technical Report UU-CS-2010-021

More information

Big Data for the Rest of Us Technical White Paper

Big Data for the Rest of Us Technical White Paper Big Data for the Rest of Us Technical White Paper Treasure Data - Big Data for the Rest of Us 1 Introduction The importance of data warehousing and analytics has increased as companies seek to gain competitive

More information

Business Intelligence for Big Data

Business Intelligence for Big Data Business Intelligence for Big Data Will Gorman, Vice President, Engineering May, 2011 2010, Pentaho. All Rights Reserved. www.pentaho.com. What is BI? Business Intelligence = reports, dashboards, analysis,

More information

MORE CONTROL LESS RISK

MORE CONTROL LESS RISK MORE CONTROL LESS RISK IDENTIFYING, ANALYZING AND CONTROLLING ACCESS RISK WITH THE GARANCY ACCESS INTELLIGENCE MANAGER White Paper VERSION 1.0 STEFANIE PFAU, THOMAS GROSSE OSTERHUES 2013 Beta Systems Software

More information

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING

dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on

More information

BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE. Prepared by:

BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE. Prepared by: BIG DATA COURSE 1 DATA QUALITY STRATEGIES - CUSTOMIZED TRAINING OUTLINE Cerulium Corporation has provided quality education and consulting expertise for over six years. We offer customized solutions to

More information

January 2010. Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling. Sponsored by:

January 2010. Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling. Sponsored by: Fast-Tracking Data Warehousing & Business Intelligence Projects via Intelligent Data Modeling January 2010 Claudia Imhoff, Ph.D Sponsored by: Table of Contents Introduction... 3 What is a Data Model?...

More information

Data Warehouse Automation A Decision Guide

Data Warehouse Automation A Decision Guide Data Warehouse Automation A Decision Guide A White Paper by Dave Wells Infocentric LLC Table of Contents Seven Myths of Data Warehouse Automation 1 Why Automate Data Warehousing? 2 The Basis of Data Warehouse

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

A Survey on Data Warehouse Architecture

A Survey on Data Warehouse Architecture A Survey on Data Warehouse Architecture Rajiv Senapati 1, D.Anil Kumar 2 1 Assistant Professor, Department of IT, G.I.E.T, Gunupur, India 2 Associate Professor, Department of CSE, G.I.E.T, Gunupur, India

More information

Enterprise Information Integration (EII) A Technical Ally of EAI and ETL Author Bipin Chandra Joshi Integration Architect Infosys Technologies Ltd

Enterprise Information Integration (EII) A Technical Ally of EAI and ETL Author Bipin Chandra Joshi Integration Architect Infosys Technologies Ltd Enterprise Information Integration (EII) A Technical Ally of EAI and ETL Author Bipin Chandra Joshi Integration Architect Infosys Technologies Ltd Page 1 of 8 TU1UT TUENTERPRISE TU2UT TUREFERENCESUT TABLE

More information

Rich Traceability. by Jeremy Dick Quality Systems and Software Ltd. Abstract

Rich Traceability. by Jeremy Dick Quality Systems and Software Ltd. Abstract www.telelogic.com Rich Traceability by Jeremy Dick Quality Systems and Software Ltd. Abstract Creating traceability through the use of links between paragraphs of documents, or between objects in a requirements

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

OnX Big Data Reference Architecture

OnX Big Data Reference Architecture OnX Big Data Reference Architecture Knowledge is Power when it comes to Business Strategy The business landscape of decision-making is converging during a period in which: > Data is considered by most

More information

Designing Agile Data Pipelines. Ashish Singh Software Engineer, Cloudera

Designing Agile Data Pipelines. Ashish Singh Software Engineer, Cloudera Designing Agile Data Pipelines Ashish Singh Software Engineer, Cloudera About Me Software Engineer @ Cloudera Contributed to Kafka, Hive, Parquet and Sentry Used to work in HPC @singhasdev 204 Cloudera,

More information

B.Sc (Computer Science) Database Management Systems UNIT-V

B.Sc (Computer Science) Database Management Systems UNIT-V 1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used

More information

Primary Key Associates Limited

Primary Key Associates Limited is at the core of Primary Key Associates work Our approach to analytics In this paper Andrew Lea, our Technical Director in charge of, describes some of the paradigms, models, and techniques we have developed

More information

Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract

Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract 224 Business Intelligence Journal July DATA WAREHOUSING Ofori Boateng, PhD Professor, University of Northern Virginia BMGT531 1900- SU 2011 Business Intelligence Project Jagir Singh, Greeshma, P Singh

More information

Research into competency models in arts education

Research into competency models in arts education Research into competency models in arts education Paper presented at the BMBF Workshop International Perspectives of Research in Arts Education, Nov. 4 th and 5 th, 2013. Folkert Haanstra, Amsterdam School

More information

Big Data and Analytics

Big Data and Analytics INSIDE TRACK Analyst commentary with a real-world edge Big Data and Analytics Dazzling new solutions or irritating new hype? By Tony Lock, November 2012 Originally published on http://www.theregister.co.uk/

More information

BIG DATA GOVERNANCE: BALANCING BIG DATA VELOCITY & INFORMATION GOVERNANCE

BIG DATA GOVERNANCE: BALANCING BIG DATA VELOCITY & INFORMATION GOVERNANCE BIG DATA GOVERNANCE: BALANCING BIG DATA VELOCITY & INFORMATION GOVERNANCE Size Matters. The success of big data projects requires access to huge sets of high quality information. Compliant data represents

More information

Formal Methods for Preserving Privacy for Big Data Extraction Software

Formal Methods for Preserving Privacy for Big Data Extraction Software Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability

More information

An Introduction to Master Data Management (MDM)

An Introduction to Master Data Management (MDM) An Introduction to Master Data Management (MDM) Presented by: Robert Quinn, Sr. Solutions Architect FYI Business Solutions Agenda Introduction MDM Definition MDM Terms Best Practices Data Challenges MDM

More information

Outline. What is Big data and where they come from? How we deal with Big data?

Outline. What is Big data and where they come from? How we deal with Big data? What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,

More information

90% of your Big Data problem isn t Big Data.

90% of your Big Data problem isn t Big Data. White Paper 90% of your Big Data problem isn t Big Data. It s the ability to handle Big Data for better insight. By Arjuna Chala Risk Solutions HPCC Systems Introduction LexisNexis is a leader in providing

More information

Information Systems and Technologies in Organizations

Information Systems and Technologies in Organizations Information Systems and Technologies in Organizations Information System One that collects, processes, stores, analyzes, and disseminates information for a specific purpose Is school register an information

More information

Choosing the right enterprise resource

Choosing the right enterprise resource whitepaper Choosing the right enterprise resource Planning (ERP) System How to Avoid the 7 Fatal Flaws WHITEPAPER Choosing the Right ERP System 2 about For process manufacturers like chemicals, pharmaceuticals

More information

Data Warehouse: Introduction

Data Warehouse: Introduction Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,

More information

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal Information has gone from scarce to super-abundant. That brings huge new benefits. The Economist

More information

Top 10 Business Intelligence (BI) Requirements Analysis Questions

Top 10 Business Intelligence (BI) Requirements Analysis Questions Top 10 Business Intelligence (BI) Requirements Analysis Questions Business data is growing exponentially in volume, velocity and variety! Customer requirements, competition and innovation are driving rapid

More information

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data White Paper A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data Contents Executive Summary....2 Introduction....3 Too much data, not enough information....3 Only

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Software Development Life Cycle & Process Models

Software Development Life Cycle & Process Models Volume 1, Issue 1 ISSN: 2320-5288 International Journal of Engineering Technology & Management Research Journal homepage: www.ijetmr.org Software Development Life Cycle & Process Models Paritosh Deore

More information

Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain

Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain Journal of The International Association of Advanced Technology and Science Proper study of Data Warehousing and Data Mining Intelligence Application in Education Domain AMAN KADYAAN JITIN Abstract Data-driven

More information

Data Governance for Regulated Industries

Data Governance for Regulated Industries Data Governance for Regulated Industries Amir Halfon CTO, Worldwide Financial Service Agenda Components of Data Governance Challenges Solutions and Case Studies Q&A SLIDE: 2 Data Governance Considerations

More information

Requirements in Functional IT Management

Requirements in Functional IT Management Requirements in Functional IT Floris Blaauboer University of Twente, Department of Computer Science, Chair of Information Systems, f.a.blaauboer@student.utwente.nl Abstract. Requirements engineering and

More information