Speeding ETL Processing in Data Warehouses White Paper
|
|
|
- Dwain Farmer
- 10 years ago
- Views:
Transcription
1 Speeding ETL Processing in Data Warehouses White Paper dmxwpADM
2 High-Performance Aggregations and Joins for Faster Data Warehouse Processing Data Processing Challenges... 1 Joins and Aggregates are Critical to Data Warehouse Processing... 1 Aggregations are a Key Component of Data Warehouse Processing... 2 Using High-Performance Aggregations for Preprocessing... 2 Pre-Calculated Aggregations Speed Queries... 2 Using Multi-Level Hierarchal Aggregations... 3 Joins Combine Data for Reports and Downstream Processing... 4 Using High-Performance Joins for Changed Data Capture (CDC) and Other Preprocessing... 4 Case Studies Demonstrate Elapsed Time Savings Using DMExpress ADM High-Performance Aggregations and Joins... 5 Scenario #1: Slow Java Processing... 5 The ADM Solution... 6 Scenario #2: Slow ETL Application... 6 The ADM Solution... 6 DMExpress: An Enterprise Solution... 7
3 Data Processing Challenges Staying competitive in today s real-time business world demands the capability to process ever-increasing volumes of data at lightning speed. In today's fast-changing, competitive environment, a complaint frequently heard by data warehouse users is that access to time-critical data is too slow. Shrinking batch windows and data volume that increases exponentially are placing increasing demands on data warehouses to deliver instantly-available information. Additionally, data warehouses must be able to consistently generate accurate results. But achieving accuracy and speed with large, diverse sets of data can be challenging. Data management and governance are thus of paramount importance, a trend that has forced industries to invest billions of dollars in data infrastructure updates. The majority of these expenditures has been devoted to improving access to, and the quality of, real-time business intelligence (BI) and other mission-critical operations, including data marts, data mining, online data stores, and online transaction processing (OLTP) systems. One of the primary factors affecting data warehouse performance is the underlying physical distribution of the data in the databases, which must be consolidated in various ways. For example, a query may rely on tables built from multiple heterogeneous sources, involving a variety of transformations. An inability to perform dataintensive operations efficiently will inevitably impede business analyses and subsequent decision-making. Joins and Aggregates are Critical to Data Warehouse Processing Various operations can be used to optimize data manipulation and thus accelerate data warehouse processes. Two operations in particular, joins and aggregations, play an integral role during preprocessing as well in manipulating and consolidating data in a data warehouse. For processing data for BI analytics and other mission-critical applications, the usual go-to software products include various extract, transform, load (ETL) and data warehousing solutions. However, these products do not always provide the speed required to deliver data on time. Often an additional product can contribute to speeding the processing of high volumes of data. One such product, Syncsort s DMExpress, including its Advanced Data Management (ADM) component, significantly reduces processing time of data-intensive operations through the use of high-performance joins and high-performance aggregations. Using ADM, operations such as queries, multi-level hierarchal aggregations, and changed data capture (CDC), also referred to as delta processing, can be accelerated. The result: critical business reports and analyses finish many hours and sometimes even days faster, allowing more frequent access to data for timely reports and analysis. Deriving useful information from raw data within a data warehouse involves joining factual and dimensional information before aggregating it to produce a report or output for downstream analysis. However, joins and aggregations are data-intensive and time-intensive, which prevent data warehouses with insufficiently fast ETL software from performing them frequently; but frequency is key to providing data for timely business Page 1
4 analysis. ADM provides a solution. It speeds data-intensive joins and aggregations, making it possible to perform them much more frequently. ADM high-speed aggregations and joins exploit patented algorithms, parallel processing technology, and dynamic optimization to accelerate applications and reduce computing resources. And these savings increase dramatically as data volumes grow. The following shows a job built from join and aggregate tasks in the DMExpress management interface. Join Aggregate Figure 1. Using DMExpress ADM High-Speed Join and Aggregation to Generate a Product Sales Report. Aggregations are a Key Component of Data Warehouse Processing Aggregations play a key role at various stages of data warehouse processing. From preprocessing of data prior to its entering the data warehouse, to dimensional data analysis used for conducting queries and generating reports, aggregations are critical to efficiently preparing, configuring, and analyzing large volumes of data. Using High-Performance Aggregations for Preprocessing By aggregating data prior to loading it into the data warehouse, queries, database loads, and other downstream processing can be performed much faster; DMExpress ADM can quickly summarize factual data to the minimum level of granularity required by the data warehouse. A common application removes redundant transaction data from multiple sources for faster queries and database loads. Pre-Calculated Aggregations Speed Queries Data warehouse experts agree that aggregates are the best way to speed data warehouse queries. According to data warehousing expert Ralph Kimball, Every data warehouse should contain precalculated and prestored aggregation tables (The Data Warehouse Toolkit, 2nd edition, p. 356). Page 2
5 Aggregate operations yield the greatest performance benefits when they are used for input to data analysis; for example, to aggregate daily sales data for each quarter and each sales region. If these queries run at consistent times and if the aggregation application is fast enough, then it is often possible to generate the aggregations and store them in tables within the database before the query is run. The query can then use these aggregate tables to compute the results, thus avoiding the processing necessary to create the aggregations. In the case of a six million row query, reducing the number of rows read by creating aggregations across dimensions can vastly accelerate processing time. A query answered from base-level data can take hours and involve millions of data records and millions of calculations. With pre-calculated aggregates, the same query can be answered in seconds with just a few records and calculations. Using Multi-Level Hierarchal Aggregations Sophisticated aggregation schemes recognize dimensional hierarchies and build higher-level aggregations from more granular aggregations. For instance, a daily aggregation of sales for a region could be used to build a monthly aggregation of sales by region, and thereby avoid an aggregation of the more granular daily sales totals. Such aggregate-awareness can significantly increase speed and enhance reporting capacity. Figure 2. Summarizing Sales Data Using Multi-Level Hierarchal Aggregation. Multi-level hierarchal aggregations (Figure 2) can be efficiently performed with ADM. Using the aggregate functionality and implementing a processing scheme that involves working with smaller and smaller datasets, ADM can perform hierarchical aggregation and then use DMExpress merge functionality to bring the hierarchically-aggregated data together to create a final sales report. Page 3
6 Aggregations can also be used to replace fact data with rolled up versions of themselves. For example, you can use this approach to create a monthly product-sales-by-store summary of an original fact table. All of the daily sales records are aggregated into monthly records, which reduces the size of the fact table. Yet dimensional analysis by time at the month, quarter, and year levels can still be performed. ADM can also perform multi-level hierarchal aggregation to help build cubes for more advanced dimensional data analyses. Joins Combine Data for Reports and Downstream Processing Joins are used to combine information from two or more data sources, such as database tables, and place it into a new data source suitable for downstream processing or reports. Joins are particularly powerful because they enable rapidly changing data to be organized into categories for subsequent report preparation through the use of matching keys. Joins are used to pre-process data, to improve the efficiency of queries, and to accelerate changed data capture (CDC), or delta processing. Using High-Performance Joins for Changed Data Capture (CDC) and Other Preprocessing To reduce the time needed to retrieve information, data must be preprocessed into the proper form for the dimensional data warehouse. High-speed joins are critical to this process, which can include lookups of legacy values for appropriate replacement, cleansing and validating data, identifying and eliminating nonmatching or unmatching values, and pre-aggregation. Using ADM high-performance join for data-intensive operations, descriptive information can be combined with factual data late in a processing sequence so that storage and throughput requirements are minimized. CDC is an increasingly important pre-processing function. By loading only new, updated, and deleted records into the data warehouse, CDC significantly conserves time and resources when carrying out data-intensive processes (Figure 3). Figure 3. Joins Used for Changed Data Capture. Page 4
7 Rather than replacing the information in the data warehouse with the data in the entire online transactional database, a join will match the primary key of the previously loaded record with its corresponding new record and then compare the data portions of the two records to determine if they ve changed. In this way, only added, deleted, and altered records are updated, which significantly reduces elapsed time of database loads. By using a high-performance join for CDC, data warehouse updates can be performed with far greater efficiency. Case Studies Demonstrate Elapsed Time Savings Using DMExpress ADM High-Performance Aggregations and Joins Because of the significant benefits they provide, aggregations and joins are widely used in nearly every industrial sector where large volumes of data must be analyzed including the banking, pharmaceutical, retail, and telecommunications industries. Below are two examples from the financial industry where ADM highperformance aggregations and joins were used to significantly accelerate data processing. Scenario #1: Slow Java Processing A large investment bank wanted to improve the performance of a lengthy product/customer data management process that originally involved five major procedures, all written in Java. Due to recent increases in data volumes, legacy Java-based applications were consuming excessive system resources and time. It thus became necessary to upgrade the bank s data processing applications in order to reduce elapsed time for routine customer management processes. The processes included data extractions, joins, merges, reorganizations, and loads in a networked UNIX environment. The bank s existing solution, using programs written in Java, required hours to complete. Figure 4. Investment Bank Uses DMExpress with ADM to Reduce Data Processing Time by 90%. Page 5
8 The ADM Solution The Java procedures were converted to ADM high-performance join and aggregation features, which eliminated the need for a separate database load step. As a result of switching to DMExpress ADM, elapsed time for customer management data processing was reduced by over 90%, from hours to just 2-3 hours (Figure 4). The bank now completes its routine customer management processes faster and therefore can perform them more frequently. Scenario #2: Slow ETL Application At another leading financial institution, a client profitability application was designed to examine customer account and product data and subsequently determine the rate and amount that the company's products earned for its investors. The company needed to create aggregates and order the data, which consisted of thirty-six, 1GB regular time period files and five, 1.8 GB summary time period files. DataStage was used to process the customer financial data and then generate financial files by time periods as well as by product, customer, and organization. DataStage's aggregation and ordering features did not provide the performance and functionality that the company needed. With DataStage alone the process was taking 72 hours. Figure 5. Bank Cuts Processing Time by 33% Using DMExpress with ADM The ADM Solution To reduce total processing time, the company selected DMExpress with ADM to process the aggregates and ordering. Once the aggregates were completed, the database was brought down and the database administrator ran several statistics with the aggregated files and other DataStage files. The data was then loaded into the data warehouse and different divisions from around the world queried the information using Oracle Reports and Oracle Discoverer. When the ETL process was run using ADM, the previous data was still up so that queries could be performed on the same machine while running their ETL process. With ADM, the elapsed time was reduced by 33% to 48 hours, cutting processing time by one entire day (Figure 5). Page 6
9 DMExpress: An Enterprise Solution DMExpress is the high-speed data management solution for ETL, data warehousing, BI, and other missioncritical applications. DMExpress replaces long-running processes so applications run faster. This can save many hours, even days, especially when processing large amounts of data. DMExpress can be used to speed processing at many points in the enterprise management structure (Figure 6). Figure 6. DMExpress Speed is Utilized at Many Points in the Data Management Structure. Page 7
10 DMExpress leverages nearly 40 years of research and development in high-performance software. It is needed wherever you have data-intensive applications. It speeds processes like ETL, staging data for a data warehouse, and database loading by up to 90%. The ultimate business benefit of DMExpress is twofold: It provides rapid availability of data for critical operations such as BI (including CRM, ERP, SCM, and CPM), data marts, data mining, web log processing, online data stores, and OLTP systems. Its speed and efficiency minimize resource consumption, making it possible to consolidate hardware for significant cost savings. For more information about DMExpress ADM and high-performance aggregations and joins, contact DMExpress sales at or visit Page 8
11 50 Tice Boulevard Woodcliff Lake, NJ Syncsort Incorporated, 2007 All rights reserved. DMExpress is a trademark of Syncsort Incorporated. All other company and product names used herein may be the trademarks of their respective companies.
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
Integrating SAP and non-sap data for comprehensive Business Intelligence
WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst
Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage
Moving Large Data at a Blinding Speed for Critical Business Intelligence A competitive advantage Intelligent Data In Real Time How do you detect and stop a Money Laundering transaction just about to take
East Asia Network Sdn Bhd
Course: Analyzing, Designing, and Implementing a Data Warehouse with Microsoft SQL Server 2014 Elements of this syllabus may be change to cater to the participants background & knowledge. This course describes
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Appliances and DW Architectures John O Brien President and Executive Architect Zukeran Technologies 1 TDWI 1 Agenda What
When to consider OLAP?
When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: [email protected] Abstract: Do you need an OLAP
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse
Information management software solutions White paper Powerful data warehousing performance with IBM Red Brick Warehouse April 2004 Page 1 Contents 1 Data warehousing for the masses 2 Single step load
SAS BI Course Content; Introduction to DWH / BI Concepts
SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console
Designing a Dimensional Model
Designing a Dimensional Model Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Definitions Data Warehousing A subject-oriented, integrated, time-variant, and
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved
CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information
Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0
SQL Server Technical Article Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0 Writer: Eric N. Hanson Technical Reviewer: Susan Price Published: November 2010 Applies to:
DMX-h ETL Use Case Accelerator. Web Log Aggregation
DMX-h ETL Use Case Accelerator Web Log Aggregation Syncsort Incorporated, 2015 All rights reserved. This document contains proprietary and confidential material, and is only for use by licensees of DMExpress.
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Content Problems of managing data resources in a traditional file environment Capabilities and value of a database management
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Accelerate time to market Move data in real time
Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole
Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT
HYPERION MASTER DATA MANAGEMENT SOLUTIONS FOR IT POINT-AND-SYNC MASTER DATA MANAGEMENT 04.2005 Hyperion s new master data management solution provides a centralized, transparent process for managing critical
Implementing a Data Warehouse with Microsoft SQL Server
This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with SQL Server Integration Services, and
Adobe Insight, powered by Omniture
Adobe Insight, powered by Omniture Accelerating government intelligence to the speed of thought 1 Challenges that analysts face 2 Analysis tools and functionality 3 Adobe Insight 4 Summary Never before
IAF Business Intelligence Solutions Make the Most of Your Business Intelligence. White Paper November 2002
IAF Business Intelligence Solutions Make the Most of Your Business Intelligence White Paper INTRODUCTION In recent years, the amount of data in companies has increased dramatically as enterprise resource
Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives
Chapter 6 FOUNDATIONS OF BUSINESS INTELLIGENCE: DATABASES AND INFORMATION MANAGEMENT Learning Objectives Describe how the problems of managing data resources in a traditional file environment are solved
Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time, budget cycles, etc.
Data Warehouses Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical
COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
Page 1 of 8 ABOUT THIS COURSE This 5 day course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
Implementing a Data Warehouse with Microsoft SQL Server
Page 1 of 7 Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL 2014, implement ETL
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.
Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc. Introduction Abstract warehousing has been around for over a decade. Therefore, when you read the articles
Breadboard BI. Unlocking ERP Data Using Open Source Tools By Christopher Lavigne
Breadboard BI Unlocking ERP Data Using Open Source Tools By Christopher Lavigne Introduction Organizations have made enormous investments in ERP applications like JD Edwards, PeopleSoft and SAP. These
Building a Data Warehouse
Building a Data Warehouse With Examples in SQL Server EiD Vincent Rainardi BROCHSCHULE LIECHTENSTEIN Bibliothek Apress Contents About the Author. ; xiij Preface xv ^CHAPTER 1 Introduction to Data Warehousing
Data Virtualization A Potential Antidote for Big Data Growing Pains
perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and
BUSINESSOBJECTS DATA INTEGRATOR
PRODUCTS BUSINESSOBJECTS DATA INTEGRATOR IT Benefits Correlate and integrate data from any source Efficiently design a bulletproof data integration process Improve data quality Move data in real time and
INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence
INTRODUCTION TO BUSINESS INTELLIGENCE What to consider implementing a Data Warehouse and Business Intelligence Summary: This note gives some overall high-level introduction to Business Intelligence and
Course 103402 MIS. Foundations of Business Intelligence
Oman College of Management and Technology Course 103402 MIS Topic 5 Foundations of Business Intelligence CS/MIS Department Organizing Data in a Traditional File Environment File organization concepts Database:
EAI vs. ETL: Drawing Boundaries for Data Integration
A P P L I C A T I O N S A W h i t e P a p e r S e r i e s EAI and ETL technology have strengths and weaknesses alike. There are clear boundaries around the types of application integration projects most
University of Gaziantep, Department of Business Administration
University of Gaziantep, Department of Business Administration The extensive use of information technology enables organizations to collect huge amounts of data about almost every aspect of their businesses.
Implementing a Data Warehouse with Microsoft SQL Server MOC 20463
Implementing a Data Warehouse with Microsoft SQL Server MOC 20463 Course Outline Module 1: Introduction to Data Warehousing This module provides an introduction to the key components of a data warehousing
COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER MODULE 1: INTRODUCTION TO DATA WAREHOUSING This module provides an introduction to the key components of a data warehousing
CHAPTER 4: BUSINESS ANALYTICS
Chapter 4: Business Analytics CHAPTER 4: BUSINESS ANALYTICS Objectives Introduction The objectives are: Describe Business Analytics Explain the terminology associated with Business Analytics Describe the
Understanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:
Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
Master Data Management and Data Warehousing. Zahra Mansoori
Master Data Management and Data Warehousing Zahra Mansoori 1 1. Preference 2 IT landscape growth IT landscapes have grown into complex arrays of different systems, applications, and technologies over the
[callout: no organization can afford to deny itself the power of business intelligence ]
Publication: Telephony Author: Douglas Hackney Headline: Applied Business Intelligence [callout: no organization can afford to deny itself the power of business intelligence ] [begin copy] 1 Business Intelligence
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Chapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
OLAP. Business Intelligence OLAP definition & application Multidimensional data representation
OLAP Business Intelligence OLAP definition & application Multidimensional data representation 1 Business Intelligence Accompanying the growth in data warehousing is an ever-increasing demand by users for
Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013
An Oracle White Paper October 2013 Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics Introduction: The value of analytics is so widely recognized today that all mid
Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days
Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days Course
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
Data Warehousing and Data Mining
Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong [email protected] Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge
High-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA
Applied Business Intelligence Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Agenda Business Drivers and Perspectives Technology & Analytical Applications Trends Challenges
Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence. Peter Simons peter.simons@cimaglobal.
Management Accountants and IT Professionals providing Better Information = BI = Business Intelligence Peter Simons [email protected] Agenda Management Accountants? The need for Better Information
BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining
BUSINESS INTELLIGENCE Bogdan Mohor Dumitrita 1 Abstract A Business Intelligence (BI)-driven approach can be very effective in implementing business transformation programs within an enterprise framework.
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources
CHAPTER 5: BUSINESS ANALYTICS
Chapter 5: Business Analytics CHAPTER 5: BUSINESS ANALYTICS Objectives The objectives are: Describe Business Analytics. Explain the terminology associated with Business Analytics. Describe the data warehouse
Business Intelligence Solution for Small and Midsize Enterprises (BI4SME)
Business Intelligence Solution for Small and Midsize Enterprises (BI4SME) Preface Not only large Enterprises can benefit from the advantages of Business Intelligence (BI) Solutions. BI4SME is a cost efficient,
Jet Enterprise Frequently Asked Questions Pg. 1 03/18/2011 JEFAQ - 02/13/2013 - Copyright 2013 - Jet Reports International, Inc.
Pg. 1 03/18/2011 JEFAQ - 02/13/2013 - Copyright 2013 - Jet Reports International, Inc. Regarding Jet Enterprise What are the software requirements for Jet Enterprise? The following components must be installed
Harnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
How To Use Noetix
Using Oracle BI with Oracle E-Business Suite How to Meet Enterprise-wide Reporting Needs with OBI EE Using Oracle BI with Oracle E-Business Suite 2008-2010 Noetix Corporation Copying of this document is
Whitepaper. Innovations in Business Intelligence Database Technology. www.sisense.com
Whitepaper Innovations in Business Intelligence Database Technology The State of Database Technology in 2015 Database technology has seen rapid developments in the past two decades. Online Analytical Processing
POLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances
IBM Software Business Analytics Cognos Business Intelligence IBM Cognos 10: Enhancing query processing performance for IBM Netezza appliances 2 IBM Cognos 10: Enhancing query processing performance for
Business Intelligence, Analytics & Reporting: Glossary of Terms
Business Intelligence, Analytics & Reporting: Glossary of Terms A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ad-hoc analytics Ad-hoc analytics is the process by which a user can create a new report
Five Technology Trends for Improved Business Intelligence Performance
TechTarget Enterprise Applications Media E-Book Five Technology Trends for Improved Business Intelligence Performance The demand for business intelligence data only continues to increase, putting BI vendors
Service Oriented Data Management
Service Oriented Management Nabin Bilas Integration Architect Integration & SOA: Agenda Integration Overview 5 Reasons Why Is Critical to SOA Oracle Integration Solution Integration
ENABLING OPERATIONAL BI
ENABLING OPERATIONAL BI WITH SAP DATA Satisfy the need for speed with real-time data replication Author: Eric Kavanagh, The Bloor Group Co-Founder WHITE PAPER Table of Contents The Data Challenge to Make
Integrating Ingres in the Information System: An Open Source Approach
Integrating Ingres in the Information System: WHITE PAPER Table of Contents Ingres, a Business Open Source Database that needs Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application
Bringing Big Data into the Enterprise
Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?
BENEFITS OF AUTOMATING DATA WAREHOUSING
BENEFITS OF AUTOMATING DATA WAREHOUSING Introduction...2 The Process...2 The Problem...2 The Solution...2 Benefits...2 Background...3 Automating the Data Warehouse with UC4 Workload Automation Suite...3
Deductive Data Warehouses and Aggregate (Derived) Tables
Deductive Data Warehouses and Aggregate (Derived) Tables Kornelije Rabuzin, Mirko Malekovic, Mirko Cubrilo Faculty of Organization and Informatics University of Zagreb Varazdin, Croatia {kornelije.rabuzin,
Foundations of Business Intelligence: Databases and Information Management
Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,
THE DATA WAREHOUSE ETL TOOLKIT CDT803 Three Days
Three Days Prerequisites Students should have at least some experience with any relational database management system. Who Should Attend This course is targeted at technical staff, team leaders and project
Data Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
Data Warehousing and OLAP Technology for Knowledge Discovery
542 Data Warehousing and OLAP Technology for Knowledge Discovery Aparajita Suman Abstract Since time immemorial, libraries have been generating services using the knowledge stored in various repositories
Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management 6.1 2010 by Prentice Hall LEARNING OBJECTIVES Describe how the problems of managing data resources in a traditional
Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper
Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper bex30102507wpor Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies Introduction Are you a database administrator
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
Data Integration and ETL Process
Data Integration and ETL Process Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second
Data Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
EMC: Managing Data Growth with SAP HANA and the Near-Line Storage Capabilities of SAP IQ
2015 SAP SE or an SAP affiliate company. All rights reserved. EMC: Managing Data Growth with SAP HANA and the Near-Line Storage Capabilities of SAP IQ Based on years of successfully helping businesses
Data Warehouse Snowflake Design and Performance Considerations in Business Analytics
Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
DATA WAREHOUSE AND DATA MINING NECCESSITY OR USELESS INVESTMENT
Scientific Bulletin Economic Sciences, Vol. 9 (15) - Information technology - DATA WAREHOUSE AND DATA MINING NECCESSITY OR USELESS INVESTMENT Associate Professor, Ph.D. Emil BURTESCU University of Pitesti,
SQL Server Business Intelligence on HP ProLiant DL785 Server
SQL Server Business Intelligence on HP ProLiant DL785 Server By Ajay Goyal www.scalabilityexperts.com Mike Fitzner Hewlett Packard www.hp.com Recommendations presented in this document should be thoroughly
DATA VALIDATION AND CLEANSING
AP12 Data Warehouse Implementation: Where We Are 1 Year Later Evangeline Collado, University of Central Florida, Orlando, FL Linda S. Sullivan, University of Central Florida, Orlando, FL ABSTRACT There
Cúram Business Intelligence Reporting Developer Guide
IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version 6.0.5 IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version
Implementing a Data Warehouse with Microsoft SQL Server
Course Code: M20463 Vendor: Microsoft Course Overview Duration: 5 RRP: 2,025 Implementing a Data Warehouse with Microsoft SQL Server Overview This course describes how to implement a data warehouse platform
WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
The IBM Cognos Platform
The IBM Cognos Platform Deliver complete, consistent, timely information to all your users, with cost-effective scale Highlights Reach all your information reliably and quickly Deliver a complete, consistent
Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
The Data Warehouse ETL Toolkit
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. The Data Warehouse ETL Toolkit Practical Techniques for Extracting,
Cúram Business Intelligence and Analytics Guide
IBM Cúram Social Program Management Cúram Business Intelligence and Analytics Guide Version 6.0.4 Note Before using this information and the product it supports, read the information in Notices at the
