DATA WAREHOUSE DESIGN
|
|
|
- Scot Newman
- 10 years ago
- Views:
Transcription
1 DATA WAREHOUSE DESIGN ICDE 2001 Tutorial Stefano Rizzi, Matteo Golfarelli DEIS - University of Bologna, Italy 1 Motivation Building a data warehouse for an enterprise is a huge and complex task, which requires an accurate planning aimed at devising satisfactory answers to organizational and architectural questions. Despite the pushing demand for working solutions coming from enterprises and the wide offer of advanced technologies from producers, few attempts towards devising a specific methodology for data warehouse design have been made. On the other hand, the statistic reports related to DW project failures state that a major cause lies in the absence of a global view of the design process: in other terms, in the absence of a design methodology. Summary Introduction to Data Warehousing Conceptual design of Data Warehouses Workload-based logical design for ROLAP Indexes for physical design 2
2 Introduction to Data Warehousing Stefano Rizzi 3 Information Systems: profile and role Information systems are rooted in the relationship between information, decision and control. An IS should collect and classify the information, by means of integrated and suitable procedures, in order to produce in time and at the right levels the synthesis to be used to support the decisional process, as well as to administrate and globally control the enterprise activity. 4
3 Information as a resource Information is an increasing value resource, required from s to schedule and monitor effectively the enterprise activities. Information is the first matter which is transformed by information systems like unfinished products are transformed by manufacturing systems. Manufacturing system Finished product Information system Information 5 Value of information Information is an enterprise resource like capital, first matters, plants and people; thus, it has a cost. Hence, understanding the value of information is important. Value Strategic directions Reports Selected information Primary information sources Amount 6
4 ESS MIS DSS TPS Different kinds of information systems OAS KWS Operational Operational Knowledge Knowledge Management Management Strategic Strategic Senior s Middle s Knowledge and data workers Operational s Sales and marketing Manufacturing Finance Accounting Human resources 7 The Data Warehouse phenomenon Usual complaints: We have tons of data but we cannot access them! How can people playing the same role produce substantially different results? We want to slice and dice data in any possible way! Show me only what is important! Everyone knows some data are incorrect... (R. Kimball, The Data Warehouse Toolkit) 8
5 Data Warehousing A collection of technologies and tools supporting the knowledge worker (executive,, analyst) in analysing data aimed at decision making and at improving the knowledge assets of the enterprise. Data Warehouse At the core of the architecture of modern information systems, it is a data repository: Oriented to subjects Integrated and consistent Representing temporal evolution Non volatile The data warehouse is regularly refreshed, permanently growing, logically centralised and easily accessed by users, essentially read-only 9 Data Warehouse Operational data (relational, legacy) External data ETL tools Summary data Warehouse Access Analysis tools (OLAP) Data mining What-If analysis Reporting tools 10
6 Data Marts Data Warehouse Replication and broadcasting Data mart Marketing Finance Geographical regions Client management Supplier management 11 Subject vs Process region patient charge consumption reservations Medical reports admissions Emphasis on applications Emphasis on subjects 12
7 Integration and consistency External data DB Schema Integration Extraction Transformation Cleaning Validation Filtering Loading DW Text files wrappers loaders mediators 13 Temporal evolution OLTP DW Current values Snapshot Restricted historical content, Often time is not included in keys, Data are upd Rich historical content, Time is included in keys, Snapshots cannot be upd 14
8 OLTP up Non-volatility DW load acce ss insert delete Huge data volumes: from 20 GBs to some TBs in a few years In a DW, no advanced techniques for transaction management are required (differently from OLTP systems) Key issues are the query throughput and the resilience 15 DW 90% ad hoc queries Mostly read access Hundreds users Denormalised Supports historical versions Optimised for accesses involving most database Based on summary data vs.. OLTP 90% predefined transactions Read/write access Thousands users Normalised Does not support historical versions Optimised for accesses involving a small database fraction Based on elemental data 16
9 ROLAP (Relational OLAP) Intermediate level server between a relational back- end server and the front-end client Specialised middleware Generation of SQL multi-statements for the back-end server Query scheduling MOLAP (Multidimensional OLAP) Direct support of multi-dimensional views Special data structures (e.g., multi-dimensional arrays) Compression techniques Intelligent disk/memory caching Pre-computation Complex analysis 17 The technological progress knowledge Pattern Warehousing Data Mining Refinement data Statistics & reporting OLAP Data Warehousing Source: Information Discovery
10 The Data Warehouse Market RDBMS OLAP Source: Shilakes, Tylman - Enterprise Information Portals Data Marts ETL Data Quality Metadata The DW life-cycle Objective definition and planning Clearly determine the scopes, define the borders, estimate dimensions, choose the approach to design, evaluate the benefits Infrastructure design Choose the technologies and the tools, analyse the architectural solutions, solve the management problems Design and implementation of applications Add iteratively new data marts and applications to the warehouse 20
11 Bibliography R. Barquin, S. Edelstein. Planning and Designing the Data Warehouse. Prentice Hall (1996). S. Chaudhuri, U. Dayal. An overview of data warehousing and OLAP technology. SIGMOD Record 26,1 (1997). G. Colliat. OLAP, relational and multidimensional database systems. SIGMOD Record 25, 3 (1996). M. Demarest. The politics of data warehousing. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth. Data mining and knowledge discovery in databases: an overview. Comm. of the ACM 39, 11 (1996). W.H. Inmon. Building the data warehouse. John Wiley & Sons (1996). S. Kelly. Data Warehousing in Action. John Wiley & Sons (1997). R. Kimball. The data warehouse toolkit. John Wiley & Sons (1996). R. Kimball, L. Reeves, M. Ross, W. Thornthwaite. The data Warehouse Lifecycle Toolkit. John Wiley & Sons (1998). C. Shilakes, J. Tylman. Enterprise Information Portals. P. Vassiliadis. Gulliver inthe land of data warehousing: practical experiences and observations of a researcher. Proc. DMDW 2000 (2000). J. Widom. Research Problems in Data Warehousing. Proc. CIKM (1995). 21 Conceptual modelling for Data Warehousing Stefano Rizzi 22
12 Why a new conceptual model? While it is universally recognised that a DW leans on a multidimensional model, there is no agreement on the approach to conceptual modelling. On the other hand, an accurate conceptual design is the necessary foundation for building a good information system. The Entity/Relationship model is widespread in the enterprises, but. "Entity relation data models [...] cannot be understood by users and they cannot be navigated usefully by DBMS software. Entity relation models cannot be used as the basis for enterprise data warehouses. (Kimball, 96) 23 The multidimensional data model Number of Coke cans sold at BIGSTORES in London on 10/10/99 Sales Store Product Time Time Number of Pepsi cans sold at all BIGSTORES on 10/10/99 Number of Fanta cans globally sold 24
13 Basic terminology Fact (cube, target). It is a focus of interest for the decisionmaking process; typically, it models an event occurring in the enterprise world (sales, shipments, purchases). It is essential for a fact to have some dynamic aspects, i.e., to evolve somehow across time. Measures (attributes, variables, metrics, properties). They are continuously valued (typically numerical) attributes which describe a fact from different points of view. For instance, each sale is measured by its revenue. Dimensions. They are discrete attributes which determine the minimum granularity adopted to represent facts. Typical dimensions for the sale fact are product, store and. Hierarchies (dimensions). They contain dimension attributes (levels, parameters) connected in a tree-like structure by many-to-one relationships (functional dependencies). 25 DW modelling in the literature Golfarelli et al. 98 Gyssens, Lakshmanan 97 Hüsemann et al. 00 Vassiliadis 98 Agrawal et al. 95 Sapia et al. 98 Datta, Thomas 97 Cabibbo, Torlone 98 Tryfona et al. 99 Franconi, Sattler 99 Li, Wang 96 26
14 DW modelling in the literature Golfarelli et al. 98 Hüsemann et al. 00 CONCEPTUAL Vassiliadis 98 Gyssens, Lakshmanan 97 Agrawal et al. 95 Sapia et al. 98 Datta, Thomas 97 Cabibbo, Torlone 98 Tryfona et al. 99 Franconi, Sattler 99 Li, Wang 96 LOGICAL 27 DW modelling in the literature Golfarelli et al. 98 FORMAL Gyssens, Lakshmanan 97 Hüsemann et al. 00 Vassiliadis 98 Agrawal et al. 95 Sapia et al. 98 Datta, Thomas 97 Cabibbo, Torlone 98 Tryfona et al. 99 Franconi, Sattler 99 Li, Wang 96 GRAPHICAL 28
15 DW modelling in the literature Golfarelli et al. 98 ALGEBRA Gyssens, Lakshmanan 97 Hüsemann et al. 00 Vassiliadis 98 Agrawal et al. 95 Sapia et al. 98 Datta, Thomas 97 Cabibbo, Torlone 98 Tryfona et al. 99 Franconi, Sattler 99 Li, Wang DW modelling in the literature Golfarelli et al. 98 Gyssens, Lakshmanan 97 Hüsemann et al. 00 Vassiliadis 98 Agrawal et al. 95 Sapia et al. 98 Datta, Thomas 97 Cabibbo, Torlone 98 Tryfona et al. 99 Franconi, Sattler 99 DESIGN Li, Wang 96 30
16 Conceptual models Sapia, Blaschka, Höfling, Dinter (1998) dimension level attribute roll-up relationship fact relationship 31 Conceptual models (2) Franconi, Sattler (1999) dimension target property level aggregated entity 32
17 Conceptual models (3) Hüsemann, Lechtenbörger, Vossen (2000) fact optional dimension measure dimension level property attribute optional property attribute aggregation path 33 The Dimensional Fact Model The Dimensional Fact Model (DFM) is a graphical conceptual model for DWs, aimed to: Effectively support conceptual design; Provide an environment where user queries can be formulated intuitively; Enable communication between the designer and the final user in order to refine requirement specification; Supply a stable platform for logical design; Provide an expressive and non-ambiguous documentation. The DFM is independent of the target logical model (multidimensional or relational) 34
18 The Dimensional Fact Model (2) Three levels of conceptual documentation are provided: Fact scheme: represents a fact of interest and the associated measures, dimensions and hierarchies. Data Mart scheme: summarizes the fact schemes which constitute each data mart and emphasize the feasible connections between them. Data Warehouse scheme: shows the different data marts emphasizing their overlaps, the different profiles of the users accessing them, and the operational sources which feed them. Each documentation level is integrated by glossaries which explain the names adopted within the schemes, define a connection between the DW data and the operational sources, express data volumes. Data mart schemes are associated to the workload specification. 35 dimension year hierarchy quarter month day of week holiday week Fact schemes fact marketing group SALE department category type brand city brand product sales sale district qty sold revenue unit price no. of customers store store city county measure dimension attribute A fact expresses a many-to-many relationship between its dimensions state 36
19 Fact schemes (2) A non-dimension attribute contains additional information about a dimension attribute, and is typically connected to it by a one-to-one relationship. It cannot be used for aggregation. Some links between attributes can be optional. marketing department group category type brand city brand product diet day of week sales holiday sale district SALE store year quarter month qty sold store state revenue city county week address unit price phone no. of customers begin end cost promotion optionality price reduction ad type non-dimension attribute 37 Convergence Cross-dimension attributes Additivity, non-additivity, non-aggregability Overlap fiscal year Fact schemes (3) week year quarter month fiscal quarter month fiscal fiscal week day of week marketing group non-aggregability product department category type brand diet SALE qty sold revenue unit price no. of customers promotion brand city sale district store store county store city store state phone address V.A.T. cross-dimension attribute ad type price reduction begin end convergence 38
20 The SHIPMENTS fact scheme FACT SCHEME: SHIPMENT TO STORES department marketing group category product type brand brand city fiscal year week year quarter month fiscal fiscal quarter month fiscal week day of week SHIPMENT TO STORES qty shipped shipping cost warehouse warehouse store city store state store city warehouse state mode type carrier 39 The INVENTORY fact scheme FACT SCHEME: INVENTORY department marketing group units per pallet package type package size weight category type brand city brand product fiscal year week year quarter month fiscal fiscal fiscal quarter month week day of week AVG, MIN INVENTORY level warehouse warehouse city warehouse nation 40
21 The supply chain component component from factory component PRODUCTION OF COMPONENTS factory COMPONENT DELIVERY to factory COMPONENT INVENTORY factory product MANUFACTURING factory product package type factory PACKAGING product SHIPMENT TO WAREHOUSE warehouse factory mode product product warehouse product promotion WAREHOUSE INVENTORY warehouse SHIPMENT TO STORES store SALES store mode 41 Glossaries ATTRIBUTE GLOSSARY: SHIPMENT TO STORES name description domain card. query product products 5000 select prodname,brandname, brand brands 800 cityname, brand city Where brands are manufactured cities 50 from PRODUCTS P,BRANDS B, type (pasta, soft drink, ) pr. types 200 CITIES C, where P.brandId = category (food, clothing, music, ) pr. categories 10 B.brandId department Deps. managing categories deps. 5 and B.cityId = C.cityId marketing group Responsible for product types groups 20 and stores stores 100 select storename,cityname, store city cities 80 statename from STORES store state states 5 S,CITIES C where S.cityId = C.cityId MEASURE GLOSSARY: SHIPMENT TO STORES (sparsity = 0.01) name description type query qty shipped Quantity of each product being shipped INTEGER select SUM(PS.qty) from PRODUCTS P,SHIP S,PRODSHIP PS, where P.prodId = PS.prodId and PS.shipId = S.shipId and group by P.prodId,S.,... shipping cost Cost of the shipment MONEY refresh frequency: 1 per week; refresh technique: periodic complete 42
22 Data mart schemes The data mart scheme is used to summarize the fact schemes which constitute the data mart and to show drill-across connections between them. It is a graph whose nodes are elemental and overlapped fact schemes; the arcs are directed to each overlapped scheme from its component schemes, which in turn may be overlapped. DATA MART SCHEME: SUPPLY CHAIN PRODUCTION OF COMPONENTS PRODUCTION AND DELIVERY COMPONENT DELIVERY DELIVERY AND INVENTORY COMPONENT INVENTORY MANUFACTURING MANUFACTURING AND PACKAGING PACKAGING WAREHOUSE INVENTORY DISTRIBUTION CYCLE SHIPMENT TO WAREHOUSE PRODUCT CYCLE SHIPMENT TO STORES SHIPMENT AND SALE SALE 43 The workload In principle, the workload for a data mart is dynamic and unpredictable. In some commercial tools, the actual workload is monitored while the DW is operating and the logical and physical schemes are dynamically tuned. We claim that a core workload can, and should, be determined a priori: The user typically knows in advance which kind of data analysis (s)he will carry out more often for decisional or statistical purposes; A substantial amount of queries are aimed at extracting summary data to fill standard reports. 44
23 The workload (2) FACT SCHEME: SHIPMENT TO STORES department marketing group category product type brand brand city fiscal year week year quarter month fiscal fiscal quarter month fiscal week day of week SHIPMENT TO STORES qty shipped shipping cost warehouse warehouse store city store state store city warehouse state mode type carrier 45 Data warehouse schemes At the highest abstraction level, the data warehouse scheme shows the different data marts emphasizing the fact schemes duplicated on two or more of them, the different profiles of the users accessing them, and the operational sources which feed them. personnel personnel database buyer SALES SUPPLY CHAIN PERSONNEL RENOVATION incentives administrative data mart user fact scheme operational db SALES DEMAND CHAIN purchases file transfer product database orders claims sale executive restoration works manual input 46
24 Conceptual design of Data Warehouses Stefano Rizzi 47 Designing the DW Within a successful approach to DW design, top-down and bottom-up strategies should be mixed. When planning a DW, a bottom-up approach should be followed. One data mart at a time is identified and prototyped. Each data mart is designed in a top-down fashion by building a conceptual scheme for each fact of interest. 48
25 Data Mart prototyping Prototype first the data mart which: plays the most strategic role for the enterprise; can convince the final users of the potential benefits; leans on available and consistent data sources. DM2 DM4 DM1 DM5 DM3 Source 3 Source 1 Source 2 49 Reference architecture DW Reconciled data Problem of designing the reconciled data (integration of heterogeneous sources) heterogeneous operational dbs 50
26 chiave negozio negozio città regione indirizzo resp. vendite N1... N2 chiave tempo chiave negozio chiave_prodotto quant venduta incasso num_clienti T1 N1 P T1 N1 P T1 N2 P Methodological framework analysis of the operational db requirement specification conceptual design final user db administrator workload refinement DWs are based on a pre-existing information system designer logical design physical design 51 Methodological framework (2) E/R Scheme Relational Scheme Conceptual Scheme Logical Scheme Physical Scheme Facts Preliminary workload CONCEPTUAL DESIGN Workload LOGICAL DESIGN Target logical model Workload PHYSICAL DESIGN Target DBMS 52
27 Conceptual design of the data mart Design is based on the documentation of the underlying operational information system: E/R schemes Relational schemes Golfarelli, Maio, Rizzi 98; Cabibbo, Torlone 98; Moody, Kortink 00; Hüsemann, Lechtenbörger, Vossen 00 Steps: Find facts For each fact: Navigate functional dependencies Drop useless attributes Define dimensions and measures 53 Finding facts Within an E/R scheme, a fact is represented by either an entity F or an n-ary relationship between entities E 1...E n Within a relational scheme, a fact is represented by a relation F. The entities and relationships representing frequently upd archives are good candis to define facts; those representing nearly-static archives are not. 54
28 Navigating functional dependencies Build a tree in which each vertex corresponds to an attribute of the scheme; The root corresponds to the identifier (key) of F; For each vertex v, the corresponding attribute functionally determines all the attributes corresponding to the descendants of v. 55 Example (from the E/R scheme): marketing group type diet (0,1) size weight warehouse MARKETING GROUP (1,N) for (1,1) TYPE (0,N) of (1,1) PRODUCT (1,N) from product (1,N) WAREHOUSE of (1,1) (1,N) unit price (0,N) (1,N) sale address department category qty DEPARTM. for CATEGORY (1,N) (1,1) PURCHASE TICKET district no. SALE DISTRICT (1,N) of (1,1) sales in (1,1) (1,N) (1,1) (0,N) (1,1) (1,N) in STORE in county STATE COUNTY (1,N) (1,1) state CITY ticket number store address phone city (1,N) (1,1) (1,N) (1,1) of BRAND produced in of of (1,N) (1,1) brand 56
29 Example (from the E/R scheme): dept. state category brand diet weight mark. grp. county type product city qty size sale unit price ticket number sales store address phone city county district no+state state district no 57 Dropping useless attributes Some attributes in the tree may be uninteresting for the DW. In order to drop useless levels of detail, it is possible to apply the following operators: Pruning: delete a vertex and its subtree. Grafting: delete a vertex and move its subtree. It is useful when an attribute is not interesting but the attributes it determines must be preserved. sales address sales address ticket number store city state sales address store ticket number store 58
30 Defining dimensions The choice of dimensions determines the fact granularity. Dimensions must be chosen among the root children in the attribute tree. Time should always be a dimension. dept. category brand diet weight mark. grp. type product city qty sale unit price sales store address phone city county district no+state state 59 Defining measures Measures must be chosen among the children of the root. Typically, measures are computed either by counting the number of instances of F, or by summing (averaging, ) expressions which involve numerical attributes. An attribute cannot be both a measure and a dimension. A fact may have no measures. dept. category brand diet weight mark. grp. type product city qty sale unit price sales store address phone city county district no+state state 60
31 Granularity Defining the granularity of data is a primary issue in determining performance. Granularity depends on the queries users are interested in, and represents a trade-off between query response time and detail of information to be stored. It may be worth adopting a finer granularity than that required by users, provided that this does not slow down the system too much. Constrained by the maximum time frame for loading. Choosing granularity includes defining the refresh interval. Issues to be considered: Availability of operational data Workload characteristics The total time period to be analysed 61 a CASE tool for WAND tool for data warehouse design A design methodology is almost useless, if no CASE tool to support it is provided. Acquire the relational db scheme via ODBC Carry out conceptual design Define the workload Calculate data volume Carry out logical design Create the documentation (including loading/feeding queries) 62
32 Bibliography (1) K. Aberer, K. Hemm. A methodology for building a data warehouse in a scientific environment. Proc. 1st Int. Conf. on Cooperative Inf. Systems, Brussels (1996). R. Agrawal, A. Gupta, S. Sarawagi Modeling multidimensional databases. IBM Research Report, IBM Almaden Research Center (1995). M. Blaschka et al. Finding your way through multidimensional data models. Proc. DEXA 98 (1998). L. Cabibbo, R. Torlone. A logical approach to multidimensional databases. EDBT 98 (1998). A. Datta, H. Thomas. A conceptual model and algebra for on-line analytical processing in data warehouses. Proc. WITS 97 (1997). E. Franconi, U. Sattler. A data warehouse conceptual model for multidimensional aggregation. Proc. DMDW 99 (1999). M. Golfarelli, D. Maio, S. Rizzi The Dimensional Fact Model: a conceptual model for data warehouses. Int. Jour. of Cooperative Inf. Systems 7, 2&3 (1998). M. Golfarelli, S. Rizzi. Designing the data warehouse: key steps and crucial issues. Jour. of Computer Science and Information Management 2, 3 (1999). 63 Bibliography (2) M. Gyssens, L.V.S. Lakshmanan. A foundation for multi-dimensional databases. Proc. 23rd VLDB, Athens, Greece (1997). B. Hüsemann, J. Lechtenbörger, G. Vossen. Conceptual data warehouse design. Proc. DMDW 00 (2000). R. Kimball. The data warehouse toolkit. John Wiley & Sons (1996). D. Moody, M. Kortink. From enterprise models to dimensional models: a methodology for data warehouse and data mart design. Proc. DMDW 00 (2000). T. Bach Pedersen, C. Jensen. Multidimensional data modelling for complex data. Proc. 15th ICDE, Sydney (1999). C. Sapia et al. Extending the E/R model for the multidimensional paradigm. Proc. ER 98 (1998). N. Tryfona, F. Busborg, J. Christiansen. starer: A Conceptual Model for Data Warehouse Design. Proc. DOLAP 99 (1999). P. Vassiliadis. Modeling multidimensional databases, cubes and cube operations. Proc. 10th SSDBM Conf., Capri, Italy (1998). 64
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
Data warehouse design
DataBase and Data Mining Group of DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, Data warehouse design DATA WAREHOUSE: DESIGN - 1 Risk factors Database
Data Warehouse Design
Data Warehouse Design Modern Principles and Methodologies Matteo Golfarelli Stefano Rizzi Translated by Claudio Pagliarani Mc Grauu Hill New York Chicago San Francisco Lisbon London Madrid Mexico City
Advanced Data Management Technologies
ADMT 2015/16 Unit 2 J. Gamper 1/44 Advanced Data Management Technologies Unit 2 Basic Concepts of BI and Data Warehousing J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
Data warehouse life-cycle and design
SYNONYMS Data Warehouse design methodology Data warehouse life-cycle and design Matteo Golfarelli DEIS University of Bologna Via Sacchi, 3 Cesena Italy [email protected] DEFINITION The term data
A Design and implementation of a data warehouse for research administration universities
A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon
THE DIMENSIONAL FACT MODEL: A CONCEPTUAL MODEL FOR DATA WAREHOUSES 1
THE DIMENSIONAL FACT MODEL: A CONCEPTUAL MODEL FOR DATA WAREHOUSES 1 MATTEO GOLFARELLI, DARIO MAIO and STEFANO RIZZI DEIS - Università di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy {mgolfarelli,dmaio,srizzi}@deis.unibo.it
DATA WAREHOUSING AND OLAP TECHNOLOGY
DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA
OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,
Data warehouses. Data Mining. Abraham Otero. Data Mining. Agenda
Data warehouses 1/36 Agenda Why do I need a data warehouse? ETL systems Real-Time Data Warehousing Open problems 2/36 1 Why do I need a data warehouse? Why do I need a data warehouse? Maybe you do not
Data Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
Designing a Dimensional Model
Designing a Dimensional Model Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Definitions Data Warehousing A subject-oriented, integrated, time-variant, and
DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM
DATA WAREHOUSING APPLICATIONS: AN ANALYTICAL TOOL FOR DECISION SUPPORT SYSTEM MOHAMMED SHAFEEQ AHMED Guest Lecturer, Department of Computer Science, Gulbarga University, Gulbarga, Karnataka, India (e-mail:
Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications
Data Warehousing and Mining: Concepts, Methodologies, Tools, and Applications John Wang Montclair State University, USA Information Science reference Hershey New York Acquisitions Editor: Development Editor:
CHAPTER 4 Data Warehouse Architecture
CHAPTER 4 Data Warehouse Architecture 4.1 Data Warehouse Architecture 4.2 Three-tier data warehouse architecture 4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP 4.4 Further development of Data
BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT
BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on
Data Warehousing and Data Mining Introduction
Data Warehousing and Data Mining Introduction General introduction to DWDM Business intelligence OLTP vs. OLAP Data integration Methodological framework DW definition Acknowledgements: I am indebted to
Data Warehousing. Outline. From OLTP to the Data Warehouse. Overview of data warehousing Dimensional Modeling Online Analytical Processing
Data Warehousing Outline Overview of data warehousing Dimensional Modeling Online Analytical Processing From OLTP to the Data Warehouse Traditionally, database systems stored data relevant to current business
Conceptual Multidimensional Models
Conceptual Multidimensional Models Chapter in the Book: Multidimensional Databases Maurizio Rafanelli Ed. RICCARDO TORLONE Dip. di Informatica e Automazione Università Roma Tre Via della Vasca Navale,
A Survey on Data Warehouse Architecture
A Survey on Data Warehouse Architecture Rajiv Senapati 1, D.Anil Kumar 2 1 Assistant Professor, Department of IT, G.I.E.T, Gunupur, India 2 Associate Professor, Department of CSE, G.I.E.T, Gunupur, India
Week 3 lecture slides
Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies
An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies Ashish Gahlot, Manoj Yadav Dronacharya college of engineering Farrukhnagar, Gurgaon,Haryana Abstract- Data warehousing, Data Mining,
Data Warehousing. Overview, Terminology, and Research Issues. Joachim Hammer. Joachim Hammer
Data Warehousing Overview, Terminology, and Research Issues 1 Heterogeneous Database Integration Integration System World Wide Web Digital Libraries Scientific Databases Personal Databases Collects and
Part 22. Data Warehousing
Part 22 Data Warehousing The Decision Support System (DSS) Tools to assist decision-making Used at all levels in the organization Sometimes focused on a single area Sometimes focused on a single problem
Introduction to Data Warehousing. Ms Swapnil Shrivastava [email protected]
Introduction to Data Warehousing Ms Swapnil Shrivastava [email protected] Necessity is the mother of invention Why Data Warehouse? Scenario 1 ABC Pvt Ltd is a company with branches at Mumbai,
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
Data Warehousing and OLAP Technology for Knowledge Discovery
542 Data Warehousing and OLAP Technology for Knowledge Discovery Aparajita Suman Abstract Since time immemorial, libraries have been generating services using the knowledge stored in various repositories
Data Warehousing and Data Mining
Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong [email protected] Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge
Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing
Database Applications Advanced Querying Transaction processing Online setting Supports day-to-day operation of business OLAP Data Warehousing Decision support Offline setting Strategic planning (statistics)
SAS BI Course Content; Introduction to DWH / BI Concepts
SAS BI Course Content; Introduction to DWH / BI Concepts SAS Web Report Studio 4.2 SAS EG 4.2 SAS Information Delivery Portal 4.2 SAS Data Integration Studio 4.2 SAS BI Dashboard 4.2 SAS Management Console
Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi [email protected]
Data Warehousing: Data Models and OLAP operations By Kishore Jaladi [email protected] Topics Covered 1. Understanding the term Data Warehousing 2. Three-tier Decision Support Systems 3. Approaches
Goal-Oriented Requirement Analysis for Data Warehouse Design
Goal-Oriented Requirement Analysis for Data Warehouse Design Paolo Giorgini University of Trento, Italy Stefano Rizzi University of Bologna, Italy Maddalena Garzetti University of Trento, Italy Abstract
Overview. DW Source Integration, Tools, and Architecture. End User Applications (EUA) EUA Concepts. DW Front End Tools. Source Integration
DW Source Integration, Tools, and Architecture Overview DW Front End Tools Source Integration DW architecture Original slides were written by Torben Bach Pedersen Aalborg University 2007 - DWML course
When to consider OLAP?
When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: [email protected] Abstract: Do you need an OLAP
Metadata Management for Data Warehouse Projects
Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. [email protected] Abstract Metadata management has been identified as one of the major critical success factor
An Introduction to Data Warehousing. An organization manages information in two dominant forms: operational systems of
An Introduction to Data Warehousing An organization manages information in two dominant forms: operational systems of record and data warehouses. Operational systems are designed to support online transaction
Presented by: Jose Chinchilla, MCITP
Presented by: Jose Chinchilla, MCITP Jose Chinchilla MCITP: Database Administrator, SQL Server 2008 MCITP: Business Intelligence SQL Server 2008 Customers & Partners Current Positions: President, Agile
Data Warehousing. Jens Teubner, TU Dortmund [email protected]. Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund [email protected] Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 13 Part II Overview
Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina
Data Warehousing Read chapter 13 of Riguzzi et al Sistemi Informativi Slides derived from those by Hector Garcia-Molina What is a Warehouse? Collection of diverse data subject oriented aimed at executive,
A Critical Review of Data Warehouse
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 95-103 Research India Publications http://www.ripublication.com A Critical Review of Data Warehouse Sachin
Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract
224 Business Intelligence Journal July DATA WAREHOUSING Ofori Boateng, PhD Professor, University of Northern Virginia BMGT531 1900- SU 2011 Business Intelligence Project Jagir Singh, Greeshma, P Singh
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000
2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 Introduction This course provides students with the knowledge and skills necessary to design, implement, and deploy OLAP
Fluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
14. Data Warehousing & Data Mining
14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
A Comparative Study on Operational Database, Data Warehouse and Hadoop File System T.Jalaja 1, M.Shailaja 2
RESEARCH ARTICLE A Comparative Study on Operational base, Warehouse Hadoop File System T.Jalaja 1, M.Shailaja 2 1,2 (Department of Computer Science, Osmania University/Vasavi College of Engineering, Hyderabad,
Why Business Intelligence
Why Business Intelligence Ferruccio Ferrando z IT Specialist Techline Italy March 2011 page 1 di 11 1.1 The origins In the '50s economic boom, when demand and production were very high, the only concern
IMPROVING THE QUALITY OF THE DECISION MAKING BY USING BUSINESS INTELLIGENCE SOLUTIONS
IMPROVING THE QUALITY OF THE DECISION MAKING BY USING BUSINESS INTELLIGENCE SOLUTIONS Maria Dan Ştefan Academy of Economic Studies, Faculty of Accounting and Management Information Systems, Uverturii Street,
Chapter 3 - Data Replication and Materialized Integration
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 [email protected] Chapter 3 - Data Replication and Materialized Integration Motivation Replication:
Data Warehousing Concepts
Data Warehousing Concepts JB Software and Consulting Inc 1333 McDermott Drive, Suite 200 Allen, TX 75013. [[[[[ DATA WAREHOUSING What is a Data Warehouse? Decision Support Systems (DSS), provides an analysis
LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES
LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES MUHAMMAD KHALEEL (0912125) SZABIST KARACHI CAMPUS Abstract. Data warehouse and online analytical processing (OLAP) both are core component for decision
Information assets are immensely valuable to any enterprise, and because of this,
1 CHAPTER Introduction to Data Warehousing Information assets are immensely valuable to any enterprise, and because of this, these assets must be properly stored and readily accessible when they are needed.
Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?
Outline Data Warehousing What is a data warehouse? Why a warehouse? Models & operations Implementing a warehouse 2 What is a Warehouse? Collection of diverse data subject oriented aimed at executive, decision
Data warehouse Architectures and processes
Database and data mining group, Data warehouse Architectures and processes DATA WAREHOUSE: ARCHITECTURES AND PROCESSES - 1 Database and data mining group, Data warehouse architectures Separation between
A Methodology for the Conceptual Modeling of ETL Processes
A Methodology for the Conceptual Modeling of ETL Processes Alkis Simitsis 1, Panos Vassiliadis 2 1 National Technical University of Athens, Dept. of Electrical and Computer Eng., Computer Science Division,
OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP
Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover
This tutorial will help computer science graduates to understand the basic-toadvanced concepts related to data warehousing.
About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This
Data warehousing. Han, J. and M. Kamber. Data Mining: Concepts and Techniques. 2001. Morgan Kaufmann.
Data warehousing Han, J. and M. Kamber. Data Mining: Concepts and Techniques. 2001. Morgan Kaufmann. KDD process Application Pattern Evaluation Data Mining Task-relevant Data Data Warehouse Selection Data
Data Warehouse Logical Modeling and Design (6)
Data Warehouse Logical Modeling and Design (6) Bernard ESPINASSE Professeur à Aix-Marseille Université (AMU) Ecole Polytechnique Universitaire de Marseille October 2013 Methodological framework Logical
Data Warehousing and OLAP
1 Data Warehousing and OLAP Hector Garcia-Molina Stanford University Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots
Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage
Moving Large Data at a Blinding Speed for Critical Business Intelligence A competitive advantage Intelligent Data In Real Time How do you detect and stop a Money Laundering transaction just about to take
The Study on Data Warehouse Design and Usage
International Journal of Scientific and Research Publications, Volume 3, Issue 3, March 2013 1 The Study on Data Warehouse Design and Usage Mr. Dishek Mankad 1, Mr. Preyash Dholakia 2 1 M.C.A., B.R.Patel
Data Warehousing, OLAP, and Data Mining
Data Warehousing, OLAP, and Marek Rychly [email protected] Strathmore University, @ilabafrica & Brno University of Technology, Faculty of Information Technology Advanced Databases and Enterprise Systems
A Review of Data Warehousing and Business Intelligence in different perspective
A Review of Data Warehousing and Business Intelligence in different perspective Vijay Gupta Sr. Assistant Professor International School of Informatics and Management, Jaipur Dr. Jayant Singh Associate
DATA WAREHOUSE E KNOWLEDGE DISCOVERY
DATA WAREHOUSE E KNOWLEDGE DISCOVERY Prof. Fabio A. Schreiber Dipartimento di Elettronica e Informazione Politecnico di Milano DATA WAREHOUSE (DW) A TECHNIQUE FOR CORRECTLY ASSEMBLING AND MANAGING DATA
Data Mart/Warehouse: Progress and Vision
Data Mart/Warehouse: Progress and Vision Institutional Research and Planning University Information Systems What is data warehousing? A data warehouse: is a single place that contains complete, accurate
Business Intelligence, Analytics & Reporting: Glossary of Terms
Business Intelligence, Analytics & Reporting: Glossary of Terms A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ad-hoc analytics Ad-hoc analytics is the process by which a user can create a new report
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
INTEGRATION OF HETEROGENEOUS DATABASES IN ACADEMIC ENVIRONMENT USING OPEN SOURCE ETL TOOLS
INTEGRATION OF HETEROGENEOUS DATABASES IN ACADEMIC ENVIRONMENT USING OPEN SOURCE ETL TOOLS Azwa A. Aziz, Abdul Hafiz Abdul Wahid, Nazirah Abd. Hamid, Azilawati Rozaimee Fakulti Informatik, Universiti Sultan
DATA WAREHOUSING - OLAP
http://www.tutorialspoint.com/dwh/dwh_olap.htm DATA WAREHOUSING - OLAP Copyright tutorialspoint.com Online Analytical Processing Server OLAP is based on the multidimensional data model. It allows managers,
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT Xiufeng Liu 1 & Xiaofeng Luo 2 1 Department of Computer Science Aalborg University, Selma Lagerlofs Vej 300, DK-9220 Aalborg, Denmark 2 Telecommunication Engineering
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
