DIMENSION HIERARCHIES UPDATES IN DATA WAREHOUSES A User-driven Approach
|
|
|
- Charles Knight
- 9 years ago
- Views:
Transcription
1 DIMENSION HIERARCHIES UPDATES IN DATA WAREHOUSES A User-driven Approach Cécile Favre, Fadila Bentayeb, Omar Boussaid ERIC Laboratory, University of Lyon, 5 av. Pierre Mendès-France, Bron Cedex, France [email protected], [email protected], [email protected] Keywords: Abstract: Data warehouse model evolution, dimension hierarchies updates, users knowledge, aggregation rules. We designed a data warehouse for the French bank LCL meeting users needs regarding marketing operations decision. However, the nature of the work of users implies that their requirements are often changing. In this paper, we propose an original and global approach to achieve a user-driven model evolution that provides answers to personalized analysis needs. We developed a prototype called WEDrik (data Warehouse Evolution Driven by Knowledge) within the Oracle 10g DBMS and applied our approach on banking data of LCL. 1 INTRODUCTION In the context of a collaboration with the French bank LCL-Le Crédit Lyonnais (LCL), the objective was to develop the decision support system for marketing operations. To achieve this objective, data of interests coming from heterogeneous sources have been extracted, filtered, merged, and stored in an integrating database, called data warehouse. Due to the role of the data warehouse in the daily business work of an enterprise like LCL, the requirements for the design and the implementation are dynamic and subjective. Therefore, data warehouse model design is a continuous process which has to reflect the changing environment, i.e. the data warehouse model must evolve over the time in reaction to the enterprise s changes. Indeed, data sources are often autonomous and generally have an existence and purpose beyond that of supporting the data warehouse itself. An important consequence of the autonomy of data sources is the fact that those sources may change without being controlled from the data warehouse administrator. Therefore, the data warehouse must be adapted to any change which occurs in the underlying data sources, e.g. changes of the schemas. Beside these changes on the source level, the users often change their requirements. Indeed, the nature of their work implies that their requirements are personal and usually evolve, thus they do not reach a final state. Moreover these requirements can depend on their own knowledge. Our key idea consists then in considering a specific users knowledge, which can provide new aggregated data. We represent this users knowledge under the form of aggregation rules. In a previous work (Favre et al., 2006), we proposed a data warehouse formal model based on these aggregation rules to allow a schema evolution. In this paper, we extend our previous work and propose an original and complete data warehouse model evolution approach driven by emergent users analysis needs to deal not only with the schema evolution but also with the required evolution for the data. To perform this model evolution, we define algorithms to update dimension hierarchies by creating new granularity levels and to load the required data exploiting users knowledge. Hence, our approach provides a real time evolution of the analysis possibilities of the data warehouse to cope with personalized analysis needs. To validate our approach, we developed a prototype named WEDriK (data Warehouse Evolution Driven by Knowledge), within the Oracle 10g Database Management System (DBMS), and applied our approach on data of the French bank LCL. The remainder of this paper is organized as follows. First, we detail our global evolution approach in Section 2. We present implementation elements in Section 3, detailing the necessary algorithms. In Section 4, we show, through a running example ex-
2 tracted from the LCL bank case study, how to use our approach for analysis purposes. Then, we discuss the state of the art regarding model evolution in data warehouses in Section 5. We finally conclude and provide future research directions in Section 6. 2 A USER-DRIVEN APPROACH In this section, we present our user-driven global approach for updating dimension hierarchies to integrate new analysis possibilities into an existing data warehouse (Figure 1). The first phase of our approach is the acquisition of the users knowledge under the form of rules. Then, in the integration phase, these rules are transformed into a mapping table within the DBMS. Then the evolution phase allows to create a new granularity level. Finally, the analysis phase could be carried out on the updated data warehouse model. It is an incremental evolution process, since each time new analysis needs may appear and find an answer. We detail in this section the first three phases. Figure 1: Data warehouse user-driven evolution process. 2.1 Acquisition phase In our approach, we consider a specific users knowledge, which determines the new aggregated data to integrate into the underlying data warehouse. More precisely, this knowledge defines how to aggregate data from a given level to another one to be created. It is represented in the form of if-then rules. To achieve the knowledge acquisition phase, our key idea is to define an aggregation meta-rule to represent the structure of the aggregation link. The metarule allows the user to define: the granularity level which will be created, the attributes characterizing this new level, the granularity level on which is based the level to be created and finally the attributes of the existing level used to build the aggregation link. Let EL be the existing level, EA i, i=1..m be the set of m existing attributes of EL on which conditions are expressed, GL be the generated level and GA j, j=1..n be the set of n generated attributes of GL. The meta-rule is as follows: if ConditionOn(EL, EA i ) then GenerateValue(GL, GA j ) Thus, the if -clause contains conditions on the attributes of the existing level. The then-clause contains the definition of the values of the generated attributes, which characterize the new level to be created. The user instanciates the above meta-rule to define the instances of the new aggregated level and the aggregation link with the instances of the existing level. In the literature, different types of aggregation links can compose the dimension hierarchies to model real situations (Pedersen et al., 2001). In this paper we deal with classical links: an instance in a lower level corresponds to exactly one instance in the higher level, and each instance in the lower level is represented in the higher level. Thus, aggregation rules define a partition of the instances in the lower level, and each class of this partition is associated to one instance of the created level. To check that the aggregation rules satisfy this definition, we deal with the relational representation of aggregation rules which is a mapping table. The transformation is achieved during the integration phase. 2.2 Integration phase The integration phase consists in transforming ifthen rules into mapping tables and storing them into the DBMS, by means of relational structures. For a given set of rules which defines a new granularity level, we associate one mapping table. To achieve this mapping process, we exploit firstly the aggregation meta-rule for building the structure of the mapping table, and secondly we use the aggregation rules to insert the required values in the created mapping table. Since we are in a relational context, each granularity level corresponds to one relational table. Let ET (resp. GT) be the existing (resp. generated) table, corresponding to EL (resp. GL) in a logical context. Let EA i, i=1..m be the set of m attributes of ET on which conditions are expressed and GA j, j=1..n be the set of n generated attributes of GT. The mapping table MT GT built with the meta-rule corresponds to the following relation: MT GT (EA 1,..., EA i,..., EA m, GA 1,..., GA j,...ga n ) The aggregation rules allow to insert in this mapping table the conditions on the attributes EA i, and the values of the generated attributes GA j. In parallel, we define a mapping meta-table to gather information on the different mapping tables. This meta-table includes the following attributes: Mapping Table ID and Mapping Table Name correspond respectively to the identifier and the name of the mapping table; Attribute Name and Table Name denote the attribute implied in the evolution process and its ta-
3 ble; Attribute Type provides the role of this attribute. More precisely, this last attribute has two modalities: conditioned if it appears in the if -clause and generated if it is in the then-clause. Note that even if an attribute is a generated attribute, it can be used as a conditioned one for the construction of another level. 2.3 Evolution phase The data warehouse model evolution consists in updating dimension hierarchies of the current data warehouse model by creating new granularity levels. The created level can be added at the end of a hierarchy or inserted between two existing ones, we thus speak of adding and inserting a level, respectively. In the two cases, the rules have to determine the aggregation link between the lower and the created levels. In the second case, it is also necessary to determine the aggregation link between the inserted level and the existing higher level in an automatic way. The user can insert a level between two existing ones only if it is possible to semantically aggregate data from the inserted level to the existing higher level. For each type of dimension hierarchy updates, we propose, in the following section, an algorithm which creates a new relational table and its necessary link(s) with existing tables. 3 IMPLEMENTATION To achieve the data warehouse model updating, we developed a prototype named WEDriK (data Warehouse Evolution Driven by Knowledge) within the Oracle 10g DBMS. It exploits different algorithms whose sequence is represented in the Figure 2. user has to modify them. This process is repeated until the rules become valid. If they are valid, the model evolution is carried out (Algorithm 3). Transformation algorithm. For one meta-rule MR and its associated set of aggregation rules R, we build one mapping table MT GT. The structure of the mapping table is designed with the help of the metarule, and the content of the mapping table is defined with the set of aggregation rules. Moreover, the necessary information are inserted in the meta-table MAPPING META TABLE (Algorithm 1). Algorithm 1 Pseudo-code for transforming aggregation rules into a mapping table Require: (1) the meta-rule MR: if ConditionOn(EL, EA i ) then GenerateValue(GL, GA j ) where EL is the existing level, EA i, i 1 m is the set of m attributes of EL, GL is the level to be created, and GA j, j 1 n is the set of n generated attributes of GL; (2) the set of aggregation rules R; (3) the mapping meta-table MAPPING META TABLE Ensure: the mapping table MT GT Creation of the MT GT s structure 1: CREATE TABLE MT GT ( EA i, GA j ) Values insertion in MT GT 2: for all rule of R do 3: INSERT INTO MT GT VALUES,GenerateValue(GL, GA j )) (ConditionOn(EL, EA i ) 4: end for Values insertion in MAPPING META TABLE 5: for all EA i of EA i do 6: INSERT INTO MAPPING META TABLE VALUES(MT GT, EL, EA i, conditioned ) 7: end for 8: for all GA j of GA j do 9: INSERT INTO MAPPING META TABLE VALUES(MT GT, GL, GA i, generated ) 10: end for 11: return MT GT Figure 2: Algorithms sequence. The starting point of the algorithms sequence is a set of rules expressed by a user to create a granularity level. These rules are transformed under the form of a mapping table (Algorithm 1). Then we check the validity of the rules by testing the content of the mapping table (Algorithm 2). If the rules are not valid, the Constraints checking algorithm. For each tuple of MT GT, we write the corresponding query to build a view which contains a set of instances of the table ET. First, we check that the intersection of all views considered by pairs is empty. Second, we check that the union of the instances of the whole of views corresponds to the whole of instances of the table ET (Algorithm 2). Model evolution algorithm. The model evolution consists in the creation of the table GT and the definition of the required links, according to the evolution type (Algorithm 3). For an addition or an insertion, it thus consists in creating the table GT and inserting values by exploiting the mapping table MT GT, and
4 Algorithm 2 Pseudo-code for checking constraints Require: existing table ET, mapping table MT GT Ensure: Intersection constraint checked and Union constraint checked two booleans Views creating 1: for all tuple t of MT GT do 2: CREATE VIEW v t AS SELECT * FROM ET where Conjunction(t EA ) 3: end for Checking the intersection of views considered by pair is empty 4: for x = 1 to t max do 5: I=SELECT * FROM v x INTERSECT SELECT * FROM v x+1 6: if I Ø then 7: Intersection constraint checked=false 8: end if 9: end for Checking the union of views corresponds to the table ET 10: U=SELECT * FROM v 1 11: for x = 2 to t max do 12: U=U UNION SELECT * FROM v x 13: end for 14: if U SELECT * FROM ET then 15: Union constraint checked=false 16: end if 17: return Intersection constraint checked 18: return Union constraint checked then in linking the new table GT with the existing one ET. For the insertion, we just use the mapping table MT GT, which only contains the link between the lower table ET and the generated table GT. Then we have to automatically establish the aggregation link between GT and ET 2, by inferring it according to the one that exists between ET and ET2. Algorithm 3 Pseudo-code for hierarchy dimension updating Require: evol type the type of evolution (inserting or adding a level), existing lower table ET, existing higher table ET2, mapping table MT GT, EA i, i=1..m the set of m attributes of MT GT which contain the conditions on attributes, GA j, j=1..n the set of n generated attributes of MT GT which contain the values of the generated attributes, the key attribute GA key of GT, the attribute B linking ET with ET 2 Ensure: generated table GT Creating the new table GT and making the aggregation link between ET and GT 1: CREATE TABLE GT (GA key, GA j ); 2: ALTER TABLE ET ADD (GA key ); 3: for all (tuple t of MT GT) do 4: INSERT INTO GT VALUES (GA key, GA j ) 5: UPDATE ET SET GA key WHERE EA i 6: end for If it is a level insertion: linking GT with ET 2 7: if evol type= inserting then 8: ALTER TABLE GT ADD (B); 9: UPDATE GT SET B=(SELECT DISTINCT B FROM ET WHERE ET GA key =GT GA key ); 10: end if 11: return GT 4 A RUNNING EXAMPLE To illustrate our approach, we use a case study defined by the French bank LCL. LCL is a large company, where the data warehouse users have different points of view. Thus, they need specific analyses, which depend on their own knowledge and their own objectives. We applied our approach on data about the annual Net Banking Income (NBI). The NBI is the profit obtained from the management of customers account. It is a measure observed according to several dimensions: CUSTOMER, AGENCY and YEAR (Figure 3). Figure 3: Data warehouse model for the NBI analysis. To illustrate the usage of WEDriK, let us focus on a simplified running example. Let us take the case of the person in charge of student products. He knows that there is three types of agencies: student for agencies which gather only student accounts, foreigner for agencies whose customers do not leave in France, and classical for agencies without any particularity. He needs to obtain NBI analysis according to the agency type. The existing data warehouse could not provide such an analysis. What we seek to do is showing step by step how to achieve this analysis with our approach. Let us consider the samples of the tables AGENCY and TF-NBI (Figure 4). Figure 4: Samples of tables. 1-Acquisition phase. The acquisition phase allows the user to define the following aggregation meta-rule in order to specify the structure of the aggregation link for the agency type: if ConditionOn(AGENCY, AgencyID ) then GenerateValue(AGENCY TYPE, AgencyTypeLabel ) He instances the meta-rule to define the different agency types: (R1) if AgencyID 01903, 01905, then Agency- TypeLabel= student
5 (R2) if AgencyID= then AgencyTypeLabel= foreigner (R3) if AgencyID 01903, 01905, 02256, then AgencyTypeLabel= classical 2-Integration phase. The integration phase exploits the meta-rule and the different aggregation rules to generate the mapping table MT AGENCY TYPE (Figure 5). The information concerning the mapping table is inserted in MAPPING META TABLE (Figure 6). Figure 5: Mapping table for the AGENCY TYPE level. Figure 6: Mapping meta-table. 3-Evolution phase. Then the evolution phase allows to create the AGENCY TYPE table and to update the AGENCY table (Figure 7). More precisely, the MAPPING META TABLE allows to create the structure of the AGENCY TYPE table and a primary key is automatically added. The MT AGENCY TYPE mapping table allows to insert the required values. Furthermore, the AGENCY table is updated to be linked with the AGENCY TYPE table, with the addition of the AgencyTypeID attribute. Figure 7: Created AGENCY TYPE table and updated AGENCY table. 4-Analysis phase. Finally, the analysis phase allows to exploit the model with new analysis axes. Usually, in an OLAP environment, queries require the computation of aggregates over various dimension levels. Indeed, given a data warehouse model, the analysis process allows to summarize data by using (1) aggregation operators such as SUM and (2) GROUP BY clauses. In our case, the user wants to know the sum of NBI according to the agency types he has defined. The corresponding query and the results are provided in Figure 8. Figure 8: Query and results for NBI analysis. 5 RELATED WORK When designing a data warehouse model, involving users is crucial (Kimball, 1996). What about the data warehouse model evolution? According to the literature, model evolution in data warehouses can take the form of model updating or temporal modeling. Concerning model updating, only one model is supported and the trace of the evolutions is not preserved. In this case, the data historization can be corrupted and analysis could be erroneous. We distinguish three types of approach. The first approach consists in providing evolution operators that allow an evolution of the model (Hurtado et al., 1999; Blaschka et al., 1999). The second approach consists in updating the model, focusing on enriching dimension hierarchies. For instance, some authors propose to enrich dimension hierarchies with new granularity levels by exploiting semantic relations provided by WordNet 1 (Mazón and Trujillo, 2006). The third approach is based on the hypothesis that a data warehouse is a set of materialized views. Then, when a change occurs in a data source, it is necessary to maintain views by propagating this change (Bellahsene, 2002). Contrary to model updating, temporal modeling makes it possible to keep track of the evolutions, by using temporal validity labels. These labels are associated with either dimension instances (Bliujute et al., 1998), or aggregation links (Mendelzon and Vaisman, 2000), or versions (Eder and Koncilia, 2001; Body et al., 2002). Versioning is the subject today of many work. Versions are also used to provide an answer to what-if analysis by creating versions to simulate a situation (Bébel et al., 2004). Different work are then interested in analyzing data throughout versions, in order to achieve the first objective of data warehousing: analyzing data in the course of time (Morzy and Wrembel, 2004; Golfarelli et al., 2006). In these works, an extension to a traditional SQL language is required to take into account the particularities of the approaches for analysis or data loading. Both of these approaches do not directly involve users in the data warehouse evolution process, and 1
6 thus constitute solutions rather for a data sources evolution than for users analysis needs evolution. Thus these solutions make the data warehouse model evolve, without taking into account new analysis needs driven by users knowledge, and thus without providing an answer to personalized analysis needs. However, the personalization of the use of a data warehouse becomes crucial. Works in this domain are particularly focused on the selection of data to be visualized, based on users preferences (Bellatreche et al., 2005). In opposition of this approach, we add supplement data to extend and personalize the analysis possibilities of the data warehouse. Moreover our approach updates the data warehouse model without inducing erroneous analyses; and it does not require extensions for analysis. 6 CONCLUSION AND FUTURE WORK We aimed in this paper at providing an original userdriven approach for the data warehouse model evolution. Our key idea is to involve users in the dimension hierarchies updating to provide an answer to their personalized analysis needs based on their own knowledge. To achieve this process, we propose a method to acquire users knowledge and the required algorithms. We developed a prototype named WEDriK within the Oracle 10g DBMS and we applied our approach on the LCL case study. To the best of our knowledge, the idea of userdefined evolution of dimensions (based on analysis needs) is novel; there are still many aspects to be explored. First of all, we intend to study the performance of our approach in terms of storage space, response time and algorithms complexity. Moreover, we have to study how individual users needs evolve in time, and thus we have to consider different strategies of rules and mapping tables updating. Finally, we are also interested in investigating the joint evolution of the data sources and the analysis needs. REFERENCES Bébel, B., Eder, J., Koncilia, C., Morzy, T., and Wrembel, R. (2004). Creation and Management of Versions in Multiversion Data Warehouse. In XIXth ACM Symposium on Applied Computing (SAC 04), Nicosia, Cyprus, pages Bellahsene, Z. (2002). Schema Evolution in Data Warehouses. Knowledge and Information Systems, 4(3): Bellatreche, L., Giacometti, A., Marcel, P., Mouloudi, H., and Laurent, D. (2005). A Personalization Framework for OLAP Queries. In VIIIth ACM International Workshop on Data Warehousing and OLAP (DOLAP 05), Bremen, Germany, pages Blaschka, M., Sapia, C., and Höfling, G. (1999). On Schema Evolution in Multidimensional Databases. In Ist International Conference on Data Warehousing and Knowledge Discovery (DaWaK 99), Florence, Italy. Bliujute, R., Saltenis, S., Slivinskas, G., and Jensen, C. (1998). Systematic Change Management in Dimensional Data Warehousing. In IIIrd International Baltic Workshop on Databases and Information Systems, Riga, Latvia, pages Body, M., Miquel, M., Bédard, Y., and Tchounikine, A. (2002). A Multidimensional and Multiversion Structure for OLAP Applications. In Vth ACM International Workshop on Data Warehousing and OLAP (DOLAP 02), McLean, Virginia, USA, pages 1 6. Eder, J. and Koncilia, C. (2001). Changes of Dimension Data in Temporal Data Warehouses. In IIIrd International Conference on Data Warehousing and Knowledge Discovery (DaWaK 01), pages Favre, C., Bentayeb, F., and Boussaid, O. (2006). A Knowledge-driven Data Warehouse Model for Analysis Evolution. In XIIIth International Conference on Concurrent Engineering: Research and Applications (CE 06), Antibes, France. Golfarelli, M., Lechtenborger, J., Rizzi, S., and Vossen, G. (2006). Schema Versioning in Data Warehouses: Enabling Cross-Version Querying via Schema Augmentation. Data and Knowledge Engineering, 59(2): Hurtado, C. A., Mendelzon, A. O., and Vaisman, A. A. (1999). Maintaining Data Cubes under Dimension Updates. In XVth International Conference on Data Engineering (ICDE 99), Sydney, Australia, pages Kimball, R. (1996). The Data Warehouse Toolkit. John Wiley & Sons. Mazón, J.-N. and Trujillo, J. (2006). Enriching Data Warehouse Dimension Hierarchies by Using Semantic Relations. In XXIIIrd British National Conference on Databases (BNCOD 2006), Belfast, Northern Ireland, volume 4042 of LNCS. Mendelzon, A. O. and Vaisman, A. A. (2000). Temporal Queries in OLAP. In XXVIth International Conference on Very Large Data Bases (VLDB 00), Cairo, Egypt, pages Morzy, T. and Wrembel, R. (2004). On Querying Versions of Multiversion Data Warehouse. In VIIth ACM International Workshop on Data Warehousing and OLAP (DOLAP 04), Washington, Columbia, USA, pages Pedersen, T. B., Jensen, C. S., and Dyreson, C. E. (2001). A Foundation for Capturing and Querying Complex Multidimensional Data. Information Systems, 26(5):
Efficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
DWEB: A Data Warehouse Engineering Benchmark
DWEB: A Data Warehouse Engineering Benchmark Jérôme Darmont, Fadila Bentayeb, and Omar Boussaïd ERIC, University of Lyon 2, 5 av. Pierre Mendès-France, 69676 Bron Cedex, France {jdarmont, boussaid, bentayeb}@eric.univ-lyon2.fr
A Design and implementation of a data warehouse for research administration universities
A Design and implementation of a data warehouse for research administration universities André Flory 1, Pierre Soupirot 2, and Anne Tchounikine 3 1 CRI : Centre de Ressources Informatiques INSA de Lyon
Encyclopedia of Data Warehousing and Mining
Encyclopedia of Data Warehousing and Mining Second Edition John Wang Montclair State University, USA Volume IV Pro-Z Information Science reference Hershey New York Director of Editorial Content: Director
Curriculum of the research and teaching activities. Matteo Golfarelli
Curriculum of the research and teaching activities Matteo Golfarelli The curriculum is organized in the following sections I Curriculum Vitae... page 1 II Teaching activity... page 2 II.A. University courses...
Deductive Data Warehouses and Aggregate (Derived) Tables
Deductive Data Warehouses and Aggregate (Derived) Tables Kornelije Rabuzin, Mirko Malekovic, Mirko Cubrilo Faculty of Organization and Informatics University of Zagreb Varazdin, Croatia {kornelije.rabuzin,
Design of a Multi Dimensional Database for the Archimed DataWarehouse
169 Design of a Multi Dimensional Database for the Archimed DataWarehouse Claudine Bréant, Gérald Thurler, François Borst, Antoine Geissbuhler Service of Medical Informatics University Hospital of Geneva,
ETL evolution from data sources to data warehouse using mediator data storage
ETL evolution from data sources to data warehouse using mediator data storage Alexander Dolnik ([email protected]) Saint-Petersburg State University Abstract. The problem of evolution in the ETL
Conceptual Workflow for Complex Data Integration using AXML
Conceptual Workflow for Complex Data Integration using AXML Rashed Salem, Omar Boussaïd and Jérôme Darmont Université de Lyon (ERIC Lyon 2) 5 av. P. Mendès-France, 69676 Bron Cedex, France Email: [email protected]
Fluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
Data warehouses. Data Mining. Abraham Otero. Data Mining. Agenda
Data warehouses 1/36 Agenda Why do I need a data warehouse? ETL systems Real-Time Data Warehousing Open problems 2/36 1 Why do I need a data warehouse? Why do I need a data warehouse? Maybe you do not
Data Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer
Alejandro Vaisman Esteban Zimanyi Data Warehouse Systems Design and Implementation ^ Springer Contents Part I Fundamental Concepts 1 Introduction 3 1.1 A Historical Overview of Data Warehousing 4 1.2 Spatial
CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB
CUBE INDEXING IMPLEMENTATION USING INTEGRATION OF SIDERA AND BERKELEY DB Badal K. Kothari 1, Prof. Ashok R. Patel 2 1 Research Scholar, Mewar University, Chittorgadh, Rajasthan, India 2 Department of Computer
2 Associating Facts with Time
TEMPORAL DATABASES Richard Thomas Snodgrass A temporal database (see Temporal Database) contains time-varying data. Time is an important aspect of all real-world phenomena. Events occur at specific points
Managing a Fragmented XML Data Cube with Oracle and Timesten
ACM Fifteenth International Workshop On Data Warehousing and OLAP (DOLAP 2012) Maui, Hawaii, USA November 2nd, 2012 Managing a Fragmented XML Data Cube with Oracle and Timesten Doulkifli BOUKRAA, ESI,
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification
Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT Xiufeng Liu 1 & Xiaofeng Luo 2 1 Department of Computer Science Aalborg University, Selma Lagerlofs Vej 300, DK-9220 Aalborg, Denmark 2 Telecommunication Engineering
Data warehouse design
DataBase and Data Mining Group of DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, Data warehouse design DATA WAREHOUSE: DESIGN - 1 Risk factors Database
Data warehouse Architectures and processes
Database and data mining group, Data warehouse Architectures and processes DATA WAREHOUSE: ARCHITECTURES AND PROCESSES - 1 Database and data mining group, Data warehouse architectures Separation between
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija, srecko@vizija.
The Design and the Implementation of an HEALTH CARE STATISTICS DATA WAREHOUSE Dr. Sreèko Natek, assistant professor, Nova Vizija, [email protected] ABSTRACT Health Care Statistics on a state level is a
Data warehouse life-cycle and design
SYNONYMS Data Warehouse design methodology Data warehouse life-cycle and design Matteo Golfarelli DEIS University of Bologna Via Sacchi, 3 Cesena Italy [email protected] DEFINITION The term data
DATA WAREHOUSING AND OLAP TECHNOLOGY
DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are
Data Warehouse Schema Design
Data Warehouse Schema Design Jens Lechtenbörger Dept. of Information Systems University of Münster Leonardo-Campus 3 D-48149 Münster, Germany [email protected] 1 Introduction A data warehouse
Horizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design
Horizontal Partitioning by Predicate Abstraction and its Application to Data Warehouse Design Aleksandar Dimovski 1, Goran Velinov 2, and Dragan Sahpaski 2 1 Faculty of Information-Communication Technologies,
IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs
IMPLEMENTING SPATIAL DATA WAREHOUSE HIERARCHIES IN OBJECT-RELATIONAL DBMSs Elzbieta Malinowski and Esteban Zimányi Computer & Decision Engineering Department, Université Libre de Bruxelles 50 av.f.d.roosevelt,
Conceptual Workflow for Complex Data Integration using AXML
Conceptual Workflow for Complex Data Integration using AXML Rashed Salem, Omar Boussaid, Jérôme Darmont To cite this version: Rashed Salem, Omar Boussaid, Jérôme Darmont. Conceptual Workflow for Complex
Using the column oriented NoSQL model for implementing big data warehouses
Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 469 Using the column oriented NoSQL model for implementing big data warehouses Khaled. Dehdouh 1, Fadila. Bentayeb 1, Omar. Boussaid 1, and Nadia
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
OLAP on Complex Data : Visualization Operator Based on Correspondence Analysis
OLAP on Complex Data : Visualization Operator Based on Correspondence Analysis Sabine Loudcher and Omar Boussaid ERIC laboratory, University of Lyon (University Lyon 2) 5 avenue Pierre Mendes-France, 69676
PartJoin: An Efficient Storage and Query Execution for Data Warehouses
PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE [email protected] 2
Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms
Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,
HANDLING IMPRECISION IN QUALITATIVE DATA WAREHOUSE: URBAN BUILDING SITES ANNOYANCE ANALYSIS USE CASE
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-2/W1, 213 8th International Symposium on Spatial Data Quality, 3 May - 1 June 213, Hong Kong HANDLING
CONTINUOUS DATA WAREHOUSE: CONCEPTS, CHALLENGES AND POTENTIALS
Geoinformatics 2004 Proc. 12th Int. Conf. on Geoinformatics Geospatial Information Research: Bridging the Pacific and Atlantic University of Gävle, Sweden, 7-9 June 2004 CONTINUOUS DATA WAREHOUSE: CONCEPTS,
SQL Server 2012 Business Intelligence Boot Camp
SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
Basics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.
Goal-Driven Design of a Data Warehouse-Based Business Process Analysis System
Proceedings of the 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases, Corfu Island, Greece, February 16-19, 2007 243 Goal-Driven Design of a Data Warehouse-Based Business
DEVELOPING REQUIREMENTS FOR DATA WAREHOUSE SYSTEMS WITH USE CASES
DEVELOPING REQUIREMENTS FOR DATA WAREHOUSE SYSTEMS WITH USE CASES Robert M. Bruckner Vienna University of Technology [email protected] Beate List Vienna University of Technology [email protected]
II. OLAP(ONLINE ANALYTICAL PROCESSING)
Association Rule Mining Method On OLAP Cube Jigna J. Jadav*, Mahesh Panchal** *( PG-CSE Student, Department of Computer Engineering, Kalol Institute of Technology & Research Centre, Gujarat, India) **
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm
Indexing Techniques in Data Warehousing Environment The UB-Tree Algorithm Prepared by: Yacine ghanjaoui Supervised by: Dr. Hachim Haddouti March 24, 2003 Abstract The indexing techniques in multidimensional
A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems
Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
The Role of Metadata for Effective Data Warehouse
ISSN: 1991-8941 The Role of Metadata for Effective Data Warehouse Murtadha M. Hamad Alaa Abdulqahar Jihad University of Anbar - College of computer Abstract: Metadata efficient method for managing Data
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing
CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday
A Framework for Data Migration between Various Types of Relational Database Management Systems
A Framework for Data Migration between Various Types of Relational Database Management Systems Ahlam Mohammad Al Balushi Sultanate of Oman, International Maritime College Oman ABSTRACT Data Migration is
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
B.Sc (Computer Science) Database Management Systems UNIT-V
1 B.Sc (Computer Science) Database Management Systems UNIT-V Business Intelligence? Business intelligence is a term used to describe a comprehensive cohesive and integrated set of tools and process used
Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner
24 Horizontal Aggregations In SQL To Generate Data Sets For Data Mining Analysis In An Optimized Manner Rekha S. Nyaykhor M. Tech, Dept. Of CSE, Priyadarshini Bhagwati College of Engineering, Nagpur, India
To increase performaces SQL has been extended in order to have some new operations avaiable. They are:
OLAP To increase performaces SQL has been extended in order to have some new operations avaiable. They are:. roll up: aggregates different events to reduce details used to describe them (looking at higher
TOWARDS A FRAMEWORK INCORPORATING FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTS FOR DATAWAREHOUSE CONCEPTUAL DESIGN
IADIS International Journal on Computer Science and Information Systems Vol. 9, No. 1, pp. 43-54 ISSN: 1646-3692 TOWARDS A FRAMEWORK INCORPORATING FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTS FOR DATAWAREHOUSE
Data warehousing with PostgreSQL
Data warehousing with PostgreSQL Gabriele Bartolini http://www.2ndquadrant.it/ European PostgreSQL Day 2009 6 November, ParisTech Telecom, Paris, France Audience
Multi-dimensional index structures Part I: motivation
Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for
Performance Measurement Framework with Indicator Life-Cycle Support
Performance Measurement Framework with Life-Cycle Support Aivars NIEDRITIS, Janis ZUTERS, and Laila NIEDRITE University of Latvia Abstract. The performance measurement method introduced in this paper is
Optimization of ETL Work Flow in Data Warehouse
Optimization of ETL Work Flow in Data Warehouse Kommineni Sivaganesh M.Tech Student, CSE Department, Anil Neerukonda Institute of Technology & Science Visakhapatnam, India. [email protected] P Srinivasu
Metadata Management for Data Warehouse Projects
Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. [email protected] Abstract Metadata management has been identified as one of the major critical success factor
Semantic Analysis of Business Process Executions
Semantic Analysis of Business Process Executions Fabio Casati, Ming-Chien Shan Software Technology Laboratory HP Laboratories Palo Alto HPL-2001-328 December 17 th, 2001* E-mail: [casati, shan] @hpl.hp.com
A Novel Approach for a Collaborative Exploration of a Spatial Data Cube
International Journal of Computer and Communication Engineering, Vol. 3, No. 1, January 2014 A Novel Approach for a Collaborative Exploration of a Spatial Data Cube Olfa Layouni and Jalel Akaichi Geographic
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
