TOWARDS IMPLEMENTING TOTAL DATA QUALITY MANAGEMENT IN A DATA WAREHOUSE
|
|
|
- Martha Harrell
- 10 years ago
- Views:
Transcription
1 Journal of Information Technology Management ISSN # A Publication of the Association of Management TOWARDS IMPLEMENTING TOTAL DATA QUALITY MANAGEMENT IN A DATA WAREHOUSE G. SHANKARANARAYANAN INFORMATION SYSTEMS DEPARTMENT BOSTON UNIVERSITY SCHOOL OF MANAGEMENT [email protected] ABSTRACT Poor quality of data in a warehouse adversely impacts the usability of the warehouse and managing data quality in a warehouse is very important. In this article we describe a framework for managing data quality in a data warehouse. This framework is of interest to both academics and practitioners as it offers an intuitive approach for not just managing data quality in a data warehouse but also implementing total data quality management. The framework is based on the information product approach. Using this approach it integrates existing metadata in a warehouse with quality-related metadata and proposes a visual representation for communicating data quality to the decision-makers. It allows decision-makers to gauge data quality in context-dependent manner. The representation also helps implement capabilities that are integral components of total data quality management. Keywords: Data Quality Management, Data Warehouse, Information Product, Metadata INTRODUCTION Data quality is an important issue in decision environments due to the large data volumes and the complex, data-intensive decision-tasks that they support. In organizations, capital losses and heightened risk exposure are increasingly being attributed to data quality issues such as accuracy, consistency, completeness and timeliness. Decision support in such environments demands efficient data quality (in this paper data quality and information quality are used interchangeably) management. Here we describe a framework for managing and evaluating data quality in one such environment, the data warehouse (DW). Data quality issues are common in data warehouses and administrators are concerned about the usability of this decision environment [5]. Although research has examined several approaches for managing data quality, very few have addressed data quality management in data warehouses. Jarke et al. propose a quality meta-model for a warehouse [5]. The meta-model allows users to define abstract quality goals for the content of a warehouse and offers a method to translate these goals into analysis queries that can be executed against the quality measurements. Although it does not explicitly address the evaluation of data quality, this research recognizes the subjective nature of the evaluation, i.e., the quality of data in a data warehouse is dependent on users and the decision-tasks it is used for. The same data may be evaluated differently by different users and even by the same user for two different decision-tasks. It is hence important that decision-makers must be permitted to gauge data quality in the context of the decision-task. Existing methods for evaluating data quality adopt an impartial (or objective) view that perceives data quality to be independent of all other factors including users and decision-tasks. Such impartial measurements of data quality associated with some data are not directly useful to the decision-maker who must be able to gauge data quality in the context of the decision-task that the same Journal of Information Technology Management Volume XVI, Number 1,
2 data is used for. We use the terms contextual and impartial to describe the two different approaches to evaluating data quality. The objective of the framework presented in this paper is to communicate quality-related information (including impartial measurements) and enable the decision-makers to gauge (or evaluate) data quality in the context of the decision-task. Neely presents a framework for analyzing data quality in source databases [9]. This research proposes a methodology for understanding data quality. Helfert and von Maur describe a data quality management system for a DW that incorporates the concepts of total data quality management (TDQM) [2]. Research, such as the above, dealing with data quality in a data warehouse provide different perspectives for examining data quality but do not address the evaluation of data quality in a warehouse. The framework described here assists in evaluating data quality and takes a step towards implementing TDQM in a data warehouse. Data quality is typically evaluated along one or more of several quality dimensions [11]. Jarke et al. describe a set of quality dimensions for a warehouse [5]. The multi-tiered model specifies that data quality is a higher-order dimension that is composed of and can be evaluated using a set of secondary dimensions: interpretability, usefulness, accessibility, and believability. Each of these secondary dimensions (e.g. believability) in turn can be evaluated using a set of primitive dimensions (e.g. completeness, consistency, credibility, and accuracy). One such primitive dimension, accuracy, is used as an example in this paper for the purpose of illustrating two properties of the proposed framework: (1) how the framework can incorporate data quality dimensions and (2) how the framework can be used to evaluate data quality along such data quality dimension. The framework can hence supplement and improve existing methods for managing data quality in a data warehouse. The foundation for this framework is the information product map (IPMAP), based on managing information as a product (IP-approach), and defined using the constructs proposed in [12]. The data in a data warehouse is created by using data from multiple different data sources and uses a variety of different processes (referred to as extraction, transformation, and loading or ETL processes) for extracting the data, cleansing it, transforming it, aggregating it at different levels of granularity and across multiple different dimensions. Further, the warehouse data may undergo several more transformations using additional processes (also referred to as ETL) prior to its delivery to a decision-maker. To comprehensively manage data quality in a warehouse it is essential to understand the complete set of processes and how the quality of data at each processing stage is impacted by its quality at previous stages. Further, the framework must help the decision-maker gauge data quality not only at the final stage but also at any of the preceding intermediate stages. To do so, it must help communicate metadata associated with each stage to the decision-maker. All of these requirements mandate a systematic approach for understanding the back-end and front-end processes as well as managing the metadata at each stage in a data warehouse. Although useful, conventional approaches to data quality management such as data cleansing [3], data tracking and statistical process control [11], data source calculus and algebra [7], and dimensional gap analysis [6] do not offer a systematic approach for managing data quality. They suffer from two key drawbacks: (1) these methods attempt to evaluate data quality in an impartial manner regardless of the decisioncontext that the data is used for. (2) These methods do not permit the evaluation of data quality at one specific stage using data quality measurements associated with one or more preceding stage. In a data warehouse where data is processed in stages, and where the quality of data at one stage is dependent on the data quality measurements associated with preceding stages, it is difficult to use these techniques for managing data quality. Further, these techniques do not support the requirements for TDQM such as managing quality at source and the abilities to trace-back or predict the impacts of data quality problems observed /identified at some data processing stage. The information product (IP) approach takes the view that an output (e.g. a management report) of an information system is a product that is manufactured by the system [13]. The IP approach has gained acceptance for several reasons. First, manufacturing an IP is akin to manufacturing a physical product. Raw materials, storage, assembly, processing, inspection, rework, and packaging (formatting) are all stages in its manufacture. Typical IPs are standard products and can be assembled in a production line. Components and /or processes of an IP may be outsourced to an external agency (ASP), organization, or a different business-unit that uses a different set of computing resources. Second, IPs, like physical products, can be grouped based on common manufacturing stages permitting the group to be managed as a whole, i.e., multiple IPs may share a subset of processes and data inputs, and be created using a single production line with minor variations that distinguish each. Finally, proven methods for total quality management successfully applied in manufacturing can be adapted for TDQM. All these characteristics of data manufacture are applicable in a data warehouse. Data from multiple sources is cleansed, integrated, transformed, staged, and Journal of Information Technology Management Volume XVI, Number 1,
3 loaded into the warehouse. At each stage, an IP is created that forms the input to the next stage. Decision-makers use this loaded data (also an IP) and create their own outputs (also IPs) by combining, analyzing, aggregating this data and defining convenient visual schemes to view it. To understand the implications of poor-quality data, it is necessary to trace a quality-problem in an IP to the manufacturing stage(s) that may have caused it, and predict the IP(s) impacted by quality issues identified at some manufacturing step(s). The IP-approach and the graph-like IPMAP representation make it amenable for incorporating such capabilities. It helps communicate the metadata on processing performed in a data warehouse thereby informing the user and allowing him/her to gauge the capabilities for implementing TDQM using the IPMAP. FRAMEWORK FOR TDQM IN A DATA WAREHOUSE The IPMAP modeling scheme offers six constructs to represent the manufacturing stages of an IP: (1) data source (DS) block that is used to represent each data source/provider of raw data used in the creation of the IP, (2) a processing (P) block that is used to represent any manipulations and/or combinations of data items or the creation of new data items required to produce the IP, (3) a storage (S) block that represents a stage where data may wait prior to being processed, (4) an information system boundary (SB) block to represent the transition of data from one information system to another (e.g. transaction data in legacy file systems transferred to a relational database after some processing), (5) a data consumer (DC) block to represent the consumer, and (6) an inspection (I) block that serves to represent predetermined inspections (validity checks, checks for missing values etc., authorizations, approvals etc.). Though it may be viewed as a process, this block is used to differentiate a transport/transformation process from the inspection/validation process. The arrows between the constructs represent the raw/component data units that flow between the corresponding stages. An input obtained from a source is referred to as a raw data unit. Once a raw data unit is processed or inspected, it is referred to as a component data unit. Consider a warehouse having (say) three dimensions and a set of facts (or measures). A generic, high-level, sequence of steps that result in the warehouse is represented by the IPMAP in figure 1, parts a, b, and c. The data from data sources (DS 1 and DS 2 ) are extracted data quality in the context of each decision-task that he/she is using the data for. It allows administrators to visualize the manufacture of the warehouse by treating a warehouse as an information product. It makes it easier to identify problems with data (quality) in a warehouse and trace each problem to its one/more causes. To describe the framework we first define the metadata requirements for managing data quality in a warehouse. We then describe the IPMAP representation for capturing the metadata and communicating it to decision-makers. We also define the associations between the IPMAP and the corresponding warehouse metadata. We then show how data quality may be evaluated in the IPMAP using accuracy as a sample. Finally we describe by extraction processes (EP 1 and EP 2 ), and cleansed (CP 1 and CP 2 ). The cleansed data from DS 1 is inspected (manual or automated process I 1 ) and stored (S 1 ). This is combined with the cleansed data from DS 2 by an integration process (INP 1 ), inspected for errors (I 2 ), and staged in storage S 2. The staging of the other data (dimensions and facts) can similarly be represented as shown in figure 2. The staged fact data (in S 5 ) may then be combined with the staged dimension data (in S 2, S 3, and S 4 ) by a transformation process (TP) and loaded into the DW by the process (LP 1 ). Though the transformation may be a single process it is shown as multiple stages in figure 1c. Iverson states that in order to improve data quality the metadata attributes must include process and procedure documentation such as data capture, storage, transformation rules, quality usage and metrics, and datatips on usage and feedback [4]. Access to this metadata must be layered to support better communication. So each construct in the IPMAP is supplemented with metadata about the manufacturing stage that it represents. The metadata includes (1) a unique identifier for each stage, (2) the composition of the data unit when it exits the stage, (3) the role and business unit responsible for that stage, (4) individual(s) that may assume this role, (5) the processing steps to complete that manufacturing step, (6) the business rules /constraints associated with it, (7) a description of the technology used at this stage, (8) and the physical location where the step is performed. These help the decision-maker understand what is the output from this step, how was this created including business rules and constraints applicable, where (both physical location and the system used), and who is responsible for this stage in the manufacture in addition to when (at what stage) an operation was performed. Journal of Information Technology Management Volume XVI, Number 1,
4 DS 1 EP 1 CP 1 I 1 S 1 INP 1 I 2 S 2 DS 2 EP 2 CP 2 Figure 1a: IPMAP for staging data in a DW D S 3 E P 3 C P 3 S 3 D S 4 E P 4 I N P 2 C P 4 I 3 S 4 D S 5 E P 5 D S 6 E P 6 C P 5 D S 7 E P I N P 3 7 C P 6 I 4 S 5 D S 8 E P 8 C P 7 Figure 1b: IPMAP for staging of dimensions and facts in a DW S 5 T P 1 T P 2 T P 3 LP 1 S 6 S 4 S 2 S 3 Figure 1c: IPMAP representing the transformation and loading of a DW Further, metadata in a warehouse repository includes metadata on data sources, objects within each source (schema), and data elements in each. To assist data integration, the dependencies, constraints, and mappings between the source data elements are also maintained. The different pieces of metadata along with the associations and inter-relationships between these pieces (also metadata) are shown in figure 2. In addition to these metadata elements, for the purpose of managing data quality in the warehouse, we need additional metadata elements (items #1, 2, 3, 4, and 7 listed in the previous paragraph). Journal of Information Technology Management Volume XVI, Number 1,
5 IPMAPofoutputsusingdatainawarehouse Warehouse Data IPMAP StagedatatoLoadingincludingtransformations Extractionand CleansingRules Metadataon ETLModules Transformation Rules Intermediate Dependencies& Data Constraints BetwensourceDataelements IPMAPofdimensionsandfacts fromsourcestostaging SourceData SourceData Objects Data Sources Figure 2: Layered Conceptual Representation of DW and IPMAP metadata When implementing the TDQM framework in a warehouse, the DQ-metadata repository will link to and reuse the metadata elements that already exist in the warehouse repository besides capturing and managing the additional metadata elements. A conceptual data (ER) model of the DQ-metadata repository is shown in figure 3. For clarity, only the identifiers of entity classes and certain key attributes are shown. The complete list of attributes is in table 1. Journal of Information Technology Management Volume XVI, Number 1,
6 Object-Name Source Objects Include Rule# 1:1 Dimension Processing Rules Contain DQ Measurements Defined-in Element-Name Value Defined on 1:1 Source Data S Data Data Sources ST-Map SW-Map Intermediate Data Output-of Input-to IT-Map IW-Map SI-Map 1:1 1:1 Warehouse Data IP Stages S Stage-ID Data Stores Processed in WT-Map Data Sinks Target Data 1:1 IPs Name S Loading Extraction Cleansing Transform. Data Delivery Figure 3: Conceptual Data Model of Metadata for TDQM in a DW Journal of Information Technology Management Volume XVI, Number 1,
7 Table 1: Metadata elements for TDQM in a DW Metadata Entity Class Warehouse Data Data Sources Source Data Objects (e.g. tables if source is relational) Source Data Staged Intermediates /Target Objects (these are typically relational tables or object classes) Intermediate/Target Data Source Element to Target Element Mappings & Constraints ETL Process Modules Extraction Process Cleansing Process Transformation Process Load Process Attributes (or metadata items) tracked by the entity class Date loaded, Date updated, Currency (old/current) in the warehouse, associated data sources, associated extraction, cleansing, and transformation processes, whether (still) available in the data source, associated staged data elements, staged data sources ID or Unique name, Format type, Frequency of update, Active Status Object name, Aliases, Business Entity name, Business rules associated, Owner Element name, Units, Business rules, Computation method(s), business name/alias, data type, data length, Range-Max, Range-Min, Date/time when it was included, [Constraint and participating source elements] Object name, Aliases, Business Entity name, Business rules associated, Owner, Creation date, Object Status, Administrator, Element name, Units, Business rules, Computation method(s), business name/alias, data type, data length, Range-Max, Range-Min, Date/time when it was included or became effective, [Constraint and participating source elements] Derivation and business rules, assumptions on default and missing values, associations between source and target data elements ID and/or Unique name, Creation date, Effective date, Owner, Role/Business Unit responsible, Modification date, Modified by, reason for modification, system/platform associated, location in file system, execution commands, Run Date, Error Codes/messages Applicable source data element(s), extraction rules, business restrictions/rules, Last Run Date, Error Codes/Messages, output data elements Applicable source data element(s), sanitizing rules, business restrictions/rules, output data elements Input data element(s), transformation rules, business rules, output data elements Input data element(s), format/transformation rules, business rules, output warehouse data elements EVALUATING DATA QUALITY To illustrate the evaluation of data quality in an IPMAP using accuracy as an example we treat accuracy as a perceived measure that may be evaluated subjectively. In certain situations, it is possible to evaluate accuracy in an objective manner. For instance, an objective measure of accuracy in databases might be computed as [Accuracy = 1 (# of data items in error / Total # of data items)]. For individual data elements it could be computed as [Accuracy = 1 {(Correct Value Actual Value Used) / Correct Value}]. In these situations, the actual value of the data element is known and is used in the assessment of error and computation of accuracy. However, in most decision-tasks, the actual value of a data element is unknown at the time when it is used in the decision-task. Further, how accurate a data element needs to be is also dependent on the decision task at hand. For example, a ballpark figure of the enrollment in a course may be sufficient to determine how many textbooks should be ordered. A more accurate enrollment figure is necessary when deciding which classroom (seating capacity) is appropriate for this course. In both cases, the correct value is unknown when the decision is made. Journal of Information Technology Management Volume XVI, Number 1,
8 The raw data units that come in from data source blocks are assigned an accuracy value by the provider or by the decision-maker. The value assigned is between 0 and 1, with 1 indicating a very accurate value. Inspection blocks do not affect the accuracy of the data unit(s) that flows through these blocks but may improve completeness. A processing block may combine raw and/or component data units to create a different component data unit. The accuracy of the output data element in a processing block is dependent on the processing performed and the determination of a functional formula to express accuracy of the output data element is difficult. The formula proposed here is based on a generic process that combines together (collates) multiple data elements to create an output. It does not take into account the processing errors (affecting accuracy) that might be introduced by the process itself. For more complex processes that perform mathematical operations and can introduce process-errors that affect accuracy, statistical techniques such as those proposed by Morey [8] can be used to calculate the accuracy of the output data element instead of the formula proposed. To compute the accuracy of the output data element from a processing block, the decision-maker may assign weights (continuous between 0 and 1) to each input of the processing block and the output accuracy is a weighted average of the accuracy of the inputs. For example, let there be n data elements flowing into one processing stage (say, x). Let A i denote the specified (would be a computed value if it is a component data element) accuracy of raw data element i. Let us further state the decision-maker s perceived importance of the data element i in the context of the decision-task is a i. The accuracy of the output data element of stage x is: A x = [Σ i =1, n (a i * A i )] / [Σ i = 1, n (a i )] For instance, in figure 1(a) let A S1 and A CP2 be the accuracy measurements associated with the two inputs into process INP 1. The accuracy In case of inspection and storage blocks, the accuracy of the output elements is the same as the accuracy of the corresponding input elements. For additional data elements introduced during the inspection, the inspector can assign new values for accuracy. The absence of an objective measure can result in the custodian of some data element (say, k) inflating the specified accuracy (A k ) of that data element due to vested interests. The perceived accuracy a k of that data element that is assigned by the decision-maker allows the decision-maker to adjust for such biased values. Organizations need to have some incentive schemes to reward unbiased evaluations. Capabilities for Managing Data Quality The IPMAP offers three distinct capabilities for managing data quality and for implementing TDQM. These are: (1) estimating time-to-deliver, (2) determining reachability, and (3) determining traceability. The timeto-deliver an IP (or any component data) is defined as the time to completely generate the IP (or component data) from any other processing stage in the IPMAP. It is often necessary to estimate the time it takes for some work-inprocess (component data) to move from some intermediate stage in the IPMAP to a different stage in the same IPMAP. Consider a case where a required component data is unavailable resulting in an unacceptable product quality. Decision-makers may consider substitutes to increase product quality to acceptable levels. They can now evaluate the alternatives to identify the most suitable one based on time constraints. To facilitate this, the IPMAP supports metadata that defines the processing time associated with each stage in the IPMAP. The processing time can be a deterministic value or stochastic based on some distribution. Using these time-tags, time-to-deliver may be estimated using proven operations management techniques such as the Critical Path Method (CPM) or the Project Evaluation and Review Technique (PERT) (details in appendix). Reachability in IPMAP is the ability to identify all production stages of an IP that can be reached from a (any) given stage in the IPMAP. Stage y is reachable from stage x if there is a defined sequence of stages that constitute a path from x to y in the IPMAP corresponding to that IP. Reachability plays an important role in identifying impacts of quality errors. For example, if a data unit at some stage in the IPMAP is of poor quality it would affect all the stages in the manufacture of one or more IPs that are reachable from this stage. To implement reachability we first map the IPMAP onto its corresponding graph, IP-graph. The IP-graph is a directed graph. Each stage in the IPMAP is represented as a node in its corresponding graph. At this time we do not distinguish between the different types of blocks in the IPMAP. Each flow in the IPMAP from one stage (start) to another (end) is represented as a link between the two corresponding nodes in the graph with the associated direction. Given any IPMAP I, it can now be represented as an IP-graph G (N, L). Each node n N represents a block in I, and each link l L is defined by the ordered pair (x, y) where x, y N. This mapping process generates a mapping set P. Each member of P is an ordered-pair <b, n> where b I, n G. Associated graph-theoretic proofs for reachability as well as the Journal of Information Technology Management Volume XVI, Number 1,
9 algorithm for implementing it are provided in the appendix. Traceability in the IPMAP is defined as the ability to identify (trace) a sequence of one or more stages that precede any stage. The administrator/decision-maker can trace the source of a quality error in an IP to one or more preceding steps in its manufacture. The individual /role /department responsible can be identified using the metadata associated with each stage. This permits implementing quality-at-source, an integral part of TDQM. All of the above capabilities for TDQM can be implemented on the IPMAP using graph-based algorithms. Proofs of correctness and implementation are shown in the appendix. Benefits of the DQ Framework The framework proposed is for proactively managing data quality in a data warehouse. It defines the metadata requirements for managing data quality and then proposes a representation scheme, the IPMAP, for capturing and communicating these requirements to the decision-makers. We also show how this quality-related metadata supplements the existing metadata in a DW. This framework assists an organization implement TDQM in a in a warehouse by offering several capabilities. First it helps warehouse administrators represent and visualize the manufacture of not just the warehouse but also the products (analytical reports) generated from it. The representation offers a convenient tool for communicating the metadata (on quality, sources, and manufacturing processes) to the decision-makers. It permits decisionmakers evaluate data quality along accuracy, completeness, and timeliness and also gauge sourcecredibility. This, in turn, will improve the usability of the warehouse, or at the least, help administrators pinpoint the causes of data quality problems and effectively target remedial measures. It facilitates the implementation of several capabilities such as traceability, reachability, and time-to-deliver, all of which are necessary for TDQM. The specific questions that need to be answered to evaluate the usefulness of this framework are: (1) does the ability to view information about the manufacture of the decision-product impact the (perceived) quality of decisions made using that product? (2) Does the ability to evaluate product quality using quality dimensions impact the (perceived) quality of the decisions made? (3) Can the data quality evaluated using this framework be used as a measure for understanding the performance of a DW? The framework is incorporated into a prototype decision support system (IPView) for managing data quality. IPView supports a GUI that permits decision-makers to create and visualize the IPMAP(s) and access to the metadata associated with each stage. IPView is being used to evaluate the usefulness of this framework and to answer the questions listed earlier. The preliminary evaluation indicates that it is more useful to certain types of decision-makers (experts more than novices) and is more used for certain types of decision-tasks (more involved/data-intensive compared to simple). In such tasks, the provision of the IPMAP or process metadata (data about the manufacture of the product) does have an impact on the outcome of the decision-task. This effect does not appear to be direct but seems to be mediated by the efficiency of the decision-process. The initial findings also indicate that the provision of quality metadata (impartial measurements of data quality along data quality dimensions) has a positive impact on the decisionoutcome, though the link does not appear as strong as its counterpart, process metadata. These findings appear to support the need for decision-makers to gauge quality using contextual factors and process metadata does play a role in this evaluation process. It is also evident from our preliminary findings that the two pieces of metadata as a set has a stronger influence on the outcome of the decision-task than either of them, individually. These findings are yet to be confirmed. REFERENCES [1] Ballou, D., P., Wang, R. Y., Pazer, H., & Tayi, G., K, (1998). Modeling Information Manufacturing Systems to Determine Information Product Quality. Management Science, 44(4), [2] Helfert, M. and von Maur E. (2001), A Strategy for Managing Data Quality in Data Warehouse Systems, In the Proceedings of the International Conference on Information Quality, 2001, Boston, MA. [3] Hernadez, M. A., & Stolfo, S. J. (1998). Real-world Data is Dirty: Data Cleansing and the Merge/Purge Problem. Journal of Data Mining and Knowledge Discovery, 1(2). [4] Iverson, Dane S. (2001), Meta-Information Quality Keynote Address in the International Conference on Information Quality by the Senior VP Enterprise Information Solutions Ingenix, Boston, MA. [5] Jarke, M., Jeusfeld, M., Quix, C., & Vassiliadis, P. (1999) Architecture and Quality in Data Warehouses: An Extended Repository Approach, Information Systems, Vol. 24, No. 3, pp [6] Kahn, B. K., Strong, D. M., & Wang, R. Y. (2002). Information Quality Benchmarks: Product and Service Performance. Communications of the ACM, 45(4), Journal of Information Technology Management Volume XVI, Number 1,
10 [7] Lee, T., Bressen, S., & Madnick, S. (1998). Source Attribution for Querying Against Semi-structured Documents. Paper presented at the Workshop on Web Information and Data Management, ACM Conference on Information and Knowledge Management [8] Morey, R. C. (1982). Estimating and Improving the Quality of Information in the MIS. Communications of the ACM, 25(5), [9] Neely, M. P. (2001). A Proposed Framework for the Analysis of Source Data in a Data Warehouse, In the Proceedings of the International Conference on Information Quality, 2001, Boston, MA. [10] Parssian, A., Sarkar, S., and Jacob, V. S. (1999). Assessing Data Quality for Information Products. Proceedings of the International Conference on Information Systems, 1999, Charlotte, North Carolina [11] Redman, T. C. (Ed.). (1996). Data Quality for the Information Age. Boston, MA: Artech House. [12] Shankaranarayanan, G., Wang, R. Y., and Ziad, M. (2003). Managing Data Quality in Dynamic Decision Environments: An Information Product Approach. Journal of Database Management, 14(4), October-December 2003 [13] Wang, R. Y., Lee, Y. W., Pipino, L. L., and Strong, D. M. (1998) Manage Your Information as a Product, Sloan Management Review, 39(4), Summer AUTHOR BIOGRAPHY G. Shankaranarayanan obtained his Ph.D. in Management Information Systems from The University of Arizona and is an assistant professor in the Information Systems Department in Boston University School of Management. His research areas include schema evolution in databases, heterogeneous and distributed databases, data modeling requirements and methods, and structures for and the management of metadata. Specific topics in metadata include metadata implications for data warehouses, metadata management for knowledge management systems/architectures, metadata management for data quality, and metadata models for mobile data services. His work has appeared in Decision Support Systems, Journal of Database Management, and in the Communications of the ACM. Acknowledgements This research was funded in part by Boston University Institute for Leading in the Dynamic Economy (BUILDE). Journal of Information Technology Management Volume XVI, Number 1,
Data Quality Assessment
Data Quality Assessment Leo L. Pipino, Yang W. Lee, and Richard Y. Wang How good is a company s data quality? Answering this question requires usable data quality metrics. Currently, most data quality
Report on the Dagstuhl Seminar Data Quality on the Web
Report on the Dagstuhl Seminar Data Quality on the Web Michael Gertz M. Tamer Özsu Gunter Saake Kai-Uwe Sattler U of California at Davis, U.S.A. U of Waterloo, Canada U of Magdeburg, Germany TU Ilmenau,
Appendix B Data Quality Dimensions
Appendix B Data Quality Dimensions Purpose Dimensions of data quality are fundamental to understanding how to improve data. This appendix summarizes, in chronological order of publication, three foundational
Metadata Repositories in Health Care. Discussion Paper
Health Care and Informatics Review Online, 2008, 12(3), pp 37-44, Published online at www.hinz.org.nz ISSN 1174-3379 Metadata Repositories in Health Care Discussion Paper Dr Karolyn Kerr [email protected]
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 [email protected]
A Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
Sales and Operations Planning in Company Supply Chain Based on Heuristics and Data Warehousing Technology
Sales and Operations Planning in Company Supply Chain Based on Heuristics and Data Warehousing Technology Jun-Zhong Wang 1 and Ping-Yu Hsu 2 1 Department of Business Administration, National Central University,
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment
DOI: 10.15415/jotitt.2014.22021 A Review of Contemporary Data Quality Issues in Data Warehouse ETL Environment Rupali Gill 1, Jaiteg Singh 2 1 Assistant Professor, School of Computer Sciences, 2 Associate
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
Evaluating Data Warehousing Methodologies: Objectives and Criteria
Evaluating Data Warehousing Methodologies: Objectives and Criteria by Dr. James Thomann and David L. Wells With each new technical discipline, Information Technology (IT) practitioners seek guidance for
POLAR IT SERVICES. Business Intelligence Project Methodology
POLAR IT SERVICES Business Intelligence Project Methodology Table of Contents 1. Overview... 2 2. Visualize... 3 3. Planning and Architecture... 4 3.1 Define Requirements... 4 3.1.1 Define Attributes...
Data warehouse and Business Intelligence Collateral
Data warehouse and Business Intelligence Collateral Page 1 of 12 DATA WAREHOUSE AND BUSINESS INTELLIGENCE COLLATERAL Brains for the corporate brawn: In the current scenario of the business world, the competition
Optimization of ETL Work Flow in Data Warehouse
Optimization of ETL Work Flow in Data Warehouse Kommineni Sivaganesh M.Tech Student, CSE Department, Anil Neerukonda Institute of Technology & Science Visakhapatnam, India. [email protected] P Srinivasu
Data Warehouse Modeling and Quality Issues
Data Warehouse Modeling and Quality Issues Panos Vassiliadis Knowledge and Data Base Systems Laboratory Computer Science Division Department of Electrical and Computer Engineering National Technical University
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
A QUESTIONNAIRE-BASED DATA QUALITY METHODOLOGY
A QUESTIONNAIRE-BASED DATA QUALITY METHODOLOGY Reza Vaziri 1 and Mehran Mohsenzadeh 2 1 Department of Computer Science, Science Research Branch, Azad University of Iran, Tehran Iran rvaziri@iauctbacir
Knowledge Base Data Warehouse Methodology
Knowledge Base Data Warehouse Methodology Knowledge Base's data warehousing services can help the client with all phases of understanding, designing, implementing, and maintaining a data warehouse. This
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
Information Quality for Business Intelligence. Projects
Information Quality for Business Intelligence Projects Earl Hadden Intelligent Commerce Network LLC Objectives of this presentation Understand Information Quality Problems on BI/DW Projects Define Strategic
An Approach for Facilating Knowledge Data Warehouse
International Journal of Soft Computing Applications ISSN: 1453-2277 Issue 4 (2009), pp.35-40 EuroJournals Publishing, Inc. 2009 http://www.eurojournals.com/ijsca.htm An Approach for Facilating Knowledge
Springer SUPPLY CHAIN CONFIGURATION CONCEPTS, SOLUTIONS, AND APPLICATIONS. Cham Chandra University of Michigan - Dearborn Dearborn, Michigan, USA
SUPPLY CHAIN CONFIGURATION CONCEPTS, SOLUTIONS, AND APPLICATIONS Cham Chandra University of Michigan - Dearborn Dearborn, Michigan, USA Jänis Grabis Riga Technical University Riga, Latvia Springer Contents
PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions. A Technical Whitepaper from Sybase, Inc.
PowerDesigner WarehouseArchitect The Model for Data Warehousing Solutions A Technical Whitepaper from Sybase, Inc. Table of Contents Section I: The Need for Data Warehouse Modeling.....................................4
Measurement Information Model
mcgarry02.qxd 9/7/01 1:27 PM Page 13 2 Information Model This chapter describes one of the fundamental measurement concepts of Practical Software, the Information Model. The Information Model provides
Quality. Data. In Context
Diane M. Strong, Yang W. Lee, and Richard Y. Wang Data A new study reveals businesses are defining Quality data quality with the consumer in mind. In Context DATA-QUALITY (DQ) PROBLEMS ARE INCREASINGLY
A Service-oriented Architecture for Business Intelligence
A Service-oriented Architecture for Business Intelligence Liya Wu 1, Gilad Barash 1, Claudio Bartolini 2 1 HP Software 2 HP Laboratories {[email protected]} Abstract Business intelligence is a business
TOWARD A FRAMEWORK FOR DATA QUALITY IN ELECTRONIC HEALTH RECORD
TOWARD A FRAMEWORK FOR DATA QUALITY IN ELECTRONIC HEALTH RECORD Omar Almutiry, Gary Wills and Richard Crowder School of Electronics and Computer Science, University of Southampton, Southampton, UK. {osa1a11,gbw,rmc}@ecs.soton.ac.uk
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
DATA WAREHOUSE DESIGN AND IMPLEMENTATION BASED
DATA WAREHOUSE DESIGN AND IMPLEMENTATION BASED ON QUALITY REQUIREMENTS Khalid Ibrahim Mohammed Department of Computer Science, College of Computer, University of Anbar, Iraq. ABSTRACT The data warehouses
Part A OVERVIEW...1. 1. Introduction...1. 2. Applicability...2. 3. Legal Provision...2. Part B SOUND DATA MANAGEMENT AND MIS PRACTICES...
Part A OVERVIEW...1 1. Introduction...1 2. Applicability...2 3. Legal Provision...2 Part B SOUND DATA MANAGEMENT AND MIS PRACTICES...3 4. Guiding Principles...3 Part C IMPLEMENTATION...13 5. Implementation
8. Master Test Plan (MTP)
8. Master Test Plan (MTP) The purpose of the Master Test Plan (MTP) is to provide an overall test planning and test management document for multiple levels of test (either within one project or across
DATA QUALITY MATURITY
3 DATA QUALITY MATURITY CHAPTER OUTLINE 3.1 The Data Quality Strategy 35 3.2 A Data Quality Framework 38 3.3 A Data Quality Capability/Maturity Model 42 3.4 Mapping Framework Components to the Maturity
Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2
Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue
COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design
COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data
Data Warehouse Snowflake Design and Performance Considerations in Business Analytics
Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker
System Development and Life-Cycle Management (SDLCM) Methodology. Approval CISSCO Program Director
System Development and Life-Cycle Management (SDLCM) Methodology Subject Type Standard Approval CISSCO Program Director A. PURPOSE This standard specifies content and format requirements for a Physical
A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing
A Framework for Developing the Web-based Integration Tool for Web-Oriented Warehousing PATRAVADEE VONGSUMEDH School of Science and Technology Bangkok University Rama IV road, Klong-Toey, BKK, 10110, THAILAND
North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics
North Highland and Analytics Governance Considerations for Big Analytics Agenda Traditional BI/Analytics vs. Big Analytics Types of Requiring Governance Key Considerations Information Framework Organizational
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram
Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram Cognizant Technology Solutions, Newbury Park, CA Clinical Data Repository (CDR) Drug development lifecycle consumes a lot of time, money
Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Length : 5 Days Audience(s) : IT Professionals Level : 300 Technology : Microsoft SQL Server 2014 Delivery Method : Instructor-led
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya
Chapter 6 Basics of Data Integration Fundamentals of Business Analytics Learning Objectives and Learning Outcomes Learning Objectives 1. Concepts of data integration 2. Needs and advantages of using data
Chapter 7: Data Mining
Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.
Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006
Enterprise Data Warehouse (EDW) UC Berkeley Peter Cava Manager Data Warehouse Services October 5, 2006 What is a Data Warehouse? A data warehouse is a subject-oriented, integrated, time-varying, non-volatile
Building a Data Quality Scorecard for Operational Data Governance
Building a Data Quality Scorecard for Operational Data Governance A White Paper by David Loshin WHITE PAPER Table of Contents Introduction.... 1 Establishing Business Objectives.... 1 Business Drivers...
Keywords : Data Warehouse, Data Warehouse Testing, Lifecycle based Testing
Volume 4, Issue 12, December 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Lifecycle
REPUBLIC OF MACEDONIA STATE STATISTICAL OFFICE. Metadata Strategy 2013-2015
REPUBLIC OF MACEDONIA STATE STATISTICAL OFFICE Metadata Strategy 2013-2015 May, 2013 1 Statistical Metadata is any information that is needed by people or systems to make proper and correct use of the
FEDERATED DATA SYSTEMS WITH EIQ SUPERADAPTERS VS. CONVENTIONAL ADAPTERS WHITE PAPER REVISION 2.7
FEDERATED DATA SYSTEMS WITH EIQ SUPERADAPTERS VS. CONVENTIONAL ADAPTERS WHITE PAPER REVISION 2.7 INTRODUCTION WhamTech offers unconventional data access, analytics, integration, sharing and interoperability
Analyzing and Improving Data Quality
Analyzing and Improving Data Quality Agustina Buccella and Alejandra Cechich GIISCO Research Group Departamento de Ciencias de la Computación Universidad Nacional del Comahue Neuquen, Argentina {abuccel,acechich}@uncoma.edu.ar
How To Evaluate Web Applications
A Framework for Exploiting Conceptual Modeling in the Evaluation of Web Application Quality Pier Luca Lanzi, Maristella Matera, Andrea Maurino Dipartimento di Elettronica e Informazione, Politecnico di
Transparency of Hospital Productivity Benchmarking
Transparency of Productivity Benchmarking (Research-in-Progress) S. Laine, Department of Computer Science and Engineering, Aalto University Laine, Sami, Niemi, Erkka (2013), Transparency of Productivity
Metadata Management for Data Warehouse Projects
Metadata Management for Data Warehouse Projects Stefano Cazzella Datamat S.p.A. [email protected] Abstract Metadata management has been identified as one of the major critical success factor
CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University
CONCEPTUALIZING BUSINESS INTELLIGENCE ARCHITECTURE MOHAMMAD SHARIAT, Florida A&M University ROSCOE HIGHTOWER, JR., Florida A&M University Given today s business environment, at times a corporate executive
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA
OLAP and OLTP AMIT KUMAR BINDAL Associate Professor Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information, which is created by data,
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
Rational Reporting. Module 3: IBM Rational Insight and IBM Cognos Data Manager
Rational Reporting Module 3: IBM Rational Insight and IBM Cognos Data Manager 1 Copyright IBM Corporation 2012 What s next? Module 1: RRDI and IBM Rational Insight Introduction Module 2: IBM Rational Insight
A Model for Effective Asset Re-use in Software Projects
A Model for Effective Asset Re-use in Software Projects Abhay Joshi Abstract Software Asset re-use has the potential to enhance the quality and reduce the time to market of software projects. However,
CDC UNIFIED PROCESS PRACTICES GUIDE
Purpose The purpose of this document is to provide guidance on the practice of Modeling and to describe the practice overview, requirements, best practices, activities, and key terms related to these requirements.
DATA VERIFICATION IN ETL PROCESSES
KNOWLEDGE ENGINEERING: PRINCIPLES AND TECHNIQUES Proceedings of the International Conference on Knowledge Engineering, Principles and Techniques, KEPT2007 Cluj-Napoca (Romania), June 6 8, 2007, pp. 282
(Refer Slide Time: 01:52)
Software Engineering Prof. N. L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture - 2 Introduction to Software Engineering Challenges, Process Models etc (Part 2) This
METRICS FOR MEASURING DATA QUALITY Foundations for an economic data quality management
METRICS FOR MEASURING DATA UALITY Foundations for an economic data quality management Bernd Heinrich, Marcus Kaiser, Mathias Klier Keywords: Abstract: Data uality, Data uality Management, Data uality Metrics
Data Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
Fluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
2 Associating Facts with Time
TEMPORAL DATABASES Richard Thomas Snodgrass A temporal database (see Temporal Database) contains time-varying data. Time is an important aspect of all real-world phenomena. Events occur at specific points
Enterprise Data Quality
Enterprise Data Quality An Approach to Improve the Trust Factor of Operational Data Sivaprakasam S.R. Given the poor quality of data, Communication Service Providers (CSPs) face challenges of order fallout,
Whitepaper. Data Warehouse/BI Testing Offering YOUR SUCCESS IS OUR FOCUS. Published on: January 2009 Author: BIBA PRACTICE
YOUR SUCCESS IS OUR FOCUS Whitepaper Published on: January 2009 Author: BIBA PRACTICE 2009 Hexaware Technologies. All rights reserved. Table of Contents 1. 2. Data Warehouse - Typical pain points 3. Hexaware
BUSINESS INTELLIGENCE AS SUPPORT TO KNOWLEDGE MANAGEMENT
ISSN 1804-0519 (Print), ISSN 1804-0527 (Online) www.academicpublishingplatforms.com BUSINESS INTELLIGENCE AS SUPPORT TO KNOWLEDGE MANAGEMENT JELICA TRNINIĆ, JOVICA ĐURKOVIĆ, LAZAR RAKOVIĆ Faculty of Economics
Healthcare, transportation,
Smart IT Argus456 Dreamstime.com From Data to Decisions: A Value Chain for Big Data H. Gilbert Miller and Peter Mork, Noblis Healthcare, transportation, finance, energy and resource conservation, environmental
Business Intelligence Not a simple software development project
Business Intelligence Not a simple software development project By V. Ramanathan Executive Director, Saksoft Date: Mar 2006 India Phone: +91 44 2461 4501 Email: [email protected] USA Phone: +1 212 286 1083
Reflections on Agile DW by a Business Analytics Practitioner. Werner Engelen Principal Business Analytics Architect
Reflections on Agile DW by a Business Analytics Practitioner Werner Engelen Principal Business Analytics Architect Introduction Werner Engelen Active in BI & DW since 1998 + 6 years at element61 Previously:
Data Mining for Successful Healthcare Organizations
Data Mining for Successful Healthcare Organizations For successful healthcare organizations, it is important to empower the management and staff with data warehousing-based critical thinking and knowledge
CHAPTER-6 DATA WAREHOUSE
CHAPTER-6 DATA WAREHOUSE 1 CHAPTER-6 DATA WAREHOUSE 6.1 INTRODUCTION Data warehousing is gaining in popularity as organizations realize the benefits of being able to perform sophisticated analyses of their
EAI vs. ETL: Drawing Boundaries for Data Integration
A P P L I C A T I O N S A W h i t e P a p e r S e r i e s EAI and ETL technology have strengths and weaknesses alike. There are clear boundaries around the types of application integration projects most
Turkish Journal of Engineering, Science and Technology
Turkish Journal of Engineering, Science and Technology 03 (2014) 106-110 Turkish Journal of Engineering, Science and Technology journal homepage: www.tujest.com Integrating Data Warehouse with OLAP Server
BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining
BUSINESS INTELLIGENCE Bogdan Mohor Dumitrita 1 Abstract A Business Intelligence (BI)-driven approach can be very effective in implementing business transformation programs within an enterprise framework.
LDIF - Linked Data Integration Framework
LDIF - Linked Data Integration Framework Andreas Schultz 1, Andrea Matteini 2, Robert Isele 1, Christian Bizer 1, and Christian Becker 2 1. Web-based Systems Group, Freie Universität Berlin, Germany [email protected],
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities
Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities April, 2013 gaddsoftware.com Table of content 1. Introduction... 3 2. Vendor briefings questions and answers... 3 2.1.
Data Warehouse (DW) Maturity Assessment Questionnaire
Data Warehouse (DW) Maturity Assessment Questionnaire Catalina Sacu - [email protected] Marco Spruit [email protected] Frank Habers [email protected] September, 2010 Technical Report UU-CS-2010-021
COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER
Page 1 of 8 ABOUT THIS COURSE This 5 day course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
CHAPTER 4 Data Warehouse Architecture
CHAPTER 4 Data Warehouse Architecture 4.1 Data Warehouse Architecture 4.2 Three-tier data warehouse architecture 4.3 Types of OLAP servers: ROLAP versus MOLAP versus HOLAP 4.4 Further development of Data
Speeding ETL Processing in Data Warehouses White Paper
Speeding ETL Processing in Data Warehouses White Paper 020607dmxwpADM High-Performance Aggregations and Joins for Faster Data Warehouse Processing Data Processing Challenges... 1 Joins and Aggregates are
Implementing a Data Warehouse with Microsoft SQL Server
Page 1 of 7 Overview This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL 2014, implement ETL
Course 20463:Implementing a Data Warehouse with Microsoft SQL Server
Course 20463:Implementing a Data Warehouse with Microsoft SQL Server Type:Course Audience(s):IT Professionals Technology:Microsoft SQL Server Level:300 This Revision:C Delivery method: Instructor-led (classroom)
