Valuation Factors for the Necessity of Data Persistence in Enterprise Data Warehouses on In-Memory Databases
|
|
|
- Elijah Austin
- 10 years ago
- Views:
Transcription
1 Valuation Factors for the Necessity of Data Persistence in Enterprise Data Warehouses on In-Memory Databases Author: Supervisor: Thorsten Winsemann Otto-von-Guericke Universität Magdeburg, Germany Kanalstraße 18, D Hamburg E: Prof. Dr. rer. nat. habil. Gunter Saake Faculty of Computer Science Departement for Technical & Operational Information Systems Otto-von-Guericke Universität Magdeburg, Germany Universitätsplatz 2, D Magdeburg E: Abstract. ETL (extraction, transformation, and loading) and data staging processes in Enterprise Data Warehouses have always been critical due to their consumption of time and resources. Mostly, the staging processes are accompanied with persistent storage of transformed data to enable a reasonable performance when accessing for analysis and other purposes. The persistence of often redundant data requires high effort regarding maintenance, such as for updating and for guaranteeing consistency. Concurrently, flexibility of data usage and speed of data availability is delimited. Especially column-based inmemory databases (IMDB) enable to query high data volumes with good response times. Progress in in-memory technology leads to the question how much persistence is necessary in such warehouses. The answer cannot only be based on cost models, but also has to take other aspects into account. The presented thesis project deals with the problem of defining valuation factors for supporting the decision whether to store data in Enterprise Data Warehouses based on in-memory databases. Keywords. Enterprise Data Warehouse, In-Memory Database, Persistence
2 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB 2 1 Research Topic 1.1 Prologue: Enterprise Data Warehouse Systems In this work, the author focuses on Enterprise Data Warehouses (EDW); that is a Business Data Warehouse [1], thus supporting management s decisions and covering all business areas. Beyond, EDW are important basis for several applications, such as Business Intelligence (BI), planning, and Customer Relationship Management (CRM). As they are embedded in an enterprise-wide system landscape, an EDW has to provide a single version of truth for all data of the company. That means, there is a common view on centralized, accurate, harmonized, consistent, and integrated data to treat information as corporate asset. An EDW covers all domains of a company or even a corporate group, including the collection and distribution of data with heterogenous source systems and a huge amount of data. The range of use is often world-wide, so that data from different time-zones have to be integrated. Frequently, 24x7-hours data availability has to be guaranteed, facing the problem of loading and querying at the same time. Despite of this, there are further very complex requirements to the data: ad-hoc access, near real-time actuality, and often applications, such as CRM, that require very detailed and granular data with a long time horizon. Moreover, new or changing needs for information have to be satisfied flexibly and promptly, and, last but not least, the access to data has to be secured by a powerful authorization concept. Therefore, an efficient usage of an EDW requires a maximum regarding data consistency, data granularity, historical data, flexibility in usage, actuality of data, speed of data provision, availability of data, and system stability. 1.2 The Problem Persistence often means redundant data, because source and transformed target data sets are stored. The resulting effort does not only concern hardware aspects; for instance, maintenance for keeping data consistently, et cetera, has to be taken into account to calculate the total costs. The operation of productive EDW systems necessarily means potential for conflicts, which arises from requirements of the data usage such as reporting and analysis and time and effort to create the necessary requirements. The requirements and consequences in more detail are as follows. As mentioned above, EDW represents a data source for several applications; the huge volume of data is a result of the manifold needs that have to be satisfied. The need for detailed information requires lots of data because of fine granularity. The need for historical information requires lots of data from the past, of course. Finally, the broad range of data being collected, for example for data mining scenarios, causes huge data volumes. This amount of data has to be prepared for its intended use; for instance, good reporting performance has to be ensured. Data, extracted into the EDW, are usually transformed over several layers. Firstly, data are integrated by means of data cleaning, identification of duplicates, data fusion,
3 3 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB and information quality [2]. Subsequently, data are transformed according to technical and business needs, such as combining sales order with invoice data. Finally, data are transformed to enhance the performance of access. Each transformation involves less flexibility in usage and delays availability. Management of redundant data not only requires disk space, but also additional effort for keeping the aggregated data up-todate and consistently regarding to the raw data (cf. [3]). Data Warehouse architectures usually base on relational databases (RDBMS) with star, snowflake, or galaxy schema as its main logical data model; see for instance [3,4]. Basically, these models offer an adequate performance when accessing data for on-line analytical processing (OLAP). However, the amount of data enforces to build up materialized views and summarization levels for enabling reporting and analysis within passable response times. Column-oriented database systems (cf. [5,6]) came into special focus in the Data Warehouse community (e.g., [7,8]) because of their advantages regarding data compression and read access [9,10]. Since some years, there are column-oriented inmemory techniques used in commercial Data Warehouse products (e.g., [11,12]), for speeding up system s response time. Such techniques lead to the ability to load and query on data in terabyte volumes with good performance. Latest announcements even propagate installations embodying both, on-line transactional processing (OLTP) and OLAP with up to 50 TB data in memory [13,14]. Those changes within technology lead to the question, in which degree persistence in IMDB-based EDW is required or necessary. This discussion includes the question, in which format the data have to be stored (i.e., as raw data, syntactically integrated, aggregated data, etc.). In this context, we discuss databases that fully support ACID, including durability; e.g., [14,15,16]. Hence, persistence is clearly to be distinguished from data storage in volatile memory. A decision for persisting data cannot just made by comparing disk space and updating costs versus gain in performance. At first, the purpose of the data s storing (see Section 4) must be taken into account, as well. Such considerations are less important in RDBMS-based EDW, as simply the lower performance of the database motivates storage. 1.3 The Problem in Examples Often, the problem is to decide whether to store transformed data or to transform raw data repeatedly. In the following, four brief examples point out this problem. A query result, which is created by a JOIN or UNION of two tables, could easily be redone each time the query is executed. The costs for accessing and combining the data seem to be less than the costs for storing and updating the data set. However, if the query is called regularly and the data do not change (e.g., such as in closed posting periods), the storing of the data might be preferable. Figure 1 shows a more complex example: the determination of so-called RFM attributes (recency, frequency, and monetary). These attributes are a common way of customer s categorization in CRM business; see [17] for more details. The determination is based on the customer master data (from CRM system), point-of-sale receipts (from POS system in e.g. boutiques), and order and invoice data (from ERP system). In short: RFM attributes have to be determined and checked for new
4 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB 4 customers and whenever new receipts, orders, or invoices flow into the system. The RFM determination is done by an application program, including selections, computations, and currency conversions, with lookups to (control) data, et cetera. The computed attributes are not only used in the Data Warehouse, but also transferred back to the CRM system. Talking about fast changing, but reproducible data base of several millions of customers with tens of transactions each, one can realize that storing the results in volatile memory is preferable. Figure 1: Data-Flow-Example Determination of RFM Attributes Another example is data that cannot or only with huge effort be reproduced, as for example data from internet or data already being archived in the source system. Here, storing is the only way to prevent data loss. Finally, there are also constraints for not persisting data; for instance changing data sets for which average values are computed. 2 Current Status of Research This thesis project covers several working areas, so that we list products and researching people for each of these areas. In the field of column-based databases, there are products as C-Store, Monet DB, Coldbase, and Brighthouse, as well as Sybase IQ as a commercial one. Researching people are M. Stonebraker, D.J. Abadi, S.R. Madden, P. Boncz, D. Bößwetter, and A. Lübcke. Relational in-memory databases are offered by IDM (soliddb) [15] and Altibase (Hybrid DB); among others, J. Guisado-Gámez researches in this area.
5 5 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB Today, commercially offered Data Warehouse Systems (DWS) are mainly based on relational database systems. In addition, there are so-called appliances (e.g., SAP Netweaver Business Warehouse Accelerator, [18]), which receive data from the Data Warehouse and enable fast and flexible analyses on huge data volumes using columnbased in-memory technology. This also counts for massively parallel processing (MPP) techniques (e.g., Teradata, HP Neoview, IBM). Researching in this area are for instance F. Färber et al., W. Lehner, H. Plattner et al. People publishing in the field of (Enterprise) Data Warehouse architecture are for example B. Devlin, W. Inmon, R. Kimball, T. Ariyachandra, H.P. Watson, and T. Zeh. Next to the common Data Warehouse reference architecture, SAP s Layered, Scalable Architecture [19] is to note. The topics ETL and transformation in DWS are examined by P. Vassiliadis, A. Simitsis, H.-J. Lenz, B. Thalheim, A. Shoshani, and P. Gray. Regarding information integration, U. Leser and F. Neumann are to mention. Today, tools for databases used for commercial data warehousing, offer cost-based models for optimizing data access, supporting decisions whether to create indexes and materialized views. As far as the author knows, the question of decision support indicators for data persistence in Data Warehouses has not been discussed apart of those models. This question will become more important for EDW based on IMDB. 3 Preliminary Results Defining the reason, for which data are stored, is an important criterion to define whether data persistence is still necessary in IMDB-based EDW. The author presents a collection of persistence reason and describes them in the following. Source system decoupling: In order to reduce the load of the source system, data are stored immediately in the warehouse s inbound layer after successful extraction; usually, there is no or less transformation done (e.g., data of origin are added). Data availability: In many cases, data are no longer available, or have been changed; in addition, network connections can be jammed so data are not available when needed. This applies to internet data, data from files that are overwritten regularly or data from disconnected legacy systems, for instance. Complex data transformation: Due to complexity, transformation is very expensive regarding time and resources data are stored to avoid repeated transformations. Addicted transformation: Often, data are transformed using additional data from different sources. As a correct transformation is addicted to these data, they have to be stored until the transformation takes place to guarantee their availability. Changing transformation rules: In some cases, the rules of transformation are changing. For instance, tax values change at some point of time; unless a timestamp is not part of the data, it will not be possible to transform the same way as before. Extensive data recreation: Recreation of data that are no longer in the warehouse (e.g., archived data) is extensive; the data are stored transformed or aggregated. Data access performance: Still one of the most important reasons for storing data is faster access for reporting, analyses et cetera. Data are stored redundantly and usually in a more aggregated format.
6 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB 6 Constant data base: There are several applications analysis and especially planning where users have to trust in a constant data set; as it must not change for a selected period of time (hours or even days), it is stored separately. En-bloc data supply : Usually, new data trickle into the Data Warehouse from several sources (world-wide); after integrating them syntactically and semantically, all new data are kept separately and released to the warehouse s basis data base at a certain point of time to ensure a selected and defined data set for further processing. Single version of truth : Data are stored after transformation according to the company s general rules without any business logic or similar. That means, these data build the basis of any further use, meaningless whether it is a report for finance or logistics, for instance. Corporate data memory : Data being extracted into the warehouse are stored in its original state. Hereby, an utmost autarky and flexibility from source systems can be achieved (e.g., when data are deleted or archived in source systems). Complex, different data: Some data are often very complex and semantically and syntactically very different to the warehouse s data; hence, they are stored in order to transform it stepwise for its diverse usage. Data lineage: Data in reports often result from multi-level processes of transformations; for later validations, backtracking of data origin is essential; source data are stored to enable or to ease this data lineage (cf. [20]). Complex authorization: Instead of defining complex authorizations for data access, creating dedicated data marts for just the relevant data is often easier. Information warranty : Many warehouses have to guarantee a defined degree of data availability and access speed; hence, data are stored additionally. Corporate governance: Data are stored according to compliance rules (corporate governance); for example, to understand previous management decisions, the data set on which these decisions have been taken must be stored separately. Laws and provisions: Finally, data persistence can also be necessary because of laws and provisions; for example, in Germany regulations exist in the areas of finance (German Commercial Code etc., [21]) and product liability [22]. Individual safety: The need for a more secure feeling is also a reason for storing data redundantly or slightly changed mostly in the data mart area. The reasons are classified into three main groups to define persistence s necessity. Mandatory persistence applies to data that have to be stored according to laws and regulations of corporate governance. It also holds for data that cannot be replaced because they are not available any longer (at least not in this format) or cannot be reproduced due to changed transformation rules. Lastly, data that are required for other data s transformation must be stored if simultaneous availability is not ensured. Essentially persistent data can be classified into certain groups. Firstly, data those are available or reproducible in principle. However, the effort for reproducing in time and/or resources is too high (e.g., for archived or costly transformed data). Of course, the definition of too high is very subjective and needs clarified. Another group is data, which are stored to simplify the maintenance or operation of the warehouse or defined applications such as planning data and data marts for specially authorized users. A third group of data is persistent because of the warehouse design: single version of truth and corporate data memory. Fourthly, responsibility for
7 7 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB guaranteeing information leads to data storage for safety reasons. Finally, there are data redundantly stored for performance purposes often the biggest volume. Supporting data include mainly data stored for subjective safety needs. Figure 2 shows a simple decision diagram for data persistence. The first three steps deal with mandatorily stored data, which are not under consideration when deciding about persistence, nor in in-memory based EDW. For all other data (both, essential and supporting), indicators to come to a decision are much diversified. For instance, if the warehouse design includes a single version of truth of data, it has to be stored. In case the reproduction or transformation of data is complex (in time and/or resources), frequency of access and warranties for availability have to be taken into account. Figure 2: Persistence Decision Diagram Further indicators that have to be examined for this decision are: Data Volume: Is the amount of data so big that its preparation for usage (reporting, analyses, and others) cannot be done on-the-fly any longer? Frequency of data usage: Are data accessed so often (for reporting, analyses, and others), that the benefit of additional materialization outweighs its cost. Frequency of data changes: Are data changed so often (by updating, inserting, or deleting), that the maintenance effort for keeping the downstream views consistent is less than their returns in performance gains. And: How complex are these changes?
8 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB 8 4 Goals, Benefits, Methodology One aspect of this work is the focus on Enterprise Date Warehouses that base on IMDB systems, because of their dedicated requirements. The goal is to develop basics and models for supporting decisions for following questions: In which transformation format shall data be stored? Will data be transformed when it is requested ( on-the-fly ), will transformed data held in volatile memory or will data be stored persistently? On base of which technical/non-technical reasons must/shall data be stored? Moreover, indicators shall be defined for supporting decisions whether to store transformed data or not, such as: Effort of transformation (logical complexity, effort of time/resources) Data volume, number of records Frequency of data usage/access (for analyses, data mining, etc.) Frequency of data modification (by updates, inserts, deletes) Requirements for data availability (query performance, real-time reporting ) This contains both, measureable and non-measurable indicators for instance, comparisons of runtime and maintenance effort of data stored in certain states to find out if a decision to store data is computable or at least supportable. Possible scenarios for measuring runtime are as follows. A given raw data base (R) is queried for analysis (A), via multiple steps of transformation (Tn; n={1,2,3}). Comparisons are taken to find out whether it is more effective to store the data persistently (P) after each transformation, to keep it in volatile memory (V), or to calculate it on-the-fly : (1) R T1 + P T2 + P T3 + A (2) R T1 + P T2 + V T3 + A (3) R T1 + P T2 T3 + A (4) R T1 + V T2 T3 + A (5) R T1 T2 T3 + A In here, factors such as frequency of data usage and changes as well as data volume have to be taken into account. Exemplary, the RFM determination (see Section 1.3) is a real-world scenario on which runtime comparisons can be made. The expected outputs will support in areas of Data Warehouse development, design, build-up, and operation. For instance, one result could be the development of stored procedures for certain types of transformation. Besides, a guideline for deciding which data will be stored in which format could be used when operating a warehouse. Methods used in this work will be: Literature research for topics as transformation, IMDB, and legislation Measurement of runtime and performance tests (e.g., as described above) Interviews/questionnaires for finding out practice in Data Warehouse development and operation (e.g., are there any other reasons for persistence, what is the percentaged distribution of the reasons)
9 9 Evaluation Factors for the Necessity of Data Persistence in EDW on IMDB References 1. B.A. Devlin, P.T. Murphy: An architecture for a business and information system ; in: IBM Systems Journal 27(1), pp ; U. Leser, F. Naumann: Informationsintegration ; dpunkt-verlag, Heidelberg; W. Lehner: Datenbanktechnologie für Data-Warehouse-Systeme ; dpunkt-verlag, Heidelberg; R. Kimball, M. Ross: The Data Warehouse Toolkit ; Wiley Publishing Inc., Indianapolis, 2 nd edition; G.P. Copeland, S.N. Khoshafian: A Decomposition Storage Model ; in: SIGMOD 85, pp ; M.J. Turner et al.: A DBMS for large statistical databases ; in: 5 th VLDB 79, pp ; M. Stonebraker et al.: C-Store: A Column-oriented DBMS ; in: 31 st VLDB 05, pp ; D. Slezak et al.: Brighthouse: An Analytic Data Warehouse for Ad-hoc Queries ; in: PVLDB 1(2), pp ; D.J. Abadi et al.: Integrating Compression and Execution in Column-Oriented Database Systems ; in: SIGMOD 06, pp ; D.J. Abadi: Query Execution in Column-Oriented Database Systems ; Dissertation, MIT; T. Legler et al.: Data Mining with the SAP NetWeaver BI Accelerator ; in: 32 nd VLDB 06, pp ; J.A. Ross: SAP NetWeaver BI Accelerator ; Galileo Press Inc., Boston; H. Plattner: A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database ; in: SIGMOD 09, pp.1-2; H. Plattner, A. Zeier: In-Memory Data Management ; Springer-Verlag, Berlin; IBM: IBM soliddb TM ; Fact Sheet, IBM Corporation; { }; Oracle: Extreme Performance Using Oracle TimesTen In-Memory Database ; White Paper, Oracle Corporation; { }; J. Stafford: RFM: A Precursor of Data Mining ; White Paper; { }; J.A. Ross: SAP NetWeaver BI Accelerator ; Galileo Press Inc., Boston; SAP: PDEBW1 - Layered Scalable Architecture (LSA) for BW ; Training Material, SAP AG; Y. Cui, J. Widom: Lineage Tracing for General Data Warehouse Transformations ; in: The VLDB Journal 12(1), pp.41-58; ,257 Handelsgesetzbuch ( ); 25a Kreditwesengesetz ( ); 147 Abgabenordnung ( ) Produkthaftungsgesetz ( )
Toward Variability Management to Tailor High Dimensional Index Implementations
Toward Variability Management to Tailor High Dimensional Index Implementations Presented by: Azeem Lodhi 14.05.2015 Veit Köppen & Thorsten Winsemann & 1 Gunter Saake Agenda Motivation Business Data Warehouse
Persistence in Enterprise Data Warehouses
Nr.: FIN-002-2012 Persistence in Enterprise Data Warehouses Thorsten Winsemann, Veit Köppen Arbeitsgruppe Datenbanken Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg Nr.: FIN-002-2012 Persistence
Data Warehousing. Jens Teubner, TU Dortmund [email protected]. Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1
Jens Teubner Data Warehousing Winter 2015/16 1 Data Warehousing Jens Teubner, TU Dortmund [email protected] Winter 2015/16 Jens Teubner Data Warehousing Winter 2015/16 13 Part II Overview
Data Warehousing Systems: Foundations and Architectures
Data Warehousing Systems: Foundations and Architectures Il-Yeol Song Drexel University, http://www.ischool.drexel.edu/faculty/song/ SYNONYMS None DEFINITION A data warehouse (DW) is an integrated repository
An Overview of SAP BW Powered by HANA. Al Weedman
An Overview of SAP BW Powered by HANA Al Weedman About BICP SAP HANA, BOBJ, and BW Implementations The BICP is a focused SAP Business Intelligence consulting services organization focused specifically
Lection 3-4 WAREHOUSING
Lection 3-4 DATA WAREHOUSING Learning Objectives Understand d the basic definitions iti and concepts of data warehouses Understand data warehousing architectures Describe the processes used in developing
In-Memory Data Management for Enterprise Applications
In-Memory Data Management for Enterprise Applications Jens Krueger Senior Researcher and Chair Representative Research Group of Prof. Hasso Plattner Hasso Plattner Institute for Software Engineering University
An Architecture for Integrated Operational Business Intelligence
An Architecture for Integrated Operational Business Intelligence Dr. Ulrich Christ SAP AG Dietmar-Hopp-Allee 16 69190 Walldorf [email protected] Abstract: In recent years, Operational Business Intelligence
Data Warehouse: Introduction
Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of Base and Mining Group of base and data mining group,
Methodology Framework for Analysis and Design of Business Intelligence Systems
Applied Mathematical Sciences, Vol. 7, 2013, no. 31, 1523-1528 HIKARI Ltd, www.m-hikari.com Methodology Framework for Analysis and Design of Business Intelligence Systems Martin Závodný Department of Information
Toronto 26 th SAP BI. Leap Forward with SAP
Toronto 26 th SAP BI Leap Forward with SAP Business Intelligence SAP BI 4.0 and SAP BW Operational BI with SAP ERP SAP HANA and BI Operational vs Decision making reporting Verify the evolution of the KPIs,
Business Intelligence In SAP Environments
Business Intelligence In SAP Environments BARC Business Application Research Center 1 OUTLINE 1 Executive Summary... 3 2 Current developments with SAP customers... 3 2.1 SAP BI program evolution... 3 2.2
SAP Real-time Data Platform. April 2013
SAP Real-time Data Platform April 2013 Agenda Introduction SAP Real Time Data Platform Overview SAP Sybase ASE SAP Sybase IQ SAP EIM Questions and Answers 2012 SAP AG. All rights reserved. 2 Introduction
www.ijreat.org Published by: PIONEER RESEARCH & DEVELOPMENT GROUP (www.prdg.org) 28
Data Warehousing - Essential Element To Support Decision- Making Process In Industries Ashima Bhasin 1, Mr Manoj Kumar 2 1 Computer Science Engineering Department, 2 Associate Professor, CSE Abstract SGT
Integrating SAP and non-sap data for comprehensive Business Intelligence
WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst
SAP Business Warehouse Powered by SAP HANA for the Utilities Industry
SAP White Paper Utilities Industry SAP Business Warehouse powered by SAP HANA SAP S/4HANA SAP Business Warehouse Powered by SAP HANA for the Utilities Industry Architecture design for utility-specific
TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS
9 8 TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS Assist. Prof. Latinka Todoranova Econ Lit C 810 Information technology is a highly dynamic field of research. As part of it, business intelligence
IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1
IN-MEMORY DATABASE SYSTEMS Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe 2015 1 Analytical Processing Today Separation of OLTP and OLAP Motivation Online Transaction Processing (OLTP)
SAP BW on HANA : Complete reference guide
SAP BW on HANA : Complete reference guide Applies to: SAP BW 7.4, SAP HANA, BW on HANA, BW 7.3 Summary There have been many architecture level changes in SAP BW 7.4. To enable our customers to understand
Exploring the Synergistic Relationships Between BPC, BW and HANA
September 9 11, 2013 Anaheim, California Exploring the Synergistic Relationships Between, BW and HANA Sheldon Edelstein SAP Database and Solution Management Learning Points SAP Business Planning and Consolidation
Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage
SAP HANA Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage Deep analysis of data is making businesses like yours more competitive every day. We ve all heard the reasons: the
Introduction to Datawarehousing
DIPARTIMENTO DI INGEGNERIA INFORMATICA AUTOMATICA E GESTIONALE ANTONIO RUBERTI Master of Science in Engineering in Computer Science (MSE-CS) Seminars in Software and Services for the Information Society
SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence
SAP HANA SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence SAP HANA Performance Table of Contents 3 Introduction 4 The Test Environment Database Schema Test Data System
PBS Information Lifecycle Management Solutions for SAP NetWeaver Business Intelligence 3.x and 7.x
PBS Information Lifecycle Management Solutions for SAP NetWeaver Business Intelligence 3.x and 7.x Contents PBS Information Lifecycle Management Solutions Page Abstract...3 SAP Data Archiving Process...5
Speeding ETL Processing in Data Warehouses White Paper
Speeding ETL Processing in Data Warehouses White Paper 020607dmxwpADM High-Performance Aggregations and Joins for Faster Data Warehouse Processing Data Processing Challenges... 1 Joins and Aggregates are
[Analysts: Dr. Carsten Bange, Larissa Seidler, September 2013]
BARC RESEARCH NOTE SAP BusinessObjects Business Intelligence with SAP HANA [Analysts: Dr. Carsten Bange, Larissa Seidler, September 2013] This document is not to be shared, distributed or reproduced in
In-memory databases and innovations in Business Intelligence
Database Systems Journal vol. VI, no. 1/2015 59 In-memory databases and innovations in Business Intelligence Ruxandra BĂBEANU, Marian CIOBANU University of Economic Studies, Bucharest, Romania [email protected],
SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
Trivadis White Paper. Comparison of Data Modeling Methods for a Core Data Warehouse. Dani Schnider Adriano Martino Maren Eschermann
Trivadis White Paper Comparison of Data Modeling Methods for a Core Data Warehouse Dani Schnider Adriano Martino Maren Eschermann June 2014 Table of Contents 1. Introduction... 3 2. Aspects of Data Warehouse
Real-Time Enterprise Management with SAP Business Suite on the SAP HANA Platform
Real-Time Enterprise Management with SAP Business Suite on the SAP HANA Platform Jürgen Butsmann, Solution Owner, Member of Global Business Development Suite on SAP HANA, SAP October 9th, 2014 Public Agenda
RDP300 - Real-Time Data Warehousing with SAP NetWeaver Business Warehouse 7.40. October 2013
RDP300 - Real-Time Data Warehousing with SAP NetWeaver Business Warehouse 7.40 October 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
MDM and Data Warehousing Complement Each Other
Master Management MDM and Warehousing Complement Each Other Greater business value from both 2011 IBM Corporation Executive Summary Master Management (MDM) and Warehousing (DW) complement each other There
By Makesh Kannaiyan [email protected] 8/27/2011 1
Integration between SAP BusinessObjects and Netweaver By Makesh Kannaiyan [email protected] 8/27/2011 1 Agenda Evolution of BO Business Intelligence suite Integration Integration after 4.0 release
Providing real-time, built-in analytics with S/4HANA. Jürgen Thielemans, SAP Enterprise Architect SAP Belgium&Luxembourg
Providing real-time, built-in analytics with S/4HANA Jürgen Thielemans, SAP Enterprise Architect SAP Belgium&Luxembourg SAP HANA Analytics Vision Situation today: OLTP and OLAP separated, one-way streets
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence
Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence Appliances and DW Architectures John O Brien President and Executive Architect Zukeran Technologies 1 TDWI 1 Agenda What
PBS ContentLink. Easy and Flexible Connection between Storage, SharePoint and SAP Solutions
Easy and Flexible Connection between, SharePoint and SAP Solutions Table of Contents Enterprise Content Management with Efficient Usage of Modern Systems as an Archive...3 and External Systems...3 Direct
Driving Peak Performance. 2013 IBM Corporation
Driving Peak Performance 1 Session 2: Driving Peak Performance Abstract We know you want the fastest performance possible for your deployments, and yet that relies on many choices across data storage,
Datawarehousing and Analytics. Data-Warehouse-, Data-Mining- und OLAP-Technologien. Advanced Information Management
Anwendersoftware a Datawarehousing and Analytics Data-Warehouse-, Data-Mining- und OLAP-Technologien Advanced Information Management Bernhard Mitschang, Holger Schwarz Universität Stuttgart Winter Term
PBS archive add ons for SAP solutions
PBS archive add ons for SAP solutions Contents Information Lifecycle Management with PBS... 3 Access to archived Data with SAP Standard... 3 Complete Data Access with PBS archive add ons... 4 Example Financial
A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY
A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY Analytics for Enterprise Data Warehouse Management and Optimization Executive Summary Successful enterprise data management is an important initiative for growing
A Few Cool Features in BW 7.4 on HANA that Make a Difference
A Few Cool Features in BW 7.4 on HANA that Make a Difference Here is a short summary of new functionality in BW 7.4 on HANA for those familiar with traditional SAP BW. I have collected and highlighted
Innovative technology for big data analytics
Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of
Innovate and Grow: SAP and Teradata
Partners Innovate and Grow: SAP and Teradata Lily Gulik, Teradata Director, SAP Center of Excellence Wayne Boyle, Chief Technology Officer Strategy, Teradata R&D Table of Contents Introduction: The Integrated
SAP NetWeaver BW Archiving with Nearline Storage (NLS) and Optimized Analytics
SAP NetWeaver BW Archiving with Nearline Storage (NLS) and Optimized Analytics www.dolphin corp.com Copyright 2011 Dolphin, West Chester PA All rights are reserved, including those of duplication, reproduction,
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS PRODUCT FACTS & FEATURES KEY FEATURES Comprehensive, best-of-breed capabilities 100 percent thin client interface Intelligence across multiple
Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers
60 Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative
Nearline Storage for Big Data combined with SAP HANA. Prof. Dr. Detlev Steinbinder, PBS Software GmbH
Nearline Storage for Big Data combined with SAP HANA Prof. Dr. Detlev Steinbinder, PBS Software GmbH Agenda PBS Software GmbH, Introduction Big Data in SAP Environments Today Nearline Storage PBS Nearline
ORACLE BUSINESS INTELLIGENCE SUITE ENTERPRISE EDITION PLUS
Oracle Fusion editions of Oracle's Hyperion performance management products are currently available only on Microsoft Windows server platforms. The following is intended to outline our general product
Data Warehousing Concepts
Data Warehousing Concepts JB Software and Consulting Inc 1333 McDermott Drive, Suite 200 Allen, TX 75013. [[[[[ DATA WAREHOUSING What is a Data Warehouse? Decision Support Systems (DSS), provides an analysis
An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies
An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies Ashish Gahlot, Manoj Yadav Dronacharya college of engineering Farrukhnagar, Gurgaon,Haryana Abstract- Data warehousing, Data Mining,
Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley
Tiber Solutions Understanding the Current & Future Landscape of BI and Data Storage Jim Hadley Tiber Solutions Founded in 2005 to provide Business Intelligence / Data Warehousing / Big Data thought leadership
Jagir Singh, Greeshma, P Singh University of Northern Virginia. Abstract
224 Business Intelligence Journal July DATA WAREHOUSING Ofori Boateng, PhD Professor, University of Northern Virginia BMGT531 1900- SU 2011 Business Intelligence Project Jagir Singh, Greeshma, P Singh
TE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach
2006 ISMA Conference 1 Sizing Logical Data in a Data Warehouse A Consistent and Auditable Approach Priya Lobo CFPS Satyam Computer Services Ltd. 69, Railway Parallel Road, Kumarapark West, Bangalore 560020,
Data Integration and ETL Process
Data Integration and ETL Process Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, second
Safe Harbor Statement
Safe Harbor Statement "Safe Harbor" Statement: Statements in this presentation relating to Oracle's future plans, expectations, beliefs, intentions and prospects are "forward-looking statements" and are
Business Intelligence and Service Oriented Architectures. An Oracle White Paper May 2007
Business Intelligence and Service Oriented Architectures An Oracle White Paper May 2007 Note: The following is intended to outline our general product direction. It is intended for information purposes
BUILDING OLAP TOOLS OVER LARGE DATABASES
BUILDING OLAP TOOLS OVER LARGE DATABASES Rui Oliveira, Jorge Bernardino ISEC Instituto Superior de Engenharia de Coimbra, Polytechnic Institute of Coimbra Quinta da Nora, Rua Pedro Nunes, P-3030-199 Coimbra,
SAP Business Suite powered by SAP HANA
SAP Business Suite powered by SAP HANA CeBIT 2013, March 5 th Bernd Leukert, Corporate Officer and Executive Vice President Application Innovation, SAP AG Magnitude of Change: Omission of Restrictions
ETL-EXTRACT, TRANSFORM & LOAD TESTING
ETL-EXTRACT, TRANSFORM & LOAD TESTING Rajesh Popli Manager (Quality), Nagarro Software Pvt. Ltd., Gurgaon, INDIA [email protected] ABSTRACT Data is most important part in any organization. Data
SAP HANA Live for SAP Business Suite. David Richert Presales Expert BI & EIM May 29, 2013
SAP HANA Live for SAP Business Suite David Richert Presales Expert BI & EIM May 29, 2013 Agenda Next generation business requirements for Operational Analytics SAP HANA Live - Platform for Real-Time Intelligence
Advanced Data Management Technologies
ADMT 2015/16 Unit 2 J. Gamper 1/44 Advanced Data Management Technologies Unit 2 Basic Concepts of BI and Data Warehousing J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
Unlock your data for fast insights: dimensionless modeling with in-memory column store. By Vadim Orlov
Unlock your data for fast insights: dimensionless modeling with in-memory column store By Vadim Orlov I. DIMENSIONAL MODEL Dimensional modeling (also known as star or snowflake schema) was pioneered by
SAP HANA In-Memory Database Sizing Guideline
SAP HANA In-Memory Database Sizing Guideline Version 1.4 August 2013 2 DISCLAIMER Sizing recommendations apply for certified hardware only. Please contact hardware vendor for suitable hardware configuration.
Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University
Bussiness Intelligence and Data Warehouse Schedule Bussiness Intelligence (BI) BI tools Oracle vs. Microsoft Data warehouse History Tools Oracle vs. Others Discussion Business Intelligence (BI) Products
Chapter 3 - Data Replication and Materialized Integration
Prof. Dr.-Ing. Stefan Deßloch AG Heterogene Informationssysteme Geb. 36, Raum 329 Tel. 0631/205 3275 [email protected] Chapter 3 - Data Replication and Materialized Integration Motivation Replication:
Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA
Applied Business Intelligence Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA Agenda Business Drivers and Perspectives Technology & Analytical Applications Trends Challenges
IBM WebSphere DataStage Online training from Yes-M Systems
Yes-M Systems offers the unique opportunity to aspiring fresher s and experienced professionals to get real time experience in ETL Data warehouse tool IBM DataStage. Course Description With this training
SAP HANA Reinventing Real-Time Businesses through Innovation, Value & Simplicity. Eduardo Rodrigues October 2013
Reinventing Real-Time Businesses through Innovation, Value & Simplicity Eduardo Rodrigues October 2013 Agenda The Existing Data Management Conundrum Innovations Transformational Impact at Customers Summary
Database Performance with In-Memory Solutions
Database Performance with In-Memory Solutions ABS Developer Days January 17th and 18 th, 2013 Unterföhring metafinanz / Carsten Herbe The goal of this presentation is to give you an understanding of in-memory
SAP Database Strategy Overview. Uwe Grigoleit September 2013
SAP base Strategy Overview Uwe Grigoleit September 2013 SAP s In-Memory and management Strategy Big- in Business-Context: Are you harnessing the opportunity? Mobile Transactions Things Things Instant Messages
Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software
SAP Technology Enabling Better Business Intelligence and Information Architecture With SAP Sybase PowerDesigner Software Table of Contents 4 Seeing the Big Picture with a 360-Degree View Gaining Efficiencies
One-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone. Michael Stonebraker December, 2008
One-Size-Fits-All: A DBMS Idea Whose Time has Come and Gone Michael Stonebraker December, 2008 DBMS Vendors (The Elephants) Sell One Size Fits All (OSFA) It s too hard for them to maintain multiple code
Dimensional Modeling for Data Warehouse
Modeling for Data Warehouse Umashanker Sharma, Anjana Gosain GGS, Indraprastha University, Delhi Abstract Many surveys indicate that a significant percentage of DWs fail to meet business objectives or
MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata
MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy Satish Krishnaswamy VP MDM Solutions - Teradata 2 Agenda MDM and its importance Linking to the Active Data Warehousing
THE DATA WAREHOUSE ETL TOOLKIT CDT803 Three Days
Three Days Prerequisites Students should have at least some experience with any relational database management system. Who Should Attend This course is targeted at technical staff, team leaders and project
Data Warehousing and OLAP Technology for Knowledge Discovery
542 Data Warehousing and OLAP Technology for Knowledge Discovery Aparajita Suman Abstract Since time immemorial, libraries have been generating services using the knowledge stored in various repositories
Near-line Storage with CBW NLS
Near-line Storage with CBW NLS High Speed Query Access for Nearline Data Ideal Enhancement Supporting SAP BW on HANA Dr. Klaus Zimmer, PBS Software GmbH Agenda Motivation Why would you need Nearline Storage
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence
Integrating Netezza into your existing IT landscape
Marco Lehmann Technical Sales Professional Integrating Netezza into your existing IT landscape 2011 IBM Corporation Agenda How to integrate your existing data into Netezza appliance? 4 Steps for creating
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT
A DATA WAREHOUSE SOLUTION FOR E-GOVERNMENT Xiufeng Liu 1 & Xiaofeng Luo 2 1 Department of Computer Science Aalborg University, Selma Lagerlofs Vej 300, DK-9220 Aalborg, Denmark 2 Telecommunication Engineering
Mastering the SAP Business Information Warehouse. Leveraging the Business Intelligence Capabilities of SAP NetWeaver. 2nd Edition
Brochure More information from http://www.researchandmarkets.com/reports/2246934/ Mastering the SAP Business Information Warehouse. Leveraging the Business Intelligence Capabilities of SAP NetWeaver. 2nd
SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
An Efficient Approach Optimized Performance with SAP Net Weaver BI Accelerator
An Efficient Approach Optimized Performance with SAP Net Weaver BI Accelerator Prof.B.Lakshma Reddy 1, Dr.T.Bhaskara Reddy 2, M.Victoria Hebseeba 3 Dr. S.Kiran 4 1 Director,Department of Computer Science,Garden
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina
Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina Gordana Radivojević 1, Gorana Šormaz 2, Pavle Kostić 3, Bratislav Lazić 4, Aleksandar Šenborn 5,
IST722 Data Warehousing
IST722 Data Warehousing Components of the Data Warehouse Michael A. Fudge, Jr. Recall: Inmon s CIF The CIF is a reference architecture Understanding the Diagram The CIF is a reference architecture CIF
The Data Warehouse ETL Toolkit
2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. The Data Warehouse ETL Toolkit Practical Techniques for Extracting,
Business Performance without limits how in memory. computing changes everything
Business Performance without limits how in memory computing changes everything Information Explosion is unprecedented An inflection point for the enterprise VELOCITY Worldwide digital content will double
James Serra Sr BI Architect [email protected] http://jamesserra.com/
James Serra Sr BI Architect [email protected] http://jamesserra.com/ Our Focus: Microsoft Pure-Play Data Warehousing & Business Intelligence Partner Our Customers: Our Reputation: "B.I. Voyage came
Tap into Big Data at the Speed of Business
SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics
Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage
Moving Large Data at a Blinding Speed for Critical Business Intelligence A competitive advantage Intelligent Data In Real Time How do you detect and stop a Money Laundering transaction just about to take
The IBM Cognos Platform for Enterprise Business Intelligence
The IBM Cognos Platform for Enterprise Business Intelligence Highlights Optimize performance with in-memory processing and architecture enhancements Maximize the benefits of deploying business analytics
In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki 30.11.2012
In-Memory Columnar Databases HyPer Arto Kärki University of Helsinki 30.11.2012 1 Introduction Columnar Databases Design Choices Data Clustering and Compression Conclusion 2 Introduction The relational
