Capacity Planning Process Estimating the load Initial configuration
|
|
- Frederick Long
- 8 years ago
- Views:
Transcription
1 Capacity Planning Any data warehouse solution will grow over time, sometimes quite dramatically. It is essential that the components of the solution (hardware, software, and database) are capable of supporting the extended sizes without unacceptable performance loss, or growth of the load window to a point where it affects the use of the system. Process The capacity plan for a data warehouse is defined within the technical blueprint stage of the process. The business requirements stage should have identified the approximate sizes for data, users, and any other issues that constrain system performance. One of the most difficult decisions you will have to make about a data warehouse is the capacity required by the hardware. It is important to have a clear understanding of the usage profiles of all users of the data warehouse. For each user or group of users you need to know the following: The number of users in the group; Whether they use ad hoc queries frequently; Whether they use ad hoc queries occasionally at unknown intervals; Whether they use ad hoc queries occasionally at regular and predictable times; The average size of query they tend to run; The maximum size of query they tend to run; The elapsed login time per day; The peak time of daily usage; The number of queries they run peak hour; The number of queries they run per day. These usage profiles will probably change over time, and need to be kept up to date. They are useful for growth predictions and capacity. The profiles in themselves are not enough; you also require an understanding of the business. Estimating the load When choosing the hardware for the data warehouse there are many things to consider, such as hardware architecture, resilience, and so on. The data warehouse will probably grow rapidly from its initial configuration, so it is not sufficient to consider the initial size of the data warehouse. There are a number of different elements that need to be considered, but the decision all come down to how much CPU, how much memory and how much disk you will need. If your sizing calls for more than the budget can afford, do not allow the required capacity to be chopped back. If that is the case, some of the functionality will need to be pared back and then the capacity can be re-estimated. In the following pages we shall attempt to outline the rules and guidelines that we follow when sizing a system for a data warehouse. Initial configuration When sizing the initial configuration you will have no history information or statistics to work with, and the sizing will need to be done on the predicted load. Estimating this load is difficult, because there is an ad hoc element to it. 1
2 All you can do is estimate the configuration based on the known requirements. This is why the business requirements phase is so important. When deciding on the initial configuration you will need to allow some contingency. This is particularly important in a data warehouse project, because the requirements are often difficult to pin down accurately, and the load can quickly vary from the expected. Otherwise, the sizing exercise is the same irrespective of the phase of the data warehouse that you are trying to size. How much CPU Bandwidth? To start with you need to consider the distinct loads that will be placed on the system. There are many aspects to this, such as query load, data load, backup and so on, but essentially the load can be divided into two distinct phases: Daily processing # user query processing Overnight processing # data transformation and load # aggregation and index creation # backup Daily processing The daily processing is centered on the user queries. To estimate the CPU requirements you need to estimate the time that each query will take. As much of the query load will be ad hoc it is impossible to estimate the requirement of every query; therefore another approach has to be found. The first thing to do is estimate the size of the largest likely common query. It is possible that some user will want to query across every piece of data in the data warehouse, but this will probably not be a common requirement. It is more likely that the users will want to query the most recent week or month s worth of data. Having established the likely period that will be queried, you will know the volume of data that will be involved. As you cannot assume that a relevant index will be in place, you must assume the query will perform a full table scan of the fact data for that period. So we now have a measure of the volume of data, let us say F megabytes, that will be accessed. To progress any further we need to know the I/O characteristics of the devices that the data will reside on. This allows us to calculate the scan rate S at which the fact data can be read. This will depend on the disk speeds and on the throughput ratings of the controllers. Clearly this also depends on the size of F itself. If F is many gigabytes then it will definitely be spread across multiple disks and probably across multiple controllers. You can now calculate S assuming a reasonable spread of the data. Remember that if the database is designed correctly and the large queries are controlled properly you should not get much contention for the disks, so you should get a reasonable throughput. Using S and F you can calculate T, the time in seconds to perform a full table scan of the period in question: T = F/S..1.1 If fact you should calculate a number of times, T1 Ta, which depend on the degree of parallelism that you are using to perform the scan. Therefore we get T1 = F/S1.... Tn = T/Sn 1.2 2
3 Where S1 is the scan speed of a disk or striped disk set, and Sn is the scan speed of all the disks or disk sets that F is spread across. You may be able to get slightly higher throughput than Sn with higher degrees of parallelism, but you will bottlenect on I/O at degrees of parallelism much above N. Now you can take the query response time requirements specified in the service level agreement and pick the appropriate T value, Tp say: this will give you Sp, the required scan rate, the number of disks or disk sets you will need to spread data across. It also gives you P, the required degree of parallelism to meet query response times for a single query. Now you need to estimate Pc, the number of parallel scanning threads that a single CPU will support. This will vary from processor to processor. The processors currently on the market will support from two to four scanners, but chip technology is moving on quickly, and this number will change over time. If possible, you should establish this by experiment. Now you can estimate your CPU requirement to support a single query with: Cs = Roundup(2P/Ps) You need to use 2P to allow for other query overheads, and for queries that involve sorts. To calculate the minimum number of CPUs required overall use the following formula: Ct = ncs Where n is the number of concurrent queries that will be allowed to run. This should not be confused with the number of concurrent users, because unless a user is running a query that are not likely to be doing anything heavier than editing. Note that the additional 1 is added to the total to allow for the operating system overheads and for all other user processing. Overnight Processing The first point to note about the nightly processing is that the operations listed at he beginning of this section are, for the most part, serialized. This is because each operation usually relies on the previous operation s completing before it can begin. The CPU bandwidth required for the data transformation will depend on the amount of data processing that is involved. Unless there is an enormous amount of data transformation, it is unlikely that this operation will require more CPU bandwidth than the aggregation and index creation operations. The same applies to the backup, although you should bear in mind that backing up large quantities of data in a short period of time will cause a major kickback onto the CPU. If the backup is spread over more hours, the amount of parallelism will come down, and its CPU bandwidth requirement will drop. The data load is another task that can use massive parallelism to speed up its operation. as with backup. As with backup, if you use fewer parallel streams it will use less CPU bandwidth and will take longer to run. Having established what you are going to use as your baseline, you then need to estimate how much CPU capacity that operation requires to complete in the allowed time. It is not safe to assume that you will have more than 10 hours overnight to achieve all the processing, even if the user day is only 8 or 9 hours long. Delays in data arrival can cause you significant problems, and you must make sure that you can complete the overnight processing without running over into the business day. 3
4 As every data warehouse is different, it is impossible here to give explicit estimates for these operations. It is not even possible to give firm guidelines, because each aggregation is a different complex query and/or update plus an intensive write operation. How Much Memory? There are a number of things that affect the amount of memory required. First, there are the database requirements. The database will need memory to cache data blocks as they are used; it will also need memory to cache parsed SQL statements and so on. You will need memory for sort space. Secondly, each user connected to the system will use an amount of memory: how much will depend on how they are connected to the system and what software they are running. Finally, the operating system will require an amount of memory. How much disk? The disk requirement can be broken down into the following categories: Database requirements # administration # fact and dimension data # aggregation Non database requirements # operating system requirements # other software requirements # user requirements Database sizing There are a number of aspects to the database sizing that need to be considered. First, there are the system administration requirements of the database. There is data dictionary and the journal files, plus any rollback space that is required. These will all be small by comparison with the temporary or sort area. If you can gauze the size of the largest transactin that will realistically be run you can use this to size the temporary requirements. If not, the best you can do is tie it to the size o a partition. If you do this, make allowance for multiple queries running at a given time, and set the temporary space to T = (2n + 1) P Where n is the number of concurrent queries allowed, and P is the size of a partition. It you use different-sized partitions for current data and for older data, then use the largest partition size in the calculation above. Then there is the fact and dimension data. This is the one piece of data that you can actually size; everything else will be sized off this. To do this sizing exercise, you will need to known the database schema. Clearly, when performing the original size estimates for a business case, much of this information will be missing. In this situation, the sizing will need to be based purely on original estimates of the base data size. When sizing the fact or dimension data you will have the record definitions, with each field type and size specified. Note, however, that the size specified for a field will be the maximum size. When calculating the actual size you will need to know. The average size of the data in each field; 4
5 The percentage occupancy of each field; The position of each data field in the table; The RDBMS storage format of each data type; Any table; row and column overhead. Another factor that may affect our calculations is the database block or page size. A database block will normally have some header information at the top of the block. This is space that cannot be used by data and can amount to bytes. The difference between using a 2 kb block size and using a!6 kb block size will mean something of the order of 100 bytes of extra data space in every 16 kb. You will also need to estimate the size of the index space required for the fact and dimension data. Fact data should generally be only lightly indexed, with indexes occupying between 15% and 30% of the space occupied by the fact data; the cost in terms of index maintenance would be extremely heavy otherwise. The final determinant of the ultimate size of the fact data is the amount of data that you intend to keep online. When this is known, you can decide on your partitionling strategy. This needs to be taken into account in the sizing, because it is unlikely that you will get data to load exactly into every partition. You also need to size the aggregations. For the initial system you will probably be able to size the actual aggregations that are planned. All the factors discussed above apply equally to the aggregations. As a rule of thumb, you should allow the same amount of space for aggregations as you will have fact data online. You will also need to allow space for indexes on the aggregations. These summarized tables are likely to be heavily indexed, and it is usual to assume 100% indexation. In other words, allow as much space again for indexing as for the aggregates themselves. So, to summarize, the space required by the database will be Space required = F+Fi+D+Di+A+Ai+A+S Where F is the size of the fact data (all the fact data that will be kept online); Fi is the size of the fact data indexation; D is the size of the dimension data; Di is the size of the dimension data indexation; A is the size of the aggregations; Ai is the size of the aggregations indexation; T is the size of the database temporary or sort area; and S is the database system administration overhead. If you want to get a quick upper bound on the database size, equation 1.6 can be reduced as follows: Space required = F+Fi+D+Di+A+Ai+A+S = 3F + Fi + D + Di + T + S as A = Ai = F < 3.3F +D+Di+T+S as Fi <= 30% F < 3.5F + T + S as D<=10%F and D = Di <= 3.5F + T as S<<T and S<<F 1.7 If F is sized accurately this formula will give a reasonable estimate of the ultimate system size. To show a worked example, suppose the fact data is calculated to be 36GB of data per year, and 4 years worth of data are to be kept online. This means that F would be 144GB. Then using eq. 1.7 we get 5
6 Space required = 3.5F + T = (3.5 * 144)+T = 504 +T GB Now suppose the data is to be partitioned by month; that would give a partition size P of 3GB. If four concurrent queries are to be allowed, using eq 1.5 we can now estimate the size of the temporary space T: T = (2n+1)P T = [(2*4)+1]3 T = 27 This gives a total database size of 531 GB for the full-sized system. If it is intended to keep 3 years worth of data online, the formula above will represent the size of the database after 3 years worth of data has been loaded. To size the initial system for 6 months worth of data you can say Initial space = (3.5F+T)/[(n*12)/6] Where n is the number of years data you intend to keep online. Note the +1 under the line: this will account for the dimension data s being a bigger percentage of the fact initially. This means that eq. 1.9 reduces to Initial space = (3.5F+T)/ Which is your initial sizing. One final word: remember that every data warehouse is different. These figures are only guidelines. 6
Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.
Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services By Ajay Goyal Consultant Scalability Experts, Inc. June 2009 Recommendations presented in this document should be thoroughly
More informationVirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5
Performance Study VirtualCenter Database Performance for Microsoft SQL Server 2005 VirtualCenter 2.5 VMware VirtualCenter uses a database to store metadata on the state of a VMware Infrastructure environment.
More informationVirtuoso and Database Scalability
Virtuoso and Database Scalability By Orri Erling Table of Contents Abstract Metrics Results Transaction Throughput Initializing 40 warehouses Serial Read Test Conditions Analysis Working Set Effect of
More informationMuse Server Sizing. 18 June 2012. Document Version 0.0.1.9 Muse 2.7.0.0
Muse Server Sizing 18 June 2012 Document Version 0.0.1.9 Muse 2.7.0.0 Notice No part of this publication may be reproduced stored in a retrieval system, or transmitted, in any form or by any means, without
More informationPARALLEL PROCESSING AND THE DATA WAREHOUSE
PARALLEL PROCESSING AND THE DATA WAREHOUSE BY W. H. Inmon One of the essences of the data warehouse environment is the accumulation of and the management of large amounts of data. Indeed, it is said that
More informationDistribution One Server Requirements
Distribution One Server Requirements Introduction Welcome to the Hardware Configuration Guide. The goal of this guide is to provide a practical approach to sizing your Distribution One application and
More informationThe Bus (PCI and PCI-Express)
4 Jan, 2008 The Bus (PCI and PCI-Express) The CPU, memory, disks, and all the other devices in a computer have to be able to communicate and exchange data. The technology that connects them is called the
More informationSQL Server Business Intelligence on HP ProLiant DL785 Server
SQL Server Business Intelligence on HP ProLiant DL785 Server By Ajay Goyal www.scalabilityexperts.com Mike Fitzner Hewlett Packard www.hp.com Recommendations presented in this document should be thoroughly
More informationWhitepaper: performance of SqlBulkCopy
We SOLVE COMPLEX PROBLEMS of DATA MODELING and DEVELOP TOOLS and solutions to let business perform best through data analysis Whitepaper: performance of SqlBulkCopy This whitepaper provides an analysis
More informationInnovative technology for big data analytics
Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of
More informationOracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
More informationPerformance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
More informationThe Advantages of Using RAID
1 Quick Guide to the SPD Engine Disk-I/O Set-Up SPD Engine Disk-I/O Set-Up 1 Disk Striping and RAIDs 2 Metadata Area Configuration 3 Assigning a Metadata Area 3 Metadata Space Requirements 3 Data Area
More informationTechnical White Paper. Symantec Backup Exec 10d System Sizing. Best Practices For Optimizing Performance of the Continuous Protection Server
Symantec Backup Exec 10d System Sizing Best Practices For Optimizing Performance of the Continuous Protection Server Table of Contents Table of Contents...2 Executive Summary...3 System Sizing and Performance
More informationThe Methodology Behind the Dell SQL Server Advisor Tool
The Methodology Behind the Dell SQL Server Advisor Tool Database Solutions Engineering By Phani MV Dell Product Group October 2009 Executive Summary The Dell SQL Server Advisor is intended to perform capacity
More informationCognos Performance Troubleshooting
Cognos Performance Troubleshooting Presenters James Salmon Marketing Manager James.Salmon@budgetingsolutions.co.uk Andy Ellis Senior BI Consultant Andy.Ellis@budgetingsolutions.co.uk Want to ask a question?
More informationParallel Replication for MySQL in 5 Minutes or Less
Parallel Replication for MySQL in 5 Minutes or Less Featuring Tungsten Replicator Robert Hodges, CEO, Continuent About Continuent / Continuent is the leading provider of data replication and clustering
More informationAgenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
More informationPerformance Counters. Microsoft SQL. Technical Data Sheet. Overview:
Performance Counters Technical Data Sheet Microsoft SQL Overview: Key Features and Benefits: Key Definitions: Performance counters are used by the Operations Management Architecture (OMA) to collect data
More informationPerformance And Scalability In Oracle9i And SQL Server 2000
Performance And Scalability In Oracle9i And SQL Server 2000 Presented By : Phathisile Sibanda Supervisor : John Ebden 1 Presentation Overview Project Objectives Motivation -Why performance & Scalability
More informationRackspace Cloud Databases and Container-based Virtualization
Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many
More informationAmadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator
WHITE PAPER Amadeus SAS Specialists Prove Fusion iomemory a Superior Analysis Accelerator 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com SAS 9 Preferred Implementation Partner tests a single Fusion
More informationSQL Memory Management in Oracle9i
SQL Management in Oracle9i Benoît Dageville Mohamed Zait Oracle Corporation Oracle Corporation 500 Oracle Parway 500 Oracle Parway Redwood Shores, CA 94065 Redwood Shores, CA 94065 U.S.A U.S.A Benoit.Dageville@oracle.com
More informationData Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team
Data Warehouse in the Cloud Marketing or Reality? Alexei Khalyako Sr. Program Manager Windows Azure Customer Advisory Team Data Warehouse we used to know High-End workload High-End hardware Special know-how
More informationXenDesktop 7 Database Sizing
XenDesktop 7 Database Sizing Contents Disclaimer... 3 Overview... 3 High Level Considerations... 3 Site Database... 3 Impact of failure... 4 Monitoring Database... 4 Impact of failure... 4 Configuration
More informationwww.dotnetsparkles.wordpress.com
Database Design Considerations Designing a database requires an understanding of both the business functions you want to model and the database concepts and features used to represent those business functions.
More informationQ & A From Hitachi Data Systems WebTech Presentation:
Q & A From Hitachi Data Systems WebTech Presentation: RAID Concepts 1. Is the chunk size the same for all Hitachi Data Systems storage systems, i.e., Adaptable Modular Systems, Network Storage Controller,
More informationWHITE PAPER BRENT WELCH NOVEMBER
BACKUP WHITE PAPER BRENT WELCH NOVEMBER 2006 WHITE PAPER: BACKUP TABLE OF CONTENTS Backup Overview 3 Background on Backup Applications 3 Backup Illustration 4 Media Agents & Keeping Tape Drives Busy 5
More informationHigh performance ETL Benchmark
High performance ETL Benchmark Author: Dhananjay Patil Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 07/02/04 Email: erg@evaltech.com Abstract: The IBM server iseries
More informationWHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE
WHITE PAPER BASICS OF DISK I/O PERFORMANCE WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE This technical documentation is aimed at the persons responsible for the disk I/O performance
More informationOptimizing Performance. Training Division New Delhi
Optimizing Performance Training Division New Delhi Performance tuning : Goals Minimize the response time for each query Maximize the throughput of the entire database server by minimizing network traffic,
More informationIncidentMonitor Server Specification Datasheet
IncidentMonitor Server Specification Datasheet Prepared by Monitor 24-7 Inc October 1, 2015 Contact details: sales@monitor24-7.com North America: +1 416 410.2716 / +1 866 364.2757 Europe: +31 088 008.4600
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationMS SQL Performance (Tuning) Best Practices:
MS SQL Performance (Tuning) Best Practices: 1. Don t share the SQL server hardware with other services If other workloads are running on the same server where SQL Server is running, memory and other hardware
More informationDB2 Database Layout and Configuration for SAP NetWeaver based Systems
IBM Software Group - IBM SAP DB2 Center of Excellence DB2 Database Layout and Configuration for SAP NetWeaver based Systems Helmut Tessarek DB2 Performance, IBM Toronto Lab IBM SAP DB2 Center of Excellence
More informationCase Study I: A Database Service
Case Study I: A Database Service Prof. Daniel A. Menascé Department of Computer Science George Mason University www.cs.gmu.edu/faculty/menasce.html 1 Copyright Notice Most of the figures in this set of
More informationPUBLIC Performance Optimization Guide
SAP Data Services Document Version: 4.2 Support Package 6 (14.2.6.0) 2015-11-20 PUBLIC Content 1 Welcome to SAP Data Services....6 1.1 Welcome.... 6 1.2 Documentation set for SAP Data Services....6 1.3
More informationOptimizing the Performance of Your Longview Application
Optimizing the Performance of Your Longview Application François Lalonde, Director Application Support May 15, 2013 Disclaimer This presentation is provided to you solely for information purposes, is not
More informationStorage Layout and I/O Performance in Data Warehouses
Storage Layout and I/O Performance in Data Warehouses Matthias Nicola 1, Haider Rizvi 2 1 IBM Silicon Valley Lab 2 IBM Toronto Lab mnicola@us.ibm.com haider@ca.ibm.com Abstract. Defining data placement
More informationSAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
More informationUnderstanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
More informationMicrosoft SQL Server 2008 Data and Backup Compression
white paper Microsoft SQL Server 2008 Data and Backup Jerrold Buggert Rick Freeman Elena Shen Richard Saunders Cecil Reames August 19, 2008 Table of Contents Introduction to in Microsoft SQL Server 2008
More informationsql server best practice
sql server best practice 1 MB file growth SQL Server comes with a standard configuration which autogrows data files in databases in 1 MB increments. By incrementing in such small chunks, you risk ending
More informationBinary search tree with SIMD bandwidth optimization using SSE
Binary search tree with SIMD bandwidth optimization using SSE Bowen Zhang, Xinwei Li 1.ABSTRACT In-memory tree structured index search is a fundamental database operation. Modern processors provide tremendous
More informationSQL Server 2012 Performance White Paper
Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
More informationWindows Server Performance Monitoring
Spot server problems before they are noticed The system s really slow today! How often have you heard that? Finding the solution isn t so easy. The obvious questions to ask are why is it running slowly
More informationInitial Hardware Estimation Guidelines. AgilePoint BPMS v5.0 SP1
Initial Hardware Estimation Guidelines Document Revision r5.2.3 November 2011 Contents 2 Contents Preface...3 Disclaimer of Warranty...3 Copyright...3 Trademarks...3 Government Rights Legend...3 Virus-free
More informationWITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE
WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What
More informationPerformance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.
Is your database application experiencing poor response time, scalability problems, and too many deadlocks or poor application performance? One or a combination of zparms, database design and application
More informationQLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE
QLIKVIEW ARCHITECTURE AND SYSTEM RESOURCE USAGE QlikView Technical Brief April 2011 www.qlikview.com Introduction This technical brief covers an overview of the QlikView product components and architecture
More informationUnderstanding the Benefits of IBM SPSS Statistics Server
IBM SPSS Statistics Server Understanding the Benefits of IBM SPSS Statistics Server Contents: 1 Introduction 2 Performance 101: Understanding the drivers of better performance 3 Why performance is faster
More informationCommunicating with devices
Introduction to I/O Where does the data for our CPU and memory come from or go to? Computers communicate with the outside world via I/O devices. Input devices supply computers with data to operate on.
More information2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
More informationWhite Paper February 2010. IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario
White Paper February 2010 IBM InfoSphere DataStage Performance and Scalability Benchmark Whitepaper Data Warehousing Scenario 2 Contents 5 Overview of InfoSphere DataStage 7 Benchmark Scenario Main Workload
More informationISTANBUL AYDIN UNIVERSITY
ISTANBUL AYDIN UNIVERSITY 2013-2014 Academic Year Fall Semester Department of Software Engineering SEN361 COMPUTER ORGANIZATION HOMEWORK REPORT STUDENT S NAME : GÖKHAN TAYMAZ STUDENT S NUMBER : B1105.090068
More informationGeospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014
Geospatial Server Performance Colin Bertram UK User Group Meeting 23-Sep-2014 Topics Auditing a Geospatial Server Solution Web Server Strategies and Configuration Database Server Strategy and Configuration
More informationData Warehouse Performance Management Techniques.
Data Warehouse Performance Management Techniques. Author: Organization: Andrew Holdsworth Oracle Services, Date: 2/9/96 Advanced Technologies, Data Warehousing Practice. Address: 500 Oracle Parkway, Redwood
More informationLoad Testing Analysis Services Gerhard Brückl
Load Testing Analysis Services Gerhard Brückl About Me Gerhard Brückl Working with Microsoft BI since 2006 Mainly focused on Analytics and Reporting Analysis Services / Reporting Services Power BI / O365
More informationAdministração e Optimização de BDs
Departamento de Engenharia Informática 2010/2011 Administração e Optimização de BDs Aula de Laboratório 1 2º semestre In this lab class we will address the following topics: 1. General Workplan for the
More informationCribMaster Database and Client Requirements
FREQUENTLY ASKED QUESTIONS CribMaster Database and Client Requirements GENERAL 1. WHAT TYPE OF APPLICATION IS CRIBMASTER? ARE THERE ANY SPECIAL APPLICATION SERVER OR USER INTERFACE REQUIREMENTS? CribMaster
More informationNavigating Big Data with High-Throughput, Energy-Efficient Data Partitioning
Application-Specific Architecture Navigating Big Data with High-Throughput, Energy-Efficient Data Partitioning Lisa Wu, R.J. Barker, Martha Kim, and Ken Ross Columbia University Xiaowei Wang Rui Chen Outline
More informationlow-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
More informationBig Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC
Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC ABSTRACT As data sets continue to grow, it is important for programs to be written very efficiently to make sure no time
More informationAccelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
More informationHow To Make A Backup System More Efficient
Identifying the Hidden Risk of Data De-duplication: How the HYDRAstor Solution Proactively Solves the Problem October, 2006 Introduction Data de-duplication has recently gained significant industry attention,
More informationence confident powerful experienced The Teradata Scalability Story A Teradata White Paper
powerful simplicity mple confident ence experienced Contributor: Carrie Ballinger, Senior Technical Consultant, Teradata Development Teradata, a division of NCR A Teradata White Paper EB-3031 0801 PAGE
More informationDeploying and Optimizing SQL Server for Virtual Machines
Deploying and Optimizing SQL Server for Virtual Machines Deploying and Optimizing SQL Server for Virtual Machines Much has been written over the years regarding best practices for deploying Microsoft SQL
More informationSAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence
SAP HANA SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence SAP HANA Performance Table of Contents 3 Introduction 4 The Test Environment Database Schema Test Data System
More informationTowards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration. Sina Meraji sinamera@ca.ibm.com
Towards Fast SQL Query Processing in DB2 BLU Using GPUs A Technology Demonstration Sina Meraji sinamera@ca.ibm.com Please Note IBM s statements regarding its plans, directions, and intent are subject to
More informationA Guide to Getting Started with Successful Load Testing
Ingenieurbüro David Fischer AG A Company of the Apica Group http://www.proxy-sniffer.com A Guide to Getting Started with Successful Load Testing English Edition 2007 All Rights Reserved Table of Contents
More informationSystem Requirements Table of contents
Table of contents 1 Introduction... 2 2 Knoa Agent... 2 2.1 System Requirements...2 2.2 Environment Requirements...4 3 Knoa Server Architecture...4 3.1 Knoa Server Components... 4 3.2 Server Hardware Setup...5
More informationPerformance Workload Design
Performance Workload Design The goal of this paper is to show the basic principles involved in designing a workload for performance and scalability testing. We will understand how to achieve these principles
More informationImprove Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database
WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive
More informationConfiguring Apache Derby for Performance and Durability Olav Sandstå
Configuring Apache Derby for Performance and Durability Olav Sandstå Database Technology Group Sun Microsystems Trondheim, Norway Overview Background > Transactions, Failure Classes, Derby Architecture
More informationComputer Components Study Guide. The Case or System Box
Computer Components Study Guide In this lesson, we will briefly explore the basics of identifying the parts and components inside of a computer. This lesson is used to introduce the students to the inside
More informationBig Fast Data Hadoop acceleration with Flash. June 2013
Big Fast Data Hadoop acceleration with Flash June 2013 Agenda The Big Data Problem What is Hadoop Hadoop and Flash The Nytro Solution Test Results The Big Data Problem Big Data Output Facebook Traditional
More informationENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771
ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced
More informationOracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
More informationq for Gods Whitepaper Series (Edition 1) Multi-Partitioned kdb+ Databases: An Equity Options Case Study
Series (Edition 1) Multi-Partitioned kdb+ Databases: An Equity Options Case Study October 2012 Author: James Hanna, who joined First Derivatives in 2004, has helped design and develop kdb+ implementations
More informationBenchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
More informationPerformance Optimization Guide
Performance Optimization Guide Publication Date: July 06, 2016 Copyright Metalogix International GmbH, 2001-2016. All Rights Reserved. This software is protected by copyright law and international treaties.
More informationAzure VM Performance Considerations Running SQL Server
Azure VM Performance Considerations Running SQL Server Your company logo here Vinod Kumar M @vinodk_sql http://blogs.extremeexperts.com Session Objectives And Takeaways Session Objective(s): Learn the
More informationMailEnable Scalability White Paper Version 1.2
MailEnable Scalability White Paper Version 1.2 Table of Contents 1 Overview...2 2 Core architecture...3 2.1 Configuration repository...3 2.2 Storage repository...3 2.3 Connectors...3 2.3.1 SMTP Connector...3
More informationBENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
More informationHardware Configuration Guide
Hardware Configuration Guide Contents Contents... 1 Annotation... 1 Factors to consider... 2 Machine Count... 2 Data Size... 2 Data Size Total... 2 Daily Backup Data Size... 2 Unique Data Percentage...
More informationMeasuring Cache and Memory Latency and CPU to Memory Bandwidth
White Paper Joshua Ruggiero Computer Systems Engineer Intel Corporation Measuring Cache and Memory Latency and CPU to Memory Bandwidth For use with Intel Architecture December 2008 1 321074 Executive Summary
More informationRAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29
RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f. one large disk) Parallelism improves performance Plus extra disk(s) for redundant data storage Provides fault tolerant
More informationComprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering
Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations A Dell Technical White Paper Database Solutions Engineering By Sudhansu Sekhar and Raghunatha
More informationTertiary Storage and Data Mining queries
An Architecture for Using Tertiary Storage in a Data Warehouse Theodore Johnson Database Research Dept. AT&T Labs - Research johnsont@research.att.com Motivation AT&T has huge data warehouses. Data from
More informationOptimizing Your Data Warehouse Design for Superior Performance
Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex
More informationCitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
More informationDirections for VMware Ready Testing for Application Software
Directions for VMware Ready Testing for Application Software Introduction To be awarded the VMware ready logo for your product requires a modest amount of engineering work, assuming that the pre-requisites
More informationExploring RAID Configurations
Exploring RAID Configurations J. Ryan Fishel Florida State University August 6, 2008 Abstract To address the limits of today s slow mechanical disks, we explored a number of data layouts to improve RAID
More informationPEPPERDATA IN MULTI-TENANT ENVIRONMENTS
..................................... PEPPERDATA IN MULTI-TENANT ENVIRONMENTS technical whitepaper June 2015 SUMMARY OF WHAT S WRITTEN IN THIS DOCUMENT If you are short on time and don t want to read the
More informationTHE NEAL NELSON DATABASE BENCHMARK : A BENCHMARK BASED ON THE REALITIES OF BUSINESS
THE NEAL NELSON DATABASE BENCHMARK : A BENCHMARK BASED ON THE REALITIES OF BUSINESS Neal Nelson & Associates is an independent benchmarking firm based in Chicago. They create and market benchmarks as well
More informationStreamServe Persuasion SP5 Microsoft SQL Server
StreamServe Persuasion SP5 Microsoft SQL Server Database Guidelines Rev A StreamServe Persuasion SP5 Microsoft SQL Server Database Guidelines Rev A 2001-2011 STREAMSERVE, INC. ALL RIGHTS RESERVED United
More informationUnderstanding SQL Server Execution Plans. Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner
Understanding SQL Server Execution Plans Klaus Aschenbrenner Independent SQL Server Consultant SQLpassion.at Twitter: @Aschenbrenner About me Independent SQL Server Consultant International Speaker, Author
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More information