WHITE PAPER. Reducing Dormant Data: 7 Tips for Delivering Data Warehouse Performance and Cost Savings



Similar documents
Comprehensive Compliance Auditing and Controls for BI/DW Environments

Business Intelligence Application Usage Profiling & Management

APPLICATION VISIBILITY AND CONTROL

Performance Management for Enterprise Applications

Business Usage Monitoring for Teradata

Coping with the Data Explosion

Upgrade to Oracle E-Business Suite R12 While Controlling the Impact of Data Growth WHITE PAPER

Perform-Tools. Powering your performance

Innovative technology for big data analytics

Bringing Big Data into the Enterprise

BUSINESS VALUE SPOTLIGHT

APPLICATION COMPLIANCE AUDIT & ENFORCEMENT

Teleran PCI Customer Case Study

Mitigating Risk through OEM Partnerships. Leveraging OEM to Drive the Bottom Line

SQL Server Business Intelligence on HP ProLiant DL785 Server

Optimize Your Data Warehouse with Hadoop The first steps to transform the economics of data warehousing.

NetApp Syncsort Integrated Backup

The StorHouse Database Extension and Relational Repository for mysap BI BW

MS SQL Performance (Tuning) Best Practices:

Cloud Storage for SAP Data and Document Archiving

How To Know The Roi Of Cesp Workload Automation Software

An Accenture Point of View. Oracle Exalytics brings speed and unparalleled flexibility to business analytics

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

Archiving the insurance data warehouse

Introduction to the Event Analysis and Retention Dilemma

Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service

Understanding the Value of In-Memory in the IT Landscape

Capacity Planning Process Estimating the load Initial configuration

GE Intelligent Platforms. solutions for dairy manufacturing

Harness the value of information throughout the enterprise. IBM InfoSphere Master Data Management Server. Overview

Key Attributes for Analytics in an IBM i environment

How To Use Noetix

Predictive Intelligence: Identify Future Problems and Prevent Them from Happening BEST PRACTICES WHITE PAPER

IBM DB2 Near-Line Storage Solution for SAP NetWeaver BW

In-memory Tables Technology overview and solutions

Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.

Improve SQL Performance with BMC Software


Response Time Analysis

Business Intelligence and Big Data Analytics: Speeding the Cycle from Insights to Action Four Steps to More Profitable Customer Engagement

Increase business agility and accelerate PLM return on investment

Storage Technologies for Video Surveillance

Con-way Freight. Leveraging best-of-breed business intelligence for customer satisfaction. Overview. Before: a company with a vision

Memory-Centric Database Acceleration

Preferred Strategies: Business Intelligence for JD Edwards

Fact Sheet In-Memory Analysis

SQL Maestro and the ELT Paradigm Shift

HOW INTERSYSTEMS TECHNOLOGY ENABLES BUSINESS INTELLIGENCE SOLUTIONS

SAP NetWeaver BW Archiving with Nearline Storage (NLS) and Optimized Analytics

Using In-Memory Data Fabric Architecture from SAP to Create Your Data Advantage

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

IBM Tivoli Network Manager software

CitusDB Architecture for Real-Time Big Data

A business intelligence agenda for midsize organizations: Six strategies for success

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

Informatica Application Information Lifecycle Management

2009 Oracle Corporation 1

Speeding ETL Processing in Data Warehouses White Paper

Lowering the Total Cost of Ownership (TCO) of Data Warehousing

The ABCs of DaaS. Enabling Data as a Service for Application Delivery, Business Intelligence, and Compliance Reporting.

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

White Paper The Benefits of Business Intelligence Standardization

Performance rule violations usually result in increased CPU or I/O, time to fix the mistake, and ultimately, a cost to the business unit.

Data Warehouse: Introduction

Oracle Daily Business Intelligence. PDF created with pdffactory trial version

EMC Documentum Performance Tips

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

B.Sc (Computer Science) Database Management Systems UNIT-V

Ignite Your Creative Ideas with Fast and Engaging Data Discovery

Harnessing the power of advanced analytics with IBM Netezza

SOME STRAIGHT TALK ABOUT THE COSTS OF DATA WAREHOUSING

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

SQL Server 2012 Performance White Paper

Predictive Intelligence: Moving Beyond the Crystal Ball BEST PRACTICES WHITE PAPER

Data Warehouse design

White paper: Unlocking the potential of load testing to maximise ROI and reduce risk.

Trek Bicycle Leading bike maker accelerates innovation and global expansion with EMC

Solutions for Communications with IBM Netezza Network Analytics Accelerator

Maximizing Deduplication ROI in a NetBackup Environment

Information management software solutions White paper. Powerful data warehousing performance with IBM Red Brick Warehouse

Beyond Plateaux: Optimize SSAS via Best Practices

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Unprecedented Performance and Scalability Demonstrated For Meter Data Management:

THE RETURN-ON-INVESTMENT (ROI) OF CRM SOLUTIONS

SafeNet DataSecure vs. Native Oracle Encryption

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

SQL Server 2008 Performance and Scale

Tips and Best Practices for Managing a Private Cloud

whitepaper critical software characteristics

GUIDEBOOK MICROSOFT DYNAMICS NAV

BUSINESS INTELLIGENCE ANALYTICS

Technical White Paper. Symantec Backup Exec 10d System Sizing. Best Practices For Optimizing Performance of the Continuous Protection Server

Data Masking Secure Sensitive Data Improve Application Quality. Becky Albin Chief IT Architect

SQL Server 2012 Parallel Data Warehouse. Solution Brief

Virtual Data Warehouse Appliances

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

A Business Case for Disk Based Data Protection

Imagine business analytics at the speed of thought

Transcription:

WHITE PAPER Reducing Dormant Data: 7 Tips for Delivering Data Warehouse Performance and Cost Savings

Reducing Dormant Data Minimizing dormant data reduces system costs and improves performance, service levels, and IT staff productivity Defining Dormant Data Studies show that much of the data loaded into data warehouses and analytical application databases is dormant; that is, it is infrequently used or never used. Unlike OLTP databases, data warehouses continuously collect and store detailed and summary historical information for business analysis. Frequently, data warehouses include information to satisfy unknown requirements and data is included that may or may not be used. These databases expand significantly over time as new information is added from internal and external data sources. Dormant data can take various forms. One kind of dormant data evolves when historical data is maintained beyond its useful life in the database. This information accumulates much like geological layers buried deep in the earth, hidden and unused. A second form of dormant data develops when data elements thought to be relevant initially are included in the data warehouse but in practice are not useful to business analysis. A third type of dormant data is summary data that is created over time but no longer used. Summary tables can grow to be a huge percentage of the overall data warehouse. And finally, a fourth kind of dormant data stems from the disuse of detailed data over time as users find summary level information more useful. Dormant Data Users Database Estimating Dormant Data Bill Inmon, a noted data warehouse expert, states that as warehouses grow, the ratio of dormant data to total data increases dramatically. He asserts that dormant data may be as much as 65%- 70% in data warehouses that are a terabyte or greater in size. Inmon recommends a simple formula for calculating the data dormancy ratio: the number of queries per year times the average amount of data per query divided by total data warehouse space. While this ratio may be high since it does not consider that some queries inevitably use the same data, it does provide a rule of thumb for making ballpark estimates. 1 Teleran White Paper Reducing Dormant Data to Improve Performance and Reduce Cost

Identifying and Reducing Dormant Data with Query Monitoring Armed with this estimate, you can get a sense for the magnitude of performance improvements and systems savings that can be generated by reducing dormant data. But, how do you actually identify dormant data? In his book, Data Warehouse Performance, Bill Inmon writes, "Understanding that there is dormant data in a data warehouse is one thing. Finding the dormant data is another matter altogether. The best way to find the dormant data is to monitor the endusers' query activity against the data warehouse... the monitor sits between the end-users' query activity and the data warehouse server." An effective means of capturing end user queries and database usage is Teleran s isight usage monitor. isight identifies and reports on dormant data through its comprehensive and continuous profile of all SQL application queries against relational database objects including tables, columns, rows, views, stored procedures, and indexes. isight accomplishes this without requiring any database agents, traces, or monitors that consume a significant portion of database resources. Dormant Data Users isight Usage Monitor Database Benefits of Reducing Dormant Data Lower Costs, Better Performance The server resources and disk storage space consumed by loading and storing dormant data can be very large. Minimizing this dormant data enables organizations to recover significant server and storage resources and dramatically increase service levels by reducing database load-time windows. It also delivers two additional and important benefits: reduced DBA effort in maintaining your databases, and improved query performance. Studies show that the amount of data administration is directly related to database size. The smaller the database, the less effort and expense to maintain it. Also, with less data to process, most data-intensive end user queries will run faster. The following case studies show how these benefits translate directly into improved business performance and reduced costs. 2 Teleran White Paper Reducing Dormant Data to Improve Performance and Reduce Cost

Dormant Data Case Study 1 Saving $800,000 in distribution costs by reducing data load time One Teleran customer, a global office products company, reduced their terabyte size data warehouse by more than 30% using isight query monitoring to identify and eliminate dormant data. The data reduction allowed them to recover almost one-third of their disk storage and decrease their daily load-time window by 30%. The shorter load time enabled the company to increase availability by 1½ hours each day. The business impact of their increased information availability is significant. Each morning, this company s customer service reps must handle millions of dollars of return products. By having return goods and new order information 1½ hours earlier, the service reps can now arrange for these return goods to be shipped directly to another customer before those orders must be filled from a company warehouse. This avoids the extra expense of having to ship the return goods back to the warehouse. By providing this critical information 1½ hours earlier, the company was able to reduce shipping expenses by over $800,000 in the first 12 months. This company s first year return on investment in isight was over 800%. These returns were achieved by identifying the following information: More than one-third of the database tables being loaded daily had not been accessed during the past three months. By storing these tables off-line, the nightly load was reduced by 600 million rows. Of the remaining tables, most contained columns that had not been used in three months or more. Removal of these unused columns further reduced load time and storage requirements. 20% of all indexes had not been used in the past three months and could also be dropped. Because database indexes take time and resources to build and maintain, additional resource savings contributed to the overall improvement in availability and service levels realized from dormant data reduction. 3 Teleran White Paper Reducing Dormant Data to Improve Performance and Reduce Cost

Dormant Data Case Study 2 $120,000 saved in server and storage costs in two months Another Teleran customer, a food manufacturing and distribution company, saved over $120,000 in its first two months of using isight query monitoring. This company s data warehouse had grown beyond 1 terabyte with the addition of several new subject areas. In order to meet its nightly batch load service level agreement, the company planned to upgrade its server CPU at a cost of $60,000 to process the additional data. However, after reviewing isight dormant data reports, the company learned that a large portion of their data was looked at by users on a weekly level, not daily. By loading that portion of the data once a week summarized at the weekly level, they were able to significantly reduce nightly load volumes. This enabled them to meet their service level without having to upgrade their server processor. Avoiding the server upgrade generated $60,000 in immediate savings. After looking at additional isight usage reports, the company was able to eliminate 200 gigabytes of unused data from their database. Specifically, the company identified tables, columns, and views that had not been accessed for more than eight months. In addition, by looking at row level usage reports, they determined that historical data prior to 2000 was rarely accessed and could be removed from the data warehouse and archived. Eliminating the 200 gigabytes reduced their disk storage capacity requirement by 20% and saved them an incremental $60,000 in planned disk storage upgrade costs. From its investment in isight this company achieved: An initial savings of $120,000 Payback in less than 2 months 160% ROI in 2 months Seven Steps to Reducing Dormant Data Based on Teleran s real world experience with a wide range of organizations, we have identified seven proven steps to help organizations identify and reduce the amount of dormant data contained in their data warehouses. Following these steps will enable you and your organization to enjoy the benefits of performance and productivity improvements, as well as operational cost savings. 4 Teleran White Paper Reducing Dormant Data to Improve Performance and Reduce Cost

Step 1 Assess Dormant Data at a Table Level The most logical place to begin identifying data that is infrequently or never used is to look at table usage over a particular time period. Deciding on the appropriate time period requires some judgement based on knowledge of how tables are or were intended to be used. If a company s business is heavily seasonal, such as soft drink companies which generate a large portion of their annual sales in the summer, it may make sense to evaluate data usage over at least a one year period. If your business is relatively consistent across business periods, you may be able to apply a shorter period of time. The isight Table Usage Summary report below identifies database table dormancy by reporting on when database tables were first and then last accessed over a specified time period. Step 2 Evaluate Dormant Data at the Column Level Once you have established an understanding of your table usage, both active and dormant, the next step is to review column usage within your active tables over your relevant time period. This sample isight report details dormant columns by table, reporting on when columns were first and then last accessed. Note the last three columns were never accessed in a twelve month period. 5 Teleran Solutions Reducing Dormant Data to Improve Performance and Reduce Cost

Step 3 Identify Dormant Columns As you continue your dormant data evaluation, it is helpful to run usage reports that show only the dormant database objects. The sample isight report below identifies only those columns that have not been accessed by users over a twelve month period. In addition, this report indicates whether or not the dormant column is indexed. As indexes also take up database space, eliminating unused indexed columns enables you to reduce your database size even more. Step 4 Assess Unused or Infrequently Used Views Database views can add materially to database volume and should be taken seriously in the dormant data evaluation of your data warehouse. Because views often are created for individual users or specific analyses, they can easily fall into disuse as people change jobs or as analytical and reporting requirements change. The following isight report example reveals which views have not been utilized in the twelve month period and which views should be considered active. 6 Teleran Solutions Reducing Dormant Data to Improve Performance and Reduce Cost

Step 5 Identify Dormant Columns within Views In your assessment of column usage, it is important to remember to identify columns that can be accessed within views. This isight usage report shows what columns within particular views have not been accessed. In this case, none of the columns have been accessed over a twelve month period and most, if not all, are good candidates for elimination. Step 6 Evaluate Dormant Stored Procedures Much like views, stored procedures are often designed for very specific reports or applications. As conditions change, stored procedures become dormant, accumulating over time and increasing overall data warehouse size. The following isight report shows stored procedure usage over a twelve month period and confirms that there are a number of unused stored procedures that probably should be deleted. 7 Teleran Solutions Reducing Dormant Data to Improve Performance and Reduce Cost

Step 7 - Assess Dormant Data at the Row Level Eliminating or archiving unused row level data can materially reduce database size; the challenge is finding it. Identifying row level data usage requires a deeper analysis of the monitored SQL queries than the database object level analyses described above. Row level data is generally specified in a query by the predicate in the where clause. Predicate values can, for example, be dates or date ranges, specific product codes, or geographic areas. The isight report below reveals row level dormant data by identifying the number of times a predicate exists in the database as well as how many times it is accessed. In this case the predicate, MA (as in Massachusetts), exists 1808 times in the database, but is never specified. It is most likely a good candidate for archiving or deletion. Summary Reducing Dormant Data Improves Performance and Reduces Costs The steps to evaluating and reducing dormant data with Teleran isight are relatively easy and offer a large payoff. Identifying what tables, columns, indexes, views, stored procedures and rows are not being used can dramatically reduce the size of your database. Minimizing dormant data on an ongoing basis enables you to generate immediate, measurable returns and clear cost justification through hardware server and storage savings. By reducing load times, improvements in data availability yield quantifiable business benefits including lower operating costs and increased revenue generation. And finally, speeding query times by minimizing the overall size of the data warehouse enables you to improve business productivity while reducing IT overhead. Sources: Data Warehouse Performance, Wiley, 1999, by Inmon, Rudin, Buss and Sousa 8 Teleran Solutions Reducing Dormant Data to Improve Performance and Reduce Cost

Teleran Technologies is the leading provider of software for managing business intelligence (BI) activity in data warehouses, CRM, supply chain and analytical applications. Through end-to-end knowledge of the BI environment users, queries, applications and databases Teleran software aligns IT processes with business needs, reducing costs and improving performance and productivity. isight continuously profiles application performance and the use of corporate data enterprise-wide, helping IT to better understand, manage, and secure BI activity. iguard controls queries and users to ensure that all BI applications are performing optimally, improving resource efficiency and reducing system costs. Automated Helpdesk guides users with real-time messages, maximizing user performance and productivity while reducing helpdesk calls and support costs. Teleran Technologies, Inc. PO Box 667 Roseland, NJ 07068 973.439.1820 Phone 973.439.1821 Fax info@teleran.com www.teleran.com Service Level Manger automatically maintains service levels over time by generating predictive iguard query performance policies as usage patterns and system resources change. Teleran s Access Architecture enables these products to install quickly and operate continuously on the network without degrading database or application performance. Founded in 1996, Teleran pioneered the concept of BI activity monitoring and management for data warehouses and analytic applications with its patented policy engine and management process. Today the company provides solutions for many of the world s leading companies, including Allstate, Aventis, Ernst & Young, Gordon Food Service, Horizon Blue Cross Blue Shield, JPMorgan Chase, Merrill Lynch, MetLife, State of Texas, Sun Microsystems, Unisys and Wells Fargo. 2005 Teleran Technologies, Inc. All rights reserved. Teleran and the Teleran logo are registered trademarks and isight, iguard, Discovery, Automated Helpdesk, Access Architecture, and InfoUse Knowledge Base are trademarks of Teleran Technologies, Inc. All other names are the property of their respective owners. SO1204.3 9 Teleran Solutions Reducing Dormant Data to Improve Performance and Reduce Cost