Active Data Archiving

Size: px
Start display at page:

Download "Active Data Archiving"


1 TDWI RESEARCH TDWI CHECKLIST REPORT Active Data Archiving For Big Data, Compliance, and Analytics By Philip Russom Sponsored by:

2 MAY 2014 TDWI CHECKLIST REPORT ACTIVE DATA ARCHIVING For Big Data, Compliance, and Analytics By Philip Russom TABLE OF CONTENTS 2 FOREWORD 2 NUMBER ONE Embrace modern practices and platforms for active data archiving 3 NUMBER TWO Assure and improve data governance by using a compliance data archive 3 NUMBER THREE Consider an analytics archive for critical, high-value, and aging analytics data 4 NUMBER FOUR Rethink how data is committed to an archive 4 NUMBER FIVE Rethink how archived data is accessed and used actively 5 NUMBER SIX Deploy archiving systems that have multiple storage and processing tiers 6 NUMBER SEVEN Make security a high priority because it will make or break an archive 7 ABOUT OUR SPONSOR 7 ABOUT THE AUTHOR 7 ABOUT TDWI RESEARCH 7 ABOUT THE TDWI CHECKLIST REPORT SERIES 555 S Renton Village Place, Ste. 700 Renton, WA T F E 2014 by TDWI (The Data Warehousing Institute TM ), a division of 1105 Media, Inc. All rights reserved. Reproductions in whole or in part are prohibited except by written permission. requests or feedback to Product and company names mentioned herein may be trademarks and/or registered trademarks of their respective companies. 1 TDWI RESEARCH

3 FOREWORD NUMBER ONE EMBRACE MODERN PRACTICES AND PLATFORMS FOR ACTIVE DATA ARCHIVING Data archiving presents various problems in the enterprise today. Many organizations don t archive at all. Others mistakenly think that mere data backups can serve as archives, whereas tape is actually the final burial place of data, from which it rarely returns. Equally off base, others believe a data warehouse is an archive. Although it s true that data archiving processes exist today in some organizations, these are rarely formalized or policy driven, such that data is archived in an ad hoc fashion (typically per application or per department) without an enterprise standard or strategy. Even when an organization makes an honest attempt at an enterprise data archive, the result is usually not trustworthy (because data is easily altered), not auditable (due to poor metadata and documentation), not compliant (due to inadequate usage monitoring or the inability to purge data at specified milestones), and not properly secured (lacking encryption, masking, and security standards). Furthermore, with most existing data archives, it s hard to get data in with integrity and out with speed because the primary platform is not online, active, and highly available. Why don t more organizations invest in formal archiving processes and technical solutions? Most likely it s their common belief that archives provide little or no return on investment (ROI) because users rarely (if ever) access the archive. Without prominent and frequent usage, a respectable ROI is unlikely. A data archive can achieve ROI by serving multiple uses and users from an online, active platform. Yes, organizations do need to retain data; that s not in question. However, archived data is not just insurance for compliance, audit, and legal contingencies. Those are important goals, but a data archive should also be treated as an enterprise asset to be leveraged, typically via analytics. Hence, a data archive can be more than a cost center; it can achieve ROI when it serves multiple uses (archiving, compliance, and analytics of deep historical data sets) and it manages data online for active access at any time by a wide range of users. Users must start planning today for active data archiving. To help them prepare, this TDWI Checklist report will drill into the desirable attributes, use cases, user best practices, and enabling technologies of active data archiving. There are compelling reasons for improving data archives. Traditional reasons for data archives still apply: namely, supplying data for compliance, audit, and legal requirements. However, a modern online data archive brings greater speed, accuracy, and credibility to these tasks so they are a smaller drain on enterprise processes and resources. New reasons have come into play as well: namely, organizations voracious hunger for actionable insights discovered through advanced analysis of raw source data, big data, and a broadening diversity of data types. One of the most influential changes, however, concerns the state-of-the-art in data platforms both hardware and software. Their speed, scale, and functionality continue to rise even as their costs fall, which in turn makes the improvement of users data archive solutions feasible for both technical and financial reasons. Active data archiving can address these problems and opportunities. Enterprises need to embrace the emerging practice of active data archiving along with its enabling technologies. A modern solution for active data archiving will: Be built primarily for compliance or data governance but also serve the archival needs of analytics and sometimes data backup and disaster recovery. Be open to active access by a wide range of users, including those who need simple lookups and easy data exploration. Manage data as an immutable record that cannot be altered so that data is trustworthy for compliance and legal requirements. Be secured like a bank vault, for data security, privacy, and trust, using role-based permission access, data masking, encryption, and multiple data security standards. Scale up to multi-terabyte and petabyte data volumes using fast bulk loads and data compression to embrace new big data sources and because archives inevitably grow over time. Operate online with high availability around the clock to enable active data loads and extracts that keep the archive current up to the minute. Furthermore, data is constantly appended to an active archive without downtime or performance degradation. Support high-performance access based on SQL and other standards because users expect quick responses as they run queries and searches against archived data. 2 TDWI RESEARCH

4 NUMBER TWO ASSURE AND IMPROVE DATA GOVERNANCE BY USING A COMPLIANCE DATA ARCHIVE NUMBER THREE CONSIDER AN ANALYTICS ARCHIVE FOR CRITICAL, HIGH-VALUE, AND AGING ANALYTICS DATA Two broad archive categories defined by their content and the primary use of that information can coexist and overlap in active data archiving solutions: Compliance archives: Data retained in content, format, and for timeframes prescribed by legislation and other regulations (e.g., partners, lenders, and legal liabilities) Analytic archives: Detailed source data from operational and transactional applications, extracted for general business intelligence purposes but retained for advanced analytics (as defined in the next section of this report) Compliance archives have a number of desirable process and technical attributes: Data that s properly archived is solid evidence of an organization s compliance. In legal terms, honest attempts at archiving constitute proper intent, whereas a lack of archiving may be construed as malfeasance. Data archived for compliance must support appropriate regulations. These vary by industry. For example, in the United States, the most stringent regulations target banking and the financial services industry as seen in the Dodd-Frank legislation or SEC Rule 17a-4. Similarly, the telecommunications industry is subject to legal hold and lawful intercept requirements that demand timed data retention. Archived data must be tamper proof to be trusted. Most is captured and stored in original form so it s a credible representation of a transaction, report, business process, or other event at a specific time. If archived data becomes altered, it is no longer considered credible. For example, stock trades are stored for exact timeframes, to protect both trader and institution. Transparency is of the utmost importance to compliance archives, and WORM (write once, read many times) storage has become key. Archived data demands a convincingly documented audit trail. Most audits commence with a request for information, followed by a request for an audit trail for supplied information. With data stored properly in an active archive, audits go faster perhaps more accurately, too than with traditional offline, ad hoc archives. The speedy, documented response builds confidence with auditing bodies and contributes to favorable outcomes. An active data archive should have tracking functions so an organization can monitor and study its own activities to assure compliance and make improvements. The same tracking functions can flag data that has aged beyond its compliance requirements and should be deleted. Archiving operational data for analytic purposes is on the rise. As more advanced forms of analytics have gained credence over the last 15 to 20 years, user organizations have been retaining more detailed source data. The traditional practice was to extract data from operational applications and other sources, process that data and load the results into a DW, then delete the extracted source data. The accepted practice today keeps most source data because it is also the preferred material for analytics based on data mining, statistical analyses, natural language processing, and SQL-based analytics. An analytic archive and a data warehouse are similar but different. Because of the stepped-up data retention, the data staging areas within most data warehouse architectures today are bigger than their core warehouses. This is tantamount to data archiving, though few BI/DW professionals call it archiving. All they know is that they have to do something to improve the content and accessibility of their analytic data archives. Furthermore, they need to offload this burden from core warehouses, which have higher priorities than analytics (namely reporting, OLAP, and performance management). Hence, as BI/DW professionals ponder where to put certain classes of analytic data, they should consider a platform for active data archiving. An analytic archive easily integrates with multi-platform DW architectures. DW system architectures have always been multiplatform, but this trend has accelerated in recent years as users have extended their DW environments by adding new platforms for columnar databases, appliances, NoSQL, and Hadoop. An additional platform one that specializes in archiving data for advanced analytics would wring more value from archived source data and easily integrate with multi-platform DW architectures. A data archive can future-proof analytic applications. Most data warehouses are designed by their users (not vendors) for the data requirements of reporting, OLAP, and performance management. These practices need calculated, aggregated, standardized, and time-series numeric values modeled in multidimensional structures that don t exist in source systems. Advanced analytics has different data requirements. It needs a very large store of unaltered (or lightly transformed) detailed source data. Other than that, it s impossible to anticipate data requirements for future analytic applications (AA). Accordingly, an analytic archive preserves source data in its original form, so the source is there for future AAs to explore and repurpose. 3 TDWI RESEARCH

5 NUMBER FOUR RETHINK HOW DATA IS COMMITTED TO AN ARCHIVE NUMBER FIVE RETHINK HOW ARCHIVED DATA IS ACCESSED AND USED ACTIVELY A data archive has to be more than a dumping ground. For one thing, there needs to be a strategy based on new and evolving user requirements for aging, less frequently accessed data and other metrics for identifying which data should be archived at what level and on what schedule. Note that not all data should be archived: some data belongs elsewhere, say, in its original application database or in a data warehouse. Archive specialists need to interview a broad range of business users and managers to determine users needs for archived data. If your organization has a legal department and compliance officers, give priority to their needs but without neglecting the rest of the enterprise. On a technology level, develop interfaces and integration logic for getting data into the archive quickly and in lightly transformed states that are conducive to query and search, without altering the essential content of archived data. Finally, assume that all the data in the archive needs an audit trail and documentation (via metadata, etc.) that is sufficient to satisfy even the most aggressive users and auditors. What if data comes from applications that have been upgraded or customized (which can alter data models)? Look for a data archiving platform that can manage changing data models. That way, the platform understands changes to source schema and adjusts metadata and pointers accordingly. What if archived data comes from an application that was decommissioned (also known as application retirement)? When the only application that can read a dataset with full integrity is gone, that application s data may need to be lightly transformed before entering an archive (or after it s in the archive) so it can be easily accessed by common query and search tools. This practice is inspired by data warehousing but it does not require the full-blown time, skills, and expense of the average data warehouse. Some archived data needs encryption (for security) or compression (to reduce its storage footprint). Look for a platform that can apply these and other data operations as data enters the archive or after data is in the archive. Furthermore, as data growth rates continue to rise over time and business demands for retaining older data grow, data should be stored in a compressed state to optimize storage capacity and scale over time. Similarly, the security classification of data can change as organizational rules and policies evolve. Let s be honest: We ve all worked in organizations where archives were purely pro forma, without a credible effort to preserve data in a state that s quickly or easily accessed by anyone, much less the growing number of employees who can benefit from accessing the information. Luckily, this old worst practice is giving way to the realization that all enterprise datasets including archived data are valuable assets that can contribute to many business goals. The recent craze for analytics with big data has led many organizations to seek more business value from their datasets. With that in mind, active data archiving is a bit of a cultural shock in some organizations. To get past the shock, these organizations need upper management to define a mandate for modern archiving based on the following goals: Archived data must be leveraged. Typical use cases include fast, documented auditing for compliance, a source for analytic applications, data exploration, and information lookups. Some data will come out of the archive to be used elsewhere. To enable a broad range of users, tools, and purposes, the archive should support both query and search mechanisms. Furthermore, the archive should serve as a source for other data platforms, especially those for business intelligence and analytics. A growing constituency of users will have access to archived data. This is a sticky point in organizations that define data governance and compliance as the process of limiting data access. The catch is to balance access and control, typically through welldefined user types controlled via role-based user access and strong security features in the archival platform. Accessing archived data will be timely. First, to be truly active, the archive must be online like a database, not offline like magnetic tapes and optical disks or any media that demand a distracting and time-consuming restoration process. Second, data access mechanisms should perform at or near real time for the sake of user productivity. 4 TDWI RESEARCH

6 NUMBER SIX DEPLOY ARCHIVING SYSTEMS THAT HAVE MULTIPLE STORAGE AND PROCESSING TIERS For a data archive to be truly active, its primary tier should be based on a robust database management system (DBMS). The DBMS must include traditional relational functions (for query and data exploration) and functions for multiple security strategies, scalability, and high availability. The assumptions here are that most data being archived will be structured and that most users and applications will need to access data via queries. Even so, some functions of the DBMS should be controlled; for example, inserting and updating data can destroy data s original state, whereas appending data avoids such integrity problems. In addition to relational technology, free text search is critical to finding records of interest and to enabling non-technical users. An active data archiving platform can host many archives, each with its own unique requirements, similar to how a DBMS can manage several databases (defined as collections of data). Thus, multi-tenancy is another key assumption for a modern data archive. In most cases, an archive platform is not a data processing or analytics platform. Hence, archived data is best extracted, then moved to a DBMS or other data platform that is more conducive to in-database analytics, intense SQL-based analytics, and miscellaneous forms of advanced analytics. For these purposes, mature organizations already have in place relational data warehouses, columnar databases, and DW appliances, possibly NoSQL databases and Hadoop. As an exception, when an active archive runs atop Hadoop, it may make sense to process and analyze data on the same platform where it s archived. Note that the DBMS in the primary tier of a data archive does not replace other DBMSs, especially not those deployed for analytics. Instead, it complements them and (in addition to its archival purpose) serves as yet another source of data for analytics (largely historical data). The storage tier of an active archive should be diverse. This is to accommodate subsystems users already have as well as newer commodity-priced types such as CAS hardware or the Hadoop Distributed File System (HDFS). Even a modern active archive might include systems for magnetic tape and optical disk in the storage tier. After all, many organizations have pre-existing mag tape or op disk libraries that they must maintain. Note that these archaic media are antithetical to an active data archive; if possible, their data should be migrated into the active archive so it s online and available when users need it. In the case of a compliance archive (for, say, a financial services institution), the archive must reside in a WORM storage platform. This, in turn, requires a DBMS that supports WORM devices. WORM technologies are worth the investment because they keep compliance and risk officers happy and they avoid fines, penalties, and damaging publicity. Users should consider Hadoop as both a highly scalable storage platform for archiving and a low-cost processing platform for analytics. Note that open-source Hadoop s poor support for two key standards SQL (and other relational technologies) and security (especially LDAP and Linux PAM) keeps it unpalatable for mature IT organizations. Despite these two limitations, Hadoop has roles to play in multiplatform archive architectures. Hadoop excels with very large data volumes, as well as with file-based data, data documents (XML and JSON), textual content ( and word processing files), unstructured and non-relational structured data, and schema-free data. Hadoop s low price is appropriate to many kinds of lower-value (but high-volume) historic data, such as Web logs. However, due to limitations in current releases, purely open-source Hadoop may not be the best choice for structured data that needs relational processing (such as intense SQL or multi-way joins) or sensitive data that demands high security. That s not a show stopper because a number of software vendors offer products that integrate with Hadoop to give it stronger and broader support for security and relational technologies like standard SQL. Consider economics as you select platforms, tools, and features for a new active archiving architecture. For example, it s technically possible to include almost any brand of relational DBMS in an archiving solution. However, the older and more mature vendor brands are relatively expensive, especially once an archive scales into multi-terabytes, and they include far more features and functions than are required for archiving. A more cost-effective choice is a DBMS designed for archiving or one of the newer columnar, open-source, or appliance-based DBMSs. In this context, Hadoop is affordable in terms of dollars per terabyte of storage. Similarly, data compression is a feature that can reduce storage costs because it reduces the footprint of archived data in storage. 5 TDWI RESEARCH

7 NUMBER SEVEN MAKE SECURITY A HIGH PRIORITY BECAUSE IT WILL MAKE OR BREAK AN ARCHIVE Put succinctly, if an archive isn t secure, it won t meet the compliance goals that are its primary purpose. Furthermore, if users don t trust the security of the archival platform, they won t use it or its data, and the archive will fail to demonstrate a positive ROI. The primary line of defense is the security layer built into the relational DBMS at the heart of an active data archiving platform. Most mature IT departments and DBMS teams prefer role-based approaches to security, and many have LDAP and other directories they d like to reuse and apply within the active archiving solution. If Hadoop is to be part of an active archive s infrastructure, note that security in purely open-source Hadoop today is mostly about general access privileges controlled through Kerberos. However, a few third parties now offer add-on products that enable LDAP, Active Directory, and other approaches to security for the Hadoop family of products. Almost all modern data archives are loaded with sensitive data about customers, partners, employees, Social Security numbers, credit card numbers, transactions, internal financials, and so on. Encryption or data masking can make this data unreadable in the eventuality of a hack or other unauthorized access. Additional layers of data protection may be used to keep data locked and immutable. This provides evidence that data records and files have not been altered, which is fundamental to a credible audit. Likewise, records and files cannot be deleted before their retention periods expire. 6 TDWI RESEARCH

8 ABOUT OUR SPONSOR ABOUT THE AUTHOR RainStor provides the world s most efficient database solutions that reduce the cost, complexity, and compliance risk of managing data. Delivering solutions to the enterprise, you can quickly deploy an Analytical Archive or Compliance Archive so you continue to create business value and stay compliant. RainStor runs anywhere: on-premises or in the cloud and natively on Hadoop. Among RainStor s customers are 20 of the world s largest communications providers and 10 of the biggest banks and financial services organizations, which use RainStor to manage historical data, while saving millions. For more info: or join the Philip Russom is the research director for data management at The Data Warehousing Institute (TDWI), where he oversees many of TDWI s research-oriented publications, services, and events. He s been an industry analyst at Forrester Research and Giga Information Group, where he researched, wrote, spoke, and consulted about BI issues. Before that, Russom worked in technical and marketing positions for various database vendors. Over the years, Russom has produced over 500 publications and speeches. You can reach him at ABOUT TDWI RESEARCH TDWI Research provides research and advice for business intelligence and data warehousing professionals worldwide. TDWI Research focuses exclusively on BI/DW issues and teams up with industry thought leaders and practitioners to deliver both broad and deep understanding of the business and technical challenges surrounding the deployment and use of business intelligence and data warehousing solutions. TDWI Research offers in-depth research reports, commentary, inquiry services, and topical conferences as well as strategic planning services to user and vendor organizations. ABOUT THE TDWI CHECKLIST REPORT SERIES TDWI Checklist Reports provide an overview of success factors for a specific project in business intelligence, data warehousing, or a related data management discipline. Companies may use this overview to get organized before beginning a project or to identify goals and areas of improvement for current projects. 7 TDWI RESEARCH

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures Evolving Data Warehouse Architectures In the Age of Big Data Philip Russom April 15, 2014 TDWI would like to thank the following companies for sponsoring the 2014 TDWI Best Practices research report: Evolving

More information



More information

Using and Choosing a Cloud Solution for Data Warehousing

Using and Choosing a Cloud Solution for Data Warehousing TDWI RESEARCH TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for Data Warehousing By Colin White Sponsored by: JULY 2015 TDWI CHECKLIST REPORT Using and Choosing a Cloud Solution for

More information

Data Warehousing in the Cloud

Data Warehousing in the Cloud TDWI RESEARCH TDWI CHECKLIST REPORT Data Warehousing in the Cloud By David Loshin Sponsored by: JULY 2015 TDWI CHECKLIST REPORT Data Warehousing in the Cloud By David Loshin TABLE OF CONTENTS

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users

Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users Background Communications providers have had experience

More information

Big Data and Your Data Warehouse Philip Russom

Big Data and Your Data Warehouse Philip Russom Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management April 5, 2012 Sponsor Speakers Philip Russom Research Director, Data Management, TDWI Peter Jeffcock Director,

More information

Global Investment Bank Saves Millions with RainStor

Global Investment Bank Saves Millions with RainStor SUCCESS STORY Global Investment Bank Saves Millions with RainStor Reduces data footprint by 97%, meets stringent compliance regulations and achieves payback in 18 months Background Financial

More information

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH SECOND QUARTER 2014 Ten Mistakes to Avoid In Big Data Analytics Projects By Fern Halper Ten Mistakes to Avoid In Big Data Analytics Projects

More information

P u b l i c a t i o n N u m b e r : W P 0 0 0 0 0 0 0 4 R e v. A

P u b l i c a t i o n N u m b e r : W P 0 0 0 0 0 0 0 4 R e v. A P u b l i c a t i o n N u m b e r : W P 0 0 0 0 0 0 0 4 R e v. A FileTek, Inc. 9400 Key West Avenue Rockville, MD 20850 Phone: 301.251.0600 International Headquarters: FileTek Ltd 1 Northumberland Avenue

More information



More information

Hadoop for the Enterprise:

Hadoop for the Enterprise: TDWI RESEARCH SECOND QUARTER 2015 TDWI BEST PRACTICES REPORT Hadoop for the Enterprise: Making Data Management Massively Scalable, Agile, Feature-Rich, and Cost-Effective By Philip Russom Co-sponsored

More information

Integrated email archiving: streamlining compliance and discovery through content and business process management

Integrated email archiving: streamlining compliance and discovery through content and business process management Make better decisions, faster March 2008 Integrated email archiving: streamlining compliance and discovery through content and business process management 2 Table of Contents Executive summary.........

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

Achieving Business Value through Big Data Analytics Philip Russom

Achieving Business Value through Big Data Analytics Philip Russom Achieving Business Value through Big Data Analytics Philip Russom TDWI Research Director for Data Management October 3, 2012 Sponsor 2 Speakers Philip Russom Research Director, Data Management, TDWI Brian

More information

ten mistakes to avoid

ten mistakes to avoid second quarter 2010 ten mistakes to avoid In Predictive Analytics By Thomas A. Rathburn ten mistakes to avoid In Predictive Analytics By Thomas A. Rathburn Foreword Predictive analytics is the goal-driven

More information

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014 White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed

More information

Enterprise Data Management

Enterprise Data Management TDWI research TDWI Checklist report Enterprise Data Management By Philip Russom Sponsored by OCTOBER 2009 TDWI Checklist report Enterprise Data Management By Philip Russom TABLE OF CONTENTS

More information

Seven Tips for Unified Master Data Management

Seven Tips for Unified Master Data Management TDWI RESEARCH TDWI CHECKLIST REPORT Seven Tips for Unified Master Data Management Integrated with Data Quality and Data Governance By Philip Russom Sponsored by: MAY 2014 TDWI CHECKLIST REPORT

More information

DATA ARCHIVING. The first Step toward Managing the Information Lifecycle. Best practices for SAP ILM to improve performance, compliance and cost

DATA ARCHIVING. The first Step toward Managing the Information Lifecycle. Best practices for SAP ILM to improve performance, compliance and cost DATA ARCHIVING The first Step toward Managing the Information Lifecycle Best practices for SAP ILM to improve performance, compliance and cost 2010 Dolphin. West Chester, PA All rights are reserved, including

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Informatica Application Information Lifecycle Management

Informatica Application Information Lifecycle Management Informatica Application Information Lifecycle Management Cost-Effectively Manage Every Phase of the Information Lifecycle brochure Controlling Explosive Data Growth The era of big data presents today s

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information



More information

Discover A New Path For Your Healthcare Data and Storage

Discover A New Path For Your Healthcare Data and Storage Discover A New Path For Your Healthcare Data and Storage Enable Your IT With Healthcare Storage Virtualization Using Your Data, Your Storage, Your Way In healthcare IT, your mission is the smooth running

More information



More information

Strategic archiving. Using information lifecycle management to archive data more efficiently and comply with new regulations

Strategic archiving. Using information lifecycle management to archive data more efficiently and comply with new regulations WHITE PAPER September 2005 Strategic archiving Using information lifecycle management to archive data more efficiently and comply with new regulations ABSTRACT Information lifecycle management can help

More information

Informatica Application Information Lifecycle Management

Informatica Application Information Lifecycle Management Brochure Informatica Application Information Lifecycle Management Cost-Effectively Manage Every Phase of the Information Lifecycle Controlling Explosive Data Growth Informatica Application Information

More information

MAS 200. MAS 200 for SQL Server Introduction and Overview

MAS 200. MAS 200 for SQL Server Introduction and Overview MAS 200 MAS 200 for SQL Server Introduction and Overview March 2005 1 TABLE OF CONTENTS Introduction... 3 Business Applications and Appropriate Technology... 3 Industry Standard...3 Rapid Deployment...4

More information

Data Integration for Real-Time Data Warehousing and Data Virtualization

Data Integration for Real-Time Data Warehousing and Data Virtualization TDWI RESEARCH TDWI CHECKLIST REPORT Data Integration for Real-Time Data Warehousing and Data Virtualization By Philip Russom Sponsored by O C T OBER 2 010 TDWI CHECKLIST REPORT Data Integration

More information

How Does Big Data Change Your Way of Managing Information?

How Does Big Data Change Your Way of Managing Information? How Does Big Data Change Your Way of Managing Information? A Best-Practices Guide for Data Managers By Erian Laperi, Director Enterprise Data Management and Business Enablement at AT&T How Does Big Data

More information

Enterprise Data Integration

Enterprise Data Integration Enterprise Data Integration Access, Integrate, and Deliver Data Efficiently Throughout the Enterprise brochure How Can Your IT Organization Deliver a Return on Data? The High Price of Data Fragmentation

More information

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases DATAMEER WHITE PAPER Beyond BI Big Data Analytic Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

BEYOND BI: Big Data Analytic Use Cases

BEYOND BI: Big Data Analytic Use Cases BEYOND BI: Big Data Analytic Use Cases Big Data Analytics Use Cases This white paper discusses the types and characteristics of big data analytics use cases, how they differ from traditional business intelligence

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

and the world is built on information

and the world is built on information Let s Build a Smarter Planet Starting with a more dynamic and the world is built on information Guy England Storage sales manager CEEMEA Tel: +971 50 55 77 614 IBM Building a Smarter

More information

Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving

Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving Riverbed Whitewater/Amazon Glacier ROI for Backup and Archiving November, 2013 Saqib Jang Abstract This white paper demonstrates how to increase profitability by reducing the operating costs of backup

More information

TDWI research. TDWI Checklist report. Data Federation. By Wayne Eckerson. Sponsored by.

TDWI research. TDWI Checklist report. Data Federation. By Wayne Eckerson. Sponsored by. TDWI research TDWI Checklist report Data Federation By Wayne Eckerson Sponsored by NOVEMBER 2009 TDWI Checklist report Data Federation By Wayne Eckerson TABLE OF CONTENTS 2 FOREWORD 2 NUMBER

More information

Coping with the Data Explosion

Coping with the Data Explosion Paper 176-28 Future Trends and New Developments in Data Management Jim Lee, Princeton Softech, Princeton, NJ Success in today s customer-driven and highly competitive business environment depends on your

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Email Archiving Whitepaper. Why Email Archiving is Essential (and Not the Same as Backup)

Email Archiving Whitepaper. Why Email Archiving is Essential (and Not the Same as Backup) Why Email Archiving is Essential (and Not the Same as Backup) Why Email Archiving is Essential (and Not the Same as Backup) If your job depended on it, could you clearly explain right this moment the principal

More information

Enforce Governance, Risk, and Compliance Programs for Database Data

Enforce Governance, Risk, and Compliance Programs for Database Data Enforce Governance, Risk, and Compliance Programs for Database Data With an Information Lifecycle Management Strategy That Includes Database Archiving, Application Retirement, and Data Masking WHITE PAPER

More information

Hitachi Cloud Service for Content Archiving. Delivered by Hitachi Data Systems

Hitachi Cloud Service for Content Archiving. Delivered by Hitachi Data Systems SOLUTION PROFILE Hitachi Cloud Service for Content Archiving, Delivered by Hitachi Data Systems Improve Efficiencies in Archiving of File and Content in the Enterprise Bridging enterprise IT infrastructure

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

How To Manage An Electronic Discovery Project

How To Manage An Electronic Discovery Project Optim The Rise of E-Discovery Presenter: Betsy J. Walker, MBA WW Product Marketing Manager What is E-Discovery? E-Discovery (also called Discovery) refers to any process in which electronic data is sought,

More information



More information

Simplify IT and Reduce Costs with Automated Data and Document Archiving

Simplify IT and Reduce Costs with Automated Data and Document Archiving SAP Brief SAP Extensions SAP Archiving by OpenText Objectives Simplify IT and Reduce Costs with Automated Data and Document Archiving An easier way to store, manage, and access data and documents An easier

More information

What we do? Our services include:

What we do? Our services include: What we do? The next revolution in information technology is migration to what has been labeled as The Third Platform. Following the revolution that was brought about by the introduction of mainframe technology

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

MAS 200 for SQL Server. Technology White Paper. Best Software, Inc.

MAS 200 for SQL Server. Technology White Paper. Best Software, Inc. MAS 200 for SQL Server Technology White Paper Best Software, Inc. Table of Contents MAS 200 for SQL Server............ 1 Why Microsoft SQL Server for MAS 200?... 3 Tuning Wizard...3 Query Optimizer...4

More information

Things You Need to Know About Cloud Backup

Things You Need to Know About Cloud Backup Things You Need to Know About Cloud Backup Over the last decade, cloud backup, recovery and restore (BURR) options have emerged as a secure, cost-effective and reliable method of safeguarding the increasing

More information

How To Use Hp Vertica Ondemand

How To Use Hp Vertica Ondemand Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information



More information

A New Era in Data Protection. Enterprise-Class Data Backup for Smaller Businesses

A New Era in Data Protection. Enterprise-Class Data Backup for Smaller Businesses A New Era in Data Protection Enterprise-Class Data Backup for Smaller Businesses THE NEED: Better Backup Solutions For Small Business INTRADYN provides proven solutions designed to let you concentrate

More information

Archiving, Backup, and Recovery for Complete the Promise of Virtualization

Archiving, Backup, and Recovery for Complete the Promise of Virtualization Archiving, Backup, and Recovery for Complete the Promise of Virtualization Unified information management for enterprise Windows environments The explosion of unstructured information It is estimated that

More information


BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE Current technology for Big Data allows organizations to dramatically improve return on investment (ROI) from their existing data warehouse environment.

More information

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures TDWI research Second Quarter 2014 BEST PRACTICES REPORT Evolving Data Warehouse Architectures In the Age of Big Data By Philip Russom Research Sponsors Research Sponsors Actian Cloudera Datawatch

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

Cloud, Appliance, or Software? How to Decide Which Backup Solution Is Best for Your Small or Midsize Organization.

Cloud, Appliance, or Software? How to Decide Which Backup Solution Is Best for Your Small or Midsize Organization. WHITE PAPER: CLOUD, APPLIANCE, OR SOFTWARE?........................................ Cloud, Appliance, or Software? How to Decide Which Backup Solution Is Best for Your Small or Midsize Who should read

More information

Achieve Economic Synergies by Managing Your Human Capital In The Cloud

Achieve Economic Synergies by Managing Your Human Capital In The Cloud Achieve Economic Synergies by Managing Your Human Capital In The Cloud By Orblogic, March 12, 2014 KEY POINTS TO CONSIDER C LOUD S OLUTIONS A RE P RACTICAL AND E ASY TO I MPLEMENT Time to market and rapid

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

A Practical Guide to Legacy Application Retirement

A Practical Guide to Legacy Application Retirement White Paper A Practical Guide to Legacy Application Retirement Archiving Data with the Informatica Solution for Application Retirement This document contains Confidential, Proprietary and Trade Secret

More information

Planning for and Surviving a Data Disaster

Planning for and Surviving a Data Disaster Planning for and Surviving a Data Disaster Solutions to successfully meet the requirements of business continuity. An Altegrity Company 2 2 5 7 Introduction Managing Host Storage for Virtual Environments

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Evolving Data Warehouse Architectures

Evolving Data Warehouse Architectures TDWI RESEARCh Second Quarter 2014 BEST PRACTICES REPORT Evolving Data Warehouse Architectures In the Age of Big Data By Philip Russom Co-sponsored by: TDWI research BEST PRACTICES REPORT Second

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information



More information

Modernizing Data Protection With Backup Appliances

Modernizing Data Protection With Backup Appliances Executive Brief Modernizing Data Protection With Backup Appliances Sponsored by: Symantec Carla Arend March 2014 Andrew Buss IDC OPINION Transformation of the backup infrastructure is the next frontier

More information

Introduction. Chapter 1. Introducing the Database. Data vs. Information

Introduction. Chapter 1. Introducing the Database. Data vs. Information Chapter 1 Objectives: to learn The difference between data and information What a database is, the various types of databases, and why they are valuable assets for decision making The importance of database

More information

What You Need to Know About Cloud Backup: Your Guide to Cost, Security, and Flexibility

What You Need to Know About Cloud Backup: Your Guide to Cost, Security, and Flexibility Your Guide to Cost, Security, and Flexibility What You Need to Know About Cloud Backup: Your Guide to Cost, Security, and Flexibility 10 common questions answered Over the last decade, cloud backup, recovery

More information

SAVE OFTEN. Many new electronic records laws are forcing companies to rethink how they archive and protect data or risk stiff penalties

SAVE OFTEN. Many new electronic records laws are forcing companies to rethink how they archive and protect data or risk stiff penalties Many new electronic records laws are forcing companies to rethink how they archive and protect data or risk stiff penalties SAVE OFTEN By Courtney Macavinta 12 DELL INSIGHT JANUARY 2005 Save [ Cutting

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

How To Use Noetix

How To Use Noetix Using Oracle BI with Oracle E-Business Suite How to Meet Enterprise-wide Reporting Needs with OBI EE Using Oracle BI with Oracle E-Business Suite 2008-2010 Noetix Corporation Copying of this document is

More information

The StorHouse Database Extension and Relational Repository for mysap BI BW

The StorHouse Database Extension and Relational Repository for mysap BI BW FileTek, Inc. 9400 Key West Avenue Rockville, Maryland 20850 USA Phone: 301.251.0600 Fax: 301.251.1990 E-Mail: Web: The StorHouse Database Extension and Relational Repository

More information

Big Data and Your Data Warehouse Philip Russom

Big Data and Your Data Warehouse Philip Russom Big Data and Your Data Warehouse Philip Russom TDWI Research Director for Data Management May 7, 2013 Sponsor Speakers Philip Russom TDWI Research Director, Data Management Chris Twogood VP, Product and

More information

Key Issues for Data Management and Integration, 2006

Key Issues for Data Management and Integration, 2006 Research Publication Date: 30 March 2006 ID Number: G00138812 Key Issues for Data Management and Integration, 2006 Ted Friedman The effective management and leverage of data represent the greatest opportunity

More information

Data Growth Presents Challenges And Opportunities

Data Growth Presents Challenges And Opportunities A Custom Technology Adoption Profile Commissioned By AT&T August 2012 Introduction Today s CIO faces many challenges. Businesses are craving data as they look to remain competitive, and scour external

More information



More information

Why enterprise data archiving is critical in a changing landscape

Why enterprise data archiving is critical in a changing landscape Why enterprise data archiving is critical in a changing landscape Ovum white paper for Informatica SUMMARY Catalyst Ovum view The most successful enterprises manage data as strategic asset. They have complete

More information

Effective, Affordable Data Management with CommVault Simpana 9 and Microsoft Windows Azure

Effective, Affordable Data Management with CommVault Simpana 9 and Microsoft Windows Azure Effective, Affordable Data Management with CommVault Simpana 9 and Microsoft Windows Azure Businesses benefit from streamlined data management both on premises and in the cloud. White Paper Published:

More information

Deploying an Operational Data Store Designed for Big Data

Deploying an Operational Data Store Designed for Big Data Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Implementing Oracle BI Applications during an ERP Upgrade

Implementing Oracle BI Applications during an ERP Upgrade 1 Implementing Oracle BI Applications during an ERP Upgrade Jamal Syed Table of Contents TABLE OF CONTENTS... 2 Executive Summary... 3 Planning an ERP Upgrade?... 4 A Need for Speed... 6 Impact of data

More information

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand Build a Streamlined Data Refinery An enterprise solution for blended data that is governed, analytics-ready, and on-demand Introduction As the volume and variety of data has exploded in recent years, putting

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

C A S E S T UDY The Path Toward Pervasive Business Intelligence at an International Financial Institution

C A S E S T UDY The Path Toward Pervasive Business Intelligence at an International Financial Institution C A S E S T UDY The Path Toward Pervasive Business Intelligence at an International Financial Institution Sponsored by: Tata Consultancy Services October 2008 SUMMARY Global Headquarters: 5 Speen Street

More information

Real World Strategies for Migrating and Decommissioning Legacy Applications

Real World Strategies for Migrating and Decommissioning Legacy Applications Real World Strategies for Migrating and Decommissioning Legacy Applications Final Draft 2014 Sponsored by: Copyright 2014 Contoural, Inc. Introduction Historically, companies have invested millions of

More information

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows Sponsored by: Prepared by: Eric Slack, Sr. Analyst May 2012 Storage Infrastructures for Big Data Workflows Introduction Big

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

Information Stewardship: Moving From Big Data to Big Value

Information Stewardship: Moving From Big Data to Big Value Information Stewardship: Moving From Big Data to Big Value By John Burke Principal Research Analyst, Nemertes Research Executive Summary Big data stresses tools, networks, and storage infrastructures.

More information

Four Things You Must Do Before Migrating Archive Data to the Cloud

Four Things You Must Do Before Migrating Archive Data to the Cloud Four Things You Must Do Before Migrating Archive Data to the Cloud The amount of archive data that organizations are retaining has expanded rapidly in the last ten years. Since the 2006 amended Federal

More information


WHITE PAPER WHY ORGANIZATIONS NEED LTO-6 TECHNOLOGY TODAY WHITE PAPER WHY ORGANIZATIONS NEED LTO-6 TECHNOLOGY TODAY CONTENTS Storage and Security Demands Continue to Multiply.......................................3 Tape Keeps Pace......................................................................4

More information

W H I T E P A P E R T h e R O I o f C o n s o l i d a t i n g B a c k u p a n d A r c h i v e D a t a

W H I T E P A P E R T h e R O I o f C o n s o l i d a t i n g B a c k u p a n d A r c h i v e D a t a Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 W H I T E P A P E R T h e R O I o f C o n s o l i d a t i n g B a c k u p a n d A r c h i v e D a

More information

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group. 2008 IBM Corporation

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group. 2008 IBM Corporation IBM Optim The ROI of an Archiving Project Michael Mittman Optim Products IBM Software Group Disclaimers IBM customers are responsible for ensuring their own compliance with legal requirements. It is the

More information



More information