Optimizing the Data Warehouse Infrastructure with Archiving
|
|
|
- Hannah Jefferson
- 10 years ago
- Views:
Transcription
1 WHITE PAPER Optimizing the Data Warehose Infrastrctre with Archiving By Bill Inmon
2 This docment contains Confidential, Proprietary and Trade ecret Information ( Confidential Information ) of Informatica Corporation and may not e copied, distrited, dplicated, or otherwise reprodced in any manner withot the prior written consent of Informatica. While every attempt has een made to ensre that the information in this docment is accrate and complete, some typographical errors or technical inaccracies may exist. Informatica does not accept responsiility for any kind of loss reslting from the se of information contained in this docment. The information contained in this docment is sject to change withot notice. The incorporation of the prodct attrites discssed in these materials into any release or pgrade of any Informatica software prodct as well as the timing of any sch release or pgrade is at the sole discretion of Informatica. Protected y one or more of the following U.. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374; 6,208,990; 6,208,990; 6,850,947; 6,895,471; or y the following pending U.. Patents: 09/644,280; 10/966,046; 10/727,700. This edition plished April 2010
3 White Paper Tale of Contents Exective mmary... 2 The Evoltion of the Data Warehose... 3 The Data Lifecycle within the Data Warehose... 3 Dormant Data in the Data Warehose... 4 Data Warehose Partitioning Data in the Data Warehose... 6 Using torage Tiers to Manage Warehose Data... 6 Data Archiving to Optimize torage Tiers... 8 Indexing Archival Data The Changing trctre of Data over Time... 9 Informatica Data Archive : The Complete Data Warehose Archiving oltion Rost Archiving Techniqes Enale Optimal torage Tiers Mltiple, Easy Access Methods to Archived Data Atomatic Indexing of Archival Data Atomatic Management of Changing Data trctres Universal Connectivity Integration with Other Archiving Platform, ECM, and torage oltions Conclsion Aot Bill Inmon Optimizing the Data Warehose Infrastrctre with Archiving 1
4 Exective mmary The world of data and information has een in a constant state of evoltion since the first sage of compters in the late 1950 s. Over time, it ecame apparent that data, like so many entities, has a lifecycle and niqe to each point in the lifecycle, a different set of characteristics, storage, and access reqirements. The concept of a data warehose evolved from the siness need for reliale, consolidated and integrated data reporting and analysis from varying points in its lifecycle, across disparate data sorces. While in a gross sense, a data warehose is simply a repository of an organization s electronically stored data, it is important to recognize that any warehose is only as good as the processes to find, access and move items into and ot of the warehose. For data, the essential components of a data warehosing system incldes the aility to selectively store data, to retrieve and analyze data no matter where it s located, and to manage the data dictionary. Operating an efficient data warehose reqires the organization to nderstand the differences inherent in the information stored in the data warehose according to its point within the data lifecycle. As data ages: The proaility of that data eing accessed drops. imply pt, the older data ecomes, the less freqently it is sed. The strctre of data changes. As software grows increasingly complex to process and handle more data with greater efficiency, y necessity dataase architectres change. This is often seen in a steady stream of software releases taking advantage of increasingly more powerfl hardware and software technologies. The amont of data eing stored grows exponentially. Governed y oth indstry and government reglations, data mst e stored and kept accessile for years. While only the first year s worth of data is actively sed, maintaining historical data can easily alloon data storage to as mch as 20 times larger than the crrent prodction dataase. This white paper will address the isses created y a complex data lifecycle within the data warehose and how data archiving can etter manage growing data volmes. By nderstanding the dynamic forces at work governing the explosion of data volmes in the data warehose, and the technologies availale today to effectively archive and retrieve data ased on its point in the lifecycle, the operation and cost of the data warehose can e made more manageale, prodctive and efficient. Implementing rost archiving techniqes will provide an optimal and cost-effective archiving infrastrctre for the data warehose that: Maintains data integrity across mltiple formats Enales easy, on-demand access to archival data Provides niversal connectivity and integrates with mltiple archiving platforms to ensre sperior and cost-effective scalaility and performance. Efficiently stores archived data to save storage capacity, while facilitating fast data retrieval 2
5 White Paper The Evoltion of the Data Warehose The most important achievement of the data warehose was the aility to create a platform for integrating corporate data from mltiple enterprise applications to facilitate analysis and reporting. This profond transformation allowed the organization to have for the first time a single, integrated corporate dataase. It is this complete set of integrated data that allows an organization to view information enterprise-wide, from a tre organizational perspective. Data warehose Corporate Data Legacy/operational applications ETL historical integrated detailed non volatile Figre 1. The classic data warehose is ilt y passing legacy and operational application data throgh ETL By integrating more and more data from a growing variety of data sorces, organizations grew more sophisticated in handling their data, exposing the need for an expanded set of information processing capailities. From the asic data warehose, simply a collection of aggregated historical data, the need for a second generation data warehose architectre and design egan to evolve. The Data Lifecycle within the Data Warehose As organizations ecame experienced with a first generation data warehose, dataase administrators noticed that most of the qeries were going against the most recent six months worth of data. This first manifestation of a data lifecycle within the data warehose came with a growing awareness that as data aged, the proaility that sch data wold e accessed dropped. The older data ecame, the less freqently the data was accessed. Even more importantly came the awareness that as the data warehose aged, the volme of data increased. Data in a data warehose grows at an explosive rate. In the first one or two years of a data warehose the volme of data often grows at a 200% to 500% rate per year. This rate contines to accelerate ntil the forth or fifth year when the data warehose rate of growth drops to aot 100% to 200% per year. Bt y that time there is already a significant amont of data that has een collected in the data warehose. For a variety of reasons, the data warehose cased an explosion of data that was to e managed y the corporation. Optimizing the Data Warehose Infrastrctre with Archiving 3
6 This explosion of data in the warehose has two major impacts: The impact on performance as data grows cases degradation across the enterprise, creating ottlenecks that negatively impact each ser s aility to access data in a timely manner The growing cost of adding disk storage and increased cost to maintain an IT infrastrctre to spport it As long as IT organizations can maintain systems with jst the relevant amont of crrent data that is reglarly accessed for daily operations, performance is optimal. Bt as the system accmlates a large amont of historical data with only a small portion of that data eing sed, performance worsens. Performance degrades ecase the system mst process and handle large amont of data that is not sed. An analogy wold e cholesterol in the ody. In the circlatory system of the yong marathon rnner there is very little cholesterol and the yong athlete has a very efficient heart. Bt in a 65 year old coch potato, there is an accmlation of cholesterol casing stress to the heart which has to expend more effort to maintain proper circlation. The same is tre of a large data warehose where the system contains a large volme of nsed data. The system has to manage hge amonts of nnecessary data, and in doing so ses machine cycles that wold otherwise not e necessary. By maintaining this explosion of data in a data warehose, IT infrastrctre and maintenance costs grow exponentially even thogh the percentage of the data that is actally sed decreases. What complicates matters is that after a certain volme of data, costs rise dramatically as spporting this data egins to reqire more than jst physical disk. The infrastrctre egins to reqire additional processors, complex disk arrays, additional software, and of corse, staff time to operate and maintain the growing systems, casing the associated IT cost to increase exponentially. Dormant Data in the Data Warehose An analysis of sage patterns shows that most qeries se only the most crrent data, with a larger and larger portion of the data warehose not eing sed. Within jst two years of collecting data, most organizations find that only the first six months is eing analyzed, leaving approximately 18 months of data ntoched a trend that contines naated as data is collected over longer periods. The reslt is that the vast majority of the data in the data warehose is simply never toched y anyone. 50 g 49 g sed 500 g 400 g sed 2 t 600 g sed 10 t 700 g sed Figre 2. As the volme of data grows in the data warehose, the amont and percentage of data that is actally sed drops. 4
7 White Paper The organization has jst discovered what is termed dormant data. Dormant data in a data warehose is like a 2000 pond anchor on a two man rowoat. It simply cases prolems far more disproportionately than one wold ever imagine. One way to nderstand the impact of dormant data in a data warehose is ased on the proaility of access. In a first generation, matre data warehose, there is typically some crrent data that is sed very freqently and a lot of data that is rarely or never sed at all. High proaility of access Low proaility of access Figre 3. Data can e groped into different proaility of access The next stage in the evoltion of data warehose architectre now ecomes apparent it makes oth economic and technological sense to move dormant data ot of the prodction system to some other storage media, in a different data tier. There are three main reasons for moving dormant data ot of the first generation data warehose environment: The cost of the data warehose infrastrctre is greatly redced y the movement of data from the first generation data warehose into another less expensive storage media. By moving dormant data ot of the first generation data warehose into the different storage tiers availale in the next generation data warehose, the organization can now handle mch larger data volmes than cold ever e handled y a first generation data warehose. Performance improves y alleviating the stress created y maintaining a hge dataase infrastrctre Optimizing the Data Warehose Infrastrctre with Archiving 5
8 Data Warehose 2.0 Based on limitations in the first generation data warehose, a second generation, DW 2.0, evolved to recognize and spport the lifecycle within the data warehose. There are several sstantial differences etween the first generation data warehoses and DW 2.0, most notaly the recognition that as data ages, its characteristics and access reqirements change. As a conseqence, the infrastrctre in DW 2.0 is divided into different storage types ased on the age of the data. Data is first placed on a high performance storage type and is moved over time from this high end storage type to the next lower cost and lower performance storage type ased on the proaility of data access. This second generation data warehose recognizes the need for dataase partitions, indexes and storage tiers. Partitioning Data in the Data Warehose One standard practice for managing the data warehose environment is the aility to reak the archival data p into partitions. While there are many ways to partition warehose data, the most common is to divide the data y date. One partition contains the data from 2003, the next partition contains data from 2004, the next partition contains data from 2005 and so forth. This mode of partitioning is natral ecase the data arrives y date. Other strategies can also e employed, sch as partitioning y organizational nit, y geography, and so forth. And data can e partitioned y more than one set of parameters. For example data can e partitioned y date and geography, or y date and organizational fnction, and so forth. By dividing data into partitions, data can e strctred according to ser access patterns. earches that can eliminate data in mltiple partitions at once can e condcted more qickly and efficiently, lowering the cost of accessing data and redcing processing demands Conslting Leasing Hardware Figre 4. Different ways of partitioning data Using torage Tiers to Manage Warehose Data Erope Asia Africa To frther optimize the data warehose infrastrctre, data partitions can e located on different storage tiers, having different performance, access, availaility, and cost characteristics, ased on the access reqirements of the data. There are many reasons for separating a first generation data warehose into physically separate s divisions that resides on and is managed on different storage tiers. Figre 5. Data with low proaility of access can e moved to alternate storage 6
9 White Paper The most ovios and compelling of those reasons is economics. By separating ot the first generation data warehose into separate physical storage tiers, the small amont of data that is sed freqently resides on expensive high performance disk storage, and the lk of the data that is not eing sed resides on less expensive storage media. Different storage tiering strategies can e employed. One possile strategy is to define storage tiers ased on the performance reqirements arond data access and data pdates: The interactive tier is where transaction processing takes place. The proaility of access of data in the interactive tier is high. The integrated tier is the place where corporate data is created. In the integrated tier is fond the classical first generation data warehose. There is a reasonaly high proaility of access for data fond in the integrated tier. The near line storage tier is optional. ome organizations need near line storage and some organizations do not. Typically, near line storage is a cache for the integrated tier. Data fond in the near line storage has a low proaility of access. The archival tier contains data with the lowest proaility of access However it is done, data needs to e physically removed from the core prodction data warehose, where in the context of the aove storage tiering strategy, the integrated storage tier wold likely reside. This means that the data can e relocated to other storage types sch as less expensive disk or file-ased storage. However it is done, dormant data needs to e placed on a separate storage medim than that of the core prodction data warehose. This leads to the evoltion of data warehose archiving. Expensive Inexpensive Figre 6. One of the enefits of moving data to alternate form of storage is to redce cost Optimizing the Data Warehose Infrastrctre with Archiving 7
10 Data Archiving to Optimize torage Tiers Data archiving can e employed to atomatically and physically relocate data with lower siness vale in data warehoses to more appropriate and cost-effective storage tiers. Data can have lower siness vale ased on a nmer of criteria, sch as data access and performance reqirements, the age of the data, which region or department the data pertains to, or partition sage. As low access data grows to consme the lion s share of the data warehose, the most logical progression is for this data to e physically and logically separated from the core prodction data warehose. Once the organization nderstands the isses of data management, the related economics, the isses of dormant data, and the evoltionary pressres created y data growth, the conclsion is inevitale that first generation data warehoses evolve to DW 2.0, and in doing so, the archival data storage tier is created. The archival storage tier in the DW 2.0 data warehose environment has many different characteristics that set it apart from the other parts of the data warehose. The proaility of access of data in the archival tier is low. Data is normally not pdated in the archival environment. Dataase design may or may not e the same etween the two environments. The major drivers for data warehose archiving are sally to redce infrastrctre cost y storage tiering, redce maintenance cost, and maintain peak data warehose performance. imply relocating inactive data from the prodction data warehoses to lower-cost servers and storage achieves many of these goals, t yor siness reqirements are likely to e more complex, sch as how yo access and retrieve archived data. Yo need to consider yor organization s dget constraints and performance and access reqirements when selecting a data warehose archiving soltion. Yor IT organization will proaly access archived data less freqently than active data. Bt yo may still have to periodically retrieve the comined archived and crrent data directly from the original application interface. In this case, the data shold e archived to a format that facilitates relatively high qery performance sch as another data warehose instance, located on a lowercost infrastrctre. On the other hand, if inactive data is qite old and is ready to e retired, yo may have to access it only rarely. In this case, access from a reporting or e-discovery tool, rather than from an application interface, may e adeqate. lower qery performance can e tolerated, and the data may e archived to a more optimal, compressed format, sch as a compressed file. Indexing Archival Data Another significant component of ilding a data archiving environment is the practice of creating passive indexes. In the active parts of the data warehose, the practice of creating indexes to enhance performance is very common. In the archival environment, however, projecting siness reqirements for ftre data access can e difficlt. Generally, the archival environment is examined whenever a siness need arises. Bt the siness need may not e recognized for 20 years after the archival data is stored. Therefore, the processes sed to ild indexes in the data warehose do not apply to the archival environment. To that end, there is the design practice of creating what are called passive indexes. 8
11 White Paper Typically passive indexes are ilt sing the likely or possile criteria for fast ftre retrieval of archival data. Part nmers, cstomer names, order nmers, phone calls made, and an episode of care all are likely pieces of data that cold e indexed. An analysis of common sage patterns can help determine what data is likely to e referenced in the ftre. Archiving software shold e ale to analyze the data and atomatically create indexes dring the archival process, optimizing it for ftre access. Figre 7. The more archival data can e indexed, the faster sseqent searches will e The Changing trctre of Data over Time It is ecase every organization ndergoes change, and every change is ltimately reflected in the data strctre, that the dataase designer expects the data strctre fond in the archival environment to not remain constant. As data is added to the archival environment, the isse of managing data stored in different releases of software technology over a long period of time arises. ppose that an organization starts to store data in the archival environment in 1990 nder release 2.0 of a prodct. By 1996 the data is stored nder release 3.1 of the prodct. More time passes and y 2005 data is stored nder release 8.i. In 2010 data is stored nder release Release 3.1 Release 8.i Release 2.0 Release 11.4 Figre 8. Over time, newer releases of the software will change the data strctre ch a progression is asoltely normal. The qestion now ecomes can the crrent software release read and recognize data that was stored nder an earlier release? Usally software vendors can handle the previos release of the software. Bt when it comes to going ack to a software release that is a decade old (or even two decades old), a time comes when a vendor can no longer spport a past data architectre while providing the new fnctionality that has ecome essential. Optimizing the Data Warehose Infrastrctre with Archiving 9
12 j j j j j j j j j j j j l l l j j j There are many approaches to the handling changing data strctres in the archival environment. One essential element of the archival environment is that of metadata the descriptive information that defines the context and strctre of the archival data. Maintaining the right metadata is essential to handling strctral changes to archival data. One soltion to managing strctral changes is to maintain mltiple metadata versions corresponding to the strctral changes over time. Another soltion is to pdate the metadata periodically to synchronize the archival metadata with the core prodction data warehose strctre. Regardless of the approach, a data archive soltion needs to handle strctral changes to archival data ased on the evoltion of the prodction data warehose over time and shield the ser from the maintenance nightmare Figre 9. Over time, the asic strctre of data changes By evolving the data warehose infrastrctre to the DW 2.0 architectre, organizations ecome etter ale to alance data to meet access and system performance reqirements. In doing so, the cost of the data warehose is mitigated, enaling the data warehose to more efficiently accommodate hge amonts of data. In addition the data warehose can store and manage data over a wide range of time. DW 2.0 manages data that is two seconds old and data that is 20 years old. Data Warehose DW 2.0 DW 2.0 Interactive Very crrent Architectre for the next generation of data warehosing ETL, data qality Transaction data A p p A p p A p p LOCAL METADATA Integrated Crrent++ Near line Less than crrent Unstrctred ETL Textal sjects Internal, external imple pointer Captred text Text id... Linkage Text to sj Textal sjects Internal, external imple pointer Captred text Text id... Linkage Text to sj Detailed mmary Detailed ETL, data qality mmary Continos snapshot data Profile data Continos snapshot data Profile data j j j j j j ENTERPRIE METADATA REPOITORY MATER DATA Archival Older Textal sjects Internal, external Captred text Text id... imple pointer Detailed Continos snapshot data Profile data Linkage Text to sj mmary Unstrctred trctred Figre 10. There is a natral evoltion of data warehoses from the classic first generation to DW
13 White Paper Informatica Data Archive : The Complete Data Warehose Archiving oltion Informatica Data Archive helps yor IT organization to cost-effectively manage the explosion of data volmes in data warehoses. It allows yo to easily and safely archive inactive data, and then readily access it when needed. Informatica Data Archive delivers the fll range of capailities that yor IT organization needs to effectively manage data growth in data warehoses, inclding: Rost archiving techniqes that ensre data integrity after archiving and spporting mltiple archive formats to enale optimal storage tiers Mltiple, easy access methods to archived data Atomatic indexing of archived data Atomatic management of changing data strctres Universal connectivity Integration with other archiving platforms, ECM, and storage soltions, sch as ymantec, Commvalt, and EMC By leveraging the power of the Informatica Platform, the indstry s leading data integration platform, Informatica Data Archive enales organizations to handle the hge data volmes typical of very large gloal enterprises. The software provides sperior scalaility and performance, delivering data to the most cost-effective storage option ased on their vale. It also offers nparalleled interoperaility. The software is ased on an open, easily extensile architectre, enaling simple integration with third-party soltions. Rost Archiving Techniqes Enale Optimal torage Tiers With Informatica Data Archive, yo can archive to another data warehose instance or to a highly compressed file format that can reslt in dramatic storage capacity saving. The compression ratio that can e achieved is ased on the size of the data the larger the data size, and the more redndancy in data vales, yo may e ale to achieve a compression ratio of 20:1 to 60:1 compared to the original data size. The choice of archiving to another data warehose or a compressed file archive shold e ased on the age of the data and response time as well as freqency of access. If yo still need to access the data with relatively high freqency and with high performance, then archiving to another data warehose instance is more appropriate. However, if data will e rarely accessed, for infreqent reporting or adit reqirements, then archiving to a highly compressed file is the more optimal soltion. Archived data can e stored on a file system located in lower cost storage or even storage in the clod, for economies of scale. As data ages and access reqirements change over time, Informatica Data Archive atomatically converts and relocates the data from one archiving format and location to another, enaling mltiple cost-effective storage tiers. Informatica Data Archive enales yo to archive transactional and detailed data only, which are the fastest growing. This is done while maintaining data integrity and links to dimensional and aggregate tales that may still e stored in the prodction system. Eventally, some older dimension records may also e archived as well. Informatica Data Archive has deep knowledge aot what types of tales shold e archived to spport an optimal archiving strategy. Informatica Data Archive can also handle partitions that were created in the prodction data Optimizing the Data Warehose Infrastrctre with Archiving 11
14 warehose and maintain those data partitions in the data archive, to maintain scalaility and performance. Figre 11 illstrates a data warehose archiving strategy where detailed data are slowly relocated to another dataase and sseqently to a more optimal compressed file format, which reslts in extreme redction in storage capacity. Informatica Data Archive provides an easy to se graphical interface to define archiving jos easily withot extensive configration, scripting, or programming. Figre 11 shows the Informatica Data Archive wizard-ased interface to allow sers to easily define and monitor archiving jos. Prodction Data Warehose (less than 2 years old) Archive Data Warehose (2-7 years old) Optimized File Archive (40:1 compression) (over 7 years old) DIM1 DIM2 DIM3 AGG1 AGG2 AGG2 DETAIL 1 DETAIL 2 OLD_DIM3 DETAIL 3 DETAIL 4 OLD_DIM2 DETAIL 5 DETAIL 6 DETAIL 7 Figre 11. Informatica Data Archive offers mltiple archiving formats (dataase or compressed file) that enale optimal storage tiering and the flexiility to archive different types of records while maintaining data integrity A data warehose archiving soltion that offers mltiple archiving formats and accessiility options allows IT organizations to determine the appropriate tradeoffs among archive size, performance, application accessiility, and cost. Figre 12. Archive complete siness entities sing Informatica Data Archive. Yor IT organization mst also e ale to restore archive data to its original location. Otherwise, there is no way to correct mistakes dring archiving or to accommodate changes to access reqirements. If archived data later needs to ecome active again and for some reason modified and annotated, then it also needs to e restored. For example, a cstomer order that is closed and reopened may need to e restored ecase it has ecome active again. Informatica Data Archive allows yo to restore archived data at different levels of granlarity, sch as selected detail records, siness entities, or an entire archive. 12
15 White Paper Mltiple, Easy Access Methods to Archived Data Regardless of the archive format, archived data needs to e easily accessile either from the original application interface or throgh standard interfaces for reporting or compliance adits. Informatica Data Archive spports standard QL/ODBC/JDBC interfaces for reporting sing any reporting or siness intelligence tool. The soltion also offers the option to access the data from an application-aware data discovery portal to easily search, rowse, and view archived or retired data ased on siness entities and with a similar look-and-feel as the original application interface. Atomatic Indexing of Archival Data When archiving data to another data warehose instance Informatica Data Archive atomatically ilds and maintains indexes that exist in the prodction data warehose instance. When archiving to a highly compressed file archive, data is atomatically indexed and stored in an optimal format to facilitate efficient storage and scalale retrieval. No performance tning and maintenance is reqired on the archival data, redcing IT staff time. Atomatic Management of Changing Data trctres As the prodction data warehose strctre contines to evolve, Informatica Data Archive atomatically pdates the metadata and strctre of the archival data warehose. When archiving to a highly compressed file format, Informatica Data Archive maintains mltiple versions of the metadata, corresponding to periodic snapshots of the prodction data warehose strctre. This enales point-in-time qerying of the archival data ased on the strctre of the data warehose at that point in time. By atomatically managing the metadata and strctre of archive data, ased on the changing strctre of the prodction data warehose, Informatica Data Archive redces the maintenance effort reqired on the archival infrastrctre. Universal Connectivity If yor organization is like many other enterprises, yo have data warehoses and applications on mltiple dataase systems on varying operating systems. To spport yor enterprise needs, Informatica Data Archive enales yo to manage archive processes across data warehoses and applications on diverse dataases, inclding relational (e.g. Oracle, DB2, yase, QL erver, Teradata, Informix), mainframe (e.g. IDM, VAM, IM), files, and packaged CRM and ERP applications on open systems (e.g. Windows, Linx, UNIX) or mainframes (e.g. z/o, A/400). Integration with Other Archiving Platform, ECM, and torage oltions Yor company may already have an archiving soltion for s and files. Yor IT organization may also have standardized on an Enterprise Content Management (ECM) soltion to manage yor nstrctred data. To spport compliance to reglatory reqirements and ensre immtaility and single instance storage of retained data, yo may e sing archiving platforms, sch as Content Addressale torage (CA), which reqires proprietary connectivity. To enale yor organization to respond qickly and accrately to adit reqests as well as to cost-effectively retain data for longer periods, Informatica Data Archive allows yo to manage and discover archived data of all types, oth strctred and nstrctred, centrally. This is achieved throgh integration with existing archiving, content management, and storage soltions, inclding EMC Docmentm, ymantec Enterprise- Valt and Discovery Accelerator, and CommValt impana and ediscovery, to facilitate centralized management and e-discovery of all types of archived data. Optimizing the Data Warehose Infrastrctre with Archiving 13
16 Conclsion Based on an explosion of data in corporate environments, the data warehose has evolved from a simple platform for reliale, consolidated and integrated data reporting to a sophisticated data infrastrctre that recognizes a complex data lifecycle. By nderstanding the dynamic forces at work in data growth, storage and accessiility, and the technologies availale today to effectively manage the operation and cost of the data warehose, IT organizations shold now e etter positioned to implement soltions that make their data warehose environments more manageale, prodctive and efficient. DW 2.0, the second generation data warehose, recognizes that as data ages, its characteristics and access reqirements change. By dividing data into different storage tiers ased on the age and freqency of access, from high performance storage for interactive data to lower cost, lower performance storage for low access or inactive data access, DW 2.0 provides a platform for managing warehose data more effectively. The key to capping yor IT organization s data warehose management costs and risks is to relocate dormant data to a lower-cost infrastrctre that the storage tiering in DW 2.0 architectre makes availale. This is what data warehose archiving soltions can do for yo archive data ased on its point in the lifecycle, while maintaining data integrity and easy access to the data. Informatica Data Archive enales organizations to handle the hge data volmes typical of very large gloal enterprises. By providing comprehensive and rost techniqes to easily and safely archive inactive data, and then readily access it when needed, Informatica Data Archive delivers the complete archiving soltion necessary to provide an optimal and cost-effective data warehose infrastrctre. When yor IT organization implements a complete, scalale, and flexile archiving soltion, yo ll redce the total cost of ownership of yor data warehoses and other applications y: Redcing storage, server, software, and maintenance costs Improving data warehose performance Increasing data warehose availaility pporting compliance with internal, indstry, and governmental mandates and reglations Together, Informatica and yor IT organization can align the siness vale of data with the most appropriate and cost-effective IT infrastrctre to manage it. 14
17 White Paper Learn More Learn more aot the Informatica Platform. Visit s at or call ( in the U..). Aot Informatica Informatica Corporation (NADAQ: INFA) is the world s nmer one independent provider of data integration software. Organizations arond the world gain a competitive advantage in today s gloal information economy with timely, relevant and trstworthy data for their top siness imperatives. More than 3,900 enterprises worldwide rely on Informatica to access, integrate and trst their information assets held in the traditional enterprise, off premise and in the Clod. Aot Bill Inmon Bill Inmon, the father of data warehosing, has written 52 ooks translated into 9 langages. Bill fonded and took plic the world s first ETL software company. Bill has written over 1000 articles and plished in most major trade jornals. Bill has condcted seminars on every continent except Antarctica. References W H Inmon, DW 2.0 ARCHITECTURE FOR THE NEXT GENERATION OF DATA WAREHOUING, 2008, Morgan Kafman, Boston Mass Inmoncif.com a wesite with many white papers and other information aot data warehoses and DW 2.0. Optimizing the Data Warehose Infrastrctre with Archiving 15
18 Worldwide Headqarters, 100 Cardinal Way, Redwood City, CA 94063, UA phone: fax: toll-free in the U: Informatica Corporation. All rights reserved. Printed in the U..A. Informatica, the Informatica logo, and The Data Integration Company are trademarks or registered trademarks of Informatica Corporation in the United tates and in jrisdictions throghot the world. All other company and prodct names may e trade names or trademarks of their respective owners. First Plished: April (04/01/2010)
Charles Dickens A Tale of Two Cities A TALE OF TWO ARCHITECTURES. By W H Inmon. It was the best of times. It was the worst of times.
A TALE OF TWO ARCHITECTURE It was the est of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of elief, it was the epoch of incredlity, it was
Deploying Network Load Balancing
C H A P T E R 9 Deploying Network Load Balancing After completing the design for the applications and services in yor Network Load Balancing clster, yo are ready to deploy the clster rnning the Microsoft
Introduction to HBase Schema Design
Introdction to HBase Schema Design Amandeep Khrana Amandeep Khrana is a Soltions Architect at Clodera and works on bilding soltions sing the Hadoop stack. He is also a co-athor of HBase in Action. Prior
Planning and Implementing An Optimized Private Cloud
W H I T E PA P E R Intelligent HPC Management Planning and Implementing An Optimized Private Clod Creating a Clod Environment That Maximizes Yor ROI Planning and Implementing An Optimized Private Clod
Corporate performance: What do investors want to know? Innovate your way to clearer financial reporting
www.pwc.com Corporate performance: What do investors want to know? Innovate yor way to clearer financial reporting October 2014 PwC I Innovate yor way to clearer financial reporting t 1 Contents Introdction
Designing and Deploying File Servers
C H A P T E R 2 Designing and Deploying File Servers File servers rnning the Microsoft Windows Server 2003 operating system are ideal for providing access to files for sers in medim and large organizations.
Planning a Managed Environment
C H A P T E R 1 Planning a Managed Environment Many organizations are moving towards a highly managed compting environment based on a configration management infrastrctre that is designed to redce the
CRM Customer Relationship Management. Customer Relationship Management
CRM Cstomer Relationship Management Farley Beaton Virginia Department of Taxation Discssion Areas TAX/AMS Partnership Project Backgrond Cstomer Relationship Management Secre Messaging Lessons Learned 2
Planning an Active Directory Deployment Project
C H A P T E R 1 Planning an Active Directory Deployment Project When yo deploy the Microsoft Windows Server 2003 Active Directory directory service in yor environment, yo can take advantage of the centralized,
7 Help Desk Tools. Key Findings. The Automated Help Desk
7 Help Desk Tools Or Age of Anxiety is, in great part, the reslt of trying to do today s jobs with yesterday s tools. Marshall McLhan Key Findings Help desk atomation featres are common and are sally part
High Availability for Microsoft SQL Server Using Double-Take 4.x
High Availability for Microsoft SQL Server Using Doble-Take 4.x High Availability for Microsoft SQL Server Using Doble-Take 4.x pblished April 2000 NSI and Doble-Take are registered trademarks of Network
Introducing Revenue Cycle Optimization! STI Provides More Options Than Any Other Software Vendor. ChartMaker Clinical 3.7
Introdcing Revene Cycle Optimization! STI Provides More Options Than Any Other Software Vendor ChartMaker Clinical 3.7 2011 Amblatory EHR + Cardiovasclar Medicine + Child Health STI Provides More Choices
GUIDELINE. Guideline for the Selection of Engineering Services
GUIDELINE Gideline for the Selection of Engineering Services 1998 Mission Statement: To govern the engineering profession while enhancing engineering practice and enhancing engineering cltre Pblished by
High Availability for Internet Information Server Using Double-Take 4.x
High Availability for Internet Information Server Using Doble-Take 4.x High Availability for Internet Information Server Using Doble-Take 4.x pblished April 2000 NSI and Doble-Take are registered trademarks
Position paper smart city. economics. a multi-sided approach to financing the smart city. Your business technologists.
Position paper smart city economics a mlti-sided approach to financing the smart city Yor bsiness technologists. Powering progress From idea to reality The hman race is becoming increasingly rbanised so
Enabling Advanced Windows Server 2003 Active Directory Features
C H A P T E R 5 Enabling Advanced Windows Server 2003 Active Directory Featres The Microsoft Windows Server 2003 Active Directory directory service enables yo to introdce advanced featres into yor environment
Chapter 1. LAN Design
Chapter 1 LAN Design CCNA3-1 Chapter 1 Note for Instrctors These presentations are the reslt of a collaboration among the instrctors at St. Clair College in Windsor, Ontario. Thanks mst go ot to Rick Graziani
Technical Notes. PostgreSQL backups with NetWorker. Release number 1.0 302-001-174 REV 01. June 30, 2014. u Audience... 2. u Requirements...
PostgreSQL backps with NetWorker Release nmber 1.0 302-001-174 REV 01 Jne 30, 2014 Adience... 2 Reqirements... 2 Terminology... 2 PostgreSQL backp methodologies...2 PostgreSQL dmp backp... 3 Configring
Motorola Reinvents its Supplier Negotiation Process Using Emptoris and Saves $600 Million. An Emptoris Case Study. Emptoris, Inc. www.emptoris.
Motorola Reinvents its Spplier Negotiation Process Using Emptoris and Saves $600 Million An Emptoris Case Stdy Emptoris, Inc. www.emptoris.com VIII-03/3/05 Exective Smmary With the disastros telecommnication
Designing an Authentication Strategy
C H A P T E R 1 4 Designing an Athentication Strategy Most organizations need to spport seamless access to the network for mltiple types of sers, sch as workers in offices, employees who are traveling,
EMC VNX Series. EMC Secure Remote Support for VNX. Version VNX1, VNX2 300-014-340 REV 03
EMC VNX Series Version VNX1, VNX2 EMC Secre Remote Spport for VNX 300-014-340 REV 03 Copyright 2012-2014 EMC Corporation. All rights reserved. Pblished in USA. Pblished Jly, 2014 EMC believes the information
Planning a Smart Card Deployment
C H A P T E R 1 7 Planning a Smart Card Deployment Smart card spport in Microsoft Windows Server 2003 enables yo to enhance the secrity of many critical fnctions, inclding client athentication, interactive
Galvin s All Things Enterprise
Galvin s All Things Enterprise The State of the Clod, Part 2 PETER BAER GALVIN Peter Baer Galvin is the CTO for Corporate Technologies, a premier systems integrator and VAR (www.cptech. com). Before that,
aééäçóáåö=táåççïë= péêîéê=ommp=oéöáçå~ä= açã~áåë
C H A P T E R 7 aééäçóáåö=táåççïë= péêîéê=ommp=oéöáçå~ä= açã~áåë Deploying Microsoft Windows Server 2003 s involves creating new geographically based child domains nder the forest root domain. Deploying
Isilon OneFS. Version 7.1. Backup and recovery guide
Isilon OneFS Version 7.1 Backp and recovery gide Copyright 2013-2014 EMC Corporation. All rights reserved. Pblished in USA. Pblished March, 2014 EMC believes the information in this pblication is accrate
9 Setting a Course: Goals for the Help Desk
IT Help Desk in Higher Edcation ECAR Research Stdy 8, 2007 9 Setting a Corse: Goals for the Help Desk First say to yorself what yo wold be; and then do what yo have to do. Epictets Key Findings Majorities
Executive Coaching to Activate the Renegade Leader Within. Renegades Do What Others Won t To Get the Results that Others Don t
Exective Coaching to Activate the Renegade Leader Within Renegades Do What Others Won t To Get the Reslts that Others Don t Introdction Renegade Leaders are a niqe breed of leaders. The Renegade Leader
EMC ViPR. Concepts Guide. Version 1.1.0 302-000-482 02
EMC ViPR Version 1.1.0 Concepts Gide 302-000-482 02 Copyright 2013-2014 EMC Corporation. All rights reserved. Pblished in USA. Pblished Febrary, 2014 EMC believes the information in this pblication is
Every manufacturer is confronted with the problem
HOW MANY PARTS TO MAKE AT ONCE FORD W. HARRIS Prodction Engineer Reprinted from Factory, The Magazine of Management, Volme 10, Nmber 2, Febrary 1913, pp. 135-136, 152 Interest on capital tied p in wages,
The Boutique Premium. Do Boutique Investment Managers Create Value? AMG White Paper June 2015 1
The Botiqe Premim Do Botiqe Investment Managers Create Vale? AMG White Paper Jne 2015 1 Exective Smmary Botiqe active investment managers have otperformed both non-botiqe peers and indices over the last
Effective governance to support medical revalidation
Effective governance to spport medical revalidation A handbook for boards and governing bodies This docment sets ot a view of the core elements of effective local governance of the systems that spport
EMC VNX Series Setting Up a Unisphere Management Station
EMC VNX Series Setting Up a Unisphere Management Station P/N 300-015-123 REV. 02 April, 2014 This docment describes the different types of Unisphere management stations and tells how to install and configre
iet ITSM: Comprehensive Solution for Continual Service Improvement
D ATA S H E E T iet ITSM: I T I L V 3 I n n o v at i v e U s e o f B e s t P ra c t i c e s ITIL v3 is the crrent version of the IT Infrastrctre Library. The focs of ITIL v3 is on the alignment of IT Services
Building Trust How Banks are Attracting and Retaining Business Clients With Institutional Money Fund Portals
Bilding Trst How Banks are Attracting and Retaining Bsiness Clients With Instittional Money Fnd Portals By George Hagerman, Fonder and CEO, CacheMatrix Holdings, LLC C ompetitive pressres are driving innovation
Purposefully Engineered High-Performing Income Protection
The Intelligent Choice for Disability Income Insrance Prposeflly Engineered High-Performing Income Protection Keeping Income strong We engineer or disability income prodcts with featres that deliver benefits
Candidate: Kevin Taylor. Date: 04/02/2012
Systems Analyst / Network Administrator Assessment Report 04/02/2012 www.resorceassociates.com To Improve Prodctivity Throgh People. 04/02/2012 Prepared For: Resorce Associates Prepared by: John Lonsbry,
Modeling and Metadata Strategies for Next Generation Architectures
White Paer Data Warehosing 2.0 Modeling and Meta trategies for Next Generation Architectres By Bill H. Inmon Forest Rim Technology, LLC Aril 2010 Cororate Headqarters EMEA Headqarters Asia-Pacific Headqarters
EMC Smarts SAM, IP, ESM, MPLS, VoIP, and NPM Managers
EMC Smarts SAM, IP, ESM, MPLS, VoIP, and NPM Managers Version 9.2.2 Spport Matrix 302-000-357 REV 02 Copyright 2013 EMC Corporation. All rights reserved. Pblished in USA. Pblished December, 2013 EMC believes
EMC PowerPath Virtual Appliance
EMC PowerPath Virtal Appliance Version 1.2 Administration Gide P/N 302-000-475 REV 01 Copyright 2013 EMC Corporation. All rights reserved. Pblished in USA. Pblished October, 2013 EMC believes the information
Preparing your heavy vehicle for brake test
GUIDE Preparing yor heavy vehicle for brake test A best practice gide Saving lives, safer roads, ctting crime, protecting the environment Breaking the braking myth Some people believe that a locked wheel
Accelerated Implementation Model
ABOUT US SALES CLOUD SOLUTION CLIENT SUCCESS STORIES Accelerated Implementation Model Sales Clod implementation Fastest ROI - delivered in as few as 60-90 days Three implementation plan options Terillim
EMC Storage Analytics
EMC Storage Analytics Version 2.1 Installation and User Gide 300-014-858 09 Copyright 2013 EMC Corporation. All rights reserved. Pblished in USA. Pblished December, 2013 EMC believes the information in
Designing a TCP/IP Network
C H A P T E R 1 Designing a TCP/IP Network The TCP/IP protocol site defines indstry standard networking protocols for data networks, inclding the Internet. Determining the best design and implementation
CRM Customer Relationship Management. Customer Relationship Management
CRM Cstomer Relationship Management Kenneth W. Thorson Tax Commissioner Virginia Department of Taxation Discssion Areas TAX/AMS Partnership Project Backgrond Cstomer Relationship Management Secre Messaging
MVM-BVRM Video Recording Manager v2.22
Video MVM-BVRM Video Recording Manager v2.22 MVM-BVRM Video Recording Manager v2.22 www.boschsecrity.com Distribted storage and configrable load balancing iscsi disk array failover for extra reliability
A guide to safety recalls in the used vehicle industry GUIDE
A gide to safety recalls in the sed vehicle indstry GUIDE Definitions Aftermarket parts means any prodct manfactred to be fitted to a vehicle after it has left the vehicle manfactrer s prodction line.
EMC ViPR Analytics Pack for VMware vcenter Operations Management Suite
EMC ViPR Analytics Pack for VMware vcenter Operations Management Site Version 1.1.0 Installation and Configration Gide 302-000-487 01 Copyright 2013-2014 EMC Corporation. All rights reserved. Pblished
Using GPU to Compute Options and Derivatives
Introdction Algorithmic Trading has created an increasing demand for high performance compting soltions within financial organizations. The actors of portfolio management and ris assessment have the obligation
NAPA TRAINING PROGRAMS FOR:
NAPA TRAINING PROGRAMS FOR: Employees Otside Sales Store Managers Store Owners See NEW ecatalog Inside O V E R V I E W 2010_StoreTrainingBrochre_SinglePg.indd 1 5/25/10 12:39:32 PM Welcome 2010 Store Training
The Role of the Community Occupational Therapist
Ceredigion Conty Concil Social Services Department The Role of the Commnity Occpational Therapist...taking care to make a difference Large Print or other format/medim are available on reqest please telephone
«Quality in Open Markets: How Larger Leads to Less»
TE 505 Jne 014 «Qality in Open Marets: ow Larger Leads to Less» M.. Calmette, M. Kilenny, C. Lostalan, I. Pechox and C.Bernard Qality in Open Marets: ow Larger Leads to Less M.. Calmette (TE 1, M. Kilenny
The Good Governance Standard for Public Services
The Good Governance Standard for Pblic Services The Independent Commission on Good Governance in Pblic Services Good Governance Standard for Pblic Services OPM and CIPFA, 2004 OPM (Office for Pblic Management
Closer Look at ACOs. Putting the Accountability in Accountable Care Organizations: Payment and Quality Measurements. Introduction
Closer Look at ACOs A series of briefs designed to help advocates nderstand the basics of Accontable Care Organizations (ACOs) and their potential for improving patient care. From Families USA Janary 2012
Closer Look at ACOs. Making the Most of Accountable Care Organizations (ACOs): What Advocates Need to Know
Closer Look at ACOs A series of briefs designed to help advocates nderstand the basics of Accontable Care Organizations (ACOs) and their potential for improving patient care. From Families USA Updated
The Good Governance Standard for Public Services
The Good Governance Standard for Pblic Services The Independent Commission for Good Governance in Pblic Services The Independent Commission for Good Governance in Pblic Services, chaired by Sir Alan Langlands,
Bosch Security Training Academy Training Course Catalogue 2015. uk.boschsecurity.com
Bosch Secrity Training Academy Training Corse Cataloge 2015 k.boschsecrity.com 2 Bosch Secrity Training Academy Training Corses 2015 Bosch Secrity Training Academy Training Corses 2015 3 Contents Enqiries
Facilities. Car Parking and Permit Allocation Policy
Facilities Car Parking and Permit Allocation Policy Facilities Car Parking and Permit Allocation Policy Contents Page 1 Introdction....................................................2 2.0 Application
Apache Hadoop. The Scalability Update. Source of Innovation
FILE SYSTEMS Apache Hadoop The Scalability Update KONSTANTIN V. SHVACHKO Konstantin V. Shvachko is a veteran Hadoop developer. He is a principal Hadoop architect at ebay. Konstantin specializes in efficient
A taxonomy of knowledge management software tools: origins and applications
Evalation and Program Planning 25 2002) 183±190 www.elsevier.com/locate/evalprogplan A taxonomy of knowledge management software tools: origins and applications Peter Tyndale* Kingston University Bsiness
MSc and MA in Finance and Investment online Study an online MSc and MA in Finance and Investment awarded by UNINETTUNO and Geneva Business School
MSc and MA in Finance and Investment online Stdy an online awarded by UNINETTUNO and Geneva Bsiness School Awarded by Geneva Bsiness School Geneva Barcelona Moscow Class profile The connects yo with stdents
Our business is to help you take care of your business. Throgmorton Outsourcing Services. HR Services Payroll Immigration Health & Safety
Or bsiness is to help yo take care of yor bsiness Throgmorton Otsorcing Services HR Services Payroll Immigration Health & Safety Hman Resorces Throgmorton is dedicated to providing HR spport for bsinesses
Standard. 8029HEPTA DataCenter. Because every fraction of a second counts. network synchronization requiring minimum space. hopf Elektronik GmbH
8029HEPTA DataCenter Standard Becase every fraction of a second conts network synchronization reqiring minimm space hopf Elektronik GmbH Nottebohmstraße 41 58511 Lüdenscheid Germany Phone: +49 (0)2351
ASAND: Asynchronous Slot Assignment and Neighbor Discovery Protocol for Wireless Networks
ASAND: Asynchronos Slot Assignment and Neighbor Discovery Protocol for Wireless Networks Fikret Sivrikaya, Costas Bsch, Malik Magdon-Ismail, Bülent Yener Compter Science Department, Rensselaer Polytechnic
Formal modeling and analysis of XML firewall for service-oriented systems
Int. J. Secrity and Networks, Vol. 3, No. 3, 2008 1 Formal modeling and analysis of XML firewall for service-oriented systems Haiping X*, Mihir Ayachit and Ahinay Reddyreddy Compter and Information Science
Tax Considerations for Charitable Gifting
Tax Considerations for Charitable Gifting Prepared by: Martin Kretzschmann CPA CA for Georgian Bay General Hospital Fondation October 24, 2014 Ideas to Redce Tax sing Charitable Giving Simple Gift dring
Candidate: Cassandra Emery. Date: 04/02/2012
Market Analyst Assessment Report 04/02/2012 www.resorceassociates.com To Improve Prodctivity Throgh People. 04/02/2012 Prepared For: Resorce Associates Prepared by: John Lonsbry, Ph.D. & Lcy Gibson, Ph.D.,
AN OTT NETWORK FOR THE CONNECTED WORLD
AN OTT NETWORK FOR THE CONNECTED WORLD 1 AN OTT NETWORK FOR THE CONNECTED WORLD Connect, identify and interact with yor cstomers, via apps, online channels and any other web enabled device. Create social
Opening the Door to Your New Home
Opening the Door to Yor New Home A Gide to Bying and Financing. Contents Navigating Yor Way to Home Ownership...1 Getting Started...3 Finding Yor Home...9 Finalizing Yor Financing...12 Final Closing...13
f.airnet DECT over IP System
The modlar IP commnication system for voice and messaging with the greatest mobility: flexible, easy to maintain, expandable. Fnkwerk Secrity Commnications For s, efficient commnication is vital. New:
Owning A business Step-By-Step Guide to Financial Success
Owning A bsiness Step-By-Step Gide to Financial Sccess CONTACT US For more information abot any of the services in this brochre, call 1-888-845-1850, visit or website at bsiness.mac.com or stop by the
VRM Video Recording Manager v3.0
Video VRM Video Recording Manager v3.0 VRM Video Recording Manager v3.0 www.boschsecrity.com Distribted storage and configrable load balancing iscsi disk array failover for extra reliability Used with
