Modeling and Metadata Strategies for Next Generation Architectures



Similar documents
Charles Dickens A Tale of Two Cities A TALE OF TWO ARCHITECTURES. By W H Inmon. It was the best of times. It was the worst of times.

Optimizing the Data Warehouse Infrastructure with Archiving

Microsoft Licensing NEWSLETTER. Agreement & Products. Contents. Insight, your Trusted Advisor. Your Trusted Advisor uk.insight.

Introduction to HBase Schema Design

Corporate performance: What do investors want to know? Innovate your way to clearer financial reporting

Every manufacturer is confronted with the problem

Chapter 1. LAN Design

9 Setting a Course: Goals for the Help Desk

Designing and Deploying File Servers

High Availability for Microsoft SQL Server Using Double-Take 4.x

Closer Look at ACOs. Making the Most of Accountable Care Organizations (ACOs): What Advocates Need to Know

GUIDELINE. Guideline for the Selection of Engineering Services

High Availability for Internet Information Server Using Double-Take 4.x

Purposefully Engineered High-Performing Income Protection

Candidate: Cassandra Emery. Date: 04/02/2012

Planning and Implementing An Optimized Private Cloud

Closer Look at ACOs. Putting the Accountability in Accountable Care Organizations: Payment and Quality Measurements. Introduction

CRM Customer Relationship Management. Customer Relationship Management

NAPA TRAINING PROGRAMS FOR:

Planning a Managed Environment

Opening the Door to Your New Home

Enabling Advanced Windows Server 2003 Active Directory Features

The Role of the Community Occupational Therapist

5 Using Your Verbatim Autodialer

Candidate: Kevin Taylor. Date: 04/02/2012

7 Help Desk Tools. Key Findings. The Automated Help Desk

A Model Transformation for Increasing Value in Service Networks through Intangible Value Exchanges

Executive Coaching to Activate the Renegade Leader Within. Renegades Do What Others Won t To Get the Results that Others Don t

Deploying Network Load Balancing

Apache Hadoop. The Scalability Update. Source of Innovation

Motorola Reinvents its Supplier Negotiation Process Using Emptoris and Saves $600 Million. An Emptoris Case Study. Emptoris, Inc.

Planning an Active Directory Deployment Project

Position paper smart city. economics. a multi-sided approach to financing the smart city. Your business technologists.

Preparing your heavy vehicle for brake test

On the urbanization of poverty

Sickness Absence in the UK:

Effective governance to support medical revalidation

Accelerated Leadership Performance Program

An unbiased crawling strategy for directed social networks

aééäçóáåö=táåççïë= péêîéê=ommp=oéöáçå~ä= açã~áåë

How To Link Data Across Agencies

Contents Welcome to FOXTEL iq2...5 For your safety...6 Getting Started...7 Playlist Active...53 Setup...54 FOXTEL Guide...18 ON DEMAND...

iet ITSM: Comprehensive Solution for Continual Service Improvement

The Good Governance Standard for Public Services

The Boutique Premium. Do Boutique Investment Managers Create Value? AMG White Paper June

3. Fluid Dynamics. 3.1 Uniform Flow, Steady Flow

HSBC Internet Banking. Combined Product Disclosure Statement and Supplementary Product Disclosure Statement

MSc and MA in Finance and Investment online Study an online MSc and MA in Finance and Investment awarded by UNINETTUNO and Geneva Business School

Chapter Consider an economy described by the following equations: Y = 5,000 G = 1,000

Using GPU to Compute Options and Derivatives

The Good Governance Standard for Public Services

Our business is to help you take care of your business. Throgmorton Outsourcing Services. HR Services Payroll Immigration Health & Safety

Tax Considerations for Charitable Gifting

Curriculum development

KEYS TO BEING AN EFFECTIVE WORKPLACE PERSONAL ASSISTANT

FINANCIAL FITNESS SELECTING A CREDIT CARD. Fact Sheet

ASAND: Asynchronous Slot Assignment and Neighbor Discovery Protocol for Wireless Networks

In this chapter we introduce the idea that force times distance. Work and Kinetic Energy. Big Ideas is force times distance.

Accelerated Implementation Model

Planning a Smart Card Deployment

Introducing Revenue Cycle Optimization! STI Provides More Options Than Any Other Software Vendor. ChartMaker Clinical 3.7

AN OTT NETWORK FOR THE CONNECTED WORLD

WHITE PAPER. Filter Bandwidth Definition of the WaveShaper S-series Programmable Optical Processor

Designing an Authentication Strategy

Memory management. Chapter 4: Memory Management. Memory hierarchy. In an ideal world. Basic memory management. Fixed partitions: multiple programs

EMC Storage Analytics

How to Find Us. 1 Crumlin College of Further Education Crumlin Road, Dublin 12 Buses - 17, 18, 27, 56a, 77a, 122, 123, 151

Automatic Search for Correlated Alarms

MVM-BVRM Video Recording Manager v2.22


Bosch Security Training Academy Training Course Catalogue uk.boschsecurity.com

f.airnet DECT over IP System

Technical Notes. PostgreSQL backups with NetWorker. Release number REV 01. June 30, u Audience u Requirements...

Modeling Roughness Effects in Open Channel Flows D.T. Souders and C.W. Hirt Flow Science, Inc.

Facilities. Car Parking and Permit Allocation Policy

Transcription:

White Paer Data Warehosing 2.0 Modeling and Meta trategies for Next Generation Architectres By Bill H. Inmon Forest Rim Technology, LLC Aril 2010 Cororate Headqarters EMEA Headqarters Asia-Pacific Headqarters 100 California treet, 12th Floor an Francisco, California 94111 York Hose 18 York Road Maidenhead, Berkshire L6 1F, United Kingdom L7. 313 La Troe treet Melorne VIC 3000 Astralia

Data Warehosing 2.0 Bill Inmon INTRODUCTION Data warehosing has ndergone a constant state of evoltion since the eginning. Jst when yo think that everything has een discovered and develoed, warehosing evolves once again, mtating into a new form and strctre. EVOLUTION OF THE DATA WAREHOUE From the eginning the evoltion of warehosing has een shaed y owerfl forces. First there was the need for access to. Then there was the need for integration. Then came the need for a single version of the trth. Then different deartmental needs for looking at the same fondation of arose. The general evoltion of the first stages of the warehose environment is shown y Fig 1. edw First there were alications, then there was a warehose, then there was an infrastrctre srronding the warehose Figre 1 In the eginning there were alications. Alications grew so qickly that there develoed what was termed the sider we environment where the same was scattered all over the landscae. The frstration with the sider we environment led to the creation of the first warehose. The simle and singlar warehose addressed many of the rolems of the sider we environment. Bt soon it was discovered that other comonents of the warehose infrastrctre were needed. There was a need for ETL, the rocess that allows to e read in an nintegrated alication format and written ot in a cororate, integrated format. Then there ware marts, where different deartments had their own version of the ase fond in the warehose environment. oon the OD aeared where there was a need for high erformance, transaction rocessing on integrated. An entire infrastrctre grew arond the world of the simle warehose. An architectre called the cif, or the cororate information factory, grew arond the warehose. Bt the evoltion of the warehose did not sto there. oon the evoltion of the warehose grew to inclde a roader set of reqirements. oon the notion of a warehose exanded to inclde a fll set of flfilling a newly discovered set of reqirements that reached well eyond the original notion of what a warehose shold e. Emarcadero Technologies, Inc. Page - 1 -

Data Warehosing 2.0 Bill Inmon DW 2.0: ARCHITECTURE FOR THE NEXT GENERATION OF DATA WAREHOUING This new architectre was called DW 2.0. Fig 2 shows the asic descrition of DW 2.0. DW 2.0 crrent Architectre for the next generation of warehosing A l A l A l Integrated Crrent++ Textal sects Catred imle ointer Continos Profile Text to s mmary Near line Less than crrent Textal sects Catred Text to s imle ointer mmary Continos Profile Archival Older Textal sects Catred imle ointer Continos Profile Text to s mmary Then there was DW 2.0, architectre for the next generation of warehosing Figre 2 Like the earlier renditions of warehose architectre that receded DW 2.0, DW 2.0 was shaed y owerfl evoltionary forces. THE LIFE CYCLE OF DATA The first owerfl evoltionary force that shaed DW 2.0 was the recognition that the within the warehose contained its own life cycle. It was not enogh to merely lace Emarcadero Technologies, Inc. Page - 2 -

Data Warehosing 2.0 Bill Inmon on disk storage and call it a warehose. As time assed that egan to exhiit its own characteristics. The first manifestation of the life cycle of the within the warehose was that over time, the roaility of access to in the warehose droed. The older ecame, the less that was accessed. A second manifestation of the lifecycle of with the warehose was that over time the volmes of in the warehose grew raidly. This led to a aradox: the larger a warehose ecame and the older the in the warehose, the smaller ercentage of was eing sed. UNTRUCTURED DATA The second manifestation of evoltionary forces in the warehose was the realization that nstrctred, al elonged in the warehose. The original that was laced in the warehose was transaction ased, reetitive. Many cororate decisions were ased on this kind of. Bt there is mch imortant that is not transaction ased that elongs in the warehose. METADATA A third realization was that meta elonged in the warehose as a formal and integral comonent. In first generation warehoses, meta was an afterthoght. Bt there are owerfl reasons why meta needs to ecome an integral and formal art of the warehose environment. Fig 3 shows the evoltionary forces and their effect on the DW 2.0 environment. crrent DW 2.0 Architectre for the next generation of warehosing A recognition of the life cycle of within the warehose Integrated Crrent++ Near line Less than crrent Textal sects Catred Text to s Textal sects Catred Text to s imle ointer imle ointer mmary mmary Continos Profile Continos Profile Recognition of the need for a formal meta infrastrctre Archival Older Textal sects Catred imle ointer Continos Profile Text to s mmary Recognition of the need to integrate oth strctred and nstrctred in the warehose Figre 3 DW 2.0 has ecome the architectral aradigm for modern warehoses. For a detailed descrition of DW 2.0 refer to the ook DW 2.0 ARCHITECTURE FOR THE NEXT GENERATION OF DATA WAREHOUING, Morgan Kafman, 2008. Emarcadero Technologies, Inc. Page - 3 -

Data Warehosing 2.0 Bill Inmon Like most large architectres, DW 2.0 is not ild all at once. Instead an organization ilds first one comonent of DW 2.0 then another comonent. Indeed some comonents of the warehose are otional, sch as the near line sector. COMPARTMENT OF PROCEING In any case, the different comonents of DW 2.0 consist of sectors that erform one asic fnction. These sectors are in a sense their own small oerating environments. Each sector has its own rose and its own fnctionality. Fig 4 shows that each sector is neatly comartmentalized. crrent Integrated Crrent++ Textal sects Catred imle ointer Continos Profile Text to s mmary Near line Less than crrent Textal sects Catred imle ointer Continos Profile DW 2.0 is made of different comonents Text to s mmary Archival Older Textal sects Catred imle ointer Continos Profile Text to s mmary Figre 4 In light of the individal comartmentalization of the sectors of DW 2.0, the qestion then ecomes how is work distrited and coordinated across the different comonents? The answer is that work is distrited and coordinated y means of the assage of meta from one comonent to the next. METADATA A THE GLUE Fig 5 shows that meta is the gle that inds the different oerating comonents of DW 2.0 together. Emarcadero Technologies, Inc. Page - 4 -

Data Warehosing 2.0 Bill Inmon crrent Integrated Crrent++ Near line Less than crrent Archival Older Textal sects Catred Text to s Textal sects Catred Text to s Textal sects Catred imle ointer imle ointer imle ointer mmary mmary Continos Profile Continos Profile Continos Profile There is a need for tight commnications and tight coordination among the different comonents of DW 2.0 Text to s mmary Figre 5 In a sense, meta forms a lattice work that holds the comonents of DW 2.0 together. It is the assage of meta that allows the different comonents of DW 2.0 to work in cooeration and in coordination with each other. tated differently, withot meta, the different comonents of DW 2.0 wold not e ale to achieve a coordinated and cohesive work flow. Consider a symhony orchestra. What kind of msic wold e created if the violins were laying Beethoven s Fifth, the cellos were laying Hotel California, the drms were laying Aretha Franklin s Resect, the fltes were laying Hay Birthday, and the trmets were laying Jingle Bells. The reslt wold e a nch of noise nothing that anyone wold want to listen to. Bt now sose a condctor stes in and gets everyone to lay DeBssy s Le Mer. Now, sddenly the very same comonents that st a few seconds ago sonded awfl are making very eatifl msic. What is needed is a condctor, and it is meta that lays the role of the condctor. Meta of corse does not direct a symhony orchestra. Meta instead directs the DW 2.0 environment, with all of its different comonents and different technologies. Fig 6 shows the role of meta and the role of a condctor. Emarcadero Technologies, Inc. Page - 5 -

Data Warehosing 2.0 Bill Inmon The orchestra with no condctor The orchestra with a condctor Figre 6 o what kind of meta is sed to e assed from one DW 2.0 comonent to another? In trth there are all sorts of meta that are assed from one DW 2.0 comonent to another. DIFFERENT TYPE OF METADATA Tyical tyes of meta that are assed inclde: descritions definitions formla and algorithms the timing of rocessing descrition of assorted volmes of oerating arameters encoded vales rocessed/calclated/derived ratios and statistics stats codes, and the like Fig 7 shows the different kinds of meta that can e assed from one comonent to the next. (Note: this list is merely a reresentative samle of the different tyes of meta that can e assed from one comonent to the next.) Emarcadero Technologies, Inc. Page - 6 -

Data Warehosing 2.0 Bill Inmon crrent A l A l A l mmary mmary Continos Profile Continos Profile The that is assed from one comonent to the next incldes - - descritions - arameters - rocessed - ratios and statistics - stats codes - and so forth Continos Profile mmary Figre 7 The descritions that can e assed from one comonent of DW 2.0 to another often reflect on the model that exists for each comonent of DW 2.0. Indeed, there is a model for each of the different comonents of the DW 2.0 environment. Emarcadero Technologies, Inc. Page - 7 -

Data Warehosing 2.0 Bill Inmon DIFFERENT DATA MODEL Fig 8 shows the different models that exist for the different comonents of DW 2.0. crrent Oerational, alication model mmary Continos Profile Integrated, cororate model mmary Continos Profile ee integrated, cororate model Continos Profile Archival model mmary There is a model for each sector of the DW 2.0 environment Figre 8 At the oerational level (i.e., the interactive sector of DW 2.0) is an alication oriented model. This model reflects the transactional natre of the work that is done here. While integration may occr here, it often doesn t. In addition, this level of often incldes the recetion of from alications that lie otside of it. The second model that is fond is that of the model for the integrated sector. The integrated sector is the lace where cororate integration occrs. The integrated model is a classical sect oriented, non volatile model. The third model that (otionally) aears is the model for the near line sector. If the near line sector is art of the warehose environment, then the near line model is IDENTICAL to the integrated model. tated differently, if the near line sector aears at all, then the model for the near line sector is IDENTICAL, in every way, to the integrated model. The forth model fond in the DW 2.0 environment is the model for the archival environment. The archival model is the model that reflects several imortant asects of design: The need to look at and measre over lengthy eriods of time the need to need to reflect a changing meta strctre over time The need to store meta directly in the same hysical volme of as the actal itself, The need to store calclation algorithms along with smmarized or aggregated The need to e ale to free from its software strctring Emarcadero Technologies, Inc. Page - 8 -

Data Warehosing 2.0 Bill Inmon The need to e ale to frther normalize as it is laced in the archival environment, and so forth. There are then some distinctive needs for a model for each of the different layers rocessing that occr in the DW 2.0 environment. A COMMON THEME It is of interest that the different models for the different sectors of DW 2.0 are in fact very different. Having stated that, there never the less is a common theme rnning throgh each of the different models. For examle, the integrated model is ndenialy akin to its interactive model. And the model fond at the archival environment is ndenialy related to the integrated model. The similarities of the different models to each other are somewhat akin to looking at a grandfather, his daghter, and the child of the daghter. There will e certain facial and ody similarities throghot the family. Bt no one will mistake a grandfather for his daghter, and no one will mistake a daghter for her child. Fig 9 shows that there are definite familial similarities rnning one model and the next. crrent Continos Profile mmary Continos Profile mmary Continos Profile mmary There is a great deal of commonality from one tye of model to the next Figre 9 Emarcadero Technologies, Inc. Page - 9 -

Data Warehosing 2.0 Bill Inmon THE TECHNICAL COMMUNITY o who do the meta and the model asects of the DW 2.0 environment hel? The first set of eole that are heled y the models and the meta is the develoment and maintenance organizations. The technicians of the organization find that models and meta ecome the Bile of develoment and maintenance. The models and the meta ecome the tre intellectal roadma for the ongoing develoment and maintenance of the warehose environment. tated differently, withot meta and a model, the technician is like a mechanic working on a Ygo where the Ygo is no longer rodced and where the manals descriing the Ygo are either all lost or all in erian, where the mechanic only reads and seaks English. Trying to do a reair o on a Ygo nder those circmstances is qestionale nder the est of circmstances. Fig 10 shows the se of the meta and models y the technical commnity. crrent Continos Profile mmary Continos Profile mmary Continos Profile mmary The model for each comonent of DW 2.0 ecomes the intellectal road ma for the ilding and the maintenance the strctres fond in the comonent Figre 10 THE END UER ANALYTICAL COMMUNITY Bt there is another adience that is well served y meta and models, and that adience is the end ser analyst. Consider a newly hired end ser analyst that has st received an office right across from yo. The end ser analyst has st received an MBA and wants to show his/her stff. The end ser Emarcadero Technologies, Inc. Page - 10 -

Data Warehosing 2.0 Bill Inmon analyst starts to ild an analysis and ecomes erlexed. There are so many tyes of, so mch similar, so mch that is of different ages that the analyst does not know where to start. Merely iting off a chnk of may e very misleading if the analyst doesn t chose the roer. And there is a lot to choose from. o the analyst comes to yor office and asks how do I find my way arond this environment? There is a lot to choose from and I don t want to start with the wrong. It is at this oint that the meta and the models ecome invalale. Withot the meta and the models the end ser analyst may wander arond the maze of for a long time. Bt with the meta and the models, the end ser analyst can determine what is where and what rovides the est sorce for the analysis at hand. Fig 11 shows that meta and the models are extremely sefl to the analytical commnity as well as the technical commnity. crrent Continos Profile mmary Continos Profile mmary mmary Continos Profile Another gro of eole who are greatly heled y meta and models are the end ser analysts who have to find what there is to se as they do their analysis Figre 11 UMMARY In this aer we have discssed DW 2.0 and the evoltion of the architectre srronding warehose. DW 2.0 has different sectors. There is a need for meta to control the activities in each of these sectors. In addition there is a model for each of the sectors. Each model is different and yet there is a common theme rnning throgh each model. The models and the meta serve two commnities the technical commnity and the end ser analysis commnity. For the technical commnity the meta and the models act as an intellectal roadma. For the end ser analytical commnity, the models and the Emarcadero Technologies, Inc. Page - 11 -

Data Warehosing 2.0 Bill Inmon meta serve as a gideost for where the end ser analyst shold go to in order to find the roer for analysis. REFERENCE Inmon, W H - DW 2.0 ARCHITECTURE FOR THE NEXT GENERATION OF DATA WAREHOUING, Morgan Kafman, 2008. ABOUT THE AUTHOR Bill Inmon, the father of warehosing, has written 50 ooks translated into 9 langages. Bill fonded and took lic the world s first ETL software comany. Bill has written over 1000 articles and lished in most maor trade ornals. Bill has condcted seminars and soken at conferences on every continent excet Antarctica. Bill holds three software atents. Bill s latest comany is Forest Rim Technology, a comany dedicated to the access and integration of nstrctred into the strctred world. Bill s wesite inmoncif.com - has attracted over 1,000,000 visitors a month. Bill s weekly newsletter in -eye-network.com is one of the most widely read in the indstry and goes ot to 75,000 sscriers each week. Emarcadero Technologies, Inc. is a leading rovider of award-winning tools for alication develoers and ase rofessionals so they can design systems right, ild them faster and rn them etter, regardless of their latform or rogramming langage. Ninety of the Fortne 100 and an active commnity of more than three million sers worldwide rely on Emarcadero rodcts to increase rodctivity, redce costs, simlify change management and comliance and accelerate innovation. The comany s flagshi tools inclde: Emarcadero RAD tdio, DBArtisan, Delhi, ER/tdio, JBilder and Raid QL. Fonded in 1993, Emarcadero is headqartered in an Francisco, with offices located arond the world. Emarcadero is online at www.emarcadero.com. Emarcadero Technologies, Inc. Page - 12 -