Eleven Steps to Success in Data Warehousing



Similar documents
Data Warehousing: A Technology Review and Update Vernon Hoffner, Ph.D., CCP EntreSoft Resouces, Inc.

A SAS White Paper: Implementing the Customer Relationship Management Foundation Analytical CRM

Business Intelligence Solutions for Gaming and Hospitality

Analytical CRM Technology Options

Retail s Complexity: The Information Technology Solution

Analance Data Integration Technical Whitepaper

EAI vs. ETL: Drawing Boundaries for Data Integration

Analance Data Integration Technical Whitepaper

Optimizing Application Management Outsourcing:

JOURNAL OF OBJECT TECHNOLOGY

LITERATURE SURVEY ON DATA WAREHOUSE AND ITS TECHNIQUES

Increasing Efficiency across the Value Chain with Master Data Management

Framework for Data warehouse architectural components

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data

Master Reference Data: Extract Value from your Most Common Data. White Paper

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

6 Steps to Creating a Successful Marketing Database

SimCorp Solution Guide

Managing the Cloud as an Incremental Step Forward

The Top 10 Critical Challenges for Business Intelligence Success

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

DATA MINING AND WAREHOUSING CONCEPTS

The Keys to Successful Service Level Agreements Effectively Meeting Enterprise Demands

CONNECTING DATA WITH BUSINESS

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Best Practices in Contract Migration

Business Intelligence

Using and Choosing a Cloud Solution for Data Warehousing

A Unified View of Network Monitoring. One Cohesive Network Monitoring View and How You Can Achieve It with NMSaaS

Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve. XXXIV Meeting on Central Bank Systematization

We are live on KFS Now What? Sameer Arora Director Strategic Initiatives, Syntel

The Oracle Enterprise Data Warehouse (EDW)

B2B Exchanges: Now That We Know Better, How to Move Forward From Here

A WHITE PAPER By Silwood Technology Limited

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Build an effective data integration strategy to drive innovation

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Bringing Big Data into the Enterprise

Symantec Global Intelligence Network 2.0 Architecture: Staying Ahead of the Evolving Threat Landscape

Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing. 1 P a g e.

IBM Tivoli Netcool network management solutions for enterprise

ORACLE PRODUCT DATA HUB

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

MDM and Data Warehousing Complement Each Other

Empowering ACO Success with Integrated Analytics

Accenture cloud application migration services

Virtualization 101: Technologies, Benefits, and Challenges. A White Paper by Andi Mann, EMA Senior Analyst August 2006

Data Warehouses & OLAP

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Overcoming Obstacles to Retail Supply Chain Efficiency and Vendor Compliance

Business Intelligence and Service Oriented Architectures. An Oracle White Paper May 2007

Sales & Marketing Services & Strategy

INSIGHTS LIFE SCIENCES

EVALUATING INTEGRATION SOFTWARE

FIVE WAYS TO MAKE YOUR SUPPLY CHAIN MORE DYNAMIC

Maximize the Value of your Custom Business Applications with Microsoft Dynamics CRM

A business intelligence agenda for midsize organizations: Six strategies for success

Business Analytics and Data Visualization. Decision Support Systems Chattrakul Sombattheera

Web Analytics Understand your web visitors without web logs or page tags and keep all your data inside your firewall.

How master data management serves the business

CHOOSING A WIRELESS IMPLEMENTATION STRATEGY AND APPLICATIONS

How Cisco IT Uses SAN to Automate the Legal Discovery Process

Enterprise Data Quality

Foundations for Systems Development

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Enterprise Data Integration

UpStream Software s Big Data Analytics Platform for Marketing Optimization Helps Clients Understand Buying Behavior and Improve Customer Targeting

Data warehouse and Business Intelligence Collateral

Microsoft SQL Server Business Intelligence and Teradata Database

How Effectively Are Companies Using Business Analytics? DecisionPath Consulting Research October 2010

Big Data Integration: A Buyer's Guide

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Next Generation Business Performance Management Solution

A Guide Through the BPM Maze

Contents of This Paper

High-Volume Data Warehousing in Centerprise. Product Datasheet

The Sumo Logic Solution: Security and Compliance

CREATING PACKAGED IP FOR BUSINESS ANALYTICS PROJECTS

IT Asset Inventory and Outsourcing: The Value of Visibility

Three Fundamental Techniques To Maximize the Value of Your Enterprise Data

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

CHAPTER-6 DATA WAREHOUSE

Upgrade to Oracle E-Business Suite R12 While Controlling the Impact of Data Growth WHITE PAPER

IAF Business Intelligence Solutions Make the Most of Your Business Intelligence. White Paper November 2002

Data Warehouse: Introduction

Banking on Business Intelligence (BI)

Database Schema Management

I m awash in reports, how do I access clear information? Sentry Data Systems Saves You Time and Money

Transcription:

DATA WAREHOUSING BUILDING A DATA WAREHOUSE IS NO EASY TASK... THE RIGHT PEOPLE, METHODOLOGY, AND EXPERIENCE ARE EXTREMELY CRITICAL Eleven Steps to Success in Data Warehousing by Sanjay Raizada General Manager, Northwest Operations Beaverton, Oregon

1 2 3 4 5 6 7 8 9 10 11 12 RECOGNIZE THAT THE JOB IS PROBABLY HARDER THAN YOU EXPECT UNDERSTAND THE DATA IN YOUR EXISTING SYSTEMS BE SURE TO RECOGNIZE EQUIVALENT ENTITIES USE METADATA TO SUPPORT DATA QUALITY SELECT THE RIGHT DATA TRANSFORMATION TOOLS TAKE ADVANTAGE OF EXTERNAL RESOURCES UTILIZE NEW INFORMATION DISTRIBUTION METHODS FOCUS ON HIGHER PAYBACK MARKETING APPLICATIONS EMPHASIZE EARLY WINS TO BUILD SUPPORT THROUGHOUT THE ORGANIZATION DON T UNDERESTIMATE HARDWARE REQUIREMENTS CONSIDER OUTSOURCING YOUR DATA WAREHOUSE DEVELOPMENT AND MAINTENANCE CONCLUSION Eleven Steps to Success in Data Warehousing More and more companies are using data warehousing as a strategic tool to help them win new customers, develop new products, and lower costs. Searching through mountains of data generated by corporate transaction systems can provide insights and highlight critical facts that can significantly improve business performance. Until recently, data warehousing has been an option mostly for large companies, but the reduction in the cost of warehousing technology makes it practical, often even a competitive requirement, for smaller companies as well. Turnkey integrated analytical solutions are reducing the cost, time, and risk involved in implementation. While access to the warehouse was previously limited to highly trained analytical specialists, today corporate portals are making it possible to grant access to hundreds, even thousands of employees. B Y S A N J A Y R A I Z A D A S Y N T E L B E A V E R T O N, O R E G O N

1. RECOGNIZE THAT THE JOB IS PROBABLY HARDER THAN YOU EXPECT The reduction in the cost of warehousing technology makes it practical. Experts frequently report that 30 percent to 50 percent of the information in a typical database is missing or incorrect. This situation may not be noticeable or may even be acceptable in an operational system that focuses on swiftly and accurately processing current transactions. But it s totally unacceptable in a data warehousing system designed to sort through millions of historical records in order to identify trends or select potential customers for a new product. And, even when the data is correct, it may not be usable in a data warehouse environment. For example, legacy system programmers often use shortcuts to save disk space or CPU cycles, such as using numbers instead of names of cities, which makes the data meaningless in a generic environment. Another challenge is that database schema often change over the lifecycle of a project, yet few companies take the time to rebuild historical databases to account for these changes. 2. UNDERSTAND THE DATA IN YOUR EXISTING SYSTEMS The first step in any data warehousing project should be to perform a detailed analysis of the status of all databases that will potentially contribute to the data warehouse. An important part of understanding the existing data is determining interrelationships between various systems. The importance of this step lies in the fact that interrelationships must be maintained as the data is moved into the warehouse. In addition, the implementation of the data warehouse often involves making changes to database schema. A clear understanding of data relationships among heterogeneous systems is required to determine in advance the impact of any such change. Otherwise, it s possible for changes to create inconsistencies in other areas that ripple across the entire enterprise, creating enormous headaches. 3. BE SURE TO RECOGNIZE EQUIVALENT ENTITIES One of the most important aspects of preparing for a data warehousing project is identifying equivalent entities and heterogeneous systems. The problem arises from the fact that the same essential piece of information may appear under different field names in different parts of the organization. For example, two different divisions may be servicing the same customer yet have the name entered in a slightly different manner in their records (such as AIG and American International Group). A data transformation product capable of fuzzy matching can be used to

identify and correct this and similar problems. More complicated issues arise when corporate entities take a conceptually different approach in the way that they manage in-store data. This situation frequently occurs in cases of a merger. The database schema of the two organizations was established under the influence of two entirely different corporate cultures. Establishing a common database structure can be just as important as merging the corporate cultures and crucial to obtaining the full effects of synergy. still been made to minimize the work involved in dealing with historical data. This emphasizes the importance of starting as soon as possible to create and capture metadata for interfaces, business processes, and database requirements. Several vendors have developed products that have the potential to integrate metadata from disparate sources and begin to establish a central repository that can be used to provide the information needed by both administrators and users. 4. USE METADATA TO SUPPORT DATA QUALITY 5. SELECT THE RIGHT DATA TRANSFORMATION TOOLS Metadata is crucial to a successful data warehousing implementation. By definition, metadata is data about data; for example, tags that indicate the subject of a World Wide Web document. There are many types of metadata that can be associated with a database: to characterize and index data, to facilitate or restrict access to data, to determine the source and currency of data, etc. One major challenge is trying to synchronize the metadata between different vendor products, different functions, and different metadata stores. A major data center might easily have a 500-page schema layout or COBOL copybooks with 200,000 lines and up. But the chances are that many undocumented changes have Data transformation tools extract data from the operational sources, clean it, and load it into the data warehouse while capturing the history of that process. This transformation process may include creating and populating new fields from the operational data, summarizing data to an appropriate level for analysis, performing error checking operations to validate the integrity of the data, etc. Look for a tool that makes it possible to map data from source to target with a simple point-and-click interface. The ability to track and manage the relationships of interrelated data entities that are affected when changing database structures is also useful. Finally, try to find a tool that can capture and store metadata during the conversion process.

The biggest technological improvements in recent years in the data warehousing field have come in the area of information delivery. 6. TAKE ADVANTAGE OF EXTERNAL RESOURCES 7. UTILIZE NEW INFORMATION DISTRIBUTION METHODS External sources of information, such as data from a customer's transaction processing system or market research data provided by a third party, can often greatly increase the value of internal information. Rather than simply comparing sales against the same month last year, use of external data might make it possible, for example, to compare sales growth against the increase in the overall market. Or suppose that you want to estimate the income of each of your customers. You may be able to obtain a database containing the average income for every ZIP Code in the country. While this will not provide accurate answers for individual customers, it will enable you to obtain a good estimate of an aggregate grouping, such as all 30 to 45 year olds in Kansas, that might be used in a targeted marketing campaign. The integration challenge is even greater when external data sources are involved. In some cases, external data will differ so drastically from existing schema that data transformation algorithms will be required to make use of the external resource. The biggest technological improvements in recent years in the data warehousing field have come in the area of information delivery. In the past, highly skilled analysts were needed to prepare information in the format requested by users. Today, information can be delivered in several different ways directly to the people who need it. Users can subscribe to regular reports and have them delivered via e-mail. If, when run, the report contains data that meets the subscription criteria, a customer-filtered view of the data will be delivered to the user by means of an e-mail attachment. The user only has to click on the e-mail attachment to bring up the report. The report data stays securely and economically on the server and only the pages requested by the authorized user are sent on demand. Another option is to let the user log in, search for data, and open the reports that provide the needed information. Only the pages requested by the user are sent over the network, to minimize unnecessary traffic.

8. FOCUS ON HIGHER PAYBACK MARKETING APPLICATIONS Most of the really hot applications in data warehousing involve marketing because of the potential for an immediate payback in terms of increased revenues. For example, catalog manufacturers are using data warehousing to match personal characteristics to purchases of specific items. This empirical data is then used to produce special demographic editions that generate higher sales since they are targeted more closely to their customers needs. Companies selling more complicated and higher value products, such as brokerages, insurance companies, credit card firms, etc., are weaving data warehousing into their relationship-based businesses in order to dramatically increase the information flow between their firm and their customer. Innovative banks are even using similar methods to sort out customers whose trading characteristics match the profiles of known money launderers. 9. EMPHASIZE EARLY WINS TO BUILD SUPPORT THROUGHOUT THE ORGANIZATION The availability of a wide range of off-the-shelf solutions has made it possible to drastically reduce cost and lead-time requirements for data warehousing applications. Off-the-shelf solutions won t usually complete project objectives, but they often can be used to provide point solutions in a short time that serve as a training and demonstration platform and most importantly, build momentum for full-scale implementation. Even for the largest scale applications, technology surveys should be performed in order to maximize the use of pre-built technology. 10. DON T UNDERESTIMATE HARDWARE REQUIREMENTS The hardware requirements for data warehousing database servers are surprisingly high. The primary reason is the large number of CPU cycles required to slice and dice data over and over again to meet the varying needs of users throughout the organization. Database size also plays a part in the server performance requirements requirements range up to terabytes and higher. For these reasons, be sure to select a scalable platform regardless of how much headroom you have provided in your server specification. The typical data warehouse implementation starts out at the departmental level and grows over time to an enterprise wide solution. Purchasing servers that can be expanded with additional processors is one possible approach. A more ambitious idea is to combine loosely coupled systems that enable the database to be spread out over multiple servers, although it still appears as a single entity to users.

11. CONSIDER OUTSOURCING YOUR DATA WAREHOUSE DEVELOPMENT AND MAINTENANCE A large percentage of medium and large companies use outsourcing to avoid the difficulty of locating and the high cost of retaining skilled IT staff members. Most data warehousing applications fit the main criteria for a good outsourcing project a large project that has been defined to the point that it does not require day-to-day interaction between business and development teams. There have been many cases where an outsourcing team is able to make dramatic improvements in a new or existing data warehouse. Typically, these improvements do not necessarily stem from an increased level of skill on the part of the outsourcing team, but rather flow from the nature of outsourcing. The outsourcing team brings fresh ideas and a new perspective to their assignment and is often able to bring to bear methods and solutions that they have developed on previous assignments. The outsourcing team also does not have to deal with manpower shortages and conflicting priorities faced by the previous internal team. 12. CONCLUSION By now, you ve realized that building a data warehouse is no easy task. With the average cost of a system valued at $1.8 million, the right people, methodology, and experience are critical. The reliance on technology is only a small part in realizing the true business value buried within the multitude of data collected within an organization s business systems. Data warehouses touch the organization at all levels, and the people that design and build the data warehouse must be capable of working across the organization as well. The industry and product experience of a diverse team coupled with a business focus and proven methodology may be the difference between a functional system and true success.

aboutsyntel: about the author Sanjay Raizada With Syntel s acquisition of IMG in September 1999, Sanjay was appointed General Manager of Northwest Operations based in Beaverton, Oregon. As President of IMG, Sanjay led its Consulting Services Division and helped build a Fortune 500 clientele. His responsibilities included business and methodology development and quality control for all projects in the region. Prior to joining IMG and Syntel, Sanjay was a Senior Manager at a Big 5 professional services firm leading their Western Region Data Warehousing Practice for North America. From 1991 to 1994, Sanjay served as Principle Consultant at an industry-leading provider of global e-business solutions. Sanjay s 15 years IT experience includes all aspects of systems architecture design, infrastructure planning, and systems development. He specializes in data warehousing architecture and decision support solutions using automated process methodologies involving CASE, ETT, and other emerging technologies. Syntel (www.syntelinc.com) is a global e-business and applications outsourcing company that delivers real-world technology solutions to Global 2000 corporations. Syntel s portfolio of services includes e-business development and integration, wireless solutions, data warehousing, ecrm, as well as complex application development and enterprise integration services. The company was the first U.S.-based firm to launch a Global Delivery Service to drive speed-to-market and quality advantages for its customers. Syntel s 2,600 employees operate from centers in the U.S., Europe, and Asia. SYNTEL Sanjay Raizada General Manager, Northwest Operations Syntel HEADQUARTERS 2800 Livernois Road Troy, MI 48083 phone 248/619-3503 fax 248/619-2889 v i s i t S y n t e l s w e b s i t e a t w w w. s y n t e l i n c. c o m