Transitioning to a Data Driven Enterprise - What is A Data Strategy and Why Do You Need One? Mike Ferguson Managing Director Intelligent Business Strategies Information Builders Data Strategy Workshop London, April 2015
About Mike Ferguson Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant he specialises in business intelligence, analytics, data management and big data. With over 33 years of IT experience, Mike has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and co-founder of Codd and Date Europe Limited the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates. www.intelligentbusiness.biz mferguson@intelligentbusiness.biz Twitter: @mikeferguson1 Tel/Fax (+44)1625 520700 2
Topics The increasingly complex data landscape Why have a data strategy? The impact of data issues on your core business processes The impact of fractured master data on business operations The impact of inconsistent data on analysis, reporting and decision making Competitive advantage the impact of new data Creating a data strategy what do you need to consider? What is needed for enterprise data governance and data management and where are you on the roadmap? People Process Technology Getting started 3
The Data Landscape Is Becoming Increasingly Complex And Lack of Integration Are Working Against Business Line of business IT initiatives when there is a need for enterprise wide common infrastructure Multiple copies of data Processes not integrated Sales System Marketing System Customer Service System Different user interfaces HR Gen. Ledger Gen. Ledger Server platforms complexity Duplicate application functionality Billing system Fulfilment System Procurement system Point-to-Point Spaghetti application integration 4
Trends More And More Appliances Appearing On The Market Causing Islands of Data Oracle Exadata Pivotal Greenplum DCA IBM PureData System for Analytics Teradata 5
Big Data Is Also Now In The Enterprise Introducing More Data Stores, e.g. Hadoop, NoSQL, Analytic RDBMS users business analysts developers Graph analytics tools real-time BI tools platform & data visualisation tools SQL Map Reduce BI tools Search based BI tools indexes Custom MR apps Graph DBMS MPP Analytical RDBMS actions DW Stream processing Event streams Enterprise Information Management Tool Suite OLTP data Unstructured / semi-structured content clickstream social graph data Files RDBMS Web logs social data 6
Complexity Is Increasing Further As Companies Adopt and Deploy A Mix of On-Premise, SaaS and Cloud Based Systems partners employees customers Mashups Enterprise Portal Enterprise Service Bus Office Applications On-Premise Systems Within the Enterprise Operational & BI Systems Off-premise hosted apps SaaS BI Private cloud corporate firewall Private or public cloud Data is now potentially fractured even more than before WWW 7
Hundreds of New Data Sources Are Emerging - The Internet of Things (IoT) 8
The Task Of Governing and Managing Data Is Becoming Increasingly Complex As Data Becomes Distributed Flat files Legacy applications Office documents Web content ECMS E-mail Cloud based applications Where is all the Customer Data? Big Data applications BI systems <XML>Text</XML> RDBMSs Digital media Packaged applications 9
Why Do We Need A Data Strategy and Enterprise Data Governance? Uncontrolled and unmanaged data impacts: Business operations Employees, customers, partners and suppliers struggle to find information Incomplete and inaccurate data can cause process defects and delays Business are slow to respond when they do not have the required data in time or when it is not fully trusted Can cause errors that result in customer dissatisfaction Business decision making and performance management Incorrect or poor quality decision making Inability to make decisions Performance management reconciliation problems Excel mania! Compliance Violation of regulations e.g. inaccurate regulatory and legislative reporting 10
As Processes Execute, Subsets And Aggregates of Master and Transaction Data Are Stored In Many Different Systems Process Example - Manufacturing Order to cash credit order check schedule fulfil package ship invoice payment Order entry system Finance credit control system Production planning & scheduling system CAM system Inventory system Distribution system Billing Gen Ledger Orders data Customer data Product data This makes data difficult to track, maintain, synchronise and manage 11
Business Operational Transaction Processing The Ideal Situation Order-to-Cash Process Orders order credit check fulfill package ship invoice payment An ideal situation would be smooth operation, increased automation, no delays, no defects and no unplanned operational cost 12
Data Issues In Transaction Processing Impact Business - What Are We Looking For In Business Processes? What about other types of transactions that have data related problems? Order-to-Cash Process Orders order credit check fulfill package ship invoice payment data Data quality errors problems e.g. missing or wrong data on order entry Domino impact manual intervention and process delays errors All these defects add up to unplanned operational cost of processing an Order Unplanned operational cost = ( + + ) * Number of Orders Whatever you do has to reduce unplanned operational cost 13
The Impact of Data Anomalies In Transaction Processing As The Business Scales Can Be Considerable Order-to-Cash Process Orders order credit check fulfill package ship invoice payment data Data quality errors problems e.g. missing or wrong data on order entry Domino impact manual intervention and process delays errors Unplanned operational cost increases as the business scales if anomalies are not fixed and data is not governed 14
Master Data Anomalies Audience Question? How many of you have duplicate customers in your ERP system(s)? Change customer details ERP Duplicate customers? What happens if you have to invoice a customer? What happens when you receive a payment from a customer? If you change the details of a customer address do you change all duplicates? Does your ERP system send customer data to other systems? If so does it send all duplicates? What happens if duplicates are not in sync? 15
Master Data Is Often Fractured Across Multiple Data Entry Systems E.G. Customer Data Branch Banking System Customer data subset ERP System Customer data subset Credit Card System Customer data subset Call Centre System Customer data subset Different identifiers for the same entity in each data entry system Different data definitions for the same data in each data entry system Different subsets of master data in each system Inconsistent master data in each data entry system Varying degrees of duplication of master data in each data entry system Synchronisation issues Data conflicts Mortgage System Customer data subset Loans System Customer data subset 16
Changes To Master Data In A Stand Alone Multi-ERP Environment Makes Globalisation Very Difficult New product ERP ERP Update materials New partner XYZ Banking Group ERP ERP New supplier Update materials ERP ERP ERP ERP Update account XYZ Loans ERP XYZ Cards ERP XYZ Investments ERP update chart of accounts XYZ ERP Mortgages XYZ Insurance ERP Update customer update chart of accounts Customers Partners Products/ Services Accounts Employees Suppliers Assets Materials 17
Master Data Maintenance - The Problem of Multiple Data Entry Systems and Master Data Synchronisation Branch Banking System Customer data subset ERP System Customer data subset Credit Card System Customer data subset Call Centre System Customer data subset The synchronisation nightmare The problem gets worse as you add more applications Mortgage System Customer data subset Loans System Customer data subset This has to be done for changes to EVERY master data entity 18
Master Data Synchronisation The Spaghetti Architecture Complexity & Lack of Integration Is Working Against Business Where is the complete set of master information? How do I get the master data I need when I need it? With so many definitions for master data what does it mean? Can I trust it? Is it complete and correct? How do I get it in the form I need? How do I know where it goes and if it is correct? How do I control it? Spaghetti Interfaces between systems How much does it cost to operate this way??! 19
Inconsistent Master Data Can Disrupt Business Operations and Drive Up Costs Due To Manual Intervention Being Needed Manufacturing - Order to cash How many people do you employ to fix and reconcile data because it is not synchronised? order credit check fulfill package ship invoice payment X What master data entities are used in your core processes In what systems in your core processes does it reside? asset prod cust Master data Where in your core processes is master data created? Where in your core processes is it consumed? 20
Many Companies Have Business Units, Processes & Systems Organised Around Products and Services Channels/ Outlets Customers/ Prospects XYZ Corp. Enterprise Product/service line 1 Product/service line 2 Product/ service line 3 Order (product line 1) order credit check fulfill package ship invoice payment Order (product line 2) order credit check fulfill package ship invoice payment Order (product line 3) order credit check fulfill package ship invoice payment 21
Business and Data Complexity Can Spiral Out Of Control if Processes And Systems Are Duplicated Across Geographies Product line 1 Product line 2 Product line 3 Product line 1 Product line 2 Product line 3 Product line 1 Product line 2 Product line 3 Product line 1 Product line 2 Product line 3 Product line 1 Product line 2 Customers Partners Product line 3 Products/ Services Accounts Assets Materials Employees Suppliers 22
Business Implications Of Product Orientation and Fractured Customer Data In A World Where Customer Is Now King Different marketing campaigns from different divisions aimed at the same customer Different sales teams from different divisions selling to the same customer Customer service is hard e.g. What is my order status for all products ordered? Cost of operating is much higher due to duplicate processes across product lines Can t see customer / product ownership Can t see customer risk and customer profitability Higher chance of poor data quality Difficult to maintain customer data fractured across multiple applications 23
Enterprise Data Governance and MDM Business Case - What is the Business Benefit? How much complexity would be removed from your business if master data was centralised? How much could you save in reducing the cost of operating if master data was centralised? Data Governance & MDM is a corporate weight loss program How much more responsive would your business be if everyone could see changes to master data as soon as they happen? How many duplicate processes associated with master data could be removed from your business if master data was centralised? How many FTP transfers and emails with spread sheets would be eliminated if data could be managed by a single suite of tools 24
Data Issues - Many Companies Have Built Multiple DWs and Marts In Different Parts of Their Value Chain Makes management and regulatory reporting more challenging as data needs to be integrated to see across the value chain Financial / Reg Reporting & Planning ERP ERP CAD Forecasting Planning Product, Materials Supplier Master data Manufacturing execution SCADA system systems Shipping system CRM system Finance DW Manufacturing volumes & inventory DW Sales & mktng DW The issue here is project related DI marts marts May also be the case that data is inconsistent across marts data warehouses e.g. different PKs, data names, hierarchies and DI/DQ jobs for same data in each DW 25
Do You Have Data Consistency Across All Your BI Systems? Common data definitions across all tools for the same data? BI tool BI tool BI tool BI tool BI tool BI tool Common data definitions across all DWs for the same data? DW mart DW mart DW mart Data Integration Data Integration Data Integration Common data transformations across all DWs for the same data? Same data integration tool for all DWs? 26
Why Standardise on Data Definitions? Confusion as to what data means Lack of Trust to use it 27
What Else Should A Data Strategy Bring? Competitive Advantage! 28
Customers Supply Chain Suppliers New Data Sources Have Emerged Inside And Outside The Enterprise That Business Now Wants To Analyse sensor networks Data volume Data velocity E.g. RFID tag Front Office Service Product/ service line 1 Product line 2 BackOffice Finance Sales Credit Verification Product line 3 Procurement Marketing Product line 4 HR Planning Product line n Operations Data volume Data variety Number of sources weather data 29
Popular Types of Data That Businesses Now Want to Analyse Web data Clickstream data, e-commerce logs Social networks data e.g., Twitter Semi-structured data e.g., e-mail Unstructured content IT infrastructure logs Sensor data Temperature, light, vibration, location, liquid flow, pressure, RFIDs Vertical industries structured transaction data E.g. Telecom call data records, retail 30
Why New Data? The Demand for Enhanced Customer Data Source: IBM Redbook - Information Governance Principles and Practices for a Big Data Landscape 31
We Need To Combine Data To Get Deeper Insights MDM System R C Prod Cust Asset D U Who are our customers? What products do we sell? What are the most popular navigational paths through our web site that lead to high fee products DW Who are our most loyal, low risk customers that generate low fees? What is the online behaviour of loyal, low risk, low fee customers so we can offer them higher fee products? Basing customer analysis on transactions activity AND behaviour patterns helps to determine whether or not to strengthen or weaken a relationship 32
Data Deluge - Data Is Arriving Faster Than We Can Consume It How Good Is Your Filter? F Enterprise D I A L T T Enterprise systems A E R 33
Organising New Data In A Data Reservoir This Needs To Be Built Incrementally Txns insights Enterprise Local Data marts Data Ingest zone DW Archive zone DW Trusted Data e.g. Master Data Exploratory analysis zone (prepare & analyse data) sandbox New Insights zone NoSQL DB Graph DBMS Analytical DBMS DW Appliance C MDM R D U 34
Organising New Data In A Data Reservoir You Have To Catalog Data, Its Status And Where It Is Information Catalogue Raw data status Raw data Transactions, OLTP In-Process data Refined data status Social Media, Web Logs cloud Documents, Email Machine Device, Scientific corporate firewall Untrusted Industry Standards Data Refinery Fit for use Trusted 35
Data Strategy
Key Requirements for Enterprise Data Management And Data Governance 1. Create a vision and strategy for information management 2. Create the right organisational structure (people) to govern data 3. Nominate, standardise and define the data to be managed and governed 4. Create the right processes to manage and govern data 5. Define policies and policy scope to manage and govern specific data items 6. Follow an implementation methodology to get your data under control 7. Use technology in each step of the methodology to help implement the policies and processes to manage and govern the data 8. Produce and publish trusted data and services for others to easily find, order and consume 37
Why Is A Data Strategy Important? - What Do You Need To Consider? What are your data issues? e.g. incorrect or missing data, late data, duplicate data (customers) What is the business impact caused by data anomalies? Processes E.g. Major increases in manual activity to redo tasks Manufacturing errors, late deliveries, customer dissatisfaction Process delays e.g. month end close delayed, reports delayed Transactions rejected Decisions Incorrect, delayed, inaccurate/ incomplete reporting, lost opportunity Who is affected by data anomalies? e.g. departments, customers, suppliers What is the estimated unplanned annual cost to the business? Break it down by department (business and IT) 38
What Do You Need To Consider 2 What is the risk to the business going forward? What is the risk? e.g. headcount increase, anomalies out of control as the business scales Where is the risk? What is the estimated opportunity cost savings if you could fix it? Break it down by department What new (big) data should you bring on board that offers the greatest competitive advantage? What is your big data strategy? How will you capture, manage, clean and integrate new data and make trusted data and new insights available for consumption? How will you manage IT and self-service data integration? How will you co-ordinate activity to enrich what you already know The recommendations you need to maximise the value of data 39
What Are The Issues With Structured Data Management and Data Governance What data needs controlled? Where is that data? What data names is it known by? What should it be known by? What state is the data in? Does it need to be cleaned, transformed, integrated and shared? Where does it originate and where does it flow to? Should it be kept synchronised? Who is allowed to access it? Who is allowed to maintain it? How much power do those users have and how are they audited? 40
Key Requirements We Need to Create A New World of Information Producers and Information Consumers raw data raw data information producers clean & integrate service clean & integrate service data scientist trusted data IT professional Information catalog like a corporate itunes for data information consumers search find shop order business analysts consume BI tool or application Need to make use of A business glossary and information catalog Re-usable services to manage and process data Collaboration and social computing to manage, process and rate data Role-based data management tools aimed at IT AND business 41
What Are You Producing? Trusted, integrated, commonly understood master data Trusted, integrated, commonly understood reference data Trusted new insights from big data Trusted new master data attributes from big data Trusted, integrated, commonly understood data in data warehouses and data marts Trusted, commonly understood data in OLTP systems Trusted, commonly understood data available on-demand on an enterprise service bus 42
Data Management and Enterprise Data Governance Needs People, Process, Policies and Technology Data Management and Enterprise Data Governance The people, processes, policies and technology used to formally manage and protect structured and unstructured data assets to guarantee commonly understood, trusted and secure data throughout the enterprise This is about simplification, reducing complexity, lowering cost and increasing integration across the enterprise 43