The Data Reservoir. 10 th September 2014. Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Informa4on Solu4ons



Similar documents
The Data Reservoir as an enabler of differentiating Analytics initiatives

1 Actuate Corpora-on Big Data Business Analy/cs

Tim Blevins Execu;ve Director Labor and Revenue Solu;ons. FTA Technology Conference August 4th, 2015

Data Warehousing. Yeow Wei Choong Anne Laurent

DTCC Data Quality Survey Industry Report

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies

UNIFIED, END- TO- END EDISCOVERY

Introduc)on to the IoT- A methodology

Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015

IT Change Management Process Training

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Project Por)olio Management

Webinar: Having the Best of Both World- Class Customer Experience and Comprehensive Iden=ty Security

AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT

Application of Supply Chain Concepts to the Analysis Process

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term

Business Data Authority: A data organization for strategic advantage

A R o a d t o y o u r C l o u d. Professional Service. C R M a n d C l o u d C o n s u l t i n g

Everything You Need to Know about Cloud BI. Freek Kamst

Bringing agility to Business Intelligence Metadata as key to Agile Data Warehousing. 1 P a g e.

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.

Stream Deployments in the Real World: Enhance Opera?onal Intelligence Across Applica?on Delivery, IT Ops, Security, and More

Transforming Industries with Data & Analytics

Enterprise Content Management (ECM)

Privileged Administra0on Best Prac0ces :: September 1, 2015

Fixed Scope Offering (FSO) for Oracle SRM

PROJECT PORTFOLIO SUITE

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

How To Use Splunk For Android (Windows) With A Mobile App On A Microsoft Tablet (Windows 8) For Free (Windows 7) For A Limited Time (Windows 10) For $99.99) For Two Years (Windows 9

San Francisco Chapter. Presented by Mike O. Villegas, CISA, CISSP

ACCELUS COMPLIANCE MANAGER FOR FINANCIAL SERVICES

IBM Analytics Make sense of your data

Ganzheitliches Datenmanagement

IBM Software Understanding big data so you can act with confidence

IBM Enterprise Content Management: Streamlining operations for environmental compliance

Anzo Smart Data Integra/on

How to avoid building a data swamp

Big Data for Government Symposium

What s a BA to do with Data? Discover and define standard data elements in business terms. Susan Block, Program Manager The Vanguard Group

Modernizing EDI: How to Cut Your Migra6on Costs by Over 50%

Leading the Pack - IBM Enterprise Content Management Solutions

Corporate Challenges in Model Risk Management : Moving Beyond Model Inventory. Iain Wright Ian Francis, IBM 4 June 2015

How To Get More Data From Your Computer

Getting Real with Policies for Software Defined Infrastructure. Manish Dave Principal Engineer, Intel IT

So#ware quality assurance - introduc4on. Dr Ana Magazinius

Qubera Solu+ons Access Governance a next genera0on approach to Iden0ty Management

IBM Enterprise Content Management Solu5ons Informa(on Lifecycle Governance

Business Analysis Center of Excellence The Cornerstone of Business Transformation

Washington State s Use of the IBM Data Governance Unified Process Best Practices

Cyber Security With Big Data

B2B Offerings. Helping businesses op2mize. Infolob s amazing b2b offerings helps your company achieve maximum produc2vity

Doing Big Data Projects: What s the Best Team Process Methology?

Analytics Strategy Information Architecture Data Management Analytics Value and Governance Realization

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Business Analysis Standardization A Strategic Mandate. John E. Parker CVO, Enfocus Solu7ons Inc.

Digital Business Platform for SAP

The Next Generation of Security Leaders

Performance Management. Ch. 9 The Performance Measurement. Mechanism. Chiara Demar8ni UNIVERSITY OF PAVIA. mariachiara.demar8ni@unipv.

The Evolu*on of Service Management

Pu?ng B2B Research to the Legal Test

IBM SECURITY QRADAR INCIDENT FORENSICS

Interna'onal Standards Ac'vi'es on Cloud Security EVA KUIPER, CISA CISSP HP ENTERPRISE SECURITY SERVICES

Omni Channel in Retail The TIBCO Retail Platform

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Presenta(on How Business Intelligence can help to address current NHS challenges Chris Knowles, Oracle Corpora2on, Principal Sales Consultant

UAB Cyber Security Ini1a1ve

RSA ARCHER OPERATIONAL RISK MANAGEMENT

White Paper Achieving GLBA Compliance through Security Information Management. White Paper / GLBA

IT Asset Management Best Practices Using RFID. Stephen Schwartz

Archive I. Metadata. 26. May 2015

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Integrated Social and Enterprise Data = Enhanced Analytics

Computer Security Incident Handling Detec6on and Analysis

Master Data Management Architecture

Graduate Systems Engineering Programs: Report on Outcomes and Objec:ves

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Are You Big Data Ready?

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

The Right BI Tool for the Job in a non- SAP Applica9on Environment

Ten Mistakes to Avoid

Identity and Access Positioning of Paradgimo

DEFINING COMPONENTS OF NATIONAL REDD+ FINANCIAL PLANNING

JOURNAL OF OBJECT TECHNOLOGY

with Managing RSA the Lifecycle of Key Manager RSA Streamlining Security Operations Data Loss Prevention Solutions RSA Solution Brief

Best Practices in Enterprise Data Governance

How To Manage Security On A Networked Computer System

Oracle Solu?ons for Higher Educa?on

Introduction to Glossary Business

Hunk & Elas=c MapReduce: Big Data Analy=cs on AWS

An Econocom Group company. Your partner in the transi4on towards Mobile IT

Information and Communications Technology Supply Chain Risk Management (ICT SCRM) AND NIST Cybersecurity Framework

Luncheon Webinar Series July 29, 2010

IBM Software Enabling business agility through real-time process visibility

Honeycomb Crea/ve Works is financed by the European Union s European Regional Development Fund through the INTERREG IVA Cross- border Programme

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

WHITE PAPER SPLUNK SOFTWARE AS A SIEM

IBM Big Data in Government

SUMMIT. November 2010

Transcription:

Mandy Chessell FREng CEng FBCS Dis4nguished Engineer, Master Inventor Chief Architect, Solu4ons The Reservoir 10 th September 2014

A growing demand Business Teams want Open access to more informa4on More powerful analysis and visualiza4on tools IT Teams are Concerned about cost. Concerned about governance and regulatory requirements. both the CMO and CIO are on the hook for turning all that data into above-market growth. Call it a shotgun marriage, but it s one that CMOs and CIOs both need to make work CMOs and CIOs Need to Get Along to Make Big Work (HBR 02.04.2014) 2

The reality of information management today 3

Business scenarios we see Subject maier experts want access to their organiza4on s data to explore the content, select, control, annotate and access informa4on using their terminology with an underpinning of protec4on and governance. Scien4sts seeking data for new analy4cs models. Marketeer seeking data for new campaigns. Fraud inves4gator seeking data to understand the details of suspicious ac4vity. Day- to- day acevity. Requiring ad hoc access to a wide variety of data sources. SupporEng analysis and decision making. Using the subject maper experts terminology. Providing the flexibility of spreadsheets that can scale to large volumes, a wide variety of informa4on types whilst protec4ng sensi4ve informa4on and op4mizing data storage and provisioning. 4

blues & skills issues A dispropor4onate por4on of the 4me spent in analy4cs project is about data prepara4on: acquiring/preparing/formarng/normalizing the data In addi4on to raw data, augmented data/analy4cal assets can significantly speed up the analy4cs process and par4ally bridge the talent gap 5

Initial Big efforts are focused on gaining insights from existing and new sources Big data sources +,-./-0123./ 89:.1/ 8;-2</ =:./3,/ 8@1:,.-<6A::?/ 43567-1- =302-<6>:?2- BAC76=0-./63,6DE=67-1- A,::FG3,;6+:@1 H:3/I-12-< JK?23 =12<<6C;-5:/6L6M2?:3/ '# %!# $'" '$" "&# '(" '(" ')" ')" '*" $!" "%# ""# *(# *"# %&" %#" +&# Respondents with active big data efforts were asked which data sources are currently being collected and analyzed as part of active big data efforts within their organization. #$" ()# &'#!!" $%#!"# Global respondents Banking & Fin Mgmt respondents 6 6 Source: The real world use of Big, IBM & University of Oxford

The true value of Big is in context Ownership? Quality? Security? Privacy? Actuarial data Government statistics Epidemic data Patient records Weather history Location risk... Occupational risk Family history Raw data Travel history Feature extraction metadata Dietary risk... Personal financial situation Chemical exposure Domain linkages Social relationships Full contextual analytics 7

The interesting dilemma A man goes into a jewellers and buys an expensive watch Is it fraud in which case the bank must stop it Threat Is it money- laundering in which case the bank must report it Obliga4on Does he have an expensive trophy wife in which case perhaps he would be interested in a loan? Has he just won the loiery should the bank improve the services offered? Opportunity The same event is of interest by different departments. There is major overlap in the data required to answer the ques4on. It may not be possible to determine the answer with just the informa4on in the channel - Previous or subsequent ac4vity is required It is all a maier of coordina4on and 4ming 8

Big Lakes or Swamps? As we collect data Can we preserve clarity? Do we know what we are collecting? Can we find the data we need? Are we creating a data swamp? How do we build trust in big data? Do we know what data is being used for? 9

Organizations expect information governance will deliver Understanding of the informa4on they have, Confidence to share and reuse informa4on, Protec4on from unauthorised use of informa4on, Monitoring of ac4vity around the informa4on, Implementa4on of key business processes that manage informa4on, Tracking the provenance of informa4on and Management of the growth and distribu4on of their informa4on. 10

The Reservoir Reservoir = Efficient Management, Governance, ProtecEon and Access. As organiza4ons experiment with analy4cs they discover: Crea4ng new analy4cs requires access to historical data from many systems. This data includes valuable and sensi4ve data that is core to the organiza4on s opera4on. Hadoop is a flexible plazorm for storing many types of data but is not necessarily fast enough for the produc4on deployment of some analy4cs. needs to be reformaied and copied onto a specialist analy4cs plazorms such as Netezza. A data reservoir provides: Single extrac4on of data from opera4onal systems and distribu4on to mul4ple analy4cs plazorms. Cataloguing and governance of the data in the analy4cs plazorms Simple interfaces for the line of business to access the informa4on they need. 11

How does the data reservoir operate? 2 5 6 Discover Explore Access 1 3 Adver4se Catalog 4 Reservoir Repositories Provision 12

reservoir system context diagram Decision Model Management Enterprise IT Deploy Real- 4me Decision Models Deploy Real- 4me Decision Models Service Calls Access Deposit Deploy Decision Models Line of Business Applications Systems of Engagement Events to Evaluate Search Requests System of Record Applications New Sources Enterprise Service Bus Third Party Feeds No4fica4ons Service Calls Reservoir Cura4on Interac4on Service Calls Deposit Simple, Ad Hoc Discovery and Analysis Out Third Party APIs Internal Sources In Access Report Requests Reporting Other Systems Other Of Insight Systems Of Insight Management Reservoir Operations 13

reservoir system context diagram Decision Model Management Enterprise IT Deploy Real- 4me Decision Models Deploy Real- 4me Decision Models Service Calls Access Deposit Deploy Decision Models Line of Business Applications Systems of Engagement System of Record Applications New Sources Enterprise Service Bus Third Party Feeds Third Party APIs Internal Sources Events to Evaluate No4fica4ons Service Calls Out In Descriptive Deposited Shared Operational Historical Harvested Content Hub Catalog Operational History Information Warehouse Deep Information Views Audit Code Hub Reservoir Repositories Search Requests Cura4on Interac4on Service Calls Deposit Access Report Requests Simple, Ad Hoc Discovery and Analysis Reporting Other Systems Other Of Insight Systems Of Insight 14 Reservoir Management Reservoir Operations

reservoir system context diagram Decision Model Management Information Curator Governance, Risk and Compliance Team Enterprise IT Systems of Engagement System of Record Applications New Sources Enterprise Service Bus Third Party Feeds Third Party APIs Internal Sources Deploy Real- 4me Decision Models Events to Evaluate No4fica4ons Service Calls Out In Deploy Real- 4me Decision Models Service Calls Access Descriptive Deposited Shared Operational Historical Harvested Deposit Content Hub Catalog Operational History Deploy Decision Models Information Warehouse Deep Information Views Audit Understand Sources Code Hub Catalog Interfaces Reservoir Repositories Adver4se Source Understand Compliance Understand Sources Search Requests Cura4on Interac4on Service Calls Deposit Access Report Requests Report Compliance Line of Business Applications Simple, Ad Hoc Discovery and Analysis Reporting Other Systems Other Of Insight Systems Of Insight 15 Reservoir Management Reservoir Operations

Differing user perspectives Provision Sand Boxes. Sand Box Search for, locate and download data and related ar4facts. Define governance policies, rules and classifica4ons. Monitor compliance. Governance Catalogue View lineage (business and technical) and perform impact analysis. Add addi4onal insight into data sources through automated analysis. 16 Develop data management models and implementa4ons. Cura4on of Metadata about Stores, Models, Defini4ons Stores Stores Stores

reservoir system context diagram Decision Model Management Information Curator Governance, Risk and Compliance Team Enterprise IT Systems of Engagement System of Record Applications New Sources Enterprise Service Bus Third Party Feeds Third Party APIs Internal Sources Deploy Real- 4me Decision Models Events to Evaluate No4fica4ons Service Calls Out In Deploy Real- 4me Decision Models Enterprise IT Interaction Real-time Analyics Streaming Analytics Real-time Interfaces Publishing Feeds Ingestion Service Calls Access Descriptive Deposited Shared Operational Historical Harvested Deposit Content Hub Catalog Operational History Deploy Decision Models Information Warehouse Deep Information Views Audit Understand Sources Code Hub Catalog Interfaces Reservoir Repositories Adver4se Source Understand Compliance Understand Sources Search Requests Cura4on Interac4on Service Calls Deposit Access Report Requests Report Compliance Line of Business Applications Simple, Ad Hoc Discovery and Analysis Reporting Other Systems Other Of Insight Systems Of Insight 17 Information Integration & Governance Information Broker Code Hub Staging Areas Operational Governance Hub Reservoir Monitor Workflow Guards Management Reservoir Operations

reservoir system context diagram Decision Model Management Information Curator Governance, Risk and Compliance Team Enterprise IT Systems of Engagement System of Record Applications New Sources Enterprise Service Bus Third Party Feeds Third Party APIs Internal Sources Deploy Real- 4me Decision Models Events to Evaluate No4fica4ons Service Calls Out In Deploy Real- 4me Decision Models Enterprise IT Interaction Real-time Analyics Streaming Analytics Real-time Interfaces Publishing Feeds Ingestion Service Calls Access Sand boxes Descriptive Deposited Shared Operational Historical Harvested Deposit Content Hub Catalog Operational History Deploy Decision Models Information Warehouse Deep Raw Interaction Information Views Audit Understand Sources Code Hub Catalog Interfaces Reservoir Repositories Adver4se Source View-based Interaction Information Access and Feedback Sand boxes Reporting Marts Understand Compliance Understand Sources Search Requests Cura4on Interac4on Service Calls Deposit Access Report Requests Report Compliance Line of Business Applications Simple, Ad Hoc Discovery and Analysis Reporting Other Systems Other Of Insight Systems Of Insight 18 Information Integration & Governance Information Broker Code Hub Staging Areas Operational Governance Hub Reservoir Monitor Workflow Guards Management Reservoir Operations

Classification Schemes Classifica4on is at the heart of informa4on governance. It characterizes the type, value and cost of informa4on, or the mechanism that manage it. The design of the classifica4on schemes is key to controlling the cost and effec4veness of the informa4on governance program. Business Classifica-ons Business classifica4ons characterize informa4on from a business perspec4ve. This captures its value, how it is used, and the impact to the business if it is misused. Resource Classifica-ons Resource classifica4ons characterize the capability of the IT infrastructure that supports the management of informa4on. A resource's capability is partly due to its innate func4ons and partly controlled by the way it has been configured. Ac-vity Classifica-ons Ac4vity classifica4ons help to characterize procedures, ac4ons and automated processes. Seman-c Classifica-on Seman4c classifica4on iden4fies the meaning of an informa4on element. The classifica4on scheme is a glossary of concepts from relevant subject areas. These glossaries are industry specific and they are shipped with our industry models. The seman4c classifica4ons are defined at two levels: Subject area classifica4on Business term classifica4on 19

Governance Rules Defined for each classification for each situation Sensi4ve informa4on masked here Personal informa4on masked here 20

New policy support inside the Information Governance Catalogue Principle Principle Policy Principle Policy Policy Implica4ons Implica4ons Ac4oned by Governance Rule Governs Classifica4on Classifica4on Classified by Metadata Descrip4on Implica4ons Implemented by Describes Governance Rule Governance Rule Implementa4ons Modelled Metadata Implementa4ons Asset Deployed to, Executed by, Monitored by Asset 21

Integrated Metadata Lineage (Traceability) Where does this data come from? Why is this data incorrect? Why is this data incomplete? Can I trust this value? Impact Analysis Where is this element used? What happens if I change this? Op4miza4on Where is the redundancy? How can I make this run more efficiently? Understanding What does this mean? How is this used? Control Why is this parameter set to this value? Who made this change? I can change this to meet new business requirements 22

Information Governance People, Process and Technology Successful Governance is implemented with a combina4on of: Skilled people, correct roles and organiza4on Processes that create a pragma4c, targeted and agile work environment. Standards, templates and assets that improve consistency between implementa4ons. Technology that automates classifica4on, enforcement valida4on, and correc4on of data. Governance Activity Type Information Exchange Role that technology can play Communication Policies & Metrics Delivering education, best practices, assessments, templates. Compliance Design Changes Implementing control points and enforcement points. Support for design and code reviews. Test Management. Exception Exception Requests Exception process management, incident reporting. Feedback Measurements Dashboards and reports on compliance. Vitality New Requirements Change process management 23

Example: Supporting the Use of Authoritative Sources Defini4ons of policies, rules and processes around the use of authorita4ve sources Enforcement in build, automated test cases and configura4on management Defini4ons of authorita4ve sources and where they are used. Make it Clear Make it Unavoidable Control Points in Sprint and Design Review Processes Make it Visible Make it Easy to do the right thing Easy access to compliant test data. Pre- built data structures implemen4ng standards. refinery services for common re- engineering tasks. 24

Three interlocking lifecycles of information governance Audit Define Policy Classify Monitor Roll out Metadata Design Execute Opera4ons Policy Policy Development Deploy Detect Remediate Develop 25

Summary Increasing use of analy4cs requires business to have greater access to more informa4on governance needs to adapt so that the business takes reasonability for governance The governance program should educate, trust implicitly that people will do the right thing - and check - and follow up People are at the heart of governance - processes help them collabora4on and build trust - technology supports their work 26

zz z z z z z Questions? 27

New Information Architectures and Capabilities Reference Material 28

Governing and managing Big for Analytics and Decision Makers An introduc4on to the Reservoir solu4on http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html?open 29

Ethics for Big and Analytics Context for what purpose was the data originally surrendered? For what purpose is the data now being used? How far removed from the original context is its new use? Consent & Choice What are the choices given to an affected party? Do they know they are making a choice? Do they really understand what they are agreeing to? Do they really have an opportunity to decline? What alternatives are offered? Reasonable is the depth and breadth of the data used and the relationships derived reasonable for the application it is used for? Substantiated Are the sources of data used appropriate, authoritative, complete and timely for the application? Owned Who owns the resulting insight? What are their responsibilities towards it in terms of its protection and the obligation to act? Fair How equitable are the results of the application to all parties? Is everyone properly compensated? Considered What are the consequences of the data collection and analysis? Access What access to data is given to the data subject? Accountable How are mistakes and unintended consequences detected and repaired? Can the interested parties check the results that affect them? http://www.ibmbigdatahub. com/whitepaper/ethics-bigdata-and-analytics 30

PaIerns of Management Vocabulary for design discussions. Icons used in white- boarding and design documents. Design guidance for specific solu4ons. A catalogue of design choices. Founda4on for serng architecture standards and reference architectures. Material for educa4on and training in informa4on architecture. http://www.informit.com/store/patterns-of-information-management- 9780133155501?WT.mc_id=Author_Chessell_PoIM 31

Staying Ahead in the Cyber Security Game hip://www- 01.ibm.com/common/ssi/cgi- bin/ssialias?subtype=wh&infotyp e=sa&appname=swge_ti_se_us EN&htmlfid=TIL14103USEN&aIac hment=til14103usen.pdf#loade d 32

33