BIG DATA John A. Eisenhauer Chair, Data Governance Society Rick Young - Managing Director 3Sage Consulting
WHAT IS BIG DATA? Volume Amount Velocity Frequency of change Variety Complexity Value
WHERE DOES BIG DATA COME FROM? Enterprise business systems Government databases (i.e. data.gov) White papers Instrumentation (Factories and laboratories) Social media (i.e. Facebook and Twitter) Traditional media (TV, Newspapers, radio) Videos (YouTube, etc ) Appliances, phones and just about anywhere
INFORMATION GOVERNANCE
WHAT IS INFORMATION GOVERNANCE Information governance is the set of Principles, Policies and Processes by which an organization ensures that information is protected and aligned with its needs and objectives Information is an organizational asset that includes both data and the context by which that data has meaning
Information Governance Corporate Strategy What does the Organization want to achieve? What markets will we enter? What products will we create? What makes us unique? What are our operational & financial goals? Corporate Informa0on Strategy To achieve our mission, what information do we need and what do we intend to do with it? Informa0on Governance Technology Architecture (Data Capture & Storage) Enterprise PlaAorms (ERP, CRM, Portal, etc ) Database Management, Network Systems, etc How will we Protect and Align the information in accordance with the needs of our business? How will we create, manage and configure enterprise platforms to support the needs of our business?
VALUE OF INFORMATION GOVERNANCE Enable an organization to Leverage Information as a Competitive Asset
EMPOWER BIG D WITH DG
ANALYTICS According to MIT Sloan Management Review, top performing organizations are two times more likely to use Analytics as a competitive differentiator. In their study, there were 5 key recommendations:
DATA LEVERAGED ORGANIZATION (DLO)
BIG DATA GOVERNANCE TECHNOLOGY Not a single solution
NO SINGLE GOVERNANCE TECHNOLOGY Big Data Governance requires multiple disciplines of Data Management and their respective solutions in order to deliver value, for example: Data Quality: data must be profiled and standardized in order to develop consistent analysis models Master Data Management: the ability to integrate with key domains to associate big data occurrences (e.g., link a Customer to a Twitter account) Metadata Management: to derive the appropriate contexts for the data, we can use tagging and metadata to associate taxonomies, semantic libraries and ontologies Information Lifecycle Management: Enterprises must to define the length of time for which the data is needed online and then define the archiving and storage strategies for this data post-consumption
EXAMPLE REFERENCE ARCHITECTURE Multiple technology platforms required to manage and govern Big Data Source: Big Data Governance: An Emerging Imperative (Sunil Soares)
EXAMPLE SOLUTION ARCHITECTURES IBM has developed a comprehensive suite of applications within its InfoSphere family to develop Big Data solutions focused on the analytic requirements.
EXAMPLE SOLUTION ARCHITECTURES Informatica Big Data Strategy Informatica has developed a comprehensive Data Management platform to integrate Big Data, but does not offer solutions in the analytics space
QUESTIONS How do you prove the associated value from integrating big data? What does a big data program look like? What are the differences in such a program from any other data management program? Why do you need to pay attention to this initiative?
APPENDICES
BIG DATA... BIG DEAL We ve always had more data than our systems could handle that s why we make em bigger! Besides, what s big today won t be tomorrow, so what s the big deal?