Big Data for Good or Evil Lessons from the NSA PRISM Scandal



Similar documents
Big Data for Good or Evil Lessons from the NSA PRISM Scandal

Governance: The Key to Agile Architecture

JOURNAL OF OBJECT TECHNOLOGY

We are Big Data A Sonian Whitepaper

Testing Web Services Today and Tomorrow

THE AGILE ARCHITECTURE REVOLUTION

Independent process platform

Five best practices for deploying a successful service-oriented architecture

Cloud First Does Not Have to Mean Cloud Exclusively. Digital Government Institute s Cloud Computing & Data Center Conference, September 2014

Business white paper. Lower risk and cost with proactive information governance

Project Academy Series

Operations Management for Virtual and Cloud Infrastructures: A Best Practices Guide

What to Look for When Selecting a Master Data Management Solution

Service Oriented Architecture and the DBA Kathy Komer Aetna Inc. New England DB2 Users Group. Tuesday June 12 1:00-2:15

Introduction to BPM. Dr. Setrag Khoshafian. Chief Evangelist & VP of BPM Technology

SOA, Cloud Computing & Semantic Web Technology: Understanding How They Can Work Together. Thomas Erl, Arcitura Education Inc. & SOA Systems Inc.

What s New with Informatica Data Services & PowerCenter Data Virtualization Edition

Test Data Management in the New Era of Computing

MarkLogic Enterprise Data Layer

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Data Center Network Evolution: Increase the Value of IT in Your Organization

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Software defined networking. Your path to an agile hybrid cloud network

Agilità per perseguire nuovi modelli di business e creare nuovo valore nel mercato delle utilities. Cristina Viscontino SoftwareAG Solution Architect

Federal Enterprise Architecture and Service-Oriented Architecture

How To Handle Big Data With A Data Scientist

Integrated Social and Enterprise Data = Enhanced Analytics

Customer Cloud Architecture for Mobile.

<Insert Picture Here> Increasing the Effectiveness and Efficiency of SOA through Governance

JBOSS ENTERPRISE SOA PLATFORM AND JBOSS ENTERPRISE DATA SERVICES PLATFORM VALUE PROPOSITION AND DIFFERENTIATION

A Comprehensive Solution for API Management

SOA: The missing link between Enterprise Architecture and Solution Architecture

Building a Scalable Big Data Infrastructure for Dynamic Workflows

How To Create An Insight Analysis For Cyber Security

Informatica PowerCenter Data Virtualization Edition

3 MUST-HAVES IN PUBLIC SECTOR INFORMATION GOVERNANCE

Forward Thinking for Tomorrow s Projects Requirements for Business Analytics

IBM Software IBM Business Process Management Suite. Increase business agility with the IBM Business Process Management Suite

ECM Migration Without Disrupting Your Business: Seven Steps to Effectively Move Your Documents

SERVICE ORIENTED ARCHITECTURE

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

An Enterprise Architect s Guide to API Integration for ESB and SOA

RED HAT AND HORTONWORKS: OPEN MODERN DATA ARCHITECTURE FOR THE ENTERPRISE

Service Governance and Virtualization For SOA

Reaping the Rewards of Big Data

Operational Excellence for Data Quality

Real World Application and Usage of IBM Advanced Analytics Technology

Solving the Security Puzzle

Your Data, Any Place, Any Time. Microsoft SQL Server 2008 provides a trusted, productive, and intelligent data platform that enables you to:

BPM and Rules Technical Update. Sunil Aggarwal, WebSphere BPM Leader UK&I

Debugging the Hype about Big Data and Business Service Metrics

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

Open Group SOA Governance. San Diego 2009

Improved SOA Portfolio Management with Enterprise Architecture and webmethods

Effective Data Integration - where to begin. Bryte Systems

SOA Adoption Challenges

Data Discovery, Analytics, and the Enterprise Data Hub

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Delivering Outstanding Customer Care in a High Volume Call Center Environment

Vermont Enterprise Architecture Framework (VEAF) Master Data Management (MDM) Abridged Strategy Level 0

Reaching Customers Across Multiple Channels

Tap into Big Data at the Speed of Business

BIG DATA THE NEW OPPORTUNITY

The Way to SOA Concept, Architectural Components and Organization

The Future of Data Management

!!!!! BIG DATA IN A DAY!

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

How To Develop An Application

How to Run a Successful Big Data POC in 6 Weeks

Government's Adoption of SOA and SOA Examples

Guiding SOA Evolution through Governance From SOA 101 to Virtualization to Cloud Computing

Enterprise IT Architectures BPM (Business Process Management)

VMware vcenter Log Insight Delivers Immediate Value to IT Operations. The Value of VMware vcenter Log Insight : The Customer Perspective

How to Plan a Successful Load Testing Programme for today s websites

ENHANCING INTELLIGENCE SUCCESS: DATA CHARACTERIZATION Francine Forney, Senior Management Consultant, Fuel Consulting, LLC May 2013

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Transcription:

Big Data for Good or Evil Lessons from the NSA PRISM Scandal Jason Bloomberg About Jason Bloomberg President of ZapThink, a Dovel Technologies Company One of the original Managing Partners of ZapThink LLC Acquired by Dovel Technologies in August 2011 Global thought leader in the areas of Cloud Computing, EA, & SOA Created the Licensed ZapThink Architect (LZA) SOA course & associated credential Run LZA course & Enterprise Cloud Computing course around the world Analyst for GigaOM and blogger for DevX New book, The Agile Architecture Revolution, is now available! 2 1

What are Big Data? Datasets whose size is beyond the ability of typical database software tools to capture, store, manage & analyze 3 2012 Big Data Technology Landscape 4 2

Today s Big Data are Tomorrow s Small Data? Definition intentionally subjective & moving definition of how big a dataset must be No fixed threshold As technology advances, size of datasets that qualify will increase 5 What about yesterday s data? Big Data May Include Historical Data If the amount of data doubles every two years, then half your data are always over two years old 6 3

Big Data Crisis Point Quantity & complexity of information The Big Data crisis point Ability to deal with quantity & complexity of information Time 7 Parkinson s Law (Big Data Corollary) Quantity of data will always expand to exceed available capacity for storing & processing it 8 4

If Someone Can Collect Big Data, then Someone Will Corollary to Parkinson s Law in action If you re not collecting Big Data, then someone else is The easier it is to collect Big Data, the more important it is to govern them 9 You must govern your metadata Metadata may even contain most of the business value Not just technical value Metadata governance at least as important as data governance Metadata may be Big Data as Well 10 5

Govern the Data You Don t Want Big Data analytics focuses on finding the nuggets of gold in the dross The data you don t want must still be governed, secured, & managed As Big Data sets grow, governing the dross is increasingly challenging 11 Not just valuable but dangerous Personally identifiable information Risk of false positives Big Data Results May be Dangerous Nuggets of Uranium, not Gold 12 6

Big Data Used to Mislead According to the figures published by a major tech provider, the Internet carries 1,826 Petabytes of information per day. In its foreign intelligence mission, NSA touches about 1.6% of that. However, of the 1.6% of the data, only 0.025% is actually selected for review. The net effect is that NSA analysts look at 0.00004% of the world s traffic in conducting their mission that s less than one part in a million. Put another way, if a standard basketball court represented the global communications environment, NSA s total collection would be represented by an area smaller than a dime on that basketball court. NSA 13 In Other Words 7.5 terabytes of analytical results to process manually every day Would be equivalent of Call Detail Records for 5 million calls every day per person on the planet! 14 7

Wrong Conclusion? NSA spies on data in the US, so Keep your data out of the US, right? Assumes: Your country isn t spying on you too! Your country isn t working with the NSA! The NSA can t spy on data outside the US! What are your Big Data policies? 15 Governance the Old Way Information Problem Tools Policies for using the tools Governance 16 8

Today s Data Governance (simplified) Our data are unclean! Great! Here are policies & processes for how to manage data quality using our tool. Let s use this data quality tool. 17 Governance the New Way Information Problem Tools Policies for using the tools Meta-policies for dealing with governance Next-generation governance tools Best practice approach to Big Data Crisis 18 9

Meta Thinking Meta-requirement Requirement that applies to other requirements E.g., Business Agility requirement Meta-methodology Methodology for creating or modifying methodologies Following the Agile principle responding to change over following a plan even if the plan is to follow Agile Meta-policy Policy for how to perform governance 19 Dealing with Change Meta thinking doesn t look at something Meta thinking means looking at how something changes Meta thinking is typically manual Always includes people 20 10

Avoiding Hall of Mirrors Problem Meta-policy: how to we automate policy enforcement? Meta-meta-policy: how to we automate metapolicy enforcement? Answer: we don t (yet)! 21 Big Data Governance (even more simplified), part 1 We have too much information! Great! Here are policies & processes for how to use the Big Data tool. Let s use this Big Data tool. Uhh, our Big Data got too big for the tool. 22 11

Big Data Governance, part 2 Dang. Here s our policy for how to deal with ever-increasing quantities of data. Huh? We need a way to manage policies for dealing with ongoing Big Data challenges 23 Especially if central challenge is data quality Big Data sets tend to be unclean Structured, semi-structured, & unstructured Good and bad mixed together The move from traditional analytics to Big Data analytics is a move to poorer levels of data quality Big Data Analytics Tools May be Governance Tools 24 12

Not just governance of technology Governance with technology Largely automated Proactive Inherently iterative Not your Parents Governance! Agile 25 Governance Leads to Empowerment The more powerful the tools, the more important it is that people know how to use them properly IT should empower the people in the organization 26 13

SOA Governance (Supposedly) Works this Way! SOA Policy Security Policies, Routing Policies, etc. Registry/ Repository Policies for handling governance in the reg/rep ESB Meta-policy 27 Cloud shifts IT provisioning & management to the user Cloud automates previously manual tasks Greater risk of mucking things up How Cloud Changes the Equation Increased need for governance 28 14

Next Generation Governance 29 the Key to the Big Data Explosion 30 15

Big Data will always be too big Big Data challenge will always be changing Next-generation data governance tools must drive business agility Our Tools are Only as Good as our Architecture Tools will always fall short without architecture that supports and drives change 31 Book Giveaway! Jason Bloomberg President ZapThink, a Dovel Technologies Company jbloomberg@zapthink.com @theebizwizard 16