Big Data for Government Symposium http://www.ttcus.com



Similar documents
Chris Greer. Big Data and the Internet of Things

NIST Big Data PWG & RDA Big Data Infrastructure WG: Implementation Strategy: Best Practice Guideline for Big Data Application Development

Understanding Big Data Analytics for Research

ISO JTC 1 SGBD Mtg and ACM Workshop

NIST Big Data Phase I Public Working Group

Big Data Systems and Interoperability

ISO/IEC JTC1 SC32. Next Generation Analytics Study Group

Big Data Use Cases and Requirements

DRAFT NIST Big Data Interoperability Framework: Volume 6, Reference Architecture

Cloud and Big Data Standardisation

Big Data Standardisation in Industry and Research

DRAFT NIST Big Data Interoperability Framework: Volume 6, Reference Architecture

Standard Big Data Architecture and Infrastructure

The InterNational Committee for Information Technology Standards INCITS Big Data

Overview NIST Big Data Working Group Activities

NIST Big Data Interoperability Framework: Volume 2, Big Data Taxonomies

NIST Cloud Computing Program Activities

Standards for Big Data in the Cloud

Exploring the roles and responsibilities of data centres and institutions in curating research data a preliminary briefing.

DRAFT NIST Big Data Interoperability Framework: Volume 5, Architectures White Paper Survey

COMP9321 Web Application Engineering

A Big Picture for Big Data

POLAR IT SERVICES. Business Intelligence Project Methodology

DRAFT NIST Big Data Interoperability Framework: Volume 1, Definitions

Highlights & Next Steps

ISO/IEC JTC 1/WG 10 Working Group on Internet of Things. Sangkeun YOO, Convenor

Big Data and Analytics: Challenges and Opportunities

This Symposium brought to you by

NIST Big Data Public Working Group

Accenture Cyber Security Transformation. October 2015

NIST Cloud Computing Program

Streamlining the Process of Business Intelligence with JReport

Building Out BPM/SOA Centers of Excellence Business Driven Process Improvement

How can the Future Internet enable Smart Energy?

S&I Framework: The Role of Standards in Supporting Healthcare Data Initiatives

Green Cloud Computing: Case Study Sri Lanka & Pakistan

Information Security ISO Standards. Feb 11, Glen Bruce Director, Enterprise Risk Security & Privacy

EL Program: Smart Manufacturing Systems Design and Analysis

How Big Data Transforms Data Protection and Storage

NIST Cloud Computing Reference Architecture & Taxonomy Working Group

Business Analysis Standardization & Maturity

BIG DATA WITHIN THE LARGE ENTERPRISE 9/19/2013. Navigating Implementation and Governance

Analytics Strategy Information Architecture Data Management Analytics Value and Governance Realization

Workprogramme

The NIST Cloud Computing Program

Survey of Big Data Benchmarking

CEN and CENELEC response to the EC Consultation on Standards in the Digital Single Market: setting priorities and ensuring delivery January 2016

for Oil & Gas Industry

Cisco Network Optimization Service

Survey of Big Data Architecture and Framework from the Industry

Certified Information Professional 2016 Update Outline

Cisco Security Optimization Service

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

Data Virtualization Overview

Cisco Unified Communications and Collaboration technology is changing the way we go about the business of the University.

ANALYTICS STRATEGY: creating a roadmap for success

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Big Analytics: A Next Generation Roadmap

What happens when Big Data and Master Data come together?

Development of a Conceptual Reference Model for Micro Energy Grid

Enterprise Data Governance

NIST Big Data Interoperability Framework: Volume 5, Architectures White Paper Survey

Program Advisory Committee (PAC) Agenda. December 14, :00am 3:00pm PST. Agenda Items:

Synergies between the Big Data Value (BDV) Public Private Partnership and the Helix Nebula Initiative (HNI)

Terms of Reference. ITU-T Focus Group on Smart Cable Television (FG SmartCable)

Bridging the Gap between On-Premise BizTalk ESB and Windows Azure platform AppFabric

BIG DATA AND ANALYTICS

Big Data Support Services. Service Definition

Oracle Technical Cloud Consulting Services Descriptions. July 23, 2015

Data Intensive Research Initiative for South Africa (DIRISA)

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

NESSI Summit 2014 The European Data Market. Gabriella Cattaneo, IDC Europe May 27, 2014 Brussels

Italian Enterprises Adopt Big Data Solutions. Forrester Consulting

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

The Forefront of ICT International Standardization for Smart City and Smart Grid

Transcription:

@TECHTrain Big Data for Government Symposium http://www.ttcus.com Linkedin/Groups: Technology Training i Corporation

NIST Big Data NIST Bi D t Public Working Group p Wo Chang, NIST, wchang@nist.gov Ro ber t Marcus, E T S t r a t e g i e s Chaitanya B a r u, UC San u UC San Diego http://bigdatawg.nist.gov November, 20, 2013

AGENDA h h Why Big Data? Why NIST? NBD PWD Charter and Deliverables Overall Workplan O ll W k l Subgroup Charter and Deliverables Definitions & Taxonomies Subgroup Requirements & Use Case Subgroup Security & Privacy Subgroup Reference Architecture Subgroup Technology Roadmap Subgroup Next Steps/Future Activities g y p ISO/IEC JTC 1 Big Data Study Group 3

Why Big Data? Why NIST? Why Big Data? There is a broad agreement among commercial, academic, and government leaders about the remarkable potential of Big Data to spark innovation, fuel commerce, and drive progress. Why NIST? (a) Recommendation from January 15 h d f 17, 2013 Cloud/Big Data Forum l d and (b) A lack of consensus on some important, fundamental questions is confusing potential users and holding back progress. Questions such as: What are the attributes that define Big Data solutions? How is Big Data different from the traditional data environments and related applications that we have encountered thus far? What are the essential characteristics of Big Data environments? Wh h i l h i i f Bi D i? How do these environments integrate with currently deployed architectures? f, g, g What are the central scientific, technological, and standardization challenges that need to be addressed to accelerate the deployment of robust Big Data solutions? NBD PWG is being launched to address these questions and is charged to d l develop consensus definitions, taxonomies, secure reference architecture, and d fi iti t i f hit t d technology roadmap for Big Data that can be embraced by all sectors. 4

C H A R T E R ( M 0 0 0 1 ) T h e f o c u s o f t h e ( N B D P W G ) i s t o f o r m a c o m m u n i t y o f i n t e r e s t f r o m i n d u s t r y, a c a d e m i a, a n d g o v e r n m e n t, w i t h t h e g o a l o f d e v e l o p i n g a c o n s e n s u s d e f i n i t i o n s, t a xo n o m i e s, s e c u r e r e f e r e n c e a r c h i t e c t u r e s, a n d t e c h n o l o g y r o a d m a p. T h e a i m i s t o c r e a t e v e n d o r n e u t r a l, t e c h n o l o g y a n d i n f r a s t r u c t u r e a g n o s t i c d e l i v e r a b l e s t o e n a b l e b i g d a t a s t a ke h o l d e r s t o p i c k a n d c h o o s e b e s t a n a l y t i c s t o o l s f o r t h e i r p r o c e s s i n g a n d v i s u a l i z a t i o n r e q u i r e m e n t s o n t h e m o s t s u i t a b l e c o m p u t i n g p l a t f o r m s a n d c l u s t e r s w h i l e a l l o w i n g v a l u e a d d e d f r o m b i g d a t a s e r v i c e p r o v i d e r s a n d f l o w o f d a t a b e t w e e n t h e s t a ke h o l d e r s i n a c o h e s i v e a n d s e c u r e m a n n e r. DELIVERABLES: Working Draft for 1. B i g D a t a D e f i n i t i o n s 2. B i g D a t a Ta xo n o m i e s 3. B i g D a t a R e q u i r e m e n t s 4. B i g D a t a S e c u r i t y a n d P r i v a c y R e q u i r e m e n t s 5. B i g D a t a A r c h i t e c t u r e s S u r v e y 6. B i g D a t a R e f e r e n c e Architectures 7. B i g D a t a S e c u r i t y a n d P r i v a c y R e f e r e n c e Architectures 8. B i g D a t a Te c h n o l o g y Roadmap LAUNCHED DATE: June 26,, 20133 TARGET DATE: September 27, 2013 CHARTER AND DELIVERABLES NBD PWG 5

6

Requirement s and Use Cases Technology Roadmap NBD PWG Reference Architecture Definitions & Taxonomies Security and Privacy SUBGROUPS A ND THEIR SCOPES AN D DELIVERABLES 7

SUBGR S ROUP Requirements and Use Cases Geoffrey Fox, U. Indiana Joe Paiva, VA Tsegereda Beyene, Cisco Scope (M0020) The focus is to form a community of interest from industry, academia, and government, with the goal of developing a consensus list of Big Data requirements across all stakeholders. This includes gathering and understanding various use cases from diversified application domains. Tasks Gather input from all stakeholders regarding Big Data requirements. Technology Roadmap Definitions and Taxonomies NBD PWG Requiremen ts Analyze/prioritize a list of challenging general requirements that may delay or prevent adoption of Big Data deployment Develop a comprehensive list of Big Data requirements Reference Architecture Security and Privacy Gov. Big Data Symposium, NIST/ITL, Wo Chang, Nov. 20, 2013 8

Requirements and Use Case Subgroup Key documents: M0105 use cases, M0125, requirements, M0152 working draft Use Case Template 1. Goals, Description 2. 3 3. 4. 5. 6 6. 7. Data Characteristics, Data Types Data Anal tics Data Analytics Current Solutions Security & Privacy Lifecycle Management and Data Quality System Management and Other issues 9

Requirements and Use Case Subgroup 51 Use Cases Received (http://bigdatawg.nist.gov/usecases.php) 1. Government Operation (4): National Archives and Records Administration, Census Bureau 2. Commercial (8): Finance in Cloud, Cloud Backup, Mendeley (Citations), Netflix, Web Search, Digital Materials, Cargo shipping (as in UPS) 3. Defense (3): Sensors, Image surveillance, Situation Assessment 4. Healthcare and Life Sciences (10): Medical records, Graph and Probabilistic analysis, Pathology, Bioimaging, Genomics, Epidemiology, People Activity models, Biodiversity 5. Deep Learning and Social Media (6): Driving Car, Geolocate images, Twitter, Crowd Sourcing, Network Science, NIST benchmark datasets 6. The Ecosystem for Research (4): Metadata, Collaboration, Language Translation, Light source experiments 7. Astronomy and Physics (5): Sky Surveys compared to simulation, Large Hadron Collider at CERN, Belle Accelerator II in Japan 8. Earth, Environmental and Polar Science (10): Radar Scattering in Atmosphere, Earthquake, Ocean, Earth Observation, Ice sheet Radar scattering, Earth radar mapping, Climate simulation datasets, Atmospheric turbulence identification, Subsurface Biogeochemistry (microbes to watersheds), AmeriFlux and FLUXNET gas sensors 9. Energy (10): Smart grid 10

Requirements and Use Case Subgroup Two step process is used for requirement extraction: 1. Extract specific requirements and map to reference architecture based on each application s characteristics such as: a data sources (data size, file formats, rate of grow, at rest or in a. data sources (data size file formats rate of grow at rest or in motion, etc.) b. data lifecycle management (curation, conversion, quality check, pre analytic processing, etc.) c. data transformation (data fusion/mashup, analytics), d f i d f h l d. capability infrastructure (software tools, platform tools, hardware resources such as storage and networking), and e. data usage (processed results in text, table, visual, and other formats). f. all architecture components informed by Goals and use case description g. Security & Privacy has direct map S it & P i h di t 2. Aggregate all specific requirements into high level generalized requirements which are vendor neutral and technology agnostic. 437 requirements were extracted from 51 Use Cases 437 q 5 35 aggregated general requirements into 7 categories 11

Partt of Prop perty Su ummary y Table Requirements and Use Case Subgroup 12

SUBGR S ROUP Definitions and Taxonomies Nancy Grady, SAIC Natasha Balac, SDSC Eugene Luster, R2AD Scope (M0018) The focus is to gain a better understanding d of the principles of Big Data. It is important to develop a consensus based common language and vocabulary terms used in Big Data across stakeholders from industry, academia, and government. In addition, it is also critical to identify essential actors with roles and responsibility, and subdivide them into components and sub components on how they interact/ relate with each other according to their similarities and differences. Tasks For Definitions: Compile terms used from all stakeholders regarding the meaning of Big Data from various standard bodies, domain applications, and diversified operational environments. Technology Roadmap Definitions and Taxonomies NBD PWG Requiremen ts Reference Security and Architecture Privacy For Taxonomies: Identify key actors with their roles and responsibilities from all stakeholders, categorize them into components and subcomponents based on their similarities and differences Develop Big Data Definitions and taxonomies documents Gov. Big Data Symposium, NIST/ITL, Wo Chang, Nov. 20, 2013 13

Definitions and Taxonomies Subgroup (M0024, M Key documents: M0024 ongoing discussion, M0142 working draft Big Data Definitions, v1 Big Data Definitions v1 (Developed from Jan. 15 (Developed from Jan 15 17, 2013 NIST 17 2013 NIST Cloud/BigData Workshop) Big Data refers to digital data volume, velocity Big Data refers to digital data volume velocity and/or variety that: enable novel approaches to frontier questions previously inaccessible or impractical using l bl l current or conventional methods; and/or g p y y exceed the storage capacity or analysis capability of current or conventional methods and systems. differentiates by storing and analyzing population data and not sample sizes. 14

Definitions and Taxonomies Subgroup (M0024, M Data Science is the extraction of actionable knowledge directly from data through a process of discovery, hypothesis, and analytical hypothesis analysis. Data Scientists is a practitioner who has sufficient knowledge of the overlapping regimes of expertise in business needs, domain knowledge, analytical skills and programming expertise to manage the end to end scientific method process through each stage in the big data lifecycle (through action) to deliver value. 15

Definitions and Taxonomies Subgroup (M0024, M Big Data Taxonomies (M0202) Actors S t R l System Roles Sensors Data Provider makes available data internal and/or external to the system Applications Software agents Individuals Organizations Hardware resources Service abstractions Data Consumer uses the output of the system S System Orchestrator O h governance, requirements, monitoring Big Data Application Provider instantiates application Big Data Framework Provider provides resources 16

Definitions and Taxonomies Subgroup (M0024, M Big Data Taxonomies (M0202) 17

Definitions and Taxonomies Subgroup (M0024, M Big Data Taxonomies (M0202) 18

SUBGR S ROUP Reference Architecture Orit Levin, Microsoft James Ketner, AT&T Don Krapohl, Augmented Intelligence Scope (M0021) The focus is to form a community of interest t from industry, academia, and government, with the goal of developing a consensus based approach to orchestrate vendor neutral, technology and infrastructure agnostic for analytics tools and computing environments. The goal is to enable Big Data stakeholders to pick and choose technologyagnostic analytics tools for processing and visualization in any computing platform and cluster while allowing value added from Big Data service providers and the flow of the data between the stakeholders in a cohesive and secure manner. Tasks Gather and study available Big Data architectures representing various stakeholders, different data types, use cases, and document the architectures using the Big Data taxonomies model based upon the identified actors with their roles and responsibilities. Technology Roadmap Definitions and Taxonomies NBD PWG Requiremen ts Ensure that the developed Big Data reference architecture and the Security and Privacy Reference Architecture correspond and complement each other. Reference Architecture Security and Privacy Gov. Big Data Symposium, NIST/ITL, Wo Chang, Nov. 20, 2013 19

Reference Architecture Subgroup Key documents: M0100, M0151 white paper, M0123 working draft Data Processing Flow D t P i Fl M0039 Data Transformation Flow D t T f ti Fl M0017 IT Stackk IT St M0047 20

Reference Architecture Subgroup V d Bi D A hi Vendors Big Data Architectures 21

Reference Architecture Subgroup What the Baseline Big Data RA A superset of a traditional IS data system y A representation of a vendor neutral and technology agnostic system g y A functional architecture comprised of logical roles Applicable to a variety of business models o Tightly integrated enterprise systems o Loosely coupled vertical industries A business architecture IS NOT representing internal vs. external functional boundaries A deployment architecture A detailed IT RA of a specific system implementation All of the above will be f developed in the next stage in the context of specific use cases. p f 22

Reference Architecture Subgroup RA Diagram 23

SUBGR S ROUP Technology Roadmap Definitions and Taxonomies NBD PWG Security and Privacy Arnab Roy, CSA/Fujitsu Nancy Landreville, U. MD Akhil Manchanda, GE Requiremen ts Scope (M0019) The focus is to form a community of interest t from industry, academia, and government, with the goal of developing a consensus secure reference architecture to handle security and privacy issues across all stakeholders. This includes gaining an understanding of what standards are available or under development, as well as identifies which key organizations are working on these standards. Tasks Gather input from all stakeholders regarding security and privacy concerns in Big Data processing, storage, and services. Analyze/prioritize a list of challenging security and privacy requirements that may delay or prevent adoption of Big Data deployment Develop a Security and Privacy Reference Architecture that supplements the general Big Data Reference Architecture Reference Architecture Security and Privacy Gov. Big Data Symposium, NIST/ITL, Wo Chang, Nov. 20, 2013 24

Security and Privacy Subgroup Key documents: Google Doc ongoing discussion, M0110 requirements working draft, M0xxx architecture & taxonomies Requirements Scope 1. Infrastructure Security I f S i 2. Data Privacy 3. Data Management 4 4. Integrity and Reactive Security Requirements Use Cases Studied 1. Retail (consumer) 2 2. Healthcare 3. Media (social media and communications) 4. Government (military, justice systems, etc.) 55. Marketingg Architecture & Taxonomies 1. Privacy 2. Provenance 3. System Health 25

SUBGR S ROUP Technology Roadmap Carl Buffington, USDA/Vistronix Dan McClary, Oracle David Boyd, Data Tactic Scope (M0022) The focus is to form a community of interest from industry, academia, and government, with the goal of developing a consensus vision with recommendations on how Big Data should move forward by performing a good gap analysis through the materials gathered from all other NBD subgroups. This includes setting standardization and adoption priorities through an understanding of what standards are available or under development as part of the recommendations. Tasks Gather input from NBD subgroups and study the taxonomies for the actors roles and responsibility, use cases and requirements, and secure reference architecture. Gain understanding of what standards are available or under development for Big Data Perform a thorough gap analysis and document the findings Technology Roadmap Definitions and Taxonomies NBD PWG Requiremen ts Identify what possible barriers may delay or prevent adoption of Big Data Document vision and recommendations Reference Architecture Security and Privacy Gov. Big Data Symposium, NIST/ITL, Wo Chang, Nov. 20, 2013 26

Technology Roadmap Subgroup Key documents: M0087 working draft Inputs from other subgroups 11. 2. 3. 4 4. Definitions and Taxonomies Requirements and Use Cases Security and Privacy Reference Architecture Potential Standards Group with Big Data related activities (M0035) Capabilities and Technology Readiness Big Data Decision Framework Big Data Mapping and Gap Analysis Big Data Strategies 1. 2. 3. Adoption Implementation l Resourcing 27

Subgroups Working Draft Outline Contact: bigdatainfo@nist.gov Website: http://bigdatawg.nist.gov Join NBD PWG: http://bigdatawg nist gov/newuser php Join NBD PWG: http://bigdatawg.nist.gov/newuser.php Documents: http://bigdatawg.nist.gov/show_inputdoc.php Working Drafts Big Data Definitions and Taxonomies (M0142) f d Big Data Requirements (M0245) g y y q ( ) Big Data Security and Privacy Requirements (M0110) Big Data Architectures White Paper Survey M0151) Big Data Reference Architectures (M0226) Big Data Security and Privacy Reference Architectures (M0110) Big Data Technology Roadmap (M0087) NIST Big Data Workshop Slides: h http://bigdatawg.nist.gov/workshop.php b d k h h 28

http://bigdatawg.nist.gov 29

Next Steps/ Future Activities 30

ISO/IEC JTC 1 Big Data Study Group Terms and References 1. Survey the existing ICT landscape for key technologies and relevant standards/models/studies /use cases and scenarios for Big Data from JTC 1, ISO, IEC and other standards setting organizations, 2 Identify key terms and definitions commonly used in the area of Big Data, 2. Identify key terms and definitions commonly used in the area of Big Data 3. Assess the current status of Big Data standardization market requirements, identify standards gaps, and propose standardization priorities to serve as a basis for future JTC 1 work, and 4 Provide a report with recommendations and other potential deliverables to the 2014 JTC 4. 1 Plenary. Focus Areas: o Big Data Architecture/Infrastructure o Big Data Security & Privacy o Big Data Analytics o Big Data Applications & Tools o Big Data Management Meetings: three face to face meetings + telecon Meetings three face to face meetings + telecon o January US o April Europe o July Asia Format: 4 days (2 days workshop Format: 4 days (2 days workshop short papers, and 2 days meeting) short papers and 2 days meeting) Workshop papers will go into NIST Special Publication High quality papers may go into ACM Conference Proceeding (in process) Report due in September, 2014 31