NIST Big Data Phase I Public Working Group Reference Architecture Subgroup May 13 th, 2014 Presented by: Orit Levin Co-chair of the RA Subgroup
Agenda Introduction: Why and How NIST Big Data Reference Architecture Summary: What s new to Big Data 2
Introduction: Objectives Provide a technical reference for U.S. Government departments, agencies, and other consumers to understand, discuss, categorize, and compare Big Data solutions Facilitate the analysis of candidate standards for interoperability, portability, reusability, and extendibility Illustrate and improve understanding of the various Big Data components, processes, and systems Provide a common language for the various stakeholders 3
Introduction: Process Use Cases Survey of Architectures Definitions and Taxonomy Reference Architecture Security Roadmap 4
RAs Survey Outcome Transformation includes Processing functions Analytic functions Visualization functions Data Infrastructure includes Data stores In-memory DBs Analytic DBs Sources Transformation Usage Data Infrastructure Security Management Cloud Computing Network 5
NIST Big Data Reference Architecture 6
Is What the NIST Big Data RA Is Not A vendor-neutral and technology-agnostic functional architecture A descriptive reference model Is comprised of logical roles A superset of a traditional data system Applicable to a variety of business models Tightly-integrated enterprise systems Loosely-coupled vertical industries A proscriptive Reference Architecture A deployment model A business model showing internal vs. external functional boundaries The above can be developed in in the context of a specific use case. 7
Main Application Functional Provider Blocks / Data / System Lifecycle Roles System Orchestrator Data Provider Big Data Application Provider Collection Curation Analytics Visualization Access Data Consumer Big Data Framework Provider 8
Big Data Framework Provider(s) System Orchestrator Data Provider Big Data Application Provider Collection Curation Analytics Visualization Access Data Consumer Big Data Framework Provider Processing Frameworks (analytic tools, etc.) Horizontally Scalable Platforms (databases, etc.) Horizontally Scalable Infrastructures Horizontally Scalable (VM clusters) Vertically Scalable Vertically Scalable Vertically Scalable Physical and Virtual Resources (networking, computing, etc.) 9
Data System Flow Orchestrator Vertical Application Specific Central Entity vs. Distributed function System Orchestrator Data Data Provider Provider DATA SW Big Data Application Provider Collection Curation Analytics Visualization Access DATA SW DATA SW Data Consumer Big Data Framework Provider Processing Frameworks (analytic tools, etc.) Discovery of data Description of data Horizontally Scalable Access to data Code execution on data Etc. Platforms (databases, etc.) Horizontally Scalable Infrastructures Horizontally Scalable (VM clusters) Vertically Scalable Vertically Scalable Vertically Scalable Discovery of services Description of data Visualization of data Rendering of data Reporting of data Code execution on data Etc. Physical and Virtual Resources (networking, computing, etc.) 10
Summary: What Has Changed; What Is New Technological challenges as a result of Volume, Velocity, and Variety Unstructured repositories Changes in data lifecycle Data in-situ Security concerns as a result of distribution and numerous stakeholders Cloud Computing Globalization Privacy concerns as a result of Unconsented data collection, Internet of Things Re-identification through fusion 13
Next Steps (in Parallel) Implement selective Use Cases according to the RA Identify Use Cases Patterns with the functionalities /interfaces between the RA components for each Describe the types of interfaces between the RA components based on the existing technologies and products 14
THANK YOU! Orit Levin oritl@microsoft.com 15