Survey of Big Data Architecture and Framework from the Industry NIST Big Data Public Working Group Sanjay Mishra May13, 2014 3/19/2014 NIST Big Data Public Working Group 1
NIST BD PWG Survey of Big Data Architecture This presentation is a brief summary of survey of collection of Big Data reference architecture provided to NIST BD PWG by leading companies supporting the Big Data framework. The contribution to this survey is from the following sources: Big Data Ecosystem, Microsoft Corporation Big Data Layered architecture, Bob Marcus (submitted as an individual contribution to NIST) Big Data Architecture Framework, University of Amsterdam IBM Big Data Platform, IBM Big Data Reference Architecture, Oracle Big Data Architecture Model, Pivotal (Spinoff from EMC) Big Data Reference Architecture, SAP Big Data Architecture Model, 9Sight The High Performance Computing Cluster Systems Platform, LexisNexis NIST Big Data Public Working Group 2
Big Data Layered Architecture, Bob Marcus (individual contribution) NIST Big Data Public Working Group 3
Big Data Ecosystem by Microsoft NIST Big Data Public Working Group 4
Big Data Architecture Framework (BDAF) by University of Amsterdam NIST Big Data Public Working Group 5
Big Data Platform, IBM NIST Big Data Public Working Group 6
Big Data Reference Architecture, Oracle Information Provisioning Information Analysis Big Data Reference Architecture Descriptive Analytics Reporting Statistical Analysis In-DB Data Mining MapReduce Predictive Analytics (In-Database) Dashboards Semantic Analysis Text Mining Spatial Operational Database Big Data Processing & Discovery Massive Unstructured Data Big Data Processing & Stream Processing Information Discovery Data Warehouse Data Conversion Bulk Data Movement Distributed File Systems NoSQL/Tag-Value Faceted Unstructured Data Streams Relational Spatial/Relational Data Sources Infrastructure Services Hardware Network Storage Operating System Connectivity Virtualization Security Management NIST Big Data Public Working Group 7
Big Data Architecture, Pivotal (EMC Spinoff) NIST Big Data Public Working Group 8
Big Data Platform, SAP NIST Big Data Public Working Group 9
Big Data Platform, 9Sight NIST Big Data Public Working Group 10
Big Data Platform, LexisNexis NIST Big Data Public Working Group 11
NIST BD PWG Big Data Survey NIST Big Data Public Working Group 12
NIST BD PWG Big Data Survey NIST Big Data Public Working Group 13
NIST BD PWG Big Data Survey NIST Big Data Public Working Group 14
Big Data Survey - Platform Components NIST Big Data Public Working Group 15
Big Data Survey Analytics Components NIST Big Data Public Working Group 16
Big Data Survey Data Component NIST Big Data Public Working Group 17
Big Data Unified View NIST Big Data Public Working Group 18
Big Data Survey - Conclusion The consistent theme through the survey is the remarkable consistency of the Big Data Architecture among those that are surveyed Key points The most common theme all reference architecture that we surveyed included the following three areas: Big Data Analytics Descriptive, Predictive and Spatial Real-time Interactive Batch Analytics Reporting Dashboard Big Data Management/Data Store Structured, semi-structured and unstructured data Velocity, Variety and Volume SQL and nosql Distributed File System Big Data Infrastructure In Memory Data Grids Operational Database Analytic Database Relational Database Flat files Content Management System Horizontal scalable architecture Almost all architecture are supported by key pillars that at least included Data Users/Consumers and Orchestrations with support from resources such as systems Management, Data Resource Management and Security and Data Governance However, the architecture also reveals lack of standardized and adequate support to address data security and privacy and additional standardization would need to occur NIST Big Data Public Working Group 19