Big Data: Overview and Roadmap. 2015 eglobaltech. All rights reserved.



Similar documents
Big Data on Microsoft Platform

Demystifying Big Data Government Agencies & The Big Data Phenomenon

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

International Journal of Advancements in Research & Technology, Volume 3, Issue 5, May ISSN BIG DATA: A New Technology

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Dell* In-Memory Appliance for Cloudera* Enterprise

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

BIG DATA TRENDS AND TECHNOLOGIES

Big Data - Infrastructure Considerations

Three Open Blueprints For Big Data Success

The Future of Data Management

Master big data to optimize the oil and gas lifecycle

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

How the oil and gas industry can gain value from Big Data?

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Internet of Things. Opportunity Challenges Solutions

Big Data overview. Livio Ventura. SICS Software week, Sept Cloud and Big Data Day

Making big data simple with Databricks

locuz.com Big Data Services

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper

Hadoop for Enterprises:

Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges

HDP Hadoop From concept to deployment.

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

This Symposium brought to you by

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

BIG DATA-AS-A-SERVICE

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Agile Business Intelligence Data Lake Architecture

The future: Big Data, IoT, VR, AR. Leif Granholm Tekla / Trimble buildings Senior Vice President / BIM Ambassador

Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015

Data Refinery with Big Data Aspects

Interactive data analytics drive insights

Luncheon Webinar Series May 13, 2013

Big Data Analytics Nokia

Big Data Use Cases. To Start Today. Paul Scholey Sales Director, EMEA. 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866)

Towards a Thriving Data Economy: Open Data, Big Data, and Data Ecosystems

DATA MANAGEMENT FOR THE INTERNET OF THINGS

Microsoft Big Data. Solution Brief

How To Make Data Streaming A Real Time Intelligence

SQLstream Blaze and Apache Storm A BENCHMARK COMPARISON

Big Data: Are You Ready? Kevin Lancaster

Leveraging Big Data Technologies to Support Research in Unstructured Data Analytics

Are You Ready for Big Data?

Cisco Data Preparation

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Chapter 7. Using Hadoop Cluster and MapReduce

Architecting for the Internet of Things & Big Data

Tap into Big Data at the Speed of Business

Detecting Anomalous Behavior with the Business Data Lake. Reference Architecture and Enterprise Approaches.

COMP9321 Web Application Engineering

Driving Growth in Insurance With a Big Data Architecture

Safe Harbor Statement

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Addressing government challenges with big data analytics

Big Data Analytics: Today's Gold Rush November 20, 2013

Cloudera Enterprise Data Hub in Telecom:

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

EO Data by using SAP HANA Spatial Hinnerk Gildhoff, Head of HANA Spatial, SAP Satellite Masters Conference 21 th October 2015 Public

City Deploys Big Data BI Solution to Improve Lives and Create a Smart-City Template

Massive Cloud Auditing using Data Mining on Hadoop

Roadmap Talend : découvrez les futures fonctionnalités de Talend

MarkLogic for Government. July 2014

Virtualizing Apache Hadoop. June, 2012

Cityzenith s 5D Smart City platform empowers users with a simple way to make sense of the torrent of data in our cities, corporate

Integrating a Big Data Platform into Government:

Using Data Mining and Machine Learning in Retail

Apache Hadoop: The Big Data Refinery

Big Data. White Paper. Big Data Executive Overview WP-BD Jafar Shunnar & Dan Raver. Page 1 Last Updated

Towards Smart and Intelligent SDN Controller

Business Intelligence meets Big Data: An Overview on Security and Privacy

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Are You Ready for Big Data?

Where is... How do I get to...

Create and Drive Big Data Success Don t Get Left Behind

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Exploiting the power of Big Data

Customized Report- Big Data

IBM Big Data Platform

Big Data Spatial Analytics An Introduction

The Future of Business Analytics is Now! 2013 IBM Corporation

IoT is a King, Big data is a Queen and Cloud is a Palace

Business Analytics In a Big Data World Ted Malone Solutions Architect Data Platform and Cloud Microsoft Federal

ICT Perspectives on Big Data: Well Sorted Materials

How To Understand The Benefits Of Big Data

Big Data & Analytics: Your concise guide (note the irony) Wednesday 27th November 2013

Transcription:

Big Data: Overview and Roadmap 2015 eglobaltech. All rights reserved.

What is Big Data? Large volumes of complex and variable data that require advanced techniques and technologies to enable capture, storage, distribution, management, and analysis. Rapidly expanding volume of high velocity, complex, and diverse types of data. 2

Characteristics of Big Data Characteristic Description Drivers Volume Sheer amount of data generated or data intensity that must be ingested, analyzed, and managed to make decisions Increase in data sources and higher resolution sensors Velocity Variety How fast data is being produced and changed and the speed with which data must be received, understood, and processed Rise of information coming from new internal and external sources. Structured, Unstructured, Semi- Structured data. Increase in data sources Improved thru-put connectivity Enhanced computing power of data generating devices Mobile Social Media Videos Chat Genomics Sensors

Big Data in USG Digital Government: Build a 21st Century Platform to Better Serve The American People (Digital Government Strategy). Strategy calls for USG to unlock the power of government data to spur innovation across our nation and improve the quality of services for the American people. https://www.whitehouse.gov/sites/default/files/omb/egov/digitalgovernment/digital-government.html Big Data: Seizing Opportunities, Preserving Values White House review of the impact big data technologies will have on a range of economic, social, and government activities https://www.whitehouse.gov/sites/default/files/docs/big_data_privacy_report_ may_1_2014.pdf 4

How is Big Data Related to Cloud? Cloud Computing can provide better performance and scalability for most big data systems because they can provide auto-scaling and auto-provisioning. Cloud Characteristic Rapid elasticity and scalability Measured service Broad Network Access Resource Pooling Big Data Relationship Allows IT services to scale automatically to meet expanding demand for Big Data analytics services and volumes Big data cloud resources are monitored and controlled per use Big data cloud resources can be accessed by diverse client platforms across the network Aggregated Big Data cloud resources in a location-independent manner, enabling them to be assigned and reassigned on demand 5

How is Big Data Related to IoT? IoT consists of internetconnected sensors attached 'things, generating and transferring data over a network without requiring human-to-human or humanto-computer Increase in connected devices will lead to an exponential increase in the data that needs to be managed Big Data capacity is a prerequisite to tapping into the Internet of Things 6

Overview of the Big Data Market 7

Implementation Challenges 8

Roadmap 9

Why Big Data in Government? Government is a Data-Driven: Each Ministry regularly captures data in non-standardized formats, stored in disparate systems. Scarce resources are spent to maintain this practice and assess this data in a isolated manner producing only a partial results. Citizens capture important information via social media and mobile apps. This Issue Is: There is no practical way to aggregate this data and leverage it to bring practical results benefitting Government and Citizens as a whole. 10

The Business Case Accelerating productivity is key to increasing the standards of living Economies as a whole need to enable organizations to take advantage of big data. 11

Big Data Concept Standard Tools for Analysis and Visualization Common Government Big Data Platform Citizen Environment Health Geospatial Census Data Sources Federal Local Citizens 12

Big Data Enterprise Model 13

Data Transformation What it Is Components Component Definition A framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Hadoop Distributed File System (HDFS) MapReduce Storage clusters that hold the actual data. Java-based system used to process data. Does not involve queries. MapReduce runs as a series of jobs, with each job essentially a separate Java application that pulls out information as needed. 14

Notional Data Flow 15

Implementation Roadmap Phased Implementation Define Strategy and Concept Define Governance Framework Define Use Case Identify Big Data Assets Identify Big Data Sources Assess Data Quality Consistent Communications, Training, Change Management IT & Security Requirements Select Technologies Deploy Refine Affected Business Processes Deploy New Business Processes Tune Applications Consistent Communications, Training, Change Management 16

Lessons Learned Big Data implementation is iterative and cyclical, versus revolutionary. Do not start with technology focus. Instead, consider business/mission requirements that you are unable to address using traditional approaches. Address the initial set of use cases by augmenting current IT investments with an intent to scale them to support future deployments. Focus on one Big Data entry point volume, variety, and velocity. After initial deployment, expand to adjacent use cases, building out a more robust and unified set of core technical capabilities. Examples include the: Ability to analyze streaming data in real time. Use of Hadoop or Hadoop-like technologies to tap huge, distributed data sources. Adoption of advanced data warehousing and data mining software. 17

Additional Recommendations Establish a partnership between industry, academia, and professional organizations to advance big data analytics and maintain professional competency standards. Explore which data assets can be made public to help spur innovation outside of Government. 18

Additional Resources Source The Cloud Security Alliance https://cloudsecurityalliance.org/research/b ig-data The Open Data Foundation www.opendatafoundation.org Apache http://hadoop.apache.org Organization for the Advancement of Structured Information Standards www.oasis-open.org Description Helping to identify scalable techniques for data-centric security and privacy challenges in big data Helping to promote adoption of global metadata standards and to develop open source frameworks for statistical data Offers a open source library that is a standards-based framework for processing large data sets across clusters of computers Will focus on the creation of big data standards. 19

Case Studies 20

German Federal Labor Agency Goal: To develop a tailored intervention approach for different segments of unemployed workers. Action: Aggregated and analyzed historical data, such as unemployed worker history, the interventions for each worker, and the outcomes. Results: - Identified programs that are ineffective (for improvement or elimination). - Refined ability evaluate the characteristics of unemployed and partially employed workers. Developed a segmented approach to offer more targeted placement and counseling services. - Over 3 years, reduced spending by 10 billion ($14.9 billion) annually - Decreased the amount of time required for unemployed workers to find employment. 21

US Department of Homeland Security Goal: Increase border security around the Great Lakes region between the USA and Canada by enhancing information sharing among public safety agencies. Issue: Each agency had its own procedures and technology. Data was fragmented among different software systems, databases, spreadsheets, documents. Solution: Visual Fusion software from IDV Solutions. Provides a way to connect diverse systems and data in a shared, web-based view. Multiple agencies use the solution (called REsILIeNT) to share information, including 911 incident reports, real-time webcam feeds, regionally available resources, and vulnerable infrastructure, etc. Has a Digital Whiteboard which allows emergency planners from multiple agencies to visualize possible scenarios and work together on potential responses. 22

Chicago s Smart Data Platform An open source predictive analytics platform. Connected to WindyGrid, a hub housing information from every department in real time and gathering about 7 million rows of data per day. Scalable design and user-friendly interface Predictive power of the tool is its ability to analyze data relationships at a speed and on a scale not previously possible. 3 rd Party Apps are expanding rapidly - Purple Binder aggregates social services information so that social workers and healthcare professionals have an up-to-date, single source of data about services available to their clients. Free to any city willing to install it. 23