BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?



Similar documents
Bringing Big Data to People

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Modernizing Your Data Warehouse for Hadoop

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Hadoop implementation of MapReduce computational model. Ján Vaňo

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

BIG DATA What it is and how to use?

Please give me your feedback

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

BIG DATA TRENDS AND TECHNOLOGIES

Data Analyst Program- 0 to 100

HDP Hadoop From concept to deployment.

How To Scale Out Of A Nosql Database

Hadoop. Sunday, November 25, 12

HDP Enabling the Modern Data Architecture

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme

Constructing a Data Lake: Hadoop and Oracle Database United!

Hadoop Usage At Yahoo! Milind Bhandarkar

Deploying Hadoop with Manager

Large scale processing using Hadoop. Ján Vaňo

Hadoop Ecosystem B Y R A H I M A.

The Future of Data Management with Hadoop and the Enterprise Data Hub

Qsoft Inc

Big Data and Data Science: Behind the Buzz Words

Big Data Technologies Compared June 2014

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Comprehensive Analytics on the Hortonworks Data Platform

How to Hadoop Without the Worry: Protecting Big Data at Scale

Big Data Advanced Analytics for Game Monetization. Kimberly Chulis

Hadoop: Distributed Data Processing. Amr Awadallah Founder/CTO, Cloudera, Inc. ACM Data Mining SIG Thursday, January 25 th, 2010

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Hadoop Job Oriented Training Agenda

MapReduce with Apache Hadoop Analysing Big Data

#TalendSandbox for Big Data

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Google Bing Daytona Microsoft Research

Big Data Management and Security

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Dominik Wagenknecht Accenture

Advanced Big Data Analytics with R and Hadoop

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Big Data Analytics. Lucas Rego Drumond

Upcoming Announcements

Big Data and Industrial Internet

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Talend Open Studio for Big Data. Release Notes 5.2.1

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

DATA MINING WITH HADOOP AND HIVE Introduction to Architecture

Big Data Cloud Services

Big Data on Microsoft Platform

W H I T E P A P E R. Building your Big Data analytics strategy: Block-by-Block! Abstract

Dell In-Memory Appliance for Cloudera Enterprise

Open source Google-style large scale data analysis with Hadoop

Lecture 10: HBase! Claudia Hauff (Web Information Systems)!

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Big Data and Hadoop for the Executive A Reference Guide

WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING

HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Hadoop. for Oracle database professionals. Alex Gorbachev Calgary, AB September 2013

Architecture & Experience

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Data processing goes big

Big Data Analytics for Cyber

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Apache Hadoop: Past, Present, and Future

Hadoop and Map-Reduce. Swati Gore

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

The Evolving Apache Hadoop Eco-System

Application Development. A Paradigm Shift

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

HADOOP. Revised 10/19/2015

Peers Techno log ies Pv t. L td. HADOOP

Building Big with Big Data Now companies are in the middle of a renovation that forces them to be analytics-driven to continue being competitive.

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

White Paper: What You Need To Know About Hadoop

The Future of Data Management

IBM BigInsights for Apache Hadoop

Keywords: Big Data, Hadoop, cluster, heterogeneous, HDFS, MapReduce

Cisco IT Hadoop Journey

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Workshop on Hadoop with Big Data

Hadoop Introduction coreservlets.com and Dima May coreservlets.com and Dima May

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION

Communicating with the Elephant in the Data Center

Transcription:

BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?

The Big Data Buzz

big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. (Wikipedia)

Apache Hadoop the leading big data analytics technology won the Media Guardian Innovator of the Year Award

Big Data Analytics

Is there a comprehensive strategy for Big Data in your company? Please estimate the yearly growth of data for reporting and analysis No 63% 2011 9% 66% 19% 7% Planned 23% 2012 4% 54% 35% 8% 2013 3% 48% 36% 13% Yes 14% 0% 20% 40% 60% 80% 0% 20% 40% 60% 80% 100% Negative growth / No growth 1-25% growth 25%-50% growth Over 50% growth Survey with 274 participants from DACH, France, Nordics, Netherlands,

What problems have you encountered when using Big data? In which areas does your company use Big data analysis? Controlling 24% Inadequate technical know-how 46% Marketing 19% Inadequate analytical know-how 44% Sales 18% Lack of compelling business case 36% IT 18% Technical problems 34% Production 17% Cost 33% Research and development 14% Data privacy issues 25% Supply Chain 7% Can not make Big data usable for end-users 15% 0% 20% 40% 0% 20% 40% 60%

The database journey continuous: Big Data

Scale-Up (SMP) Scale-out (MPP) Up to 256 Cores in Windows today Parallel Data Warehouse

but is this already Big Data?

The large hadron collider produces 15 PB/year* http://public.web.cern.ch/public/en/lhc/computing-en.html

But what if my customer doesn t own a large hadron collider

Large scale plants Vehicle fleets Smart Grids Green Energy Stock Exchanges Host Protocols Computer Centers Web Farms Twitter Facebook Google Analytics

Source: The Importance of 'Big Data': A Definition, Mark Beyer, Douglas Laney, G00235055 "Big data" is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

200 Mio Feeds 100 PB 20 PB

Social analytics different sources data structures sophisticated algorithms

Up to 75 control units in one car Up to 1.000 possible special equipments About 15 GB data on board (incl. navigational data) Up to 12.000 stores for onboard diagnosis More than 50.000 car diagnoses happening each day

How to deal with the 3 Vs?

Hadoop

The Hadoop Ecosystem (simplified) Quelle: Tom White s Hadoop: The Definitive Guide

Scalable machine learning library that leverages the Hadoop infrastructure Key use cases: Recommendation mining Examine user behavior, build recommendation model Clustering Grouping data into related topics Classification Learn from classified documents to assign categories to unlabeled data Algorithmns: K-means Clustering, Naïve Bayes, Decision Tree, Neural network, Hierarchical Clustering, Positive Matrix Factorization and more

Zookeeper Ambari HCatalog Oozie HBase/Cassandra/Couch/ MongoDB Hive Mahout R Cascad-ing Pig Flume Sqoop HBase (Column DB) Avro Hadoop = MapReduce + HDFS Hortonworks HStreaming Karmasphere Splunk Cloudera Hadapt MapR Datameer

Use of Big Data

180 PB raw data in > 40.000 computers (polystructured)* Biggest Hadoop cluster: 4.500 nodes (2x4 CPUs, 4x1 TB disks, 16 GB RAM) Ad Impressions: Cube with 207 Measures 24 Dimensions 247 Attributes Desktop Clients (Excel & Tableau): < 6s ad hoc query time http://wiki.apache.org/hadoop/poweredby

Query engine for SQL & Hadoop Cost base optimizer. Decides on: Rendering operators in Map/Reduce-Jobs or Moving HDFS data into RDBMS storage HDFS-Bridge for parallelized Data Transport Regular T-SQL Results PDW V2 & HDFS Data Nodes

maturing Not every problem questions simple

2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.