Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER



Similar documents
Big Data Performance Growth on the Rise

Vendor Update Intel 49 th IDC HPC User Forum. Mike Lafferty HPC Marketing Intel Americas Corp.

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Cloud Computing. Big Data. High Performance Computing

HPC & Big Data THE TIME HAS COME FOR A SCALABLE FRAMEWORK

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

How to make BIG DATA work for you. Faster results with Microsoft SQL Server PDW

Oracle Big Data SQL Technical Update

新 一 代 軟 體 定 義 的 網 路 架 構 Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Introducing Oracle Exalytics In-Memory Machine

High Performance Computing and Big Data: The coming wave.

Business opportunities from IOT and Big Data. Joachim Aertebjerg Director Enterprise Solution Sales Intel EMEA

The Open Cloud Near-Term Infrastructure Trends in Cloud Computing

Interactive data analytics drive insights

Accelerating Business Intelligence with Large-Scale System Memory

Accelerating Business Intelligence with Large-Scale System Memory

Modernizing Your Data Warehouse for Hadoop

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Dell* In-Memory Appliance for Cloudera* Enterprise

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Big Data Are You Ready? Thomas Kyte

Intel and Qihoo 360 Internet Portal Datacenter - Big Data Storage Optimization Case Study

Intel Cyber Security Briefing: Trends, Solutions, and Opportunities. Matthew Rosenquist, Cyber Security Strategist, Intel Corp

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

How To Handle Big Data With A Data Scientist

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Intel Platform and Big Data: Making big data work for you.

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

IoT Solutions from Things to the Cloud

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Fast, Low-Overhead Encryption for Apache Hadoop*

Intel Service Assurance Administrator. Product Overview

Focus on the business, not the business of data warehousing!

Big Data and Big Data Modeling

Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC

Information Architecture

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra

CLOUD SECURITY: Secure Your Infrastructure

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Are You Ready for Big Data?

The Foundation for Better Business Intelligence

Transforming the Telecoms Business using Big Data and Analytics

2015 Global Technology conference. Diane Bryant Senior Vice President & General Manager Data Center Group Intel Corporation

The Future of Data Management

The Big Data Paradigm Shift. Insight Through Automation

Data center day. Big data. Jason Waxman VP, GM, Cloud Platforms Group. August 27, 2015

BIG DATA-AS-A-SERVICE

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

BIG DATA What it is and how to use?

What Is In-Memory Computing and What Does It Mean to U.S. Leaders? EXECUTIVE WHITE PAPER

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Big Data Processing: Past, Present and Future

BIG DATA TRENDS AND TECHNOLOGIES

Oracle Big Data Building A Big Data Management System

Next-Gen Big Data Analytics using the Spark stack

An Oracle White Paper October Oracle: Big Data for the Enterprise

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Grab some coffee and enjoy the pre-show banter before the top of the hour!

The 4 Pillars of Technosoft s Big Data Practice

HDP Enabling the Modern Data Architecture

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Big Data: Are You Ready? Kevin Lancaster

Life With Big Data and the Internet of Things

Big Data Er Big Data bare en døgnflue? Lasse Bache-Mathiesen CTO BIM Norway

Investor Presentation. Second Quarter 2015

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

High-Performance Analytics

NextGen Infrastructure for Big DATA Analytics.

Using an In-Memory Data Grid for Near Real-Time Data Analysis

Understanding the Value of In-Memory in the IT Landscape

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Scaling up to Production

A New Era Of Analytic

Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank

SQL Server 2012 Parallel Data Warehouse. Solution Brief

INVENTING THE FUTURE HITACHI DATA SYSTEMS BIG DATA ROADMAP MICHAEL HAY

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Transcription:

Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER

Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization No computer system can provide absolute security under all conditions. Intel Trusted Execution Technology (Intel TXT) requires a computer system with Intel Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of others. Copyright 2013, Intel Corporation. All rights reserved.

Agenda Big Data trends and opportunities Evolution of Data Management & Analytics Intel provides foundation for Big Data Intel Compute Platforms Optimized for Big Data Intel Storage & Network Technology Intel Software Optimization Summary

COMPLEXITY Big Data Trends Billions connected users sharing Skype 663m 5.3 bn Cell Phones facebook 629m >1500 Exabytes of cloud traffic 1400 Exabytes of new integrated systems data 690% Growth In storage capacity by 2015 Volume Big Sensed Data Big Corp Data Unstructured Data Yahoo 273m Hotmail 364m Big Web Data Corporate Data Structured Data Time What insights can we derive? PREDICTION Are you looking at Big Data? No 5% No, but on radar 20% ANALYSIS MONITORING How are you approaching the opportunity? Yes 75% REPORTING BUSINESS VALUE IT Survey Source: Intel

What Enterprises are doing with Big Data? From Experts From Customers Only business model Tech has left. Forbes, 2011 Data are becoming the new raw material of business: an economic input almost on a par with capital and labor. The Economist, 2010 Information will be the oil of the 21st century. Gartner, 2010 Retail: increase margins 60% Manufacturing: 50% decrease in production costs Cellular: $150B to Providers Public Sector: $250B growth. McKinsey 2010 Retail Financial Services Provider Billing Smart City Telco Utility Real time social trend analysis to identify the hottest products to offer Real time fraud detection, prevention & recovery Real time access to subscriber billing records to offer new service, prevent customer churn Predictive traffic forecasting New customer segmentation for realtime campaigns Load balance energy grids thru real time monitoring customer energy usage

Evolution to Big Data Processing Date Paradigm Processing Style Form Factor 90s ATA Reporting / Mining High Cost /Departmental use Batch- e.g. sales reports Sequential SQL queries e.g. retrieve sales reports RDMS Scale 2000s Model-based discovery High Cost / Dept Use Batch-e.g. correlated buying pattern No SQL. parallel analysis Shared disk/memory No SQL RDMS Scale Node Node Proprietary MPP/ DW Appliance Today Low Cost / Enterprise Use Arrival of vast amounts of unstructured data Near real-time- e.g. recommend engine Process @ storage node Built-in data replication/reliability Shared nothing, in memory Open Source SW loosely coupled on standards based HW Node Node Node Unlimited Linear Scale Distributed node addition In Memory Analytics EXALYTICS Future Real world modeling Real-time predictive analytics HPC Simulation Machine Learning

What is Different about Big Data? Traditional Data Analysis Big Data Analysis Transaction Relational Database Batch Data Warehouse Analyze Structured, Unstructured, Streaming Node Node Cluster Organize Analyze SQL Devices MapReduce R Hive Volume Gigabytes to Terabytes Petabytes and Beyond Velocity Batch CEP Real-Time Data Analytics Variety Centralized, Data Moves to Analytics Distributed, Analytics Moves to Data Value Reactive, Query, Reporting, Proprietary Predictive Analytics, Machine Learning, Graph Algorithms, statistical modeling Big Data augments traditional Business Intelligence

Right Data Methods For Right Data Structure Unstructured Multi-format Data Emerging Technologies Analytical Paradigms Structured Data Relational Database EXALYTICS *Other brands and names are the property of their respective owners.

Technology driving Big Data innovation

Intel Role in Big Data Era Distribute analytics to the edge sensors/devices and drive a standards based connected, managed and secure architecture Accelerate big data analytics through faster and more effective CPU, storage, I/O and network architectures Drive innovation in big data applications by providing optimized software stacks and services Foster the growth of big data through partner collaboration, focused on usage model examples and reference deployment architectures Invest in solution research and academia collaboration

Choice of Compute Platforms Optimized for Big Data

$/TB In Memory Analytics are Game Changing Running time (s) HANA VOLTDB 20 node VoltDB system can do what a 1000 node Hadoop cluster can do Michael Stonebreaker, Architecting for In Memory Model Objectivity GraphDB + + TimesTen In- Memory Database Business Intelligence Enterprise Edition SolidDB $50 000 $40 000 5000 SAP HANA* Scalability Customer Workload $30 000 Ideal $20 000 $10 000 20x Reduction 500 8S Glueless $0 Q4 2010 (DRAM) Q4 2016 (DRAM) 2016 (CR) Low Cost Memory Technology 50 1 2 4 8 Socket Count Near-perfect scaling on Intel Xeon processor E7 family Near Real-time Insight Enabled by In-Memory Solutions Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: http://www.intel.com/content/www/us/en/highperformance-computing/high-performance-computing-xeon-e7-analyze-business-as-it-happens-with-sap-hana-software-brief.html

Big Data Transforming Storage

Storage Models evolving for Big Data Traditional Storage Management Distributed Storage Architecture VM VM VM VM Compute Storage Network storage client Metadata Servers metadata services Storage Servers storage services Designed for structured data Longer time to deployment Restricted to single site Forklift add of new discrete storage for capacity Designed for unstructured data growth Faster time to deployment Multiple, distributed locations managed as a single device Scale capacity & performance by adding nodes

Big Data Visibly Mobile Performance Responsiveness Insight & Productivity Work Station Performance For Right Deep Model Generation for Analytics Processes Collaboration Secure Media, Data,& Assets Visibly Mobile Data Productivity Flexible End Point Solutions with client application support that allow fast and efficient data modeling, scoring and direct data access from any location 18 Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain computer system software enabled for it.

Building On the EcoSystem Database and compute infrastructure Relational Analytics engines VOLTDB Nonrelational EXALYTICS No matter the choice, all optimized, some exclusively, on Xeon

Intel s contribution to Open Source Enable open source operating environments to run best on Intel architecture UPSTREAM Code Capital Foster open source ecosystems and develop new markets for Intel and its partners DOWNSTREAM Alliances Foundations OEM Service Provider Enteprise

Intel HiTune The Hadoop performance analyzer Users develop their applications based on MapReduce model The Hadoop framework dynamically maps it to the underlying cluster HiTune automatically instruments Hadoop tasks (at binary level) to collect runtime information Low overheads (<2%) No source code changes Various runtime information JVM information System statistics Hadoop log information See Intel paper HiTune: Dataflow-Based Performance Analysis for Big Data Cloud in 2011 USENIX Annual Technical Conference

Driving Big Data Usages & Requirements Vertical Deployments & Lab Innovations Telco Retail Science Mfg Finance Healthcare Science and Technology Centers for Big Data Drive field usage models and cutting edge enhancements Open Standards Intel Cloud Builders Ref Architectures & Adoption Big Data Security Working Group Hadoop Enhancements Define and Prioritize IT Requirements & Accelerate Industry Standards Ecosystem Contributions & Distro Innovation Benchmarking ISV/OEM Designs Craft enterprise ready software contribution for OEM/ISV to build solutions Work with Industry Partners to identify and deliver usage examples and reference architectures for variety of Big Data solutions

Summary 1 Big Data is here and growing rapidly 2 3 Intel is well positioned from software stack and platform basis Intel is committed to investing in new technology to address more demanding big data requirements of the future

Want more information? hadoop.intel.com Learn how to deploy Hadoop Downloads, tutorials, deployment guides www.intel.com/bigdata Information for IT managers Case studies, Analyst Reviews & Complementary Research

Thank You!