1 Hur hanterar vi utmaningar inom området - Big Data Jan Östling Enterprise Technologies Intel Corporation, NER
2 Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit No computer system can provide absolute security under all conditions. Intel Trusted Execution Technology (Intel TXT) requires a computer system with Intel Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit Intel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of others. Copyright 2013, Intel Corporation. All rights reserved.
3 Agenda Big Data trends and opportunities Evolution of Data Management & Analytics Intel provides foundation for Big Data Intel Compute Platforms Optimized for Big Data Intel Storage & Network Technology Intel Software Optimization Summary
4 COMPLEXITY Big Data Trends Billions connected users sharing Skype 663m 5.3 bn Cell Phones facebook 629m >1500 Exabytes of cloud traffic 1400 Exabytes of new integrated systems data 690% Growth In storage capacity by 2015 Volume Big Sensed Data Big Corp Data Unstructured Data Yahoo 273m Hotmail 364m Big Web Data Corporate Data Structured Data Time What insights can we derive? PREDICTION Are you looking at Big Data? No 5% No, but on radar 20% ANALYSIS MONITORING How are you approaching the opportunity? Yes 75% REPORTING BUSINESS VALUE IT Survey Source: Intel
5 What Enterprises are doing with Big Data? From Experts From Customers Only business model Tech has left. Forbes, 2011 Data are becoming the new raw material of business: an economic input almost on a par with capital and labor. The Economist, 2010 Information will be the oil of the 21st century. Gartner, 2010 Retail: increase margins 60% Manufacturing: 50% decrease in production costs Cellular: $150B to Providers Public Sector: $250B growth. McKinsey 2010 Retail Financial Services Provider Billing Smart City Telco Utility Real time social trend analysis to identify the hottest products to offer Real time fraud detection, prevention & recovery Real time access to subscriber billing records to offer new service, prevent customer churn Predictive traffic forecasting New customer segmentation for realtime campaigns Load balance energy grids thru real time monitoring customer energy usage
6 Evolution to Big Data Processing Date Paradigm Processing Style Form Factor 90s ATA Reporting / Mining High Cost /Departmental use Batch- e.g. sales reports Sequential SQL queries e.g. retrieve sales reports RDMS Scale 2000s Model-based discovery High Cost / Dept Use Batch-e.g. correlated buying pattern No SQL. parallel analysis Shared disk/memory No SQL RDMS Scale Node Node Proprietary MPP/ DW Appliance Today Low Cost / Enterprise Use Arrival of vast amounts of unstructured data Near real-time- e.g. recommend engine storage node Built-in data replication/reliability Shared nothing, in memory Open Source SW loosely coupled on standards based HW Node Node Node Unlimited Linear Scale Distributed node addition In Memory Analytics EXALYTICS Future Real world modeling Real-time predictive analytics HPC Simulation Machine Learning
7 What is Different about Big Data? Traditional Data Analysis Big Data Analysis Transaction Relational Database Batch Data Warehouse Analyze Structured, Unstructured, Streaming Node Node Cluster Organize Analyze SQL Devices MapReduce R Hive Volume Gigabytes to Terabytes Petabytes and Beyond Velocity Batch CEP Real-Time Data Analytics Variety Centralized, Data Moves to Analytics Distributed, Analytics Moves to Data Value Reactive, Query, Reporting, Proprietary Predictive Analytics, Machine Learning, Graph Algorithms, statistical modeling Big Data augments traditional Business Intelligence
8 Right Data Methods For Right Data Structure Unstructured Multi-format Data Emerging Technologies Analytical Paradigms Structured Data Relational Database EXALYTICS *Other brands and names are the property of their respective owners.
9 Technology driving Big Data innovation
10 Intel Role in Big Data Era Distribute analytics to the edge sensors/devices and drive a standards based connected, managed and secure architecture Accelerate big data analytics through faster and more effective CPU, storage, I/O and network architectures Drive innovation in big data applications by providing optimized software stacks and services Foster the growth of big data through partner collaboration, focused on usage model examples and reference deployment architectures Invest in solution research and academia collaboration
11 Choice of Compute Platforms Optimized for Big Data
12 $/TB In Memory Analytics are Game Changing Running time (s) HANA VOLTDB 20 node VoltDB system can do what a 1000 node Hadoop cluster can do Michael Stonebreaker, Architecting for In Memory Model Objectivity GraphDB + + TimesTen In- Memory Database Business Intelligence Enterprise Edition SolidDB $ $ SAP HANA* Scalability Customer Workload $ Ideal $ $ x Reduction 500 8S Glueless $0 Q (DRAM) Q (DRAM) 2016 (CR) Low Cost Memory Technology Socket Count Near-perfect scaling on Intel Xeon processor E7 family Near Real-time Insight Enabled by In-Memory Solutions Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source:
13 Big Data Transforming Storage
14 Storage Models evolving for Big Data Traditional Storage Management Distributed Storage Architecture VM VM VM VM Compute Storage Network storage client Metadata Servers metadata services Storage Servers storage services Designed for structured data Longer time to deployment Restricted to single site Forklift add of new discrete storage for capacity Designed for unstructured data growth Faster time to deployment Multiple, distributed locations managed as a single device Scale capacity & performance by adding nodes
15 Big Data Visibly Mobile Performance Responsiveness Insight & Productivity Work Station Performance For Right Deep Model Generation for Analytics Processes Collaboration Secure Media, Data,& Assets Visibly Mobile Data Productivity Flexible End Point Solutions with client application support that allow fast and efficient data modeling, scoring and direct data access from any location 18 Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain computer system software enabled for it.
16 Building On the EcoSystem Database and compute infrastructure Relational Analytics engines VOLTDB Nonrelational EXALYTICS No matter the choice, all optimized, some exclusively, on Xeon
17 Intel s contribution to Open Source Enable open source operating environments to run best on Intel architecture UPSTREAM Code Capital Foster open source ecosystems and develop new markets for Intel and its partners DOWNSTREAM Alliances Foundations OEM Service Provider Enteprise
18 Intel HiTune The Hadoop performance analyzer Users develop their applications based on MapReduce model The Hadoop framework dynamically maps it to the underlying cluster HiTune automatically instruments Hadoop tasks (at binary level) to collect runtime information Low overheads (<2%) No source code changes Various runtime information JVM information System statistics Hadoop log information See Intel paper HiTune: Dataflow-Based Performance Analysis for Big Data Cloud in 2011 USENIX Annual Technical Conference
19 Driving Big Data Usages & Requirements Vertical Deployments & Lab Innovations Telco Retail Science Mfg Finance Healthcare Science and Technology Centers for Big Data Drive field usage models and cutting edge enhancements Open Standards Intel Cloud Builders Ref Architectures & Adoption Big Data Security Working Group Hadoop Enhancements Define and Prioritize IT Requirements & Accelerate Industry Standards Ecosystem Contributions & Distro Innovation Benchmarking ISV/OEM Designs Craft enterprise ready software contribution for OEM/ISV to build solutions Work with Industry Partners to identify and deliver usage examples and reference architectures for variety of Big Data solutions
20 Summary 1 Big Data is here and growing rapidly 2 3 Intel is well positioned from software stack and platform basis Intel is committed to investing in new technology to address more demanding big data requirements of the future
21 Want more information? hadoop.intel.com Learn how to deploy Hadoop Downloads, tutorials, deployment guides Information for IT managers Case studies, Analyst Reviews & Complementary Research
Solution Brief Big Data in the Cloud: Converging Technologies How to Create Competitive Advantage Using Cloud-Based Big Data Analytics Why You Should Read This Document This paper describes how cloud and
The Open Cloud Near-Term Infrastructure Trends in Cloud Computing Markus Leberecht BELNET Networking Conference 25-Oct-2012 1 Growth & IT Challenges Drive Need for Cloud Computing IT Pros Growth IT Challenges
December 2013 Planning Guide Updating IT Infrastructure Four Steps to Better Performance and Lower Costs for IT Managers in Midsize Businesses Why You Should Read This Document This guide provides step-by-step
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
Front cover Building Big Data and Analytics Solutions in the Cloud Characteristics of big data and key technical challenges in taking advantage of it Impact of big data on cloud computing and implications
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
SAP BusinessObjects Business Intelligence SAP BusinessObjects Business Intelligence 4.0 Solutions Empowering the Real-Time, Mobile, Social, and Global Enterprise SAP BusinessObjects Business Intelligence
A Forrester Consulting Thought Leadership Paper Commissioned By SAP Real-Time Data Management Delivers Faster Insights, Extreme Transaction Processing, And Competitive Advantage June 2013 Table Of Contents
INTELLIGENT BUSINESS STRATEGIES W H I T E P A P E R Architecting A Big Data Platform for Analytics By Mike Ferguson Intelligent Business Strategies October 2012 Prepared for: Table of Contents Introduction...
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
Intel Cyber-Security Briefing: Trends, Solutions, and Opportunities John Skinner, Director, Secure Enterprise and Cloud, Intel Americas, Inc. Agenda Intel + McAfee: What it means Computing trends and security
Internet of Things Next-Generation Business and the Internet of Things Opportunities and Challenges Created by a Connected and Real-Time World Table of Contents 3 The Internet of Things Is Redefining Enterprise
FIND THE RIGHT SERVERS FOR YOUR BUSINESS Intel Xeon Processor-Based Server Selection Guide Intel processors power a wide range of server options, from entry-level small business servers, to big data analytic
ascent Thought leadership from Atos white paper Data Analytics as a Service: unleashing the power of Cloud and Big Data Your business technologists. Powering progress Big Data and Cloud, two of the trends
For: Application Development & Delivery Professionals The Forrester Wave : Big Data Hadoop Solutions, Q1 2014 by Mike Gualtieri and Noel Yuhanna, February 27, 2014 Key Takeaways Hadoop s Momentum Is Unstoppable
1 Contents Introduction. 1 View Point Phil Shelley, CTO, Sears Holdings Making it Real Industry Use Cases Retail Extreme Personalization. 6 Airlines Smart Pricing. 9 Auto Warranty and Insurance Efficiency.
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R B i g D a t a : W h a t I t I s a n d W h y Y o u S h o u l d C a r e Sponsored
Emergence and Taxonomy of Big Data as a Service Benoy Bhagattjee Working Paper CISL# 2014-06 May 2014 Composite Information Systems Laboratory (CISL) Sloan School of Management, Room E62-422 Massachusetts
IT@Intel Achieving Intel Transformation through IT Innovation 2014 2015 Intel IT Business Review Annual Edition The Transformative Power of Innovation Kim Stevenson Intel Chief Information Officer Contents
Plug Into The Cloud with Oracle Database 12c ORACLE WHITE PAPER DECEMBER 2014 Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only,
Trends in Cloud Computing and Big Data Nikita Bhagat, Ginni Bansal, Dr.Bikrampal Kaur firstname.lastname@example.org, email@example.com, firstname.lastname@example.org Abstract - BIG data refers to the
Big Data Computing and Clouds: Trends and Future Directions Marcos D. Assunção a,, Rodrigo N. Calheiros b, Silvia Bianchi c, Marco A. S. Netto c, Rajkumar Buyya b, arxiv:1312.4722v2 [cs.dc] 22 Aug 2014
ANNUAL REPORT 2013 FINANCIAL RESULTS FOR FISCAL YEAR 2013 $1.33 billion in total revenue, an increase of 17% over fiscal 2012 $150 million in net income, or $0.77 per diluted share $1.09 billion in deferred
Integration of Big Data in Cloud computing environments for enhanced data processing capabilities Rohit Chandrashekar  Maya Kala  Dashrath Mane  VES Institute of Technology, Chembur, Mumbai 
CIO Roundtable - Big March 13, 2013 Big and its Dimensions Big refers to internal and external data that is multi-structured, generated from diverse sources in near real-time and in large volumes making
fs viewpoint www.pwc.com/fsi 02 15 19 21 27 31 Point of view A deeper dive Competitive intelligence A framework for response How PwC can help Appendix Where have you been all my life? How the financial