Setting the Direction for Big Data Benchmark Standards
|
|
|
- Bryan Lewis
- 10 years ago
- Views:
Transcription
1 Setting the Direction for Big Data Benchmark Standards Chaitan Baru, Center for Large-scale Data Systems research (CLDS), San Diego Supercomputer Center, UC San Diego Milind Bhandarkar, Greenplum Raghunath Nambiar, Cisco Meikel Poess, Oracle Tilmann Rabl, U.Toronto
2 2 Characterizing the application environment Enormous data sizes (volume), high data rates (velocity), variety of data genres Datacenter issues: Large-scale and evolving system configurations, shifting loads, heterogeneous technologies Data genres: Structured, unstructured, graphs, streams, images, scientific data, etc Software options: SQL, NoSQL, Hadoop ecosystem, Hardware options: HDD vs SSD (NVM); different types of HDD, NVM, and main memory; large-memory systems; etc. Platform options: Dedicated commodity clusters vs shared cloud platforms
3 3 Industry s first workshop on big data benchmarking Acknowledgements National Science Foundation (Grant# IIS ) SDSC industry sponsors: Seagate, Greenplum, NetApp, Mellanox, Brocade (event host) WBDB2012 steering committee Chaitan Baru, SDSC/UC San Diego Milind Bhandarkar, Greenplum/EMC Raghunath Nambiar, Cisco Meikel Poess, Oracle Tilmann Rabl, U of Toronto
4 4 Invited Attendee Organizations Actian AMD BMMsoft Brocade CA Labs Cisco Cloudera Convey Computer CWI/Monet Dell EPFL Facebook Google Greenplum Hewlett-Packard Hortonworks Indiana Univ / Hathitrust Research Foundation InfoSizing Intel LinkedIn MapR/Mahout Mellanox Microsoft NSF NetApp NetApp/OpenSFS Oracle San Diego Supercomputer Center SAS Scripps Research Institute Seagate Shell SNIA Teradata Corporation Twitter UC Irvine Univ. of Minnesota Univ. of Toronto Univ. of Washington VMware WhamCloud Yahoo! Red Hat Meeting structure: 6 x 15mts invited presentations; 35 x 5mts lightning talks. Afternoons: Structured discussion sessions
5 Topics of discussions Audience: Who is the audience for this benchmark? Application: What application should we model? Single benchmark spec: Is it possible to develop a single benchmark to capture characteristics of multiple applications? Component vs. end-to-end benchmark. Is it possible to factor out a set of benchmark components, which can be isolated and plugged into an end-to-end benchmark(s)? Paper and Pencil vs Implementation-based. Should the implementation be specification-driven or implementation-driven? Reuse. Can we reuse existing benchmarks? Benchmark Data. Where do we get the data from? Innovation or competition? Should the benchmark be for innovation or competition?
6 Audience: Who is the primary audience for a big data benchmark? Customers Workload should preferably be expressed in English Or, a declarative Language (unsophisticated user) But, not a procedural language (sophisticated user) Want to compare among different vendors Vendors Would like to sell machines/systems based on benchmarks Computer science/hardware research is also an audience Niche players and technologies will emerge out of academia Will be useful to train students on specific benchmarking
7 Applications: What application should we model? An application that somebody could donate? An application based on empirical data? Examples from scientific applications Multi-channel retailer-based application, like the amended TPC-DS for Big Data? Mature schema, large scale data generator, execution rules, audit process exists. Abstraction of an Internet-scale application, e.g. data management at the Facebook site, with synthetic data
8 Single Benchmark vs Multiple Is it possible to develop a single benchmark to represent multiple applications? Yes, but not desired if there is no synergy between the applications, e.g. say, at the data model level Synthetic Facebook application might provide context for a single benchmark Click streams, data sorting/indexing, weblog processing, graph traversals, image/video data,
9 Component benchmark vs. end-toend benchmark Are there components that can be isolated and plugged into an end-to-end benchmark? The benchmark should consist of individual components that ultimately make up an end-to-end benchmark The benchmark should include a component that extracts large data Many data science applications extract large data and then visualize the output Opportunity for pushing down viz into the data management system
10 Paper and Pencil / Specification driven versus Implementation driven Start with an implementation and develop specification at the same time Some post-workshop activity has begun in this area Data generation; sorting; some processing
11 Where Do we Get the Data From? Downloading data is not an option Data needs to be generated (quickly) Examples of actual datasets from scientific applications Observational data (e.g. LSST), simulation outputs Using existing data generators (TPC-DS, TPC-H) Data that is generic enough with good characteristics is better than specific data
12 Should the benchmark be for innovation or competition? Innovation and competition are not mutually exclusive Should be used for both The benchmark should be designed for competition, such a benchmark will then also be used internally for innovation TPC-H is a prime example of a benchmark model that could drive competition and innovation (if combined correctly)
13 Can we reuse existing benchmarks? Yes, we could but we need to discuss: How much augmentation is necessary? Can the benchmark data be scaled If the benchmark uses SQL, we should not require it Examples: but none of the following could be used unmodified Statistical Workload Injector for Map Reduce (SWIM) GridMix3 (lots of shortcomings) Open source TPC-DS YCSB++ (lots of shortcomings) Terasort strong sentiment for using this, perhaps as part of an end-to-end scenario #$%&'(""!"#$%&'&$!()*+,&-.$%&'&$/01(2$ - ')*+,+-.",/00-12"3).*45617"81-5"#16.,6*2+-."$1-*),,+.9" $) *)"%-/.*+:"" ;<<===>20*>-19<20*?,<,0)*<20*?,@A>A>B>0?8" C4D"3/+:?"-."2-0"-8"#$%&'(E" F-:/5)";"" - G-"24)-1)2+*6:":+5+2" - #),2)?"/0"2-"ABB"#H"" F):-*+2D";"1-::+.9"/0?62),"" F61+)2D" - " - I6,D"2-"6??"24)"-24)1"2=-",-/1*)," " "!" " Ahmad Ghazal, Aster
14 Keep in mind principles for good benchmark design Self-scaling, e.g. TPC-C Comparability between scale factors Results should be comparable at different scales Technology agnostic (if meaningful to the application) TPC Simple to run + Longevity: TPC-C has carried the load for 20 years + Comparability Audit requirements and strict detailed run rules mean one can compare results published by two different entities + Scaling Results just as meaningful at the high-end of the market as at the lowend; as relevant on clusters as on single servers - Hard and expensive to run - No kit - DeWitt clauses Reza Taheri, VMWare 3
15 15 Extrapolating Results TPC benchmarks typically run on over-specified systems i.e. Customer installations may have less hardware than benchmark installation (SUT) Big Data Benchmarking may be opposite May need to run benchmark on systems that are smaller than customer installations Can we extrapolate? Scaling may be piecewise linear. Need to find those inflexion points
16 16 Interest in results from the Workshop June 9, 2012: Summary of Workshop on Big Data Benchmarking, C. Baru, Workshop on Architectures and Systems for Big Data, Portland, OR. June 22-23, 2012: Towards Industry Standard Benchmarks for Big Data, M. Bhandarkar, The Extremely Large Databases Conference at Asia, June 22-23, August 27-31, 2012: Setting the Direction for Big Data Benchmark Standards, Baru, Bhandarkar, Nambiar, Poess, Rabl, 4 th Int l TPC Technology Conference (TPCTC 2012), with VLDB2012, Istanbul, Turkey September 10-13, 2012: Introducing the Big Data Benchmarking Community, Lightning Talk and Poster at the 6 th Extremely Large Database Conference (XLDB), Stanford Univ, Palo Alto. September 20, SNIA Analytics and Big Data Summit, Santa Clara, CA. September 21, Workshop on Managing Big Data Systems, San Jose, CA.
17 17 Big Data Benchmarking Community (BDBC) Biweekly phone conferences Group communications and coordination is facilitated via phone calls every two weeks. See for call-in information Contact or to join BDBC mailing list
18 Big Data Benchmarking Community Participants o Actian o Facebook o AMD o GaTech o Argonne National o Google o Labs o Greenplum o BMMsoft o Hortonworks o Brocade o Hewlett-Packard o CA Labs o IBM o Cisco o IndianaU/HTRF o Cloudera o InfoSizing o CMU o Intel o Convey Computer o Johns Hopkins U. o CWI/Monet o LinkedIn o DataStax o MapR/Mahout o Dell o Mellanox o EPFL o Microsoft o NetApp o Netflix o NIST o NSF o OpenSFS o Oracle o Ohio State U. o Paypal o PNNL o Red Hat o San Diego Supercomputer Center o SAS o Seagate o Shell o SLAC o SNIA o Teradata o Twitter o Univ. of Minnesota o UC Berkeley o UC Irvine o UC San Diego o Univ of Passau o Univ of Toronto o Univ. of Washington o VMware o WhamCloud o Yahoo! o
19 19 2 nd Workshop on Big Data Benchmarking: India December 17-18, 2012, Pune, India Hosted by Persistent Systems, India See Themes: Application Scenarios/Use Cases Benchmark Process Benchmark Metrics Data Generation Format: Invited speakers; Lightning talks; Demos We welcome sponsorship Current Sponsors: NSF, Persistent Systems, Seagate, Greenplum, Brocade, Mellanox
20 20 3 rd Workshop on Big Data Benchmarking: China June 2013, Xian, China Hosted by Shanxi Supercomputing Center See Current Sponsors: IBM, Seagate, Greenplum, Mellanox, Brocade?
21 21 The Center for Large-scale Data Systems Research: CLDS At the San Diego Supercomputer Center, UC San Diego R&D center and forum for discussions on big datarelated topics Benchmarking; Project on Data Dynamics; Information Value; Focus on industry segments, e.g Healthcare Center Structure: Workshops; Program-level activities; Project-level activities For sponsorship opportunities: Contact: Chaitan Baru, SDSC, Thank you!
Setting the Direction for Big Data Benchmark Standards Chaitan Baru, PhD San Diego Supercomputer Center UC San Diego
Setting the Direction for Big Data Benchmark Standards Chaitan Baru, PhD San Diego Supercomputer Center UC San Diego Industry s first workshop on big data benchmarking Acknowledgements National Science
Setting the Direction for Big Data Benchmark Standards 1
Setting the Direction for Big Data Benchmark Standards 1 Chaitan Baru 1, Milind Bhandarkar 2, Raghunath Nambiar 3, Meikel Poess 4, Tilmann Rabl 5 1 San Diego Supercomputer Center, UC San Diego, USA [email protected]
Welcome to the 6 th Workshop on Big Data Benchmarking
Welcome to the 6 th Workshop on Big Data Benchmarking TILMANN RABL MIDDLEWARE SYSTEMS RESEARCH GROUP DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF TORONTO BANKMARK Please note! This workshop
BENCHMARKING BIG DATA SYSTEMS AND THE BIGDATA TOP100 LIST
BENCHMARKING BIG DATA SYSTEMS AND THE BIGDATA TOP100 LIST ORIGINAL ARTICLE Chaitanya Baru, 1 Milind Bhandarkar, 2 Raghunath Nambiar, 3 Meikel Poess, 4 and Tilmann Rabl 5 Abstract Big data has become a
How To Write A Bigbench Benchmark For A Retailer
BigBench Overview Towards a Comprehensive End-to-End Benchmark for Big Data - bankmark UG (haftungsbeschränkt) 02/04/2015 @ SPEC RG Big Data The BigBench Proposal End to end benchmark Application level
IEEE BigData 2014 Tutorial on Big Data Benchmarking
IEEE BigData 2014 Tutorial on Big Data Benchmarking Dr. Tilmann Rabl Middleware Systems Research Group, University of Toronto [email protected] Dr. Chaitan Baru San Diego Supercomputer Center, University
Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council (TPC)
Shaping the Landscape of Industry Standard Benchmarks: Contributions of the Transaction Processing Performance Council (TPC) Nicholas Wakou August 29, 2011 Seattle, WA Authors: Raghunath Othayoth Nambiar
NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB, Cassandra, and MongoDB
bankmark UG (haftungsbeschränkt) Bahnhofstraße 1 9432 Passau Germany www.bankmark.de [email protected] T +49 851 25 49 49 F +49 851 25 49 499 NoSQL Performance Test In-Memory Performance Comparison of SequoiaDB,
Survey of Big Data Benchmarking
Page 1 of 7 Survey of Big Data Benchmarking Kyle Cooper, [email protected] (A paper written under the guidance of Prof. Raj Jain) Download Abstract: The purpose of this paper is provide a survey of up to
SPEC Research Group. Sam Kounev. SPEC 2015 Annual Meeting. Austin, TX, February 5, 2015
SPEC Research Group Sam Kounev SPEC 2015 Annual Meeting Austin, TX, February 5, 2015 Standard Performance Evaluation Corporation OSG HPG GWPG RG Open Systems Group High Performance Group Graphics and Workstation
Mind Commerce. http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample
Mind Commerce http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample Phone: 800.298.5699 (US) or +1.240.747.3093 or +1.240.747.3093 (Int'l) Hours: Monday - Thursday: 5:30am - 6:30pm
Pre-Conference Seminar E: Flash Storage Networking
Pre-Conference Seminar E: Flash Storage Networking Rob Davis, Chris DePuy, Tameesh Suri, Saurabh Sureka, Gunna Marripudi, and Asgeir Eiriksson Santa Clara, CA 1 Agenda Networked Flash Storage Overview
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation
Can Flash help you ride the Big Data Wave? Steve Fingerhut Vice President, Marketing Enterprise Storage Solutions Corporation Forward-Looking Statements During our meeting today we may make forward-looking
Big Data in Financial Services Industry: Market Trends, Challenges, and Prospects 2014-2019
Brochure More information from http://www.researchandmarkets.com/reports/3006484/ Big Data in Financial Services Industry: Market Trends, Challenges, and Prospects 2014-2019 Description: Big Data and predictive
Big Data Generation. Tilmann Rabl and Hans-Arno Jacobsen
Big Data Generation Tilmann Rabl and Hans-Arno Jacobsen Middleware Systems Research Group University of Toronto [email protected], [email protected] http://msrg.org Abstract. Big data challenges
BigBench: Towards an Industry Standard Benchmark for Big Data Analytics
BigBench: Towards an Industry Standard Benchmark for Big Data Analytics Ahmad Ghazal 1,5, Tilmann Rabl 2,6, Minqing Hu 1,5, Francois Raab 4,8, Meikel Poess 3,7, Alain Crolotte 1,5, Hans-Arno Jacobsen 2,9
Automating Big Data Benchmarking for Different Architectures with ALOJA
www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.
Proact whitepaper on Big Data
Proact whitepaper on Big Data Summary Big Data is not a definite term. Even if it sounds like just another buzz word, it manifests some interesting opportunities for organisations with the skill, resources
Global Hadoop Market (Hardware, Software, Services) applications, Geography, Haas, Global Trends,Opportunities, Segmentation and Forecast 2014-2021
Brochure More information from http://www.researchandmarkets.com/reports/3050450/ Global Hadoop Market (Hardware, Software, Services) applications, Geography, Haas, Global Trends,Opportunities, Segmentation
How To Understand The Business Case For Big Data
Brochure More information from http://www.researchandmarkets.com/reports/2643647/ Big Data and Telecom Analytics Market: Business Case, Market Analysis & Forecasts 2014-2019 Description: Big Data refers
TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data
TABLE OF CONTENTS 1 Chapter 1: Introduction 1.1 Executive Summary 1.2 Topics Covered 1.3 Key Findings 1.4 Target Audience 1.5 Companies Mentioned 2 Chapter 2: Big Data Technology & Business Case 2.1 Defining
HP SN1000E 16 Gb Fibre Channel HBA Evaluation
HP SN1000E 16 Gb Fibre Channel HBA Evaluation Evaluation report prepared under contract with Emulex Executive Summary The computing industry is experiencing an increasing demand for storage performance
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce
Cloud Computing @ SingularLogic:
Cloud Computing @ SingularLogic: Government cloud services: definitions and best practices Synergies with the private sector Are Greek IT companies able to provide Cloud Services? SingularLogic s Cloud
White Paper: Datameer s User-Focused Big Data Solutions
CTOlabs.com White Paper: Datameer s User-Focused Big Data Solutions May 2012 A White Paper providing context and guidance you can use Inside: Overview of the Big Data Framework Datameer s Approach Consideration
Mind Commerce. http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample
Mind Commerce http://www.marketresearch.com/mind Commerce Publishing v3122/ Publisher Sample Phone: 800.298.5699 (US) or +1.240.747.3093 or +1.240.747.3093 (Int'l) Hours: Monday - Thursday: 5:30am - 6:30pm
BPOE Research Highlights
BPOE Research Highlights Jianfeng Zhan ICT, Chinese Academy of Sciences 2013-10- 9 http://prof.ict.ac.cn/jfzhan INSTITUTE OF COMPUTING TECHNOLOGY What is BPOE workshop? B: Big Data Benchmarks PO: Performance
SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM
David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business
CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data
Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with
Dell Reference Configuration for Hortonworks Data Platform
Dell Reference Configuration for Hortonworks Data Platform A Quick Reference Configuration Guide Armando Acosta Hadoop Product Manager Dell Revolutionary Cloud and Big Data Group Kris Applegate Solution
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Ubuntu and Hadoop: the perfect match
WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely
NoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018
Transparency Market Research Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018 Buy Now Request Sample Published Date: July 2013 Single User License: US $ 4595
Big Data Services Market in Western Europe 2015-2019
Brochure More information from http://www.researchandmarkets.com/reports/3301866/ Big Data Services Market in Western Europe 2015-2019 Description: About Big Data Services Data generated from various sources
Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010
Hadoop s Entry into the Traditional Analytical DBMS Market Daniel Abadi Yale University August 3 rd, 2010 Data, Data, Everywhere Data explosion Web 2.0 more user data More devices that sense data More
Introducing EEMBC Cloud and Big Data Server Benchmarks
Introducing EEMBC Cloud and Big Data Server Benchmarks Quick Background: Industry-Standard Benchmarks for the Embedded Industry EEMBC formed in 1997 as non-profit consortium Defining and developing application-specific
The little elephant driving Big Data
The little elephant driving Big Data Despite the funny-sounding name, Hadoop is a serious enterprise software suite that drives Big Data Hadoop enables the storage and processing of very large databases
Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe 20-22 May, 2013
Dubrovnik, Croatia, South East Europe 20-22 May, 2013 Big Data Value, use cases and architectures Petar Torre Lead Architect Service Provider Group 2011 2013 Cisco and/or its affiliates. All rights reserved.
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
Introduction to Decision Support, Data Warehousing, Business Intelligence, and Analytical Load Testing for all Databases
Introduction to Decision Support, Data Warehousing, Business Intelligence, and Analytical Load Testing for all Databases This guide gives you an introduction to conducting DSS (Decision Support System)
Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)
Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database) Presented By: Mike Ferguson Intelligent Business Strategies Limited 2 Day Workshop : 25-26 September 2014 : 29-30 September 2014 www.unicom.co.uk/bigdata
Big Data Market Size and Vendor Revenues
Analysis from The Wikibon Project February 2012 Big Data Market Size and Vendor Revenues Jeff Kelly, David Vellante, David Floyer A Wikibon Reprint The Big Data market is on the verge of a rapid growth
Building Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
Apache Hadoop Patterns of Use
Community Driven Apache Hadoop Apache Hadoop Patterns of Use April 2013 2013 Hortonworks Inc. http://www.hortonworks.com Big Data: Apache Hadoop Use Distilled There certainly is no shortage of hype when
The Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. [email protected] [email protected] @OrionGM The Inside Scoop
MapReduce with Apache Hadoop Analysing Big Data
MapReduce with Apache Hadoop Analysing Big Data April 2010 Gavin Heavyside [email protected] About Journey Dynamics Founded in 2006 to develop software technology to address the issues
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
Next-Gen Big Data Analytics using the Spark stack
Next-Gen Big Data Analytics using the Spark stack Jason Dai Chief Architect of Big Data Technologies Software and Services Group, Intel Agenda Overview Apache Spark stack Next-gen big data analytics Our
OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT
WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve
Data Warehousing in the Age of Big Data
Data Warehousing in the Age of Big Data Krish Krishnan AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD * PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Morgan Kaufmann is an imprint of Elsevier
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle
A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle Growth in Data Diversity and Usage 1.8 Zettabytes of Data in 2011, 20x Growth by 2020
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
Application Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra
Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra A Quick Reference Configuration Guide Kris Applegate [email protected] Solution Architect Dell Solution Centers Dave
Big Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
APPROACHABLE ANALYTICS MAKING SENSE OF DATA
APPROACHABLE ANALYTICS MAKING SENSE OF DATA AGENDA SAS DELIVERS PROVEN SOLUTIONS THAT DRIVE INNOVATION AND IMPROVE PERFORMANCE. About SAS SAS Business Analytics Framework Approachable Analytics SAS for
Introduction to Decision Support, Data Warehousing, Business Intelligence, and Analytical Load Testing for all Databases
Introduction to Decision Support, Data Warehousing, Business Intelligence, and Analytical Load Testing for all Databases This guide gives you an introduction to conducting DSS (Decision Support System)
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Impact of Big Data in Oil & Gas Industry. Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India.
Impact of Big Data in Oil & Gas Industry Pranaya Sangvai Reliance Industries Limited 04 Feb 15, DEJ, Mumbai, India. New Age Information 2.92 billions Internet Users in 2014 Twitter processes 7 terabytes
BIG DATA APPLIANCES. July 23, TDWI. R Sathyanarayana. Enterprise Information Management & Analytics Practice EMC Consulting
BIG DATA APPLIANCES July 23, TDWI R Sathyanarayana Enterprise Information Management & Analytics Practice EMC Consulting 1 Big data are datasets that grow so large that they become awkward to work with
Testing 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
Hadoop & its Usage at Facebook
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System [email protected] Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction
INTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
Information Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.
PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics
BIG DATA USING HADOOP
+ Breakaway Session By Johnson Iyilade, Ph.D. University of Saskatchewan, Canada 23-July, 2015 BIG DATA USING HADOOP + Outline n Framing the Problem Hadoop Solves n Meet Hadoop n Storage with HDFS n Data
Peninsula Strategy. Creating Strategy and Implementing Change
Peninsula Strategy Creating Strategy and Implementing Change PS - Synopsis Professional Services firm Industries include Financial Services, High Technology, Healthcare & Security Headquartered in San
#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
Hadoop & Big Data Market [Hardware, Software, Services, Hadoop-as-a- Service] - Trends, Geographical Analysis & Worldwide Market Forecasts (2012 2017)
Brochure More information from http://www.researchandmarkets.com/reports/2259062/ Hadoop & Big Data Market [Hardware, Software, Services, Hadoop-as-a- Service] - Trends, Geographical Analysis & Worldwide
Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, And Forecast, 2012-2018
Brochure More information from http://www.researchandmarkets.com/reports/2622818/ Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, And Forecast, 2012-2018 Description: An exponential
