Big Data Leadership Team

Size: px
Start display at page:

Download "Big Data Leadership Team"

Transcription

1

2 Big Data Leadership Team Chris Ward Principal Consultant James Bigger Principal Consultant Brian Vaughan Principal Consultant Prem Jain Principal Consultant Ma3 DuBell Principal Engineer 20 years in management consul5ng and execu5ve leadership Exper5se in retail, marke5ng, hospitality & financial services Prior consul5ng experience with Opera Solu5ons and The Boston Consul5ng Group BA from Princeton University, MBA from the University of Virginia Darden School of Business 20 years of management consul5ng and entrepreneurial experience Exper5se in financial services, insurance and telecom Prior consul5ng experience with Opera Solu5ons and A. T. Kearney Ph.D. in Physics from Oxford University 15 years in management consul5ng, analy5cs and sooware experience Exper5se in healthcare and insurance Prior experience with Opera Solu5ons, Mitchell Madison Group and Broadlane Ph.D. in Physics from Stanford University Prem has 20 years of technology experience in enterprise datacenter technologies. He has built innova5ve solu5ons in Big Data, storage, HPC, virtualiza5on, data migra5on and enterprise applica5ons. Prem was formerly at NetApp, was the lead architect for Big Data and FlexPod solu5ons. 20 years of experience in a range of IT and security disciplines Responsible for deploying large, secure, Hadoop- based plauorms for the U. S. Govt. 10 year of interna5onal experience implemen5ng networking and virtual data center environments Undergraduate degree from AIU

3 Big Data Team Jason Lu Chief Scien5st Eighteen years of analy5cs and sooware development experience. Exper5se in financial services, healthcare, insurance, retail and marke5ng science. Prior analy5cs development experience at Opera Solu5ons, FICO and J.D. Power and Associates. Ph.D. in Physics from Stanford University. Yoni Malchi Consul5ng Manager Worked as an Engagement Manager for predic5ve analy5cs consul5ng engagements. Experience in both the Financial Services and Telecommunica5ons industries, bridging the gap between the business and data scien5sts. PhD in Mech. Eng. in 2007 and worked in the Aerospace industry for 4 years. Jamie Milne Consul5ng Manager Over 7 Years of management consul5ng and entrepreneurial experience. Exper5ze in financial services, travel, and retail sectors across US and Europe. Led Big Data strategy and analy5cal engagements at Opera Solu5ons. MSci in Astrophysics from the University of Cambridge. Chris Infan9 Consul5ng Manager 8+ years of experience in big data analy5cs consul5ng. Experience in business development and delivery of analy5cs projects in the educa5on, wealth management, public safety, corporate security, online subscrip5on, transporta5on, and retail sectors. B.S. in Mathema5cs, B.A. in English Literature from Georgetown University Virtual Team BDAs, Analy5c Programmers, Storage Specialists, Network Architects, Hadoop Administrators and other professionals Many years of experience architec5ng, deploying and managing compute, storage, network, Hadoop ecoysystem and database solu5ons for fortune 500 companies to augment the exper5se of the core Big Data Leadership Team.

4 Volume, Variety and Velocity of Data are Exploding The produc5on of data is expanding at an astonishing rate. Drivers include the switch from analog to digital technologies and the crea5on of structured and unstructured data by individuals and companies via social media and the Web Volume Variety Velocity Enterprise Managed Data ZB Enterprise Created Data Unstructured data storage EB Structured data storage Every 60 Seconds: - 98,000+ tweets - 695,000 status updates - 11 million instant messages - 698,445 Google searches million+ s sent - 1,820TB of data created new mobile web users The need to process more data faster to respond to dynamic business trends has brought new requirements for database architectures We believe the industry stands at the cusp of the most significant revolu8on in database and, therefore, applica8on architectures in the past 20 years.

5 Data Sources & Capture IT Infrastructure Data Management &Integra5on Analy5cs PlaUorms & Solu5ons Analy5cs Services & Support Data Vendors Infrastructure Vendors Open Data PlaUorms Ver5cal Analy5cs Solu5ons Proprietary Data PlaUorm Analy5cs Service Provider Vendor Landscape Is Crowded and Growing Extended Infrastructure + Data PlaUorms System Integrators Specialized End- to- End Solu5ons

6 Key Big Data Technologies FOUNDATIONAL EMERGING Hadoop NoSQL Columnar In- Memory Distributed File System and Processing Language Characteris9cs Parallel storage/processing Flexible programming model Horizontal scaling Batch processing Non- rela9onal Key- Value Database Characteris9cs Fast read/write Real 5me query Horizontal scaling Simple programming model Dynamic schema Column- Oriented Database Analy9cs Characteris9cs Rela5onal Efficient compression Op5mized for fast read of many/all records In- Memory Database and Processing Characteris9cs Rela5onal Random Access Extremely Fast Enablement / Uses Pre- processing of data for analy5cs ETL for transforming unstructured data to structured Data summariza5on Enablement / Uses Real- 5me ingest Rapid retrieval Input to MapReduce Enablement / Uses On- Line Analy5cs Processing (OLAP) Data storage and retrieval for advanced analy5cs Enablement / Uses Complex Event Processing Real Time Analy5cs Poten5al to use a common database for transac5ons and analy5cs

7 The Big Data Software Stack USER/MACHINE WORKFLOW The big data ecosystem includes open source and proprietary distribu5ons that span the stack from ingest through analy5cs DECIDE ANALYZE ORGANIZE ACQUIRE ANALYTICS ACCESS/ QUERIES ANALYTICS DATABASE TRANSFORM MANAGEMENT FILE SYSTEM/ DATABASE INGEST LAYER PROPERTIES OPTIONS EXAMPLES OF PRODUCTS Real Time & Batch Op9mized for high vol reads Flexible, Compressed, Fast Read Fast, Scalable Provisioning Maintenance Parallel, Distributed Interfaces to accept data OLAP Natural Language Custom Analy9cs Custom API s SQL Columnar In Memory Parallel RDBMS MapReduce HDFS NoSQL - Document - Key- Value - Wide Column Batch Streaming R PYTHON SQL PIG HIVE HADOOP ZOOKEEPER HADOOP CASSANDRA HBASE MONGODB SQOOP FLUME SAS SPSS TEREDATA NETEZZA GREENPLUM VERTICA CLOUDERA HORTONWORKS MAPR PIVOTALHD SPLUNK TALEND MICROSTRATEGY INTEGRATED BUSINESS OFFERINGS OBJECTS COGNOS ORACLE OBIEE PLUS EMC/PIVOTAL HD / GREENPLUM HP/VERTICA/CLOUDERA ORACLE BIG DATA EXADATA/EXALYTICS IBM INFOSPHERE BIGINSIGHTS SAP HANA TERRACOTTA BIGMEMORY DATA Enterprise Structured Enterprise Unstructured 3 rd Party Web/ Unstructured ODS Data Warehouse Call Center Server Logs Financial Demographic OPEN SOURCE COMMERCIAL OPEN SOURCE SOLUTIONS

8 Dual Approach to Delivering Big Data Solu5ons WWT offers customers both strategic and tac5cal approaches to derive value from the applica5on of Big Data analy5cs and technology BIG DATA BUSINESS IMPACT Extract value from data to drive mul9ple Use Cases BIG DATA TECHNOLOGY OPTIMIZATION Accomplish data tasks, faster, cheaper, bejer Strategic Roadmap Big Data Strategy Use Case Design Use Case PoC Analy5cs Development Workflow Integra5on Data Warehouse Op5miza5on ETL/ELT Offload Data Lake Crea5on SAP HANA Implementa5on Big Data Stack Build / Op5miza5on Produc5on Support & Sustainment

9 Defining The Opportunity Is The Starting Point The power of Big Data lies in bringing together data in a 5mely fashion from sources within and external to the enterprise - structured and unstructured - to create a complete view of cri5cal issues, therefore enabling advanced analy5cs to unlock key insights that drive significant Value. Outcome Analy9cs Data Technology Clearly defined use cases with the poten5al to deliver significant value by dis5lling vast data into new, previously unknowable intelligence Advanced machine learning techniques to analyze data and mine for insights to drive cri5cal decisions Structured or unstructured, internal or external, requiring new methods of storage/integra5on Emerging/new technology stacks using scalable, distributed architectures

10 MINING COMPANY PROJECT SCOPE 252 trucks 200 sensors per truck 7 mine sites 10,000 readings per second DISPARATE DATA SETS Integra5ng 15+ siloed data sources in mul5ple file formats 10 terabytes of data 3 year historical data ecosystem EQUIPMENT MAINTENANCE (SAP) DISPATCH & OPERATOR (TERADATA) FUEL, OIL, ANALYSIS, ETC. (SQL SERVER) TRUCK SENSOR DATA (Osi Pi SERVER) Stra5fying Alarms: 1. Urgent component problem 2. Cri9cal sensor problem 3. Important/not urgent component/sensor problem 4. Not important component/sensor problem 5. Noise ignore Urgent component failure models: engine, transmission, differen5als, torque converters, final drives 1 2 DATA LOGGER HADOOP INFRASTRUCTURE Established Big Data infrastructure Migrated and normalized data sets Developing visualiza5ons, tools and predic5ve analy5cs Data/analy5cs- driven 5ming for preventa5ve maintenance (e.g. oil changes) on individual trucks 3 DATA LOGGER DATA LOGGER VIEW OF MACHINE Time Sensor Data BUSINESS IMPACT Higher equipment up- 5me Reduced cri5cal component failure Beser preventa5ve maintenance Increased produc5vity

11 Data Warehouse Optimization: Value Proposition Augmen5ng the Data Warehouse with a less expensive Hadoop system allows companies to free up valuable space on their DW systems to run faster queries and analysis, whilst storing large volumes of their data universe CURRENT PROPOSED Web logs Payments Scheduling CRM Full Data Universe Social Media Billing 1. A significant amount of data is thrown out during the ETL process that may be valuable in the future Web logs Payments Scheduling CRM Full Data Universe Social Media Billing 1. U5lize addi5onal Hadoop- based storage to store full data universe WWT Hadoop Appliance Cold Data 2. About 50% of data that is brought into a typical Data Warehouse system is rarely accessed Cold Data Tradi9onal Data Warehouse Warm Data 2. Move cold/warm data, ETL workflows, and ELT scripts to Hadoop, taking advantage of lower cost per TB Warm Data Hot Data 3. About 80% of the queries and repor5ng performed on highly- used data does not need to be at DW speeds Tradi9onal Data Warehouse Warm Data Hot Data 3. Con5nue to take advantage of DW agility and speed in real- 5me analysis and querying

12 Four Major Big Data Challenges In our mee5ngs with customers, four issues are consistently brought up as a major challenges related to crea5ng a big data capability that can effec5vely support the business units Defining the outcome What problem/opportunity are we pursuing? What is the value that can be created? Deploying new technologies and combining with exis9ng architecture How do we create an effec5ve integrated Big Data stack? What new technologies do we need and how do they fit together? Big Data Challenges Naviga9ng a crowded and evolving vendor landscape How do we separate marke5ng hype from reality? Who should we use? Who can we trust Organizing for success Where does Big Data fit? Who is responsible for data integrity? Where do we find the cri5cal resources needed to deliver Big Data solu5ons?

13 Four Stages Of A Big Data Deployment Analy9cs- Ready Infrastructure Solu9on Development Plan Design Pilot Scale Develop a roadmap for implemen5ng Big Data Use case explora5on Data Governance, Infrastructure and Analy5cs ownership Define high impact use cases Design and test appropriate reference architectures Create detailed descrip5on of selected pilot use cases Analy5cs Workflow integra5on Test various reference architectures Stand- up reference architecture Design the pilot Success criteria Timeline Scope Iden5fy and prepare data Build analy5cal models Design workflow Implement, manage and monitor Implement design changes from pilot learnings Invest in sooware development as necessary to improve UI Prepare ETL process for scale Build out infrastructure as required to support rollout WWT Services 1.Strategic Roadmap Use case defini8on Organiza8onal alignment Big Data Architecture high level design 2. Big Data Stack Build Detailed design Big Data architecture and BOM Procure, configure and deploy Big Data stack 3. Proof of Concept POC design Analy8cal models Customer data loaded, processed and analyzed 4. Produc8on Support Opera8onalizing POC Infrastructure Sustainment Training Ongoing support Indica9ve Infra- structure EXAMPLE STARTER KIT Big Data Solu9on Stack: 2 UCS 6296PP 2 Nexus 2232PP 16 Cisco UCS C240 EMC Isilon SoWware: PivotalHD, Greenplum, etc. EXAMPLE SCALE OUT HARDWARE Mul9ple expansion racks 2 Nexus 2232PP Fabric Extenders 16 Cisco UCS C240 EMC Isilon

14 Advanced Technology Center ENTERPRISE NETWORKS SECURITY COLLABORATION DATA CENTER Next Genera5on Networking Nexus (7K, 5K, 3K & 2K) Virtual Networking (Nexus 1000v) OTV, LISP, Fabric Path Layer 2 Extension DR/BC Networking BYOD (Bring Your Own Device) & Secure Mobility Jukebox ISE & RSA ASA 1000v VSG (Virtual Security Gateway) Cyber Security Solu5ons Unified Communica5ons Tandberg Video VXI (View & XenDesktop) WebEx, Call Center & Collabora5on Solu5ons Phones, Backpacks & SoO, Phone Clients Telepresence & Business Video Vblock, FlexPod & CloudSystem Matrix EMC & NetApp Storage vsphere / XenServer vcloud Director VDI (View / XenDesktop) Cisco CIAC & BMC CLM EMC s UIM & Cloupia FAST MDC (Mobile Data Center) Solu5ons BIG DATA Cisco UCS C220, C240 HP DL380 Nexus 2200, UCS 6296 FlexPod Select, Isilon storage Cloudera, MapR, PivotalHD Cloud Foundry Velocidata Appliance Next Genera5on provisioning tools A highly collabora5ve, ecosystem to design, build, educate, demo & deploy advanced technology solu5ons for our customers & partners

15 Big Data Environment Set-up: ATC Reference Architectures Four analy5cs- ready infrastructure stacks have been developed in the ATC to showcase Big Data technologies ANALYTICS TOOLS ANALYTICS DATABASES REFERENCE ARCHITECTURE 1 HP Internal Local Storage R IMPALA PYTHON HBASE Current REFERENCE ARCHITECTURE 2 UCS NetApp Direct A3ached Storage MICROSTRATEGY In Process REFERENCE ARCHITECTURE 3 UCS Isilon Network Storage MICROSTRATEGY JAVA JAVA JAVA R PYTHON R PYTHON IMPALA HBASE HAWQ HBASE In Process REFERENCE ARCHITECTURE 4 SAP HANA SAP HANA FILE SYSTEM/ DATABASES HORTON CLOUDERA MAPR CLOUDERA CLOUDERA PIVOTALHD GEMFIRE HORTON MAPR HORTON MAPR INGEST VELOCIDATA SPLUNK VELOCIDATA SPLUNK VELOCIDATA SPLUNK VELOCIDATA SPLUNK NETWORK NEXUS 2200 UCS 6296UP NEXUS 2232PP UCS 6296 NEXUS 2200 UCS B BLADES COMPUTE HP DL 380 UCS- C220M3 UCS- C240 HITACHI STORAGE JBOD SATA NETAPP E5460 ISILON HITACHI DATA Enterprise Structured Enterprise Unstructured 3 rd Party Web/ Unstructured ODS Data Warehouse Call Center Server Logs Financial Demographic

16 How to Leverage ATC Architectures Func9on Proof of Concept Vendor Comparison Field Demo Performance Benchmarking Descrip9on Test customer solu5ons prior to full onsite implementa5on, e.g. Run Use Case analy5cal models and architectures on Big Data machines Create Big Data hardware/sooware stack, poten5ally with client data Compare Big Data solu5ons to provide insight into strengths and weaknesses of each Run bake- offs to gauge how well a full solu5on can be solved using certain components Showcase Big Data capabili5es by hos5ng demos of WWT PoCs and analysis Enable virtual access for field engineers to run customer demos Run benchmark tests to measure speed and performance of Big Data technologies, including compe5ng Hadoop distribu5ons and storage op5ons We use the ATC for a variety of customer and partner use cases, ranging from technology tes5ng to full solu5on deployment Technology Evalua9on Evaluate new technologies in the ATC as they are released, allowing our engineers to get up to speed before working in customer environments Training Hold training courses for customers and partners that allow them to work with Big Data sooware and hardware in a highly customizable environment that reach across a variety of vendors

17 WWT Big Data Workshop WHAT IS IT? A full- day interac5ve session with WWT consultants and Data Scien5sts designed to increase your understanding of Big Data and help you outline your strategy for using Big Data analy5cs solu5ons to add value. IDENTIFY clear use- cases that can t be iden5fied with the current setup DETERMINE which of the use- cases can benefit from WWT capabili5es ESTIMATE use- cases poten5al impact and ease of implementa5on CHOOSE high- value, ac5onable use cases WHAT TO EXPECT Highly- Skilled Consultants and Engineers Emerging Technology Customized Technical and Strategic Whiteboard Session Best Prac5ces Expert Insight Use Cases and Success Stories $ Impact High- value, ac5onable use case Ease of Implementa5on

Delivering Value with Big Data. Copyright 2014 World Wide Technology, Inc. All rights reserved.

Delivering Value with Big Data. Copyright 2014 World Wide Technology, Inc. All rights reserved. Delivering Value with Big Data Copyright 2014 World Wide Technology, Inc. All rights reserved. 0 WWT Big Data Leadership Team James Bigger Principal Consultant Brian Vaughan Principal Consultant Chris

More information

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas

Big Data. The Big Picture. Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas Big Data The Big Picture Our flexible and efficient Big Data solu9ons open the door to new opportuni9es and new business areas What is Big Data? Big Data gets its name because that s what it is data that

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

Project Por)olio Management

Project Por)olio Management Project Por)olio Management Important markers for IT intensive businesses Rest assured with Infolob s project management methodologies What is Project Por)olio Management? Project Por)olio Management (PPM)

More information

Table of Contents. The Big Data Curveball...3. The Big Data Roadblocks...4. Defining the Business Outcome: Use Cases Drive Infrastructure...

Table of Contents. The Big Data Curveball...3. The Big Data Roadblocks...4. Defining the Business Outcome: Use Cases Drive Infrastructure... Turning Big Data into Business Value A Practical Guide to Big Data Table of Contents The Big Data Curveball...3 The Big Data Roadblocks...4 Defining the Business Outcome: Use Cases Drive Infrastructure...

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term

MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT. How to Drive Adop.on, Efficiency, and ROI for the Long Term MAXIMIZING THE SUCCESS OF YOUR E-PROCUREMENT TECHNOLOGY INVESTMENT How to Drive Adop.on, Efficiency, and ROI for the Long Term What We Will Cover Today Presenta(on Agenda! Who We Are! Our History! Par7al

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

2015-16 ITS Strategic Plan Enabling an Unbounded University

2015-16 ITS Strategic Plan Enabling an Unbounded University 2015-16 ITS Strategic Plan Enabling an Unbounded University Update: July 31, 2015 IniAaAve: Agility Through Technology Vision Mission Enable Unbounded Learning Support student success through the innovaave

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

An Introduc@on to Big Data, Apache Hadoop, and Cloudera

An Introduc@on to Big Data, Apache Hadoop, and Cloudera An Introduc@on to Big Data, Apache Hadoop, and Cloudera Ian Wrigley, Curriculum Manager, Cloudera 1 The Mo@va@on for Hadoop 2 Tradi@onal Large- Scale Computa@on Tradi*onally, computa*on has been processor-

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Real World Big Data Architecture - Splunk, Hadoop, RDBMS Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah

Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated

More information

Proact whitepaper on Big Data

Proact whitepaper on Big Data Proact whitepaper on Big Data Summary Big Data is not a definite term. Even if it sounds like just another buzz word, it manifests some interesting opportunities for organisations with the skill, resources

More information

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved. EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Everything You Need to Know about Cloud BI. Freek Kamst

Everything You Need to Know about Cloud BI. Freek Kamst Everything You Need to Know about Cloud BI Freek Kamst Business Analy2cs Insight, Bussum June 10th, 2014 What s it all about? Has anything changed in the world of BI? Is Cloud Compu2ng a Hype or here to

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries

Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries Financial, Telco, Retail, & Manufacturing: Hadoop Business Services for Industries Ho Wing Leong, ASEAN 1 Cloudera company snapshot Founded Company Employees Today World Class Support Mission CriQcal 2008,

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager [email protected]

More information

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan

Data Management in the Cloud: Limitations and Opportunities. Annies Ductan Data Management in the Cloud: Limitations and Opportunities Annies Ductan Discussion Outline: Introduc)on Overview Vision of Cloud Compu8ng Managing Data in The Cloud Cloud Characteris8cs Data Management

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies

Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step. Arbela Technologies Effec%ve AX 2012 Upgrade Project Planning and Microso< Sure Step Arbela Technologies Why Upgrade? What to do? How to do it? Tools and templates Agenda Sure Step 2012 Ax2012 Upgrade specific steps Checklist

More information

Presenters: Luke Dougherty & Steve Crabb

Presenters: Luke Dougherty & Steve Crabb Presenters: Luke Dougherty & Steve Crabb About Keylink Keylink Technology is Syncsort s partner for Australia & New Zealand. Our Customers: www.keylink.net.au 2 ETL is THE best use case for Hadoop. ShanH

More information

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology

Understanding Cloud Compu2ng Services. Rain in business success with amazing solu2ons in Cloud technology Understanding Cloud Compu2ng Services Rain in business success with amazing solu2ons in Cloud technology What is Cloud Compu2ng? Cloud compu2ng encompasses various services and ac2vi2es carried out over

More information

Fixed Scope Offering (FSO) for Oracle SRM

Fixed Scope Offering (FSO) for Oracle SRM Fixed Scope Offering (FSO) for Oracle SRM Agenda iapps Introduc.on Execu.ve Summary Business Objec.ves Solu.on Proposal Scope - Business Process Scope Applica.on Implementa.on Methodology Time Frames Team,

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

DNS Big Data Analy@cs

DNS Big Data Analy@cs Klik om de s+jl te bewerken Klik om de models+jlen te bewerken! Tweede niveau! Derde niveau! Vierde niveau DNS Big Data Analy@cs Vijfde niveau DNS- OARC Fall 2015 Workshop October 4th 2015 Maarten Wullink,

More information

Cloud Compu)ng in Educa)on and Research

Cloud Compu)ng in Educa)on and Research Cloud Compu)ng in Educa)on and Research Dr. Wajdi Loua) Sfax University, Tunisia ESPRIT - December 2014 04/12/14 1 Outline Challenges in Educa)on and Research SaaS, PaaS and IaaS for Educa)on and Research

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

Big Data + Big Analytics Transforming the way you do business

Big Data + Big Analytics Transforming the way you do business Big Data + Big Analytics Transforming the way you do business Bryan Harris Chief Technology Officer VSTI A SAS Company 1 AGENDA Lets get Real Beyond the Buzzwords Who is SAS? Our PerspecDve of Big Data

More information

Peninsula Strategy. Creating Strategy and Implementing Change

Peninsula Strategy. Creating Strategy and Implementing Change Peninsula Strategy Creating Strategy and Implementing Change PS - Synopsis Professional Services firm Industries include Financial Services, High Technology, Healthcare & Security Headquartered in San

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

Saving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved.

Saving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved. Saving Millions through Data Warehouse Offloading to Hadoop Jack Norris, CMO MapR Technologies MapR Technologies. All rights reserved. MapR Technologies Overview Open, enterprise-grade distribution for

More information

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015

Build Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015 Build Your Competitive Edge in Big Data with Cisco Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015 Big Data Trends Increasingly Everything will be Connected to Everything Massive

More information

Il mondo dei DB Cambia : Tecnologie e opportunita`

Il mondo dei DB Cambia : Tecnologie e opportunita` Il mondo dei DB Cambia : Tecnologie e opportunita` Giorgio Raico Pre-Sales Consultant Hewlett-Packard Italiana 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp

Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)

More information

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 1 Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop 2 Pivotal s Full Approach It s More Than Just Hadoop Pivotal Data Labs 3 Why Pivotal Exists First Movers Solve the Big Data Utility Gap

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

Hadoop, the Data Lake, and a New World of Analytics

Hadoop, the Data Lake, and a New World of Analytics Hadoop, the Data Lake, and a New World of Analytics Hortonworks. We do Hadoop. Spring 2014 Version 1.0 Page 1 Hortonworks Inc. 2014 Traditional Data Architecture Pressured 2.8 ZB in 2012 85% from New Data

More information

A Modern Data Architecture with Apache Hadoop

A Modern Data Architecture with Apache Hadoop Modern Data Architecture with Apache Hadoop Talend Big Data Presented by Hortonworks and Talend Executive Summary Apache Hadoop didn t disrupt the datacenter, the data did. Shortly after Corporate IT functions

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

Cisco IT Hadoop Journey

Cisco IT Hadoop Journey Cisco IT Hadoop Journey Alex Garbarini, IT Engineer, Cisco 2015 MapR Technologies 1 Agenda Hadoop Platform Timeline Key Decisions / Lessons Learnt Data Lake Hadoop s place in IT Data Platforms Use Cases

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Introduction to Big Data! with Apache Spark UC#BERKELEY# Introduction to Big Data! with Apache Spark" UC#BERKELEY# So What is Data Science?" Doing Data Science" Data Preparation" Roles" This Lecture" What is Data Science?" Data Science aims to derive knowledge!

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Building your cloud porbolio APS Connect

Building your cloud porbolio APS Connect Building your cloud porbolio APS Connect 5 th November 2014 Duncan Robinson, Parallels Business Consul3ng Introduc/on to BCS Who are we? Created 3 years ago in response to partner demand Define the strategy

More information

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional.

Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. Program Model: Muskingum University offers a unique graduate program integra6ng BUSINESS and TECHNOLOGY to develop the 21 st century professional. 163 Stormont Street New Concord, OH 43762 614-286-7895

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

Big Data solutions to support Intelligent Systems and Applications

Big Data solutions to support Intelligent Systems and Applications Big solutions to support Intelligent Systems and Applications Luciana Lima, Filipe Portela, Manuel Filipe Santos, António Abelha and José Machado. Abstract in the last years the number of data available

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine

More information

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected]

Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected] Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache

More information

Solving today's integra@on challenges with Oracle SOA Suite, and Oracle Coherence

Solving today's integra@on challenges with Oracle SOA Suite, and Oracle Coherence Solving today's integra@on challenges with Oracle SOA Suite, and Oracle Coherence Asaf Lev Sales Consul@ng [email protected] Agenda Industry Trends Oracle SOA Suite Oracle Coherence Oracle Service Bus

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner [email protected] @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

Cisco and Splunk: Under the Hood of Cisco IT

Cisco and Splunk: Under the Hood of Cisco IT Cisco and Splunk: Under the Hood of Cisco IT Robert Novak, Cisco Big Data Partner CSE George Lancaster, Engineer, Cisco IT September 2015 Agenda Cisco s History with Splunk How Cisco Uses Splunk IT Operations

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT

AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT AVOIDING SILOED DATA AND SILOED DATA MANAGEMENT Dalton Cervo Author, Consultant, Management Expert September 2015 This presenta?on contains extracts from books that are: Copyright 2011 John Wiley & Sons,

More information

Anzo Smart Data Integra/on

Anzo Smart Data Integra/on Anzo Smart Data Integra/on Cambridge Seman-cs Contact: Marty Loughlin Vice President, Financial Services Cambridge Seman

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

THE STATE OF THE DATA WAREHOUSE

THE STATE OF THE DATA WAREHOUSE March 2015 Sponsored by Introduction As the volume and types of business data have increased at a phenomenal pace, and the cost to store that data has plummeted, businesses have looked to data analytics

More information