HADOOP AND MAINFRAMES CRAZY OR CRAZY LIKE A FOX? Mike Combs, VP of Marketing mcombs@veristorm.com
|
|
- Dana Perry
- 8 years ago
- Views:
Transcription
1 HADOOP AND MAINFRAMES CRAZY OR CRAZY LIKE A FOX? Mike Combs, VP of Marketing mcombs@veristorm.com
2 The Big Picture for Big Data 2 The Lack of Information Problem The Surplus of Data Problem The 3 V s of Big Data Volume: More devices, higher resolution, more frequent collection, store everything Variety: Incompatible Data Formats Velocity: Analytics fast enough to be interactive and Useful (Doug Laney, Gartner Research)
3 Big Data: Volume 3 SDDS telescope, 80 TB in 7 years LSST telescope, 80 TB in 2 days
4 Big Market, Big Growth 4
5 Big Data: Variety 5 20% is Structured 80% is Unstructured Tabular Databases like credit card transactions and Excel spreadsheets Web forms Pictures: Photos, X-rays, ultrasound scans Sound: Music (genre etc.), speech Videos: computer vision, cell growing cultures, storm movement Text: s, doctor s notes Microsoft Office: Word, PowerPoint, PDF
6 Big Data: Velocity 6 To be relevant, data analytics must be acted upon in a timely fashion Results can lead to other questions, and so the solutions should be interactive
7 Big Data Industry Value 7
8 Increasing needs for Detailed Analytics 8 Baselining & Experimenting Parkland Hospital analyzed records to find and extend best practices Data Sharing US Gov Fraud Prevention shared data across departments Segmentation Dannon uses predictive analytics to adapt to changing tastes in yogurt Decision-making Lake George ecosystem project uses sensor data to protect $1B in tourism New Business Models Social media, location-based services, mobile apps
9 Hadoop Family & Ecosystem 9 Hadoop solves the problem of moving big data Eliminates interface traffic jams of getting data from a large disk Eliminates network traffic jams of getting data from many disks Hadoop divides and moves the work instead Hadoop divides the job across many servers and sends the work to them Apache Hadoop is an opensource platform Hadoop includes HDFS File-based, unstructured, massive MapReduce Distributes processing and aggregates results (queries or data loading) Pig Programming language Hive Structure with SQL-like queries Hbase Big table, with limits Flume Import streaming logs Sqoop Import from RDBMS
10 Typical Hadoop Cluster 10 NameNode Master, directs slave DataNodes, tracks file block storage, overall health Secondary NameNode Backup JobTracker Assigns nodes to jobs, handles task failures Slave Nodes DataNode IO and backup TaskTracker Manages tasks on slave; talks with JobTracker
11 11 Today s Big Data Initiatives: Transactions, Logs, Machine Data Unstructured data
12 Transaction Data = Mainframe Computers 12 Mainframes run the global core operations of 92 of top 100 banks 21 of top 25 insurance 23 of top 25 retailers Process 60% of all transactions Mainframe computers are the place for essential enterprise data Highly reliable Highly secure IBM s Academic Initiative 1000 higher education institutions In 67 nations Impacting 59,000 students However, mainframe data uses proprietary databases which must be translated to talk to current formats
13 The ROI Behind the Hype 13 The most relevant insights come from enriching your primary enterprise data Sensitive, Structured Customers, Transactions,... Semi-public, Unstructured Call center data, Claim Systems, Social,
14 14 Bring Analytics to the Data rather than the Data to the Analytics Extract, Transform and Load (ETL) 1TB ETL per day, Initial copy plus three derivatives costs > $8 million over 4 years Operational applications Data transfer Analytical applications Multiple copies of data Transaction and analytics isolation Significant compute power Source: CPO internal study. Assume dist. send and load is same cost as receive and load.. Also, assume 2 switches and 2 T3 WAN connections. Before we even start with a workload evaluation, we need the answer to one important question: Where s the data located? The ETL Problem, Clabby Analytics
15 Obstacles to Include Mainframe Data 1/ Data Governance as the data moves off z/os operational systems 2/ Data Ingestion from z/os into Hadoop (on or off platform) is a bottleneck (MIPS & ETL cost, Security around data access and transfer, Process Agility) Lead to key requirements: Existing security policies must be applied to data access and transfer. There needs to be high speed / optimized connectors between traditional z/os LPARs and the Hadoop clusters Ability to serve data transparently into Hadoop clusters on mainframe AND on distributed platform
16 The Dilemma: Ease of Access vs. Governance DB2 VSAM Extract from proprietary formats. Transform IMS z/os Logs Aggregate or summarize. Staging Load JCL, DB2, HFS, VSAM, IMS, OMVS, COBOL Copybooks, EBCDIC, Packed Decimal, Byte ordering, IPCS, z/vm, Linux on z Hadoop, MongoDB, Cassandra, Cloud, Big Data Ecosystem, Java, Python, C++, Interface skills
17 The Dilemma: Ease of Access vs. Governance DB2 VSAM Extract from proprietary formats. Transform IMS z/os Logs Aggregate or summarize. Staging Load Push from IT OR Pull from Business
18 vstorm Enterprise Mainframe data to Mainstream 18 z/os DB2 VSAM IMS Linux vstorm Connect zvm zdoop Logs IFL IFL System z
19 vstorm Enterprise (vstorm Connect + zdoop) 19 z/os DB2 VSAM IMS Logs Linux vstorm Connect zvm IFL IFL zdoop MapReduce HDFS, Hive A secure pipe for data Data never leaves the box RACF integration no need for special credentials Data streamed over secure channel using hardware crypto Easy to use ingestion engine Native data collectors accessed via graphical interface Light-weight; no programming required Wide variety of data sources supported Conversions handled automatically Streaming technology does not load z/os engines or require DASD for staging System z Templates for agile deployment Spin up new nodes on demand An ideal cloud deployment platform Mainframe efficiencies
20 Select HDFS or Hive destination for copy Select source content on z/ OS Browse or graph content 20
21 Velocity, the 3 rd V z/os Linux DB2 Data Metadata Metadata Data Hive 2 Billion records transferred, converted and analyzed in 2 hours over 2 IFLs Time IFLs 6 weeks IBM benchmark validated linear scalability across the standard Hadoop stress tests
22 Security: vstorm Enterprise is integrated with RACF System z RACF LDAP server z/os CP(s) LDAP Client z/vm / Linux IFL(s) IFL(s) vstorm Enterprise Servlet Applet HDFS, HIVE vhub IFL(s) Integration with RACF to extend z/os security domain onto zdoop Leverage HiperSocket for secure data transfer (vs. data traveling over the network) Data in flight encryption via SSL Data at rest encryption via Hardware Crypto
23 Address the Skill Gap for new applications 23 Enable the developer community to take advantage of the enterprise primary data in a model they understand Bring your System administrators and developers together Provide easy & controlled access to mainframe data Familiar environment: Linux & Hot technology: Hadoop Procedural programming: Java, Python, JavaScript, PIG
24 Get Started Install the vstorm Enterprise in 2 hours Rapid end-to-end processing of enriched customer and transaction data Detecting fraud Enhancing insurance policy enforcement Reducing healthcare costs Improving claims response time 24
25 25 Thank you!??? Questions?
26 Financial Services Use Case 26 Problem Solution Benefits High cost of searchable archive on mainframe $350K+ storage costs (for 40TB) MIPS charges for operation $1.6M+ development costs due to many file types, including VSAM man-days effort and project delay Move data to Hadoop for analytics and archive Shift from z/os to Linux to reduce MIPS Use IBM SSD storage Use IBM private cloud Softlayer Tap talent pool of Hadoop ecosystem Reduction in storage costs Dev costs almost eliminated Quick benefits and ROI New analytics options for unstructured data Retains data on System z for security and reliability
27 Health Care Use Case 27 Problem Solution Benefits Relapses in cardiac patients One size fits all treatment $1M+ Medicare readmission penalties Sensitive patient data on System z VSAM No efficient way to offload Identify risk factors by analyzing patient data* Factors used to predict likely outcomes 31% reduction in readmissions Estimated $1.2M savings in penalties No manual intervention No increase in staffing 1100% ROI on $100K * System z VSAM database requires special skills to access without vstorm
28 Public Sector Use Case 28 Problem Solution Benefits Mismanaged assets led to shabby neighborhoods, publicity problems, often law & order issues e.g. broken lights, abandoned bikes, tsunami sirens Post-2008 austerity measures low budget Asset data System z IMS* based no efficient offload Crowd source problem reporting cell phone photo, social media, GPS data Integrate social media problem reports with asset / governance data on System z on Hadoop for System z (conforming to regulations) Software costs of $0.4M compares to $2M consulting engagement proposal Better maintained neighborhoods projected to yield $5.6M higher taxes in first year * System z IMS database requires special skills to access without vstorm
29 Retail Use Case 29 Problem Solution Benefits Streams of user data not correlated e.g. store purchases, website usage pattern, card usage, historical customer data Historical customer data System z VSAM & DB2 based no efficient, secure offload HDFS securely populated with historical customer data, card usage, store purchases, website logs Splunk scores customers based on the various data streams* High scoring customers offered coupons, special deals on website 19% increase in online sales in the middle of retail slowdown Over 50% conversion rate of website browsing customers (shopping cart to sale) Elimination of data silos since now analytics cover all data no more reliance on multiple reports / formats
Hadoop and data integration with System z
Hadoop and data integration with System z Dr. Cameron Seay, Ph.D North Carolina Agricultural and Technical State University Mike Combs Veristorm August 6, 2014 Session 15961 Insert Custom Session QR if
More informationHadoop and data integration with System z
Hadoop and data integration with System z Dr. Cameron Seay, Ph.D North Carolina Agricultural and Technical State University Mike Combs Veristorm March 2, 2015, Session 16423 3 The Big Picture for Big Data
More informationHadoop and data integration with System z
Hadoop and data integration with System z Dr. Cameron Seay, Ph.D North Carolina Agricultural and Technical State University Mike Combs Veristorm August 10, 2015, Session 17487 3 The Big Picture for Big
More informationINTRODUCING: A NEW ENTERPRISE PLATFORM FOR SECURE BIG DATA
INTRODUCING: A NEW ENTERPRISE PLATFORM FOR SECURE BIG DATA Sanjay Mazumder smazumder@veristorm.com Veristorm, 3375 Scott Blvd, Suite 230, Santa Clara, CA +1 978-809-0560 Table of Contents 2 1. Mission
More informationImprove your IT Analytics Capabilities through Mainframe Consolidation and Simplification
Improve your IT Analytics Capabilities through Mainframe Consolidation and Simplification Ros Schulman Hitachi Data Systems John Harker Hitachi Data Systems Insert Custom Session QR if Desired. Improve
More informationMr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo
Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationWhat s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?
December, 2014 What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data? Glenn Anderson IBM Lab Services and Training Today s mainframe is a hybrid system z/os Linux on Sys z DB2 Analytics Accelerator
More informationDominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
More informationIntroduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
More informationIntroduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationSOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
More informationIMS Data Integration with Hadoop
Data Integration with Hadoop Karen Durward InfoSphere Product Manager 17/03/2015 * Technical Symposium 2015 z/os Structured Data Integration for Big Data The Big Data Landscape Introduction to Hadoop What,
More informationTap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
More informationWhat s Happening to the Mainframe? Mobile? Social? Cloud? Big Data?
Glenn Anderson, IBM Lab Services and Training What s Happening to the Mainframe? Mobile? Social? Cloud? Big Data? Summer SHARE August 2014 Session 15595 (c) Copyright 2014 IBM Corporation 1 Today s mainframe
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationLecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
More informationApril 2014. The Elephant on the Mainframe Using Hadoop to Gain Insights from Mainframe Data A Joint Point of View from IBM and Veristorm
April 2014 The Elephant on the Mainframe Using Hadoop to Gain Insights from Mainframe Data A Joint Point of View from IBM and Veristorm 1 THE MAINFRAMER S GUIDE TO HADOOP 2 THERE S RELATIONAL DATA, AND
More informationBIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets
More informationCA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data
Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with
More informationCertified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
More informationHadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
More informationOpen source software framework designed for storage and processing of large scale data on clusters of commodity hardware
Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationTrends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationDell Cloudera Syncsort Data Warehouse Optimization ETL Offload
Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload
More informationHadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh
1 Hadoop: A Framework for Data- Intensive Distributed Computing CS561-Spring 2012 WPI, Mohamed Y. Eltabakh 2 What is Hadoop? Hadoop is a software framework for distributed processing of large datasets
More informationTalend Big Data. Delivering instant value from all your data. Talend 2014 1
Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,
More informationOpen Source for Cloud Infrastructure
Open Source for Cloud Infrastructure June 29, 2012 Jackson He General Manager, Intel APAC R&D Ltd. Cloud is Here and Expanding More users, more devices, more data & traffic, expanding usages >3B 15B Connected
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationGanzheitliches Datenmanagement
Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist
More informationBig Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationI/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
More informationIntegrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
More informationHow To Use A Data Center With A Data Farm On A Microsoft Server On A Linux Server On An Ipad Or Ipad (Ortero) On A Cheap Computer (Orropera) On An Uniden (Orran)
Day with Development Master Class Big Data Management System DW & Big Data Global Leaders Program Jean-Pierre Dijcks Big Data Product Management Server Technologies Part 1 Part 2 Foundation and Architecture
More informationApache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, aaa@cloudera.com Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationMicrosoft SQL Server 2012 with Hadoop
Microsoft SQL Server 2012 with Hadoop Debarchan Sarkar Chapter No. 1 "Introduction to Big Data and Hadoop" In this package, you will find: A Biography of the author of the book A preview chapter from the
More informationBig Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationBig Data Introduction
Big Data Introduction Ralf Lange Global ISV & OEM Sales 1 Copyright 2012, Oracle and/or its affiliates. All rights Conventional infrastructure 2 Copyright 2012, Oracle and/or its affiliates. All rights
More informationCase Study : 3 different hadoop cluster deployments
Case Study : 3 different hadoop cluster deployments Lee moon soo moon@nflabs.com HDFS as a Storage Last 4 years, our HDFS clusters, stored Customer 1500 TB+ data safely served 375,000 TB+ data to customer
More informationBig Data Can Drive the Business and IT to Evolve and Adapt
Big Data Can Drive the Business and IT to Evolve and Adapt Ralph Kimball Associates 2013 Ralph Kimball Brussels 2013 Big Data Itself is Being Monetized Executives see the short path from data insights
More information#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
More informationBig Data and Apache Hadoop Adoption:
Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards
More informationINTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationApache Hadoop. Alexandru Costan
1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open
More informationBig Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
More informationComprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationApplication Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
More informationBig Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney
Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Understanding Big Data and Big Data Analytics Getting familiar with Hadoop Technology Hadoop release and upgrades
More informationCommunicating with the Elephant in the Data Center
Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline
More informationMaximizing Hadoop Performance and Storage Capacity with AltraHD TM
Maximizing Hadoop Performance and Storage Capacity with AltraHD TM Executive Summary The explosion of internet data, driven in large part by the growth of more and more powerful mobile devices, has created
More informationVirtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
More informationAll You Wanted to Know About Big Data Projects Chida Sadayappan @schida. Jan 2014
All You Wanted to Know About Big Data Projects Chida Sadayappan @schida Jan 2014 1 WHAT WE DISCUSS HERE AGENDA > > > > > > Need History Open Source - Hadoop BigData EcoSystem Use Cases Managing BigData
More informationOracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
More informationBringing Big Data into the Enterprise
Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?
More informationBig Data Strategies with IMS
Big Data Strategies with IMS #16103 Richard Tran IMS Development richtran@us.ibm.com Insert Custom Session QR if Desired. Agenda Big Data in an Information Driven economy Why start with System z IMS strategies
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationDeploying Hadoop with Manager
Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution
More informationModernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
More informationYou should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.
What is this course about? This course is an overview of Big Data tools and technologies. It establishes a strong working knowledge of the concepts, techniques, and products associated with Big Data. Attendees
More informationCloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
More informationBig Data: Are You Ready? Kevin Lancaster
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
More informationBig Data: Overview and Roadmap. 2015 eglobaltech. All rights reserved.
Big Data: Overview and Roadmap 2015 eglobaltech. All rights reserved. What is Big Data? Large volumes of complex and variable data that require advanced techniques and technologies to enable capture, storage,
More informationBig Data Big Data/Data Analytics & Software Development
Big Data Big Data/Data Analytics & Software Development Danairat T. danairat@gmail.com, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationThe Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
More informationData Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
More informationBig Data Realities Hadoop in the Enterprise Architecture
Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks pphillips@hortonworks.com +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1 Agenda The Growth of Enterprise
More informationForecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
More informationBig Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationMaking Sense of Big Data in Insurance
Making Sense of Big Data in Insurance Amir Halfon, CTO, Financial Services, MarkLogic Corporation BIG DATA?.. SLIDE: 2 The Evolution of Data Management For your application data! Application- and hardware-specific
More informationSaving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved.
Saving Millions through Data Warehouse Offloading to Hadoop Jack Norris, CMO MapR Technologies MapR Technologies. All rights reserved. MapR Technologies Overview Open, enterprise-grade distribution for
More informationBig Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
More informationBig Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
More informationInformation Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationHow Cisco IT Built Big Data Platform to Transform Data Management
Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationWhat is a Petabyte? Gain Big or Lose Big; Measuring the Operational Risks of Big Data. Agenda
April - April - Gain Big or Lose Big; Measuring the Operational Risks of Big Data YouTube video here http://www.youtube.com/watch?v=o7uzbcwstu April, 0 Steve Woolley, Sr. Manager Business Continuity Dennis
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationWHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING
WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING Using Cloudera to Improve Data Processing CLOUDERA WHITE PAPER 2 Table of Contents What is Data Processing? 3 Challenges 4 Flexibility and Data Quality
More informationSimplifying Mainframe Data Access with IBM InfoSphere System z Connector for Hadoop IBM Redbooks Solution Guide Did you know?
Simplifying Mainframe Data Access with IBM InfoSphere System z Connector for Hadoop IBM Redbooks Solution Guide Did you know? For many, the IBM z Systems mainframe forms the backbone of mission-critical
More informationBuild Your Competitive Edge in Big Data with Cisco. Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015
Build Your Competitive Edge in Big Data with Cisco Rick Speyer Senior Global Marketing Manager Big Data Cisco Systems 6/25/2015 Big Data Trends Increasingly Everything will be Connected to Everything Massive
More informationHow To Make Data Streaming A Real Time Intelligence
REAL-TIME OPERATIONAL INTELLIGENCE Competitive advantage from unstructured, high-velocity log and machine Big Data 2 SQLstream: Our s-streaming products unlock the value of high-velocity unstructured log
More informationScalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
More informationOpen Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
More informationAgenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR
1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More information