QUEST meeting Big Data Analytics
|
|
|
- Avis Potter
- 9 years ago
- Views:
Transcription
1 QUEST meeting Big Data Analytics Peter Hughes Business Solutions Consultant SAS Australia/New Zealand Copyright 2015, SAS Institute Inc. All rights reserved.
2 Big Data Analytics WHERE WE ARE NOW BIG DATA Lots of data HADOOP Processing Power ANALYTICS Accurate /Decisions Copyright 2014, SAS Institute Inc. All rights reserved.
3 The era of abundance "Big data is what happened when the cost of storing information became less than the cost of making the decision to throw it away. - George Dyson Science Historian and TED Speaker C opyr i g ht 2014, SAS Ins titut e Inc. All rights res er ve d.
4 Two Eras... Will you modernize your mindset? Technology empowered Discovery-centric Focus on value Everything is permitted unless it is forbidden C opyr i g ht 2014, SAS Ins titut e Inc. All rights res er ve d.
5 WHAT IS HADOOP? An Apache Software Foundation project Open-source Origins in early 2000s with contributions from Google, Yahoo! and Facebook Framework of tools for processing Big Data 1. Base: Common, Distributed File System (HDFS); MapReduce & YARN 2. Additional projects including: Pig; Hive; HBase; Pig; Zookeeper et al. Designed for clusters using commodity server hardware typically Intel/Linux Distributed storage Distributed processing Fault-tolerant topology Commercial Hadoop distributions based on Apache code Extensions; additional tooling; support Vendors: Cloudera; Hortonworks, MapR; Pivotal; IBM; Intel & others Copyright 2014, SAS Institute Inc. All rights reserved.
6 SAS and Hadoop COMMERCIAL HADOOP VENDORS Intel recently invested $740 Million to buy 18%. Puts their value at around the $4 Billion mark! GE invested $105 Million In Pivotal Google Capital recently invested $80 Million to into MapR they gathered $110 million of investment in their last round! Pivotal HD HP recently invested $50 Million to into Hortonworks to get a place on the board. Total investment now about $300 Million. Big Teradata and SAP Partners! IBM InfoSphere BigInsights Copyright 2014, SAS Institute Inc. All rights reserved.
7 SAS and Hadoop INTEGRATION WITH OPEN SOURCE HADOOP HIVE Hcatalog YARN PIG MapReduce HDFS Impala Sqoop Parquet ORC Spark Oozie Copyright 2014, SAS Institute Inc. All rights reserved.
8 SAS WITHIN THE HADOOP ECOSYSTEM User Interface SAS Data Loader for Hadoop SAS Data Integration SAS Enterprise Miner SAS Visual Analytics SAS In-Memory Statistics for Hadoop SAS User Metadata Data Access Base SAS & SAS/ACCESS to Hadoop SAS Metadata In-Memory Data SAS Access LASR Analytic Next-Gen SAS User SAS Embedded Server Data Processing Pig Hive Process Accelerators SAS High- Map Reduce/YARN Performance Analytic MPI Procedures Based File System HDFS Copyright 2014, SAS Institute Inc. All rights reserved.
9 DATA TO DECISION LIFECYCLE on Hadoop SAS/ACCESS (Hadoop/Impala) SAS Data Management SAS Federation Server SAS Data Quality Accelerator for MANAGE Hadoop DATA SAS Code Accelerator for Hadoop SAS Data Loader for Hadoop SAS Visual Analytics SAS In-memory Statistics for Hadoop Model Manager SAS Scoring Accelerator for Hadoop DEPLOY & MONITOR TEXT DEVELOP MODELS DATA EXPLORE SAS HPA Products SAS Visual Statistics SAS In-memory Statistics for Hadoop C opyr i g ht 2014, SAS Ins titut e Inc. All rights res er ve d.
10 MANAGE DATA READ/WRITE TO HDFS file:///c:/sample_data/hadoop_config.xml# /* Create directory on HDFS */ filename cfg "C:\Sample_Data\hadoop_config.xml"; proc hadoop options=cfg username="hadoop" password="hadoop"; hdfs mkdir="/user/hadoop/testfolder" ; run; /* Copy file from local SAS to HDFS */ filename cfg "C:\Sample_Data\hadoop_config.xml"; proc hadoop options=cfg username="hadoop" password="hadoop"; hdfs copyfromlocal="c:\sample_data\dept.txt" out="/user/hadoop/testfolder/"; run; /* Copy file from HDFS to local SAS */ filename cfg "C:\Sample_Data\hadoop_config.xml"; proc hadoop options=cfg username="hadoop" password="hadoop"; hdfs copytolocal="/user/hadoop/testfolder" out="c:\sample_data\" ; run; Hadoop configuration file, used for all PROC HADOOP PIG MAPREDUCE HDFS calls C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
11 MANAGE DATA SAS/ACCESS Base SAS Procedures executed in-database for Hadoop FREQ, REPORT, SORT, SUMMARY/MEANS, TABULATE Supported Hadoop distributions & combinations* Cloudera CDH 5.0 running Hive/Hive2 Hortonworks HDP 2.0 running HiveServer2 IBM InfoSphere BigInsights 2.1 running Hive MapR M running Hive Pivotal/Greenplum HD running Hive Pivotal/Greenplum MR running Hive * If a provider assures upward compatibility, SAS/ACCESS supports newer combinations. For example, Cloudera assures upward compatibility within major releases, so Cloudera CDH4.2 running Hive or HiveServer2 is supported. C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
12 MANAGE DATA HIVE LIBNAME cdh_hdp HADOOP PORT=10000 SERVER=sascldserv02 user=hadoop password=hadoop ; /* Create new table */ proc sql; connect to hadoop(port=10000 SERVER=sascldserv02 USER=hadoop PASSWORD="hadoop"); exec( create table cars_prc (make string, model string, msrp double) ) by hadoop; quit; /* Copy from another table */ proc sql; insert into cdh_hdp.cars_prc select make, model, msrp from sashelp.cars ; quit; /* List contents */ proc sql; select * from cdh_hdp.cars_prc; quit; C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
13 MANAGE DATA MAPREDUCE /* Invoke MapReduce Word Count program */ filename cfg "C:\Sample_Data\hadoop_config.xml"; proc hadoop options=cfg username="hadoop" password="hadoop" verbose; hdfs delete="/user/hadoop/output_mr1"; mapreduce input="/user/hadoop/gutenberg output="/user/hadoop/output_mr1" jar="c:\sample_data\hadoop-examples mr1-cdh4.1.2.jar" outputkey="org.apache.hadoop.io.text" outputvalue="org.apache.hadoop.io.intwritable" reduce="org.apache.hadoop.examples.wordcount$intsumreducer" combine="org.apache.hadoop.examples.wordcount$intsumreducer" map="org.apache.hadoop.examples.wordcount$tokenizermapper" reducetasks=0 ; run; C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
14 MANAGE DATA SAS DATA INTEGRATION STUDIO Seamless access to Hadoop data (HDFS/HIVE/IMPALA) by analyst/traditional SAS users Reading & writing to/from HDFS Transfer to/from Hadoop operators Support for Pig, Hive & MapReduce transforms C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
15 SAS IN-MEMORY ANALYTICS SAS LASR ANALYTIC SERVER AND HADOOP In-memory processing; use Hadoop for storage persistence and commodity computing WEB CLIENTS APPLICATIONS SAS LASR ANALYTIC SERVER HADOOP ERP SCM SAS Visual Analytics SAS IN-MEMORY SAS IN-MEMORY CRM Images SAS Visual Statistics SAS IN-MEMORY Audio and Video SAS In-Memory Statistics for Hadoop SAS IN-MEMORY Machine Logs *Name not finalized. SAS IN-MEMORY Text f Web and Social C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
16 DEPLOY & MONITOR SAS SCORING ACCELERATOR FOR HADOOP Publish SAS Enterprise Miner models or SAS/STAT linear models inside the Hadoop Fully integrated with SAS Model Manager to streamline registration, validation and performance monitoring Reduced data movement and improve data governance by streamlining model deployment processes within Hadoop C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
17 C opyr i g ht 2014, SAS Ins titut e Inc. All rights res er ve d.
18 C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d. QUESTIONS?
19 peter hughes Thank You! C opyr i g ht 2012, SAS Ins titut e Inc. All rights res er ve d.
and Hadoop Technology
SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute
Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
Paper SAS033-2014 Techniques in Processing Data on Hadoop
Paper SAS033-2014 Techniques in Processing Data on Hadoop Donna De Capite, SAS Institute Inc., Cary, NC ABSTRACT Before you can analyze your big data, you need to prepare the data for analysis. This paper
WHAT S NEW IN SAS 9.4
WHAT S NEW IN SAS 9.4 PLATFORM, HPA & SAS GRID COMPUTING MICHAEL GODDARD CHIEF ARCHITECT SAS INSTITUTE, NEW ZEALAND SAS 9.4 WHAT S NEW IN THE PLATFORM Platform update SAS Grid Computing update Hadoop support
Comprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
ANALYTICS MODERNIZATION TRENDS, APPROACHES, AND USE CASES. Copyright 2013, SAS Institute Inc. All rights reserved.
ANALYTICS MODERNIZATION TRENDS, APPROACHES, AND USE CASES STUNNING FACT Making the Modern World: Materials and Dematerialization - Vaclav Smil Trends in Platforms Hadoop Microsoft PDW COST PER TERABYTE
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
WWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS
WWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS Table of contents 01 Abstract 01 02 03 04 The Why - Need for The Who - Prominent
9.4 Hadoop Configuration Guide for Base SAS. and SAS/ACCESS
SAS 9.4 Hadoop Configuration Guide for Base SAS and SAS/ACCESS Second Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS 9.4 Hadoop
Hadoop & SAS Data Loader for Hadoop
Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle
Workshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
Bringing the Power of SAS to Hadoop. White Paper
White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What
Constructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
HDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner [email protected] @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce
Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates
Document Type: Best Practice
Global Architecture and Technology Enablement Practice Hadoop with Kerberos Deployment Considerations Document Type: Best Practice Note: The content of this paper refers exclusively to the second maintenance
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
Bringing the Power of SAS to Hadoop
Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, High-Performance Data Storage and Processing to Get Better Answers, Faster WHITE PAPER SAS White Paper
Modernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist [email protected] O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
Dell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/
Apache Sentry. Prasad Mujumdar [email protected] [email protected]
Apache Sentry Prasad Mujumdar [email protected] [email protected] Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
Hadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK [email protected] Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
Big Data and Industrial Internet
Big Data and Industrial Internet Keijo Heljanko Department of Computer Science and Helsinki Institute for Information Technology HIIT School of Science, Aalto University [email protected] 16.6-2015
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
MySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
Oracle Big Data Essentials
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the
Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop
Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social
<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity
Big Data Realities Hadoop in the Enterprise Architecture
Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks [email protected] +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1 Agenda The Growth of Enterprise
Dominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
9.4 SPD Engine: Storing Data in the Hadoop Distributed File System
SAS 9.4 SPD Engine: Storing Data in the Hadoop Distributed File System Third Edition SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS 9.4
Data processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Big Data Too Big To Ignore
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
Använd SAS för att bearbeta och analysera ditt data i Hadoop
make connections share ideas be inspired Använd SAS för att bearbeta och analysera ditt data i Hadoop Mikael Turvall Arkitektur SAS VISUAL ANALYTICS and SAS VISUAL STATISTICS SAS IN-MEMORY STATISTICS FOR
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
Bringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team [email protected] @rob1lancaster Organizer of Chicago
The Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. [email protected] [email protected] @OrionGM The Inside Scoop
Cloudera & SAS Partnership Overview. Graham Pymm Cloudera Systems Engineer
Cloudera & Partnership Overview Graham Pymm Cloudera Systems Engineer 1 Strong Executive & Product Level Alignment Management: Formal Alliance forged in January 2013 CTO level commitment from both companies
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform
Information Builders Mission & Value Proposition
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns
Introduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
Talend Big Data. Delivering instant value from all your data. Talend 2014 1
Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson
The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities
Hadoop2, Spark Big Data, real time, machine learning & use cases. Cédric Carbone Twitter : @carbone
Hadoop2, Spark Big Data, real time, machine learning & use cases Cédric Carbone Twitter : @carbone Agenda Map Reduce Hadoop v1 limits Hadoop v2 and YARN Apache Spark Streaming : Spark vs Storm Machine
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics
Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
Data Security in Hadoop
Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in
SAS and Teradata Partnership
SAS and Teradata Partnership Ed Swain Senior Industry Consultant Energy & Resources [email protected] 1 Innovation and Leadership Teradata SAS Magic Quadrant for Data Warehouse Database Management
A Modern Data Architecture with Apache Hadoop
Modern Data Architecture with Apache Hadoop Talend Big Data Presented by Hortonworks and Talend Executive Summary Apache Hadoop didn t disrupt the datacenter, the data did. Shortly after Corporate IT functions
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction
Ubuntu and Hadoop: the perfect match
WHITE PAPER Ubuntu and Hadoop: the perfect match February 2012 Copyright Canonical 2012 www.canonical.com Executive introduction In many fields of IT, there are always stand-out technologies. This is definitely
Oracle Big Data Fundamentals Ed 1 NEW
Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big
Peers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of
Supported Platforms. HP Vertica Analytic Database. Software Version: 7.1.x
HP Vertica Analytic Database Software Version: 7.1.x Document Release Date: 10/14/2015 Legal Notices Warranty The only warranties for HP products and services are set forth in the express warranty statements
BIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
White Paper: Hadoop for Intelligence Analysis
CTOlabs.com White Paper: Hadoop for Intelligence Analysis July 2011 A White Paper providing context, tips and use cases on the topic of analysis over large quantities of data. Inside: Apache Hadoop and
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
Big data for the Masses The Unique Challenge of Big Data Integration
Big data for the Masses The Unique Challenge of Big Data Integration White Paper Table of contents Executive Summary... 4 1. Big Data: a Big Term... 4 1.1. The Big Data... 4 1.2. The Big Technology...
How Companies are! Using Spark
How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made
Big Data: Making Sense of it all!
Big Data: Making Sense of it all! Jamie Engesser E-mail : [email protected] Page 1 Data Driven Business? Facts not Intuition! Data driven decisions are better decisions its as simple as that. Using
Case Study : 3 different hadoop cluster deployments
Case Study : 3 different hadoop cluster deployments Lee moon soo [email protected] HDFS as a Storage Last 4 years, our HDFS clusters, stored Customer 1500 TB+ data safely served 375,000 TB+ data to customer
WHITE PAPER. Four Key Pillars To A Big Data Management Solution
WHITE PAPER Four Key Pillars To A Big Data Management Solution EXECUTIVE SUMMARY... 4 1. Big Data: a Big Term... 4 EVOLVING BIG DATA USE CASES... 7 Recommendation Engines... 7 Marketing Campaign Analysis...
Investor Presentation. Second Quarter 2015
Investor Presentation Second Quarter 2015 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences
Hadoop Introduction. 2012 coreservlets.com and Dima May. 2012 coreservlets.com and Dima May
2012 coreservlets.com and Dima May Hadoop Introduction Originals of slides and source code for examples: http://www.coreservlets.com/hadoop-tutorial/ Also see the customized Hadoop training courses (onsite
IBM Big Data Platform
IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of
Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader
The Digital Enterprise Demands a Modern Integration Approach Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader Yesterday s approach to data and application integration is a barrier
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp
Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce
What's New in SAS Data Management
Paper SAS034-2014 What's New in SAS Data Management Nancy Rausch, SAS Institute Inc., Cary, NC; Mike Frost, SAS Institute Inc., Cary, NC, Mike Ames, SAS Institute Inc., Cary ABSTRACT The latest releases
Lessons Learned: Building a Big Data Research and Education Infrastructure
Lessons Learned: Building a Big Data Research and Education Infrastructure G. Hsieh, R. Sye, S. Vincent and W. Hendricks Department of Computer Science, Norfolk State University, Norfolk, Virginia, USA
Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
Hadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
