Big Data projects and use cases. Claus Samuelsen IBM Analytics, Europe
|
|
|
- Evangeline West
- 10 years ago
- Views:
Transcription
1 Big projects and use cases Caus Samuesen IBM Anaytics, Europe
2 IBM Sofware Overview of BigInsights IBM BigInsights Scientist Free Quick Start (non production): IBM Open Patform BigInsights Anayst, Scientist features Community support Text Anaytics IBM BigInsights Anayst Industry standard SQL (Big SQL) Spreadsheet-stye too (BigSheets) Machine Learning on Big R IBM BigInsights Enterprise Management Big R (R support) Big SQL POSIX Distributed Fiesystem BigSheets Muti-workoad, muti-tenant scheduing... IBM Open Patform with Apache Hadoop* (, YARN, MapReduce, Ambari, Hbase, Hive, Oozie, Parquet, Parquet Format, Pig, Snappy, Sor, Spark, Sqoop, Zookeeper, Open JDK, Knox, Sider) *IBM Open Patform with Apache Hadoop is a 100% open source Apache Hadoop distribution. IBM wi incude the Open Patform common kerne once avaiabe IBM Corporation
3 IBM Big SQL Runs 100% of the queries Other environments require significant effort at scae Key points With Impaa and Hive, many queries needed to be re-written, some significanty Owing to various restrictions, some queries coud not be re-written or faied at run-time Re-writing queries in a benchmark scenario where resuts are known is one thing doing this against rea databases in production is another Resuts for 10TB scae shown here IBM Corporation
4 Hadoop-DS benchmark Singe user 10TB Big SQL is 3.6x faster than Impaa and 5.4x faster than Hive 0.13 for singe query stream using 46 common queries Based on IBM interna tests comparing BigInsights Big SQL, Coudera Impaa and Hortonworks Hive (current versions avaiabe as of 9/01/2014) running on identica hardware. The test workoad was based on the atest revision of the TPC-DS benchmark specification at 10TB data size. Successfu executions measure the abiity to execute queries a) directy from the specification without modification, b) after simpe modifications, c) after extensive query rewrites. A minor modifications are either permitted by the TPC-DS benchmark specification or are of a simiar nature. A queries were reviewed and attested by a TPC certified auditor. Deveopment effort measured time required by a skied SQL deveoper famiiar with each system to modify queries so they wi execute correcty. Performance test measured scaed query throughput per hour of 4 concurrent users executing a common subset of 46 queries across a 3 systems at 10TB data size. Resuts may not be typica and wi vary based on actua workoad, configuration, appications, queries and other variabes in a production environment. Coudera, the Coudera ogo, Coudera Impaa are trademarks of Coudera. Hortonworks, the Hortonworks ogo and other Hortonworks trademarks are trademarks of Hortonworks Inc. in the United States and other countries IBM Corporation
5 Big Projects Stock Trade Anaysis Positive side effects of drugs Log Fie Root Cause Anaysis CRM anaysis 360 Degree Customer View Ontoogies Gamers Behaviour Document cassification Weather Anaysis Roaming Log Anaysis Sensitive Access Connected Cars Tax Fraud Investigation Historica Archive Research Warehouse Augmentation DNA sequencing 2009 IBM Corporation
6 Warehouse Augmentation Banking Industry Fraud Anaysis The customer wanted to impement two different kinds of fraud anaysis: Transaction fraud and Socia Engeneering fraud. Probem: Existing data warehouse does not aow for ong running jobs Extending the data warehouse has a huge cost 2009 IBM Corporation
7 Warehouse Augmentation Banking Industry Fraud Anaysis Soution: Moving data to IBM BigInsights reduces the cost significanty No imitations on ong running jobs Obtaining the data from the various sources is the most time consuming process Using BigSQL we can run the same queries in Hadoop as in the traditiona warehouse With BigSQL customer can connect using their standard JDBC/ODBC based SQL toos IBM Corporation
8 Document Cassification Insurrance Industry Automatic cassification Probem: Insurance documents are not standardized. They are typicay free form documents written as e-mais, MS Words etc. Incoming documents are not cassified, and are therefore often sent to wrong department or wrong person, thus resuting in unacceptabe ong processing time IBM Corporation
9 Document Cassification Soution: Using BigInsights Text Anaytics new documents can be cassified automatic. Customer had described what was the characteristics of the different casses the the documents had to be put into. Using these descriptions we coud in three weeks impements the rues in BigInsights to a degree that satisfied the customer IBM Corporation
10 IBM big data An IBM Proof of Technoogy IBM big data IBM big data IBM big data IBM big data IBM big data IBM big data THINK IBM big data IBM big data IBM big data 2013 IBM Corporation
11 IBM Software Distinguishing characteristics Appication Portabiity & Integration Performance shared with Hadoop ecosystem Comprehensive fie format support Superior enabement of IBM and Third Party software Modern MPP runtime Powerfu SQL query rewriter Cost based optimizer Optimized for concurrent user throughput Resuts not constrained by memory Rich SQL Comprehensive SQL Support IBM SQL PL compatibiity Extensive Anaytic Functions 11 Federation Enterprise Features Distributed requests to mutipe data sources within a singe SQL statement Main data sources supported: DB2 LUW, Teradata, Orace, Netezza, Informix, SQL Server Advanced security/auditing Resource and workoad management Sef tuning memory management Comprehensive monitoring 2014 IBM Corporation
12 IBM Software Big SQL Behind the scenes Big SQL is derived from an existing IBM shared-nothing RDBMS A very mature MPP architecture Aready understands distributed joins and optimization Behavior is sufficienty different Certain SQL constructs are disabed Traditiona data warehouse partitioning is unavaiabe New SQL constructs introduced On the surface, porting a shared nothing RDBMS to a shared nothing custer (Hadoop) seems easy, but database partition database partition database partition database partition Traditiona Distributed RBMS Architecture IBM Corporation
13 IBM Software Architecture Overview base Service Big SQL Scheduer Big SQL Master Hive Metastore DDL Big SQL Worker Native I/O Java I/O HBase Temp 13 Big SQL Worker Node Native I/O MR Task Tracker Java I/O HBase Other Service Temp Big SQL Worker Node Native I/O MR Task Tracker Java I/O HBase Other Service Temp Node MR Task Tracker Other Service 2014 IBM Corporation
Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
Big Data Management and Security
Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value
Blistering Fast SQL Access to Hadoop using. IBM BigInsights 3.0 with Big SQL 3.0
Blistering Fast SQL Access to Hadoop using IBM BigInsights 3.0 with Big SQL 3.0 SQL-over-Hadoop implementations are ready to execute OLAP complex query workloads at a fraction of the cost of traditional
Workshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
Bringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.
Big Data Hadoop Administration and Developer Course This course is designed to understand and implement the concepts of Big data and Hadoop. This will cover right from setting up Hadoop environment in
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Qsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
Big Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce
Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013
Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache
IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look
IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based
Upcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
HDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse
SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale
Peers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
Introduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
ITG Software Engineering
Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,
Integrating Apache Spark with an Enterprise Data Warehouse
Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software
Complete Java Classes Hadoop Syllabus Contact No: 8888022204
1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Constructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
Comprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
Certified Big Data and Apache Hadoop Developer VS-1221
Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction
QUEST meeting Big Data Analytics
QUEST meeting Big Data Analytics Peter Hughes Business Solutions Consultant SAS Australia/New Zealand Copyright 2015, SAS Institute Inc. All rights reserved. Big Data Analytics WHERE WE ARE NOW 2005 2007
Dominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
Hadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
Big Data Realities Hadoop in the Enterprise Architecture
Big Data Realities Hadoop in the Enterprise Architecture Paul Phillips Director, EMEA, Hortonworks [email protected] +44 (0)777 444 3857 Hortonworks Inc. 2012 Page 1 Agenda The Growth of Enterprise
Cognizant Interactive. Digital Marketing & Analytics(DMA) Practice. 2012, Cognizant
Cognizant Interactive Digita Marketing & Anaytics(DMA) Practice 2012, Cognizant About DMA group Cognizant Interactive provides innovative soutions for design, content, earning, digita marketing and anaytics
I/O Considerations in Big Data Analytics
Library of Congress I/O Considerations in Big Data Analytics 26 September 2011 Marshall Presser Federal Field CTO EMC, Data Computing Division 1 Paradigms in Big Data Structured (relational) data Very
BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM. An Overview
BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM An Overview Contents Contents... 1 BIG DATA SERIES: HADOOP DEVELOPER TRAINING PROGRAM... 1 Program Overview... 4 Curriculum... 5 Module 1: Big Data: Hadoop
Implement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
IBM BigInsights for Apache Hadoop
IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced
Introduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
WHITE PAPER BEsT PRAcTIcEs: PusHIng ExcEl BEyond ITs limits WITH InfoRmATIon optimization
Best Practices: Pushing Exce Beyond Its Limits with Information Optimization WHITE Best Practices: Pushing Exce Beyond Its Limits with Information Optimization Executive Overview Microsoft Exce is the
Apache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. [email protected], twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
Please give me your feedback
Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &
Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
Informatica PowerCenter
Brochure Informatica PowerCenter Benefits Support better business decisions with the right information at the right time Acceerate projects in days vs. months with improved staff productivity and coaboration
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics
Technology and Consulting - Newsletter 1. IBM. July 2013
Technoogy and Consuting - Newsetter Juy 2013 Wecome to Latitude Executive Consuting s atest newsetter, reviewing recent marketpace activity. The newsetter focuses on the Technoogy and Consuting sectors,
MySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015
Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform
BIG DATA HADOOP TRAINING
BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)
Fundamentals Curriculum HAWQ
Fundamentals Curriculum Pivotal Hadoop 2.1 HAWQ Education Services zdata Inc. 660 4th St. Ste. 176 San Francisco, CA 94107 t. 415.890.5764 zdatainc.com Pivotal Hadoop & HAWQ Fundamentals Course Description
SQL on NoSQL (and all of the data) With Apache Drill
SQL on NoSQL (and all of the data) With Apache Drill Richard Shaw Solutions Architect @aggress Who What Where NoSQL DB Very Nice People Open Source Distributed Storage & Compute Platform (up to 1000s of
Lexmark ESF Applications Guide
Lexmark ESF Appications Guide Hep your customers bring out the fu potentia of their Lexmark soutions-enabed singe-function and mutifunction printers Lexmark Appications have been designed to hep businesses
Modernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist [email protected] O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
Moving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com [email protected] Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
Big Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
COURSE CONTENT Big Data and Hadoop Training
COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,
Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC, Bellevue, WA Legal disclaimer The information in this
How To Get Acedo With Microsoft.Com
alearn with Microsoft We are XMA. Ca us now on 0115 846 4900 Visit www.xma.co.uk/aearn Emai [email protected] Foow us @WeareXMA Introduction Use our 'steps to alearn' framework to ensure you cover a bases...
Big Data Infrastructure at Spotify
Big Data Infrastructure at Spotify Wouter de Bie Team Lead Data Infrastructure June 12, 2013 2 Agenda Let s talk about Data Infrastructure, how we did it, what we learned and how we ve failed Some Context
IBM Big Data Platform
IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of
Data-Intensive Programming. Timo Aaltonen Department of Pervasive Computing
Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer [email protected] Assistants: Henri Terho and Antti
We are XMA and Viglen.
alearn with Microsoft 16pp 21.07_Layout 1 22/12/2014 10:49 Page 1 FRONT COVER alearn with Microsoft We are XMA and Vigen. Ca us now on 0115 846 4900 Visit www.xma.co.uk/aearn Emai [email protected] Foow
HADOOP. Revised 10/19/2015
HADOOP Revised 10/19/2015 This Page Intentionally Left Blank Table of Contents Hortonworks HDP Developer: Java... 1 Hortonworks HDP Developer: Apache Pig and Hive... 2 Hortonworks HDP Developer: Windows...
Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
and Hadoop Technology
SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute
... ... PEPPERDATA OVERVIEW AND DIFFERENTIATORS ... ... ... ... ...
..................................... WHITEPAPER PEPPERDATA OVERVIEW AND DIFFERENTIATORS INTRODUCTION Prospective customers will often pose the question, How is Pepperdata different from tools like Ganglia,
Bringing the Power of SAS to Hadoop. White Paper
White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
EDS-Unigraphics MIS DataBroker Architecture
EDS-Unigraphics MIS DataBroker Architecture Jeff Greiner Bob Woodridge October 9,1996 Topics UG/MIS Probem Domain Requirements for New Architecture Seection of Java Deveoping Java Based Intranet Soutions
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity
Hadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK [email protected] Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
Chapter 3: e-business Integration Patterns
Chapter 3: e-business Integration Patterns Page 1 of 9 Chapter 3: e-business Integration Patterns "Consistency is the ast refuge of the unimaginative." Oscar Wide In This Chapter What Are Integration Patterns?
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Human Capital & Human Resources Certificate Programs
MANAGEMENT CONCEPTS Human Capita & Human Resources Certificate Programs Programs to deveop functiona and strategic skis in: Human Capita // Human Resources ENROLL TODAY! Contract Hoder Contract GS-02F-0010J
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
Apache Hadoop: The Big Data Refinery
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION
FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will
Hadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
Enhanced continuous, real-time detection, alarming and analysis of partial discharge events
DMS PDMG-RH DMS PDMG-RH Partia discharge monitor for GIS Partia discharge monitor for GIS Enhanced continuous, rea-time detection, aarming and anaysis of partia discharge events Unrivaed PDM feature set
Red Hat Enterprise Linux is open, scalable, and flexible
CHOOSING AN ENTERPRISE PLATFORM FOR BIG DATA Red Hat Enterprise Linux is open, scalable, and flexible TECHNOLOGY OVERVIEW 10 things your operating system should deliver for big data 1) Open source project
WINMAG Graphics Management System
SECTION 10: page 1 Section 10: by Honeywe WINMAG Graphics Management System Contents What is WINMAG? WINMAG Text and Graphics WINMAG Text Ony Scenarios Fire/Emergency Management of Fauts & Disabement Historic
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics
Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics Please note the following IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Advanced ColdFusion 4.0 Application Development - 3 - Server Clustering Using Bright Tiger
Advanced CodFusion 4.0 Appication Deveopment - CH 3 - Server Custering Using Bri.. Page 1 of 7 [Figures are not incuded in this sampe chapter] Advanced CodFusion 4.0 Appication Deveopment - 3 - Server
IT@Intel How Intel IT Successfully Migrated to Cloudera Apache Hadoop*
White Paper April 2015 IT@Intel How Intel IT Successfully Migrated to Cloudera Apache Hadoop* From our original experience with Apache Hadoop software, Intel IT identified new opportunities to reduce IT
Dell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/
Large scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
eg Enterprise vs. a Big 4 Monitoring Soution: Comparing Tota Cost of Ownership Restricted Rights Legend The information contained in this document is confidentia and subject to change without notice. No
