1 MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved.
2 Agenda Design rationale Implementation Installation Schema to Directory Mappings Integration with Hive Roadmap Q&A 2Copyright 2013, Oracle and/or its affiliates. All rights reserved.
3 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decision. The development, release, and timing of any features or functionality described for Oracle s products remains at the sole discretion of Oracle. 3Copyright 2013, Oracle and/or its affiliates. All rights reserved.
4 Big Data: Strategic Transformation From 4Copyright 2013, Oracle and/or its affiliates. All rights reserved.
5 CIO & Business Priority Web recommendations 90% with Pilot Projects at end of 2012 Sentiment Analysis Marketing Campaign Analysis Customer Churn Modeling Fraud Detection Poor Data Costs 35% in Annual Revenues Research and Development 10% Improvement in Data Usability Drives $2bn in Revenue Risk Modeling Machine Learning 5Copyright 2013, Oracle and/or its affiliates. All rights reserved.
6 CIO & Business Priority US HEALTH CARE Increase industry value per year by $300 B MANUFACTURING GLOBAL PERSONAL LOCATION DATA EUROPE PUBLIC SECTOR ADMIN US RETAIL Increase industry value per year by Increase net margin by $100 B 250 B 60+% Decrease dev., Increase service assembly costs by provider revenue by 50% In a big data world, a competitor that fails to sufficiently develop its capabilities will be left behind. McKinsey Global Institute 6Copyright 2013, Oracle and/or its affiliates. All rights reserved.
7 Analysts on Big Data The area of greatest interest to my clients is Big Data and its role in helping businesses understand customers better. Michael Maoz, Gartner Tony Baer, Ovum Big Data Will Help Shape Your Market s Next Big Winners. Brian Hopkins, Forrester Almost half of IT departments in enterprises in North America, Europe and Asia-Pacific plan to invest in Big Data analytics in the near future. CIOs will need to be realistic about their approach to 'Big Data' analytics and focus on specific use cases where it will have the biggest business impact. Philip Carter, IDC 7Copyright 2013, Oracle and/or its affiliates. All rights reserved.
8 What Makes it Big Data? PROBLEM: Exceeds limits of conventional systems SOCIAL BLOG SMART METER VOLUME VELOCITY 8Copyright 2013, Oracle and/or its affiliates. All rights reserved. VARIETY VARIABILITY
9 What s Changed? Enablers Digitization *nearly* everything has a digital heartbeat Now practical to store much larger data volumes (distributed file systems) Now practical to process much larger data volumes (parallel processing) Why is this different from BI/DW? Business formulated questions to ask upfront Drove what was data collected, data model, query design Big Data Complements Traditional Methods: Enables what-if analysis, real-time discovery 9Copyright 2013, Oracle and/or its affiliates. All rights reserved.
10 Unlocking Value of ALL Data Video and Images Big Data: Decisions based on all your data Documents Social Data Traditional Architecture: Decisions based on database data 10Copyright 2013, Oracle and/or its affiliates. All rights reserved. Machine-Generated Data Transactions
11 Why Hadoop? Scales to thousands of nodes, TB of structured and unstructured data Combines data from multiple sources, schemaless Run queries against all of the data Runs on commodity servers, handle storage and processing Data replicated, self-healing Initially just batch (Map/Reduce) processing Extending with interactive querying, via Apache Drill, Cloudera Impala, Stinger etc. 11Copyright 2013, Oracle and/or its affiliates. All rights reserved.
12 MySQL + Hadoop: Unlocking the Power of Big Data 50% of our users integrate with MySQL* Download the MySQL Guide to Big Data: *Leading Hadoop Vendor 12Copyright 2013, Oracle and/or its affiliates. All rights reserved.
13 Leading Use-Case, On-Line Retail Users Browsing Profile, Purchase History Recommendations Recommendations Social media updates Preferences Brands Liked Web Logs: Pages Viewed Comments Posted Telephony Stream 13Copyright 2013, Oracle and/or its affiliates. All rights reserved.
14 MySQL Applier for Hadoop: Big Data Lifecycle DECIDE ANALYZE 14Copyright 2013, Oracle and/or its affiliates. All rights reserved. ACQUIRE ORGANIZE
15 MySQL in the Big Data Lifecycle DECIDE ACQUIRE BI Solutions Applier ANALYZE 15Copyright 2013, Oracle and/or its affiliates. All rights reserved. ORGANIZE
16 Apache Sqoop Apache TLP, part of Hadoop project Developed by Cloudera Bulk data import and export Between Hadoop (HDFS) and external data stores JDBC Connector architecture Supports plug-ins for specific functionality Fast Path Connector developed for MySQL 16Copyright 2013, Oracle and/or its affiliates. All rights reserved.
17 Importing Data Submit Map Only Job Sqoop Import Sqoop Job HDFS Storage Map Gather Metadata Map Transactional Data Map Map Hadoop Cluster 17Copyright 2013, Oracle and/or its affiliates. All rights reserved.
18 Ensure Proper Design Performance impact: bulk transfers to and from operational systems Complexity: configuration, usage, error reporting 18Copyright 2013, Oracle and/or its affiliates. All rights reserved.
19 MySQL Applier for Hadoop Real-Time Event Streaming MySQL to HDFS 19Copyright 2013, Oracle and/or its affiliates. All rights reserved.
20 New: MySQL Applier for Hadoop Real-time streaming of events from MySQL to Hadoop Supports move towards Speed of Thought analytics Connects to the binary log, writes events to HDFS via libhdfs library Each database table mapped to a Hive data warehouse directory Enables eco-system of Hadoop tools to integrate with MySQL data See dev.mysql.com for articles Available for download now labs.mysql.com 20Copyright 2013, Oracle and/or its affiliates. All rights reserved.
21 MySQL Applier for Hadoop: Basics Replication Architecture 21Copyright 2013, Oracle and/or its affiliates. All rights reserved.
22 MySQL Applier for Hadoop: Basics What is MySQL Applier for Hadoop? An utility which will allow you to transfer data from MySQL to HDFS. Reads binary log from server on a real time basis Uri for connecting to HDFS: const *uri= ; Network Transport: const *uri= ; Decode binary log events Contain code to decode the events Uses the user defined content handler, and if nothing is specified then the default one in order to process events Cannot handle all events Event Driven API 22Copyright 2013, Oracle and/or its affiliates. All rights reserved.
23 MySQL Applier for Hadoop: Basics Event driven API: Content Handlers 23Copyright 2013, Oracle and/or its affiliates. All rights reserved.
24 MySQL Applier for Hadoop: Implementation Replicates rows inserted into a table in MySQL to Hadoop Distributed File System Uses an API provided by libhdfs, a C library to manipulate files in HDFS The library comes pre-compiled with Hadoop Distributions Connects to the MySQL master (or reads the binary log generated by MySQL) to: Fetch the row insert events occurring on the master Decode these events, extracting data inserted into each field of the row Separate the data by the desired field delimiters and row delimiters Use content handlers to get it in the format required Append it to a text file in HDFS 24Copyright 2013, Oracle and/or its affiliates. All rights reserved.
25 Installation: Pre-requisites Hadoop Applier package from Hadoop or later Java version 6 or later (since Hadoop is written in Java) libhdfs (it comes pre compiled with Hadoop distros) Cmake 2.6 or greater Libmysqlclient 5.6 Openssl Gcc MySQL Server 5.6 FindHDFS.cmake (cmake file to find libhdfs library while compiling) FindJNI.cmake (optional, check if you already have one): $locate FindJNI.cmake Hive (optional) 25Copyright 2013, Oracle and/or its affiliates. All rights reserved.
26 Installation: Steps Download a copy of Hadoop Applier from Download a Hadoop release, configure dfs to set the append property true (flag dfs.support.append) Set the environment variable $HADOOP_HOME Ensure that 'FindHDFS.cmake' is in the CMAKE_MODULE_PATH libhdfs being JNI based, JNI header files and libraries are also required Build and install the library 'libreplication', using cmake Set the CLASSPATH $export PATH= $HADOOP_HOME/bin:$PATH $export CLASSPATH= $(hadoop classpath) Compile by the command make happlier on the terminal. This will create an executable file in the mysql2hdfs directory in the repository 26Copyright 2013, Oracle and/or its affiliates. All rights reserved.
27 Integration with HIVE Hive runs on top of Hadoop. Install HIVE on the hadoop master node Set the default datawarehouse directory same as the base directory into which SQL Query Hadoop Applier writes Create similar schema's on both MySQL & CREATE TABLE t (i INT); Hive Timestamps are inserted as first field in HDFS files Data is stored in 'datafile1.txt' by default The working directory is base_dir/db_name.db/tb_name 27Copyright 2013, Oracle and/or its affiliates. All rights reserved. Hive QL CREATE TABLE t ( time_stamp INT, i INT) [ROW FORMAT DELIMITED] STORED AS TEXTFILE;
28 Mapping Between MySQL and HDFS Schema 28Copyright 2013, Oracle and/or its affiliates. All rights reserved.
29 Hadoop Applier in Action Run hadoop dfs (namenode and datanode) Start a MySQL server as master with row based replication For ex: using mtr: $MySQL-5.6/mysql-test$./mtr --start suite=rpl --mysqld=-binlog_format='row' mysqld=-binlog_checksum=none Start hive (optional) Run the executable./happlier $./happlier [mysql-uri] [hdfs-uri] Data being replicated can be controlled by command line options. $./happlier help Find the demo here: feature=player_detailpage&v=mzratcu3m1g 29Copyright 2013, Oracle and/or its affiliates. All rights reserved. Options Use -r, --field-delimiter=delim String separating the fields in a row -w, --row-delimiter=delim String separating the rows of a table -d, --databases=db_list Ex: d=db_name1-table1table2,dbname2table1,dbname3 Import entries for some databases, optionally include only specified tables. -f, --fields=list Import entries for some fields only in a table -h, --help Display help
30 Future Road Map Under Consideration Support all kinds of events occurring on the MySQL master Support binlog checksums and GTID's, the new features in MySQL 5.6 GA Considering support for DDL's Considering support for updates and deletes Leave comments on the blog! 30Copyright 2013, Oracle and/or its affiliates. All rights reserved.
31 MySQL in the Big Data Lifecycle DECIDE Analyze Export Data Decide ANALYZE 31Copyright 2013, Oracle and/or its affiliates. All rights reserved.
32 Analyze Big Data 32Copyright 2013, Oracle and/or its affiliates. All rights reserved.
33 Summary MySQL + Hadoop: proven big data pipeline Real-Time integration: MySQL Applier for Hadoop Perfect complement for Hadoop interactive query tools Integrate for Insight 33Copyright 2013, Oracle and/or its affiliates. All rights reserved.
34 Next Steps 34Copyright 2013, Oracle and/or its affiliates. All rights reserved.
35 Thank you! 35Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam email@example.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 40291196 Oracle Big Data Essentials Duration: 3 Days What you will learn This Oracle Big Data Essentials training deep dives into using the
MySQL and Hadoop Big Data Integration Unlocking New Insight A MySQL White Paper December 2012 Table of Contents Introduction... 3 The Lifecycle of Big Data... 4 MySQL in the Big Data Lifecycle... 4 Acquire:
The NewSQL database you ll never outgrow Integrating with Hadoop Hadoop is an open source framework for managing and manipulating massive volumes of data. is an database for handling high velocity data.
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
From Dolphins to Elephants: Real-Time MySQL to Hadoop Replication with Tungsten MC Brown, Director of Documentation Linas Virbalas, Senior Software Engineer. About Tungsten Replicator Open source drop-in
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
Big Data Big Data/Data Analytics & Software Development Danairat T. firstname.lastname@example.org, 081-559-1446 1 Agenda Big Data Overview Business Cases and Benefits Hadoop Technology Architecture Big Data Development
International Journal of Scientific and Research Publications, Volume 5, Issue 7, July 2015 1 Internals of Hadoop Application Framework and Distributed File System Saminath.V, Sangeetha.M.S Abstract- Hadoop
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
THE HADOOP DISTRIBUTED FILE SYSTEM Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Presented by Alexander Pokluda October 7, 2013 Outline Motivation and Overview of Hadoop Architecture,
Hadoop for MySQL DBAs + 1 About me Sarah Sproehnle, Director of Educational Services @ Cloudera Spent 5 years at MySQL At Cloudera for the past 2 years email@example.com 2 What is Hadoop? An open-source
Apache Sentry Prasad Mujumdar firstname.lastname@example.org email@example.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture
IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
Spring,2015 Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE Contents: Briefly About Big Data Management What is hive? Hive Architecture Working
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
Welcome to Virtual Developer Day MySQL! Keynote: Developer and DBA Guide to What s New in MySQL Andrew Morgan - MySQL Product Management @andrewmorgan www.clusterdb.com 1 Program Agenda 1:00 PM Keynote:
Alexander Rubin Principle Architect, Percona April 18, 2015 Using Hadoop Together with MySQL for Data Analysis About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years
Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give
An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
Hadoop Distributed File System Dhruba Borthakur Apache Hadoop Project Management Committee firstname.lastname@example.org email@example.com Hadoop, Why? Need to process huge datasets on large clusters of computers
1 Hadoop Job Oriented Training Agenda Kapil CK firstname.lastname@example.org Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
A Seminar report On Hadoop Submitted in partial fulfillment of the requirement for the award of degree of Bachelor of Technology in Computer Science SUBMITTED TO: www.studymafia.org SUBMITTED BY: www.studymafia.org
Big Data Big Deal for Public Sector Organizations Hoàng Xuân Hiếu Director, FAB & Government Business Indochina & Myanmar 1 Copyright 2013, Oracle and/or its affiliates. All rights reserved. The following
Big Data Too Big To Ignore Geert! Big Data Consultant and Manager! Currently finishing a 3 rd Big Data project! IBM & Cloudera Certified! IBM & Microsoft Big Data Partner 2 Agenda! Defining Big Data! Introduction
CS 698: Special Topics in Big Data Chapter 4. Big Data Analytics Platforms Chase Wu New Jersey Ins0tute of Technology Some of the slides have been provided through the courtesy of Dr. Ching-Yung Lin at
Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer
Beyond Web Application Log Analysis using Apache TM Hadoop A Whitepaper by Orzota, Inc. 1 Web Applications As more and more software moves to a Software as a Service (SaaS) model, the web application has
With Saurabh Singh email@example.com The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,firstname.lastname@example.org Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
Advanced Fraud Detection & Prevention Through Big Data Mark Johnson Director, Engineered Systems, Oracle Public Sector 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. The following
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,
Use Data from a Hadoop Cluster with Oracle Database Hands-On Lab Lab Structure Acronyms: OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS) All files are
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System email@example.com Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture
Oracle Big Data Discovery Unlock Potential in Big Data Reservoir Gokula Mishra Premjith Balakrishnan Business Analytics Product Group September 29, 2014 Copyright 2014, Oracle and/or its affiliates. All
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates
COSC 6397 Big Data Analytics 2 nd homework assignment Pig and Hive Edgar Gabriel Spring 2015 2 nd Homework Rules Each student should deliver Source code (.java files) Documentation (.pdf,.doc,.tex or.txt
Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
How to Install and Configure EBF15328 for MapR 4.0.1 or 4.0.2 with MapReduce v1 1993-2015 Informatica Corporation. No part of this document may be reproduced or transmitted in any form, by any means (electronic,
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner firstname.lastname@example.org @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
Move Data from Oracle to Hadoop and Gain New Business Insights Written by Lenka Vanek, senior director of engineering, Dell Software Abstract Today, the majority of data for transaction processing resides
Hadoop Distributed File System Dhruba Borthakur June, 2007 Goals of HDFS Very Large Distributed File System 10K nodes, 100 million files, 10 PB Assumes Commodity Hardware Files are replicated to handle
Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns