Big Data with MySQL. Percona Live Santa Clara Ivan Zoratti V Friday, 3 May 13
|
|
- Anthony Palmer
- 7 years ago
- Views:
Transcription
1 Big Data with MySQL Percona Live Santa Clara 2013 Ivan Zoratti V
2 Who is Ivan?
3 SkySQL Leading provider of open source databases, services and solutions Home for the founders and the original developers of the core of MySQL The creators of MariaDB, the drop-off, innovative replacement of MySQL
4 What is Big Data?
5 Big Data! Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. 5
6 Big Data By Structure Unstructured Store everything you have/you find In any format and shape You do not know how to use it, but it may come handy Storing unstructured data is usually cheaper than storing it in a more structured datastore Does not fit well in a relational database Examples: Text: Plain text, documents, web content, messages Bitmap: Image, audio, video Typical approach: Mining, pattern recognition, tagging Usually batch analysis Structured Store only what you need In a good format, ready to be used You should already know how to use it, or at least what it means Storing structured data is quite expensive Raw data, indexing, denormalisation, aggregation A relational database is still the best choice Examples: Machine-Generated Data (MGD) Tags, counters, sales Typical approach: BI tools, reporting Real time analysis change data capture 6
7 Big Data By Structure Unstructured Store everything you have/you find In any format and shape You do not know how to use it, but it may come handy Storing unstructured data is usually cheaper than storing it in a more structured datastore Does not fit well in a relational database Examples: Text: Plain text, documents, web content, messages Bitmap: Image, audio, video Typical approach: Mining, pattern recognition, tagging Usually batch analysis Structured Store only what you need In a good format, ready to be used You should already know how to use it, or at least what it means Storing structured data is quite expensive Raw data, indexing, denormalisation, aggregation A relational database is still the best choice Examples: Machine-Generated Data (MGD) Tags, counters, sales Typical approach: BI tools, reporting Real time analysis change data capture 7
8 How Big is Big Data? Data Factors Size Speed to collect/ generate Variety Resources Administrators Developers Growth Collection Processing Availability To whom? For how long? In which format? Aggregated Detailed Infrastructure 8
9 How to manage Big Data Collection - Storage - Archive Load - Transform - Analyze Access - Explore - Utilize 9
10 Big Data with MySQL
11 Technologies to Use / Consider / Watch MyISAM and MyISAM compression Cassandra Storage Engine InnoDB compression Connect Storage Engine MySQL 5.6 Partitioning MariaDB Optimizer Columnar Databases InfiniDB Infobright MariaDB Virtual & Dynamic Columns TokuDB Storage Engine 11
12 Columnar Databases Automatic compression Automatic column storage Data distribution Map/Reduce approach MPP / Parallel loading No indexes On public clouds, HW or SW appliances 12
13 TokuDB Increased Performance Increased Compression Online administration No Index rebuild 13
14 MyISAM Static, dynamic and compressed format Multiple key cache, CACHE INDEX and LOAD INDEX Compressed tables Horizontal partitioning (manual) External locking 14
15 InnoDB/XtraDB Data Load Pre-order data Split data into chunks unique_checks = 0; foreign_key_checks = 0; sql_log_bin = 0; innodb_autoinc_lock_mode = 2; Compression and block size SET GLOBAL innodb_file_per_table = 1; SET GLOBAL innodb_file_format = Barracuda; CREATE TABLE t1 ( c1 INT PRIMARY KEY, c2 VARCHAR(255) ) ROW_FORMAT = COMPRESSED KEY_BLOCK_SIZE = 8; LOAD DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t1; Query OK, rows affected (1 hour 34 min 7.49 sec) Records: Deleted: 0 Skipped: 0 Warnings: 0 LOAD DATA LOCAL INFILE '/usr2/t1_01_simple' INTO TABLE t2; Query OK, rows affected (25 min sec) Records: Deleted: 0 Skipped: 0 Warnings: 0 Persistent optimizer stats innodb_stats_persistent innodb_stats_auto_recalc 15
16 Partitioning (MySQL 5.6) Partitioning Types RANGE, LIST, RANGE COLUMN, HASH, LINEAR HASH, KEY LINEAR KEY, sub-partitions Partition and lock pruning Use of INDEX and DATA DIRECTORY PARTITION ADD, DROP, REORGANIZE, COALESCE, TRUNCATE, EXCHANGE, REBUILD, OPTIMIZE, CHECK, ANALYZE, REPAIR CREATE TABLE t1 ( c1 INT, c2 DATE ) PARTITION BY RANGE( YEAR( c2 ) ) SUBPARTITION BY HASH ( TO_DAYS( c2 ) ) ( PARTITION p0 VALUES LESS THAN (1990) ( SUBPARTITION s0 DATA DIRECTORY = '/disk0/data' INDEX DIRECTORY = '/disk0/idx', SUBPARTITION s1 DATA DIRECTORY = '/disk1/data' INDEX DIRECTORY = '/disk1/idx' ),... ALTER TABLE t1 EXCHANGE PARTITION p3 WITH TABLE t2; -- Range and List partitions ALTER TABLE t1 REORGANIZE PARTITION p0,p1,p2,p3 INTO ( PARTITION m0 VALUES LESS THAN (1980), PARTITION m1 VALUES LESS THAN (2000)); -- Hash and Key partitions ALTER TABLE t1 COALESCE PARTITION 10; ALTER TABLE t1 ADD PARTITION PARTITIONS 5; 16
17 MariaDB Optimizer Multi-Range Read (MRR) * ORDER BY... LIMIT <limit> * MariaDB 10 Index Merge / Sort intersection Batch Key Access * Block hash join Subqueries Semi-join * Materialization * subquery cache Cost-based choice of range vs. index_merge LIMIT... ROWS EXAMINED <limit> (*) - Available in MySQL
18 Virtual & Dynamic Columns VIRTUAL COLUMNS For InnoDB, MyISAM and Aria PERSISTENT (stored) or VIRTUAL (generated) CREATE TABLE t1 ( c1 INT NOT NULL, c2 VARCHAR(32), c3 INT AS ( c1 MOD 10 ) VIRTUAL, c4 VARCHAR(5) AS ( LEFT(B,5) ) PERSISTENT); DYNAMIC COLUMNS Implement a schemaless, document store COLUMN_ CREATE, ADD, GET, LIST, JSON, EXISTS, CHECK, DELETE Nested colums are allowed Main datatypes are allowed Max 1GB documents CREATE TABLE assets ( item_name VARCHAR(32) PRIMARY KEY, dynamic_cols BLOB ); INSERT INTO assets VALUES ( 'MariaDB T-shirt', COLUMN_CREATE( 'color', 'blue', 'size', 'XL' ) ); INSERT INTO assets VALUES ( 'Thinkpad Laptop', COLUMN_CREATE( 'color', 'black', 'price', 500 ) ); 18
19 Cassandra Storage Engine Column Family == Table Rowkey, static and dynamic columns allowed Batch key access support SET cassandra_default_thrift_host = ' ' CREATE TABLE cassandra_tbl ( rowkey INT PRIMARY KEY, col1 VARCHAR(25), col2 BIGINT, dyn_cols BLOB DYNAMIC_COLUMN_STORAGE = yes ) ENGINE = cassandra KEYSPACE = 'cassandra_key_space' COLUMN_FAMILY = 'column_family_name'; 19
20 Connect Storage Engine Any file format as MySQL TABLE: ODBC Text, XML, *ML Excel, Access etc. MariaDB CREATE TABLE options Multi-file table Table Autocreation CREATE TABLE handout ENGINE = CONNECT TABLE_TYPE = XML FILE_NAME = 'handout.htm' HEADER = yes OPTION_LIST = 'name = TABLE, coltype = HTML, attribute = (border=1;cellpadding=5)'; Condition push down Read/Write and Multi Storage Engine Join CREATE INDEX 20
21 Starting Your Big Data Project
22 Why would you use MySQL? Time Knowledge Infrastructure Costs Simplified Integration Not so big data 22
23 Apache Hadoop & Friends Mahout PIG HIVE Ambari, Ganglia, Nagios HCatalog HBASE ZooKeeper Sqoop Cascading Oozie Flume MapReduce Protobuf, Avro, Thrift Fuse-DFS Chukwa HDFS Cassandra 23
24 MySQL & Friends Scripts Stored Procedures DML DB Schema / DDL MySQL/MariaDB SkySQL DS Mahout SDS, Ganglia, Nagios mysqlimport Cascading Talend, Pentaho Connect SQL Optimizer MySQL/MariaDB/Storage Engines 24
25 Join us at the Solutions Day Cassandra and Connect Storage Engine Map/Reduce approach - Proxy optimisation Multiple protocols and more 25
26 Thank You! izoratti.blogspot.com
Partitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
More informationProgramming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview
Programming Hadoop 5-day, instructor-led BD-106 MapReduce Overview The Client Server Processing Pattern Distributed Computing Challenges MapReduce Defined Google's MapReduce The Map Phase of MapReduce
More informationMariaDB Cassandra interoperability
MariaDB Cassandra interoperability Cassandra Storage Engine in MariaDB Sergei Petrunia Colin Charles Who are we Sergei Petrunia Principal developer of CassandraSE, optimizer developer, formerly from MySQL
More informationBringing Big Data to People
Bringing Big Data to People Microsoft s modern data platform SQL Server 2014 Analytics Platform System Microsoft Azure HDInsight Data Platform Everyone should have access to the data they need. Process
More informationPeers Techno log ies Pv t. L td. HADOOP
Page 1 Peers Techno log ies Pv t. L td. Course Brochure Overview Hadoop is a Open Source from Apache, which provides reliable storage and faster process by using the Hadoop distibution file system and
More informationNew Features in MySQL 5.0, 5.1, and Beyond
New Features in MySQL 5.0, 5.1, and Beyond Jim Winstead jimw@mysql.com Southern California Linux Expo February 2006 MySQL AB 5.0: GA on 19 October 2005 Expanded SQL standard support: Stored procedures
More informationSQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse
SQL Server 2012 PDW Ryan Simpson Technical Solution Professional PDW Microsoft Microsoft SQL Server 2012 Parallel Data Warehouse Massively Parallel Processing Platform Delivers Big Data HDFS Delivers Scale
More informationIntroduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
More informationBig Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
More informationHADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM
HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM 1. Introduction 1.1 Big Data Introduction What is Big Data Data Analytics Bigdata Challenges Technologies supported by big data 1.2 Hadoop Introduction
More informationSQL on NoSQL (and all of the data) With Apache Drill
SQL on NoSQL (and all of the data) With Apache Drill Richard Shaw Solutions Architect @aggress Who What Where NoSQL DB Very Nice People Open Source Distributed Storage & Compute Platform (up to 1000s of
More informationDelivering Intelligence to Publishers Through Big Data
Delivering Intelligence to Publishers Through Big Data 2015-05- 21 Jonathan Sharley Team Lead, Data Operations www.sovrn.com Who is Sovrn? Ø An advertising network with direct relationships to 20,000+
More informationFrom Relational to Hadoop Part 2: Sqoop, Hive and Oozie. Gwen Shapira, Cloudera and Danil Zburivsky, Pythian
From Relational to Hadoop Part 2: Sqoop, Hive and Oozie Gwen Shapira, Cloudera and Danil Zburivsky, Pythian Previously we 2 Loaded a file to HDFS Ran few MapReduce jobs Played around with Hue Now its time
More informationCOURSE CONTENT Big Data and Hadoop Training
COURSE CONTENT Big Data and Hadoop Training 1. Meet Hadoop Data! Data Storage and Analysis Comparison with Other Systems RDBMS Grid Computing Volunteer Computing A Brief History of Hadoop Apache Hadoop
More informationComparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
More informationComplete Java Classes Hadoop Syllabus Contact No: 8888022204
1) Introduction to BigData & Hadoop What is Big Data? Why all industries are talking about Big Data? What are the issues in Big Data? Storage What are the challenges for storing big data? Processing What
More informationCloudera Certified Developer for Apache Hadoop
Cloudera CCD-333 Cloudera Certified Developer for Apache Hadoop Version: 5.6 QUESTION NO: 1 Cloudera CCD-333 Exam What is a SequenceFile? A. A SequenceFile contains a binary encoding of an arbitrary number
More informationbrief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 PART 2 PART 3 BIG DATA PATTERNS...253 PART 4 BEYOND MAPREDUCE...385
brief contents PART 1 BACKGROUND AND FUNDAMENTALS...1 1 Hadoop in a heartbeat 3 2 Introduction to YARN 22 PART 2 DATA LOGISTICS...59 3 Data serialization working with text and beyond 61 4 Organizing and
More informationBITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database
More informationGetting Started with Hadoop. Raanan Dagan Paul Tibaldi
Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationApache Sqoop. A Data Transfer Tool for Hadoop
Apache Sqoop A Data Transfer Tool for Hadoop Arvind Prabhakar, Cloudera Inc. Sept 21, 2011 What is Sqoop? Allows easy import and export of data from structured data stores: o Relational Database o Enterprise
More informationConstructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
More informationNew Modeling Challenges: Big Data, Hadoop, Cloud
New Modeling Challenges: Big Data, Hadoop, Cloud Karen López @datachick www.datamodel.com Karen Lopez Love Your Data InfoAdvisors.com @datachick Senior Project Manager & Architect 1 Disclosure I m a Data
More informationBig Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
More informationBig Data and Hadoop. Module 1: Introduction to Big Data and Hadoop. Module 2: Hadoop Distributed File System. Module 3: MapReduce
Big Data and Hadoop Module 1: Introduction to Big Data and Hadoop Learn about Big Data and the shortcomings of the prevailing solutions for Big Data issues. You will also get to know, how Hadoop eradicates
More informationWorkshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
More informationITG Software Engineering
Introduction to Apache Hadoop Course ID: Page 1 Last Updated 12/15/2014 Introduction to Apache Hadoop Course Overview: This 5 day course introduces the student to the Hadoop architecture, file system,
More information[Type text] Week. National summer training program on. Big Data & Hadoop. Why big data & Hadoop is important?
1 Week National summer training program on Big Data & Hadoop Why big data & Hadoop is important? Highlights of Big Data & Hadoop Implement a Hadoop Project Learn to write Complex MapReduce programs Perform
More informationHDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
More informationQsoft Inc www.qsoft-inc.com
Big Data & Hadoop Qsoft Inc www.qsoft-inc.com Course Topics 1 2 3 4 5 6 Week 1: Introduction to Big Data, Hadoop Architecture and HDFS Week 2: Setting up Hadoop Cluster Week 3: MapReduce Part 1 Week 4:
More informationHadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com)
Hadoop Usage At Yahoo! Milind Bhandarkar (milindb@yahoo-inc.com) About Me Parallel Programming since 1989 High-Performance Scientific Computing 1989-2005, Data-Intensive Computing 2005 -... Hadoop Solutions
More informationApache Hadoop: The Pla/orm for Big Data. Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah
Apache Hadoop: The Pla/orm for Big Data Amr Awadallah CTO, Founder, Cloudera, Inc. aaa@cloudera.com, twicer: @awadallah 1 The Problems with Current Data Systems BI Reports + Interac7ve Apps RDBMS (aggregated
More informationHadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
More informationBIG DATA HADOOP TRAINING
BIG DATA HADOOP TRAINING DURATION 40hrs AVAILABLE BATCHES WEEKDAYS (7.00AM TO 8.30AM) & WEEKENDS (10AM TO 1PM) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)
More informationBig Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch November 11, 2013 10-11-2013 1
Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch November 11, 2013 10-11-2013 1 Overview Today s program 1. Little more practical details about this course 2. Recap from last time (Google
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationDatabase Administration with MySQL
Database Administration with MySQL Suitable For: Database administrators and system administrators who need to manage MySQL based services. Prerequisites: Practical knowledge of SQL Some knowledge of relational
More informationBig Data and Analytics by Seema Acharya and Subhashini Chellappan Copyright 2015, WILEY INDIA PVT. LTD. Introduction to Pig
Introduction to Pig Agenda What is Pig? Key Features of Pig The Anatomy of Pig Pig on Hadoop Pig Philosophy Pig Latin Overview Pig Latin Statements Pig Latin: Identifiers Pig Latin: Comments Data Types
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationBig Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
More informationDominik Wagenknecht Accenture
Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationLecture 10: HBase! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl
Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the
More informationUse Your MySQL Knowledge to Become an Instant Cassandra Guru
Use Your MySQL Knowledge to Become an Instant Cassandra Guru Percona Live Santa Clara 2014 Robert Hodges CEO Continuent Tim Callaghan VP/Engineering Tokutek Who are we? Robert Hodges CEO at Continuent
More informationInfomatics. Big-Data and Hadoop Developer Training with Oracle WDP
Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationHive Development. (~15 minutes) Yongqiang He Software Engineer. Facebook Data Infrastructure Team
Hive Development (~15 minutes) Yongqiang He Software Engineer Facebook Data Infrastructure Team Agenda 1 Introduction 2 New Features 3 Future What is Hive? A system for managing and querying structured
More informationt] open source Hadoop Beginner's Guide ij$ data avalanche Garry Turkington Learn how to crunch big data to extract meaning from
Hadoop Beginner's Guide Learn how to crunch big data to extract meaning from data avalanche Garry Turkington [ PUBLISHING t] open source I I community experience distilled ftu\ ij$ BIRMINGHAMMUMBAI ')
More informationAPACHE HADOOP JERRIN JOSEPH CSU ID#2578741
APACHE HADOOP JERRIN JOSEPH CSU ID#2578741 CONTENTS Hadoop Hadoop Distributed File System (HDFS) Hadoop MapReduce Introduction Architecture Operations Conclusion References ABSTRACT Hadoop is an efficient
More informationHBase Schema Design. NoSQL Ma4ers, Cologne, April 2013. Lars George Director EMEA Services
HBase Schema Design NoSQL Ma4ers, Cologne, April 2013 Lars George Director EMEA Services About Me Director EMEA Services @ Cloudera ConsulFng on Hadoop projects (everywhere) Apache Commi4er HBase and Whirr
More informationHadoop Job Oriented Training Agenda
1 Hadoop Job Oriented Training Agenda Kapil CK hdpguru@gmail.com Module 1 M o d u l e 1 Understanding Hadoop This module covers an overview of big data, Hadoop, and the Hortonworks Data Platform. 1.1 Module
More informationOracle SQL. Course Summary. Duration. Objectives
Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data
More informationAlexander Rubin Principle Architect, Percona April 18, 2015. Using Hadoop Together with MySQL for Data Analysis
Alexander Rubin Principle Architect, Percona April 18, 2015 Using Hadoop Together with MySQL for Data Analysis About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years
More informationData Services Advisory
Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains
More informationBuilding Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
More informationCollaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
More informationHadoop: The Definitive Guide
FOURTH EDITION Hadoop: The Definitive Guide Tom White Beijing Cambridge Famham Koln Sebastopol Tokyo O'REILLY Table of Contents Foreword Preface xvii xix Part I. Hadoop Fundamentals 1. Meet Hadoop 3 Data!
More informationData-Intensive Programming. Timo Aaltonen Department of Pervasive Computing
Data-Intensive Programming Timo Aaltonen Department of Pervasive Computing Data-Intensive Programming Lecturer: Timo Aaltonen University Lecturer timo.aaltonen@tut.fi Assistants: Henri Terho and Antti
More informationData warehousing with PostgreSQL
Data warehousing with PostgreSQL Gabriele Bartolini http://www.2ndquadrant.it/ European PostgreSQL Day 2009 6 November, ParisTech Telecom, Paris, France Audience
More informationHareDB HBase Client Web Version USER MANUAL HAREDB TEAM
2013 HareDB HBase Client Web Version USER MANUAL HAREDB TEAM Connect to HBase... 2 Connection... 3 Connection Manager... 3 Add a new Connection... 4 Alter Connection... 6 Delete Connection... 6 Clone Connection...
More informationBig Data? Definition # 1: Big Data Definition Forrester Research
Big Data Big Data? Definition # 1: Big Data Definition Forrester Research Big Data? Definition # 2: Quote of Tim O Reilly brings it all home: Companies that have massive amounts of data without massive
More informationNew Modeling Challenges: Big Data, Hadoop, Cloud
New Modeling Challenges: Big Data, Hadoop, Cloud Karen López @datachick Karen Lopez Love Your Data InfoAdvisors.com @datachick Senior Project Manager & Architect 1 Abstract We data architects have done
More informationPractical Hadoop by Example
Practical Hadoop by Example for relational database professioanals Alex Gorbachev 12-Mar-2013 New York, NY Alex Gorbachev Chief Technology Officer at Pythian Blogger OakTable Network member Oracle ACE
More informationDatabase Migration from MySQL to RDM Server
MIGRATION GUIDE Database Migration from MySQL to RDM Server A Birdstep Technology, Inc. Raima Embedded Database Division Migration Guide Published: May, 2009 Author: Daigoro F. Toyama Senior Software Engineer
More informationUpcoming Announcements
Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within
More informationBig Data Technology CS 236620, Technion, Spring 2013
Big Data Technology CS 236620, Technion, Spring 2013 Structured Databases atop Map-Reduce Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Roadmap Previous class MR Implementation This class Query Languages
More informationWWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS
WWW.WIPRO.COM HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS Table of contents 01 Abstract 01 02 03 04 The Why - Need for The Who - Prominent
More informationHadoop and Map-Reduce. Swati Gore
Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationIntroduction to Polyglot Persistence. Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace
Introduction to Polyglot Persistence Antonios Giannopoulos Database Administrator at ObjectRocket by Rackspace FOSSCOMM 2016 Background - 14 years in databases and system engineering - NoSQL DBA @ ObjectRocket
More informationMaximizing Hadoop Performance with Hardware Compression
Maximizing Hadoop Performance with Hardware Compression Robert Reiner Director of Marketing Compression and Security Exar Corporation November 2012 1 What is Big? sets whose size is beyond the ability
More informationMoving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationPro Apache Hadoop. Second Edition. Sameer Wadkar. Madhu Siddalingaiah
Pro Apache Hadoop Second Edition Sameer Wadkar Madhu Siddalingaiah Contents J About the Authors About the Technical Reviewer Acknowledgments Introduction xix xxi xxiii xxv Chapter 1: Motivation for Big
More informationData Management in SAP Environments
Data Management in SAP Environments the Big Data Impact Berlin, June 2012 Dr. Wolfgang Martin Analyst, ibond Partner und Ventana Research Advisor Data Management in SAP Environments Big Data What it is
More informationBIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal
BIG DATA AND MICROSOFT Susie Adams CTO Microsoft Federal THE WORLD OF DATA IS CHANGING Cloud What s making this possible? Electrical efficiency of computers doubles every year and ½. Laptops and mobile
More informationMySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
More informationHDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
More informationNoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
More informationThe Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
More informationExtending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago
More informationThe Hadoop Eco System Shanghai Data Science Meetup
The Hadoop Eco System Shanghai Data Science Meetup Karthik Rajasethupathy, Christian Kuka 03.11.2015 @Agora Space Overview What is this talk about? Giving an overview of the Hadoop Ecosystem and related
More informationSQL Server 2016 New Features!
SQL Server 2016 New Features! Improvements on Always On Availability Groups: Standard Edition will come with AGs support with one db per group synchronous or asynchronous, not readable (HA/DR only). Improved
More informationHadoop Introduction. Olivier Renault Solution Engineer - Hortonworks
Hadoop Introduction Olivier Renault Solution Engineer - Hortonworks Hortonworks A Brief History of Apache Hadoop Apache Project Established Yahoo! begins to Operate at scale Hortonworks Data Platform 2013
More informationHadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard
Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop
More informationBig Data: Using ArcGIS with Apache Hadoop. Erik Hoel and Mike Park
Big Data: Using ArcGIS with Apache Hadoop Erik Hoel and Mike Park Outline Overview of Hadoop Adding GIS capabilities to Hadoop Integrating Hadoop with ArcGIS Apache Hadoop What is Hadoop? Hadoop is a scalable
More informationBig Data Advanced Analytics for Game Monetization. Kimberly Chulis
Big Data Advanced Analytics for Game Monetization Kimberly Chulis CEO Core Analytics, LLC Core Analytics / Game Loyalty Bay area and Chicago based digital advanced analytics firm Big Data / NoSQL Advanced
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationD61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:
D61830GC30 for Developers Summary Duration Vendor Audience 5 Days Oracle Database Administrators, Developers, Web Administrators Level Technology Professional Oracle 5.6 Delivery Method Instructor-led
More informationHow to Hadoop Without the Worry: Protecting Big Data at Scale
How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance
More informationBIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION
FACT SHEET BIG DATA & HADOOP DEVELOPER TRAINING & CERTIFICATION BIGDATA & HADOOP CLASS ROOM SESSION GreyCampus provides Classroom sessions for Big Data & Hadoop Developer Certification. This course will
More informationEMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
More informationHadoop and MySQL for Big Data
Hadoop and MySQL for Big Data Alexander Rubin October 9, 2013 About Me Alexander Rubin, Principal Consultant, Percona Working with MySQL for over 10 years Started at MySQL AB, Sun Microsystems, Oracle
More informationCloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
More informationApache Hadoop: Past, Present, and Future
The 4 th China Cloud Computing Conference May 25 th, 2012. Apache Hadoop: Past, Present, and Future Dr. Amr Awadallah Founder, Chief Technical Officer aaa@cloudera.com, twitter: @awadallah Hadoop Past
More informationBig Data Hive! 2013-2014 Laurent d Orazio
Big Data Hive! 2013-2014 Laurent d Orazio Introduction! Context Parallel computation on large data sets on commodity hardware Hadoop [hadoop] Definition Open source implementation of MapReduce [DG08] Objective
More informationHow To Use Big Data For Telco (For A Telco)
ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call
More information