INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD METAMARKETS
|
|
|
- Angelica Fields
- 10 years ago
- Views:
Transcription
1 INTRODUCING DRUID: FAST AD-HOC QUERIES ON BIG DATA MICHAEL DRISCOLL - CEO ERIC TSCHETTER - LEAD METAMARKETS
2 MICHAEL E. DRISCOLL METAMARKETS
3 Metamarkets is the bridge from BI to Data Science, a new kind of real-time, analytics tool for the Big Data generation.
4 OUR SOLUTION... Visualization Analytics Data Store ETL
5 OUR SOLUTION... LEVERAGES OPEN SOURCE Visualization (node.js, D3) Analytics (R, Scala) Data Store (Druid*) ETL (Hadoop, Kafka)
6 1,000,000, , ms events daily queries daily avg response Data Store (Druid*)
7 Twitter 8 TB per day Global Businesses 1.8 ZB in 2011 Wal-Mart 2.5 PB stored Facebook 100 TB per day US Library of Congress 235 TB Boeing 640 TB per flight A TIDAL WAVE OF DATA
8 JUST IN CASE THE WIFI FAILS US...
9
10
11
12
13
14
15 ERIC TSCHETTER LEAD METAMARKETS
16
17 REAL-TIME ANALYTICS MEANS Fast aggregations over O(billions), high-dimensional events Flexible drill-downs with arbitrary depths Sub-second queries that reflect changes immediately
18 WHAT WE TRIED
19 WHAT WE TRIED I. RDMBS - Relational Database
20 I. RDBMS - THE SETUP Star Schema Aggregate Tables Query Caching
21 I. RDBMS - THE RESULTS Queries that were cached fast Queries against aggregate tables fast to acceptable Queries against base fact table generally unacceptable
22 I. RDBMS - PERFORMANCE Naive benchmark scan rate ~5.5M rows / second / core 1 day of summarized aggregates 60M+ rows 1 query over 1 week, 16 cores ~5 seconds Page load with 20 queries over a week of data long time
23 WHAT WE TRIED I. RDMBS - Relational Database
24 WHAT WE TRIED I. RDMBS - Relational Database
25 WHAT WE TRIED I. RDMBS - Relational Database II. NoSQL - Key/Value Store
26 II. NOSQL - THE SETUP Pre-aggregate all dimensional combinations Store results in a NoSQL store
27 II. NOSQL - THE RESULTS Queries were fast range scan on primary key Inflexible not aggregated, not available Not continuously updated aggregate first, then display Processing scales exponentially
28 II. NOSQL - PERFORMANCE Dimensional combinations => exponential increase Tried limiting dimensional depth still expands exponentially Example: ~500k records 11 dimensions, 5-depth 4.5 hours on a 15-node Hadoop cluster 14 dimensions, 5-depth 9 hours on a 25-node Hadoop cluster
29 WHAT WE TRIED I. RDMBS - Relational Database II. NoSQL - Key/Value Store
30 WHAT WE TRIED I. RDMBS - Relational Database II. NoSQL - Key/Value Store
31 WHAT WE TRIED I. RDMBS - Relational Database II. NoSQL - Key/Value Store III.???
32 WHAT WE LEARNED Problem with RDBMS: scans are slow Problem with NoSQL: computationally intractable
33 WHAT WE LEARNED Problem with RDBMS: scans are slow Problem with NoSQL: computationally intractable Fixing RDBMS issue seems easier
34 INTRODUCING DRUID
35 DRUID FIVE KEY FEATURES 1.Distributed Column-Store 2.Real-time Ingestion 3.Fast Data Scans 4.Fast Filtering 5.Operationally Simple
36 FEATURE 1 DISTRIBUTED COLUMN-STORE Data chunked into segments MVCC swapping protocol Converted to columnar format Coordinator oversees cluster Data replication and balance Per-segment replication Increase replication on hot spots
37 FEATURE 2 REAL-TIME INGESTION Historical Nodes Query API
38 FEATURE 2 REAL-TIME INGESTION Historical Nodes Query Rewrite Scatter/Gather Query API Broker Nodes Query API
39 FEATURE 2 REAL-TIME INGESTION Real-time Nodes Historical Nodes Query API Query Rewrite Scatter/Gather Query API Broker Nodes Query API
40 FEATURE 2 REAL-TIME INGESTION a Hand Off Data Historical Nodes Real-time Nodes Query API Query Rewrite Scatter/Gather Query API Broker Nodes Query API
41 FEATURE 3 FAST DATA SCANS Column orientation Only load/scan what you need Compression Decrease size of data in storage (RAM/disk)
42 FEATURE 4 FAST FILTERING Indexed Bitmap indices Compressed Bitmaps CONCISE compression Operate on compressed form Resolve filters without looking at data
43 FEATURE 5 OPERATIONALLY SIMPLE Fault-tolerant Replication Rolling deployments/restarts All node types have failover/replication Grow == start processes Shrink == kill processes
44 DRUID IN ACTION
45 DRUID S QUERY INTERFACE Resembles SQL... SELECT hour(timestamp), lang, hashtag, sum(tweets) AS tweets FROM twitter-spritzer GROUP BY hour(timestamp), lang, hashtag WHERE country = US AND timestamp BETWEEN AND ;
46 DRUID S QUERY INTERFACE...but in a JSON format SELECT hour(timestamp), lang, hashtag, sum(tweets) AS tweets FROM twitter-spritzer GROUP BY hour(timestamp), lang, hashtag WHERE country = US AND timestamp BETWEEN AND ; { "querytype": "groupby", "datasource": twitter-spritzer", "intervals":[" / "], "granularity": hour", "dimensions": ["lang", "hashtag"], "aggregations":[ { "type": longsum", "fieldname": "tweets", "name": "tweets"} ], "filter": {"type": "selector", "dimension": "country", "value":"us }
47 QUERY FEATURES Group By Time-series roll-ups Arbitrary boolean filters Aggregation functions Sum, Min, Max, Avg, etc. Dimensional Search Explore values in your dimensions
48 DRUID VERSUS OTHERS vs Google Dremel No indexing structure vs Google PowerDrill Close analog, all in-memory vs Hadoop+Avro+Hive(+Yarn) Closer to Tenzig Back-office use case
49 DRUID BENCHMARKS Scan speed ~33M rows / second / core Realtime ingestion rate ~10k records / second / node Cluster Examples 100 nodes, 6 TB Memory, 400 TB disk 1.4 seconds 6TB in-memory query
50 DRUID IS NOW OPEN SOURCE
51 DRUID IS NOW OPEN SOURCE Learn more: METAMARKETS.COM/DRUID GITHUB.COM/METAMX/DRUID
52 DRUID IS NOW OPEN SOURCE Learn more: METAMARKETS.COM/DRUID GITHUB.COM/METAMX/DRUID eric tschetter - zedruid
Apache Kylin Introduction Dec 8, 2014 @ApacheKylin
Apache Kylin Introduction Dec 8, 2014 @ApacheKylin Luke Han Sr. Product Manager [email protected] @lukehq Yang Li Architect & Tech Leader [email protected] Agenda What s Apache Kylin? Tech Highlights Performance
Unified Big Data Processing with Apache Spark. Matei Zaharia @matei_zaharia
Unified Big Data Processing with Apache Spark Matei Zaharia @matei_zaharia What is Apache Spark? Fast & general engine for big data processing Generalizes MapReduce model to support more types of processing
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
Using distributed technologies to analyze Big Data
Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/
Moving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com [email protected] Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
Practical Cassandra. Vitalii Tymchyshyn [email protected] @tivv00
Practical Cassandra NoSQL key-value vs RDBMS why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints Vitalii Tymchyshyn
Big Data Approaches. Making Sense of Big Data. Ian Crosland. Jan 2016
Big Data Approaches Making Sense of Big Data Ian Crosland Jan 2016 Accelerate Big Data ROI Even firms that are investing in Big Data are still struggling to get the most from it. Make Big Data Accessible
SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here
PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.
How to Choose Between Hadoop, NoSQL and RDBMS
How to Choose Between Hadoop, NoSQL and RDBMS Keywords: Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data, Hadoop, NoSQL Database, Relational Database, SQL, Security, Performance Introduction A
Can the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.
Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology
Parallel Data Warehouse
MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
TUT NoSQL Seminar (Oracle) Big Data
Timo Raitalaakso +358 40 848 0148 [email protected] TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com
Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h
Data-cubing made-simple! with Spark, Algebird and HBase goo.gl/dbgr0h Vidmantas Zemleris Agenda Intro Analytics at Vinted What is Data-cubing? Why did we build it? Architecture Preliminaries Metric computation
HBase A Comprehensive Introduction. James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367
HBase A Comprehensive Introduction James Chin, Zikai Wang Monday, March 14, 2011 CS 227 (Topics in Database Management) CIT 367 Overview Overview: History Began as project by Powerset to process massive
Crack Open Your Operational Database. Jamie Martin [email protected] September 24th, 2013
Crack Open Your Operational Database Jamie Martin [email protected] September 24th, 2013 Analytics on Operational Data Most analytics are derived from operational data Two canonical approaches
Columnstore in SQL Server 2016
Columnstore in SQL Server 2016 Niko Neugebauer 3 Sponsor Sessions at 11:30 Don t miss them, they might be getting distributing some awesome prizes! HP SolidQ Pyramid Analytics Also Raffle prizes at the
Performance Management in Big Data Applica6ons. Michael Kopp, Technology Strategist @mikopp
Performance Management in Big Data Applica6ons Michael Kopp, Technology Strategist NoSQL: High Volume/Low Latency DBs Web Java Key Challenges 1) Even Distribu6on 2) Correct Schema and Access paperns 3)
CitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
Zynga Analytics Leveraging Big Data to Make Games More Fun and Social
Connecting the World Through Games Zynga Analytics Leveraging Big Data to Make Games More Fun and Social Daniel McCaffrey General Manager, Platform and Analytics Engineering World s leading social game
Using RDBMS, NoSQL or Hadoop?
Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Monitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering
Building a real-time, self-service data analytics ecosystem Greg Arnold, Sr. Director Engineering Self Service at scale 6 5 4 3 2 1 ? Relational? MPP? Hadoop? Linkedin data 350M Members 25B 3.5M 4.8B 2M
Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich
Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit
Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: [email protected] Website: www.qburst.com
Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Architectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase
In-Memory Analytics: A comparison between Oracle TimesTen and Oracle Essbase Agenda Introduction Why In-Memory? Options for In-Memory in Oracle Products - Times Ten - Essbase Comparison - Essbase Vs Times
API Analytics with Redis and Google Bigquery. javier ramirez @supercoco9
API Analytics with Redis and Google Bigquery REST API + AngularJS web as an API client obvious solution: use a ready-made service as 3scale or apigee 1. non intrusive metrics 2. keep the history
Building Scalable Big Data Infrastructure Using Open Source Software. Sam William sampd@stumbleupon.
Building Scalable Big Data Infrastructure Using Open Source Software Sam William sampd@stumbleupon. What is StumbleUpon? Help users find content they did not expect to find The best way to discover new
Case Study: Real-time Analytics With Druid. Salil Kalia, Tech Lead, TO THE NEW Digital
Case Study: Real-time Analytics With Druid Salil Kalia, Tech Lead, TO THE NEW Digital Agenda Understanding the use-case Ad workflow Our use case Experiments with technologies Redis Cassandra Introduction
Big Data Pipeline and Analytics Platform
Big Data Pipeline and Analytics Platform Using NetflixOSS and Other Open Source Software Sudhir Tonse (@stonse) Danny Yuan (@g9yuayon) Netflix is a log generating company that also happens to stream movies
In-memory computing with SAP HANA
In-memory computing with SAP HANA June 2015 Amit Satoor, SAP @asatoor 2015 SAP SE or an SAP affiliate company. All rights reserved. 1 Hyperconnectivity across people, business, and devices give rise to
SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
the missing log collector Treasure Data, Inc. Muga Nishizawa
the missing log collector Treasure Data, Inc. Muga Nishizawa Muga Nishizawa (@muga_nishizawa) Chief Software Architect, Treasure Data Treasure Data Overview Founded to deliver big data analytics in days
NoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
HP Vertica and MicroStrategy 10: a functional overview including recommendations for performance optimization. Presented by: Ritika Rahate
HP Vertica and MicroStrategy 10: a functional overview including recommendations for performance optimization Presented by: Ritika Rahate MicroStrategy Data Access Workflows There are numerous ways for
Pulsar Realtime Analytics At Scale. Tony Ng April 14, 2015
Pulsar Realtime Analytics At Scale Tony Ng April 14, 2015 Big Data Trends Bigger data volumes More data sources DBs, logs, behavioral & business event streams, sensors Faster analysis Next day to hours
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor
PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor The research leading to these results has received funding from the European Union's Seventh Framework
Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets
Parquet. Columnar storage for the people
Parquet Columnar storage for the people Julien Le Dem @J_ Processing tools lead, analytics infrastructure at Twitter Nong Li [email protected] Software engineer, Cloudera Impala Outline Context from various
A very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect
A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com StreamHorizon & Big Data Integrates into your Data Processing Pipeline Seamlessly integrates at any point of your your data processing pipeline Implements
Domain driven design, NoSQL and multi-model databases
Domain driven design, NoSQL and multi-model databases Java Meetup New York, 10 November 2014 Max Neunhöffer www.arangodb.com Max Neunhöffer I am a mathematician Earlier life : Research in Computer Algebra
SQL + NOSQL + NEWSQL + REALTIME FOR INVESTMENT BANKS
Enterprise Data Problems in Investment Banks BigData History and Trend Driven by Google CAP Theorem for Distributed Computer System Open Source Building Blocks: Hadoop, Solr, Storm.. 3548 Hypothetical
Big Data Management. Big Data Management. (BDM) Autumn 2013. Povl Koch September 2, 2013 01-09-2013 1
Big Data Management Big Data Management (BDM) Autumn 2013 Povl Koch September 2, 2013 01-09-2013 1 Overview Today s program 1. Little more practical details about this course 2. Chapter 2 & 3 in NoSQL
Vertica Live Aggregate Projections
Vertica Live Aggregate Projections Modern Materialized Views for Big Data Nga Tran - HPE Vertica - [email protected] November 2015 Outline What is Big Data? How Vertica provides Big Data Solutions? What
Rakam: Distributed Analytics API
Rakam: Distributed Analytics API Burak Emre Kabakcı May 30, 2014 Abstract Today, most of the big data applications needs to compute data in real-time since the Internet develops quite fast and the users
Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
Integrating Apache Spark with an Enterprise Data Warehouse
Integrating Apache Spark with an Enterprise Warehouse Dr. Michael Wurst, IBM Corporation Architect Spark/R/Python base Integration, In-base Analytics Dr. Toni Bollinger, IBM Corporation Senior Software
Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
<Insert Picture Here> Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option
Enhancing the Performance and Analytic Content of the Data Warehouse Using Oracle OLAP Option The following is intended to outline our general product direction. It is intended for
ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process
ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced
Optimizing Your Data Warehouse Design for Superior Performance
Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex
Power BI Performance. Tips and Techniques WWW.PRAGMATICWORKS.COM
Power BI Performance Tips and Techniques Rachael Martino Principal Consultant [email protected] @RMartinoBoston About Me SQL Server and Oracle developer and IT Manager since SQL Server 2000 Focused
Big Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB
BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next
The Kentik Data Engine
The Kentik Data Engine A CLUSTERED, MULTI-TENANT APPROACH FOR MANAGING THE WORLD S LARGEST NETWORKS When Kentik set out to build a network traffic monitoring solution that would be scalable, high-precision,
Fact Sheet In-Memory Analysis
Fact Sheet In-Memory Analysis 1 Copyright Yellowfin International 2010 Contents In Memory Overview...3 Benefits...3 Agile development & rapid delivery...3 Data types supported by the In-Memory Database...4
Ali Ghodsi Head of PM and Engineering Databricks
Making Big Data Simple Ali Ghodsi Head of PM and Engineering Databricks Big Data is Hard: A Big Data Project Tasks Tasks Build a Hadoop cluster Challenges Clusters hard to setup and manage Build a data
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
Unified Batch & Stream Processing Platform
Unified Batch & Stream Processing Platform Himanshu Bari Director Product Management Most Big Data Use Cases Are About Improving/Re-write EXISTING solutions To KNOWN problems Current Solutions Were Built
European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project
European Archival Records and Knowledge Preservation Database Archiving in the E-ARK Project Janet Delve, University of Portsmouth Kuldar Aas, National Archives of Estonia Rainer Schmidt, Austrian Institute
Real-Time Data Analytics and Visualization
Real-Time Data Analytics and Visualization Making the leap to BI on Hadoop Predictive Analytics & Business Insights 2015 February 9, 2015 David P. Mariani CEO, AtScale, Inc. THE TRUTH ABOUT DATA We think
Oracle BI Suite Enterprise Edition
Oracle BI Suite Enterprise Edition Optimising BI EE using Oracle OLAP and Essbase Antony Heljula Technical Architect Peak Indicators Limited Agenda Overview When Do You Need a Cube Engine? Example Problem
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
Oracle Database In-Memory A Practical Solution
Oracle Database In-Memory A Practical Solution Sreekanth Chintala Oracle Enterprise Architect Dan Huls Sr. Technical Director, AT&T WiFi CON3087 Moscone South 307 Safe Harbor Statement The following is
Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15
Amazon Redshift & Amazon DynamoDB Michael Hanisch, Amazon Web Services Erez Hadas-Sonnenschein, clipkit GmbH Witali Stohler, clipkit GmbH 2014-05-15 2014 Amazon.com, Inc. and its affiliates. All rights
SQL Server 2012 Performance White Paper
Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP
PLATFORA INTERACTIVE, IN-MEMORY BUSINESS INTELLIGENCE FOR HADOOP Your business is swimming in data, and your business analysts want to use it to answer the questions of today and tomorrow. YOU LOOK TO
An Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Understanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
Evaluation of NoSQL databases for large-scale decentralized microblogging
Evaluation of NoSQL databases for large-scale decentralized microblogging Cassandra & Couchbase Alexandre Fonseca, Anh Thu Vu, Peter Grman Decentralized Systems - 2nd semester 2012/2013 Universitat Politècnica
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre
NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges
Session 1: IT Infrastructure Security Vertica / Hadoop Integration and Analytic Capabilities for Federal Big Data Challenges James Campbell Corporate Systems Engineer HP Vertica [email protected] Big
Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce
Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of
Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya
Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
Performance Verbesserung von SAP BW mit SQL Server Columnstore
Performance Verbesserung von SAP BW mit SQL Server Columnstore Martin Merdes Senior Software Development Engineer Microsoft Deutschland GmbH SAP BW/SQL Server Porting AGENDA 1. Columnstore Overview 2.
Putting Apache Kafka to Use!
Putting Apache Kafka to Use! Building a Real-time Data Platform for Event Streams! JAY KREPS, CONFLUENT! A Couple of Themes! Theme 1: Rise of Events! Theme 2: Immutability Everywhere! Level! Example! Immutable
The Arts & Science of Tuning HANA models for Performance. Abani Pattanayak, SAP HANA CoE Nov 12, 2015
The Arts & Science of Tuning HANA models for Performance Abani Pattanayak, SAP HANA CoE Nov 12, 2015 Disclaimer This presentation outlines our general product direction and should not be relied on in making
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
Oracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
