5 Database technology Trends. Guy Harrison, Executive Director, Information Management R&D
|
|
|
- Randell Spencer
- 10 years ago
- Views:
Transcription
1 5 Database technology Trends Guy Harrison, Executive Director, Information Management R&D
2 Introductions Web: guyharrison.net
3
4
5
6 But Seriously
7 5 Database Technology Trends 1. The end of one size fits all 2. Big Data and Hadoop 3. NoSQL 4. Columnar architectures 5. In-memory databases
8 Trend #1: The end of one size fits all 8
9 History of databases Pre-computer technologies: Printing press Dewey decimal system Punched cards Magnetic tape flat (sequential) files Magnetic Disk IDMS ADABAS System R Oracle V2 Access Postgres MySQL HBase Dynamo MongoDB Redis VoltDB Neo4J Relational Model defined IMS Network Model Hierarchical model Indexed-Sequential Access Mechanism (ISAM) SQL Server Sybase Informix Ingres DB2 dbase Aerospike Hana Riak Cassandra Vertica Hadoop
10 Why? 3 rd Platform drives new demands on the database: Global High Availability Data volumes Unstructured data Transaction rates Latency A single architecture cannot meet all those demands
11 It takes all sorts In-memory processing (Spark) Analytic/BI software (SAS, Tableau) Web Server Data Warehouse RDBMS (Oracle, Terradata ) In-memory Analytics (HANA, Exalytics ) Hadoop Web DBMS (MySQL, Mongo, Cassandra) Operational RDBMS (Oracle, SQL Server, ) ERP & inhouse CRM
12 Oracle engineered systems
13 Trend #2: Big Data and Hadoop 14
14 The 3-4 V s Value Volume Terabytes Petabytes Exabytes Zetabytes Variety Structured Unstructured Human Generated Machine Generated Velocity Transaction rates User populations Machines
15 The Industrial revolution of data
16 2005
17 2009
18 The instrumented human Compass Camera Mike/earphones Heads up display Emotion/Attention monitor Bluetooth Personal Area Network 3G/WiFi Wide Area Network GPS Storage Pulse, temp monitor Silent alarms Pedometer, sleep monitoring
19
20 The instrumented world
21 Big Data is the culmination of cloud, social and mobile
22 More Data Storing all data including machine generated and sol, Social, community, demographic data in original format for ever To More Effect Smarter use of data (data science) to achieve competitive or human benefit
23 More Data Storing all data including machine generated and sol, Social, community, demographic data in original format for ever To More Effect Smarter use of data (data science) to achieve competitive or human benefit
24 Pioneers of big data
25
26
27
28
29
30 Google Software Architecture (circa 2005) Google Applications Map Reduce BigTable Google File System (GFS)
31 Map Reduce Start Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Map Reduce
32 Hadoop: 1.0: Open Source Map-Reduce Stack
33 Hadoop at Yahoo 2010(biggest cluster): 4000 nodes 16PB disk 64 TB of RAM 32,000 Cores 2014: 16 Clusters 32,500 nodes
34
35 Hadoop family Oozie (Workflow manager) Hive (Query) Pig (Scripting) SQOOP (RDBMS loader) Flume (Log Loader) Map Reduce / YARN Hbase (database) Zookeeper (locking) Hadoop File System (HDFS)
36 Economies Exadata vs Hadoop $$/TB (Hardware only) Hadoop $750 Exadata $4,911 $0 $1,000 $2,000 $3,000 $4,000 $5,000 $6,000
37 Hadoop is the most concrete Big Data technology Toad: your companion in the Big Data revolution
38 More Data Storing all data including machine generated and sol, Social, community, demographic data in original format for ever To More Effect Smarter use of data (data science) to achieve competitive or human benefit
39 More Data Storing all data including machine generated and sol, Social, community, demographic data in original format for ever To More Effect Smarter use of data (data science) to achieve competitive or human benefit
40 Big Data Analytics AKA Data Science Machine Learning Programs that evolve with experience Collective Intelligence Programs that use inputs from crowds to simulate intelligence Predictive Analytics Programs that extrapolate from past to future
41
42 Collective Intelligence Siri call me an ambulance From now on, I ll call you An Ambulance. OK?
43 Data science 250 Predictive Analytics Classification Clustering Model training and deployment y = x
44 Trend #3: NoSQL
45 Web Servers Memcached Servers Database Servers Read Only Slaves Shard (A-F) Shard (G-O) Shard (P-Z)
46 CAP Theorem says something has to give CAP (Brewer s) Theorem says you can only have two out of three of Consistency, Partition Tolerance, Availability Partition Tolerance System stays up when network between nodes fail Consistency Everyone always sees the same data NO GO Availability System stays up when nodes fail Oracle RAC lives here Most NoSQL lives here
47 Major influences on non-relational Amazon Dynamo Eventually consistent transaction model Consistent hashing Google BigTable Column Family model for sparse distributed columnar data OODBMS and XML DBs Paved the way for the document database
48 Amazon Dynamo Model
49 BigTable Data Model NameId Name 1 Dick 2 Jane SiteId SiteName 1 Ebay 2 Google 3 Facebook 4 ILoveLarry.com 5 MadBillFans.com Name Site Counter Dick Ebay 507,018 Dick Google 690,414 Jane Google 716,426 Dick Facebook 723,649 Jane Facebook 643,261 Jane ILoveLarry.com 856,767 Dick MadBillFans.com 675,230 NameId SiteId Counter , , , , , , ,230 Id Name Ebay Google Facebook (other columns) MadBillFans.com 1 Dick 507, , , ,230 Id Name Google Facebook (other columns) ILoveLarry.com 2 Jane 716, , ,767
50 OODBMS -1990s The OODBMS Manifesto (Atkinson/Bancilhon/DeWitt/Dittrich/Maier/Zdo nik, '90) "A relational database is like a garage that forces you to take your car apart and store the pieces in little drawers Also SQL is ugly A Object database is like a closet which requires that you hang up your suit with tie, underwear, belt socks and shoes all attached (Dave Ensor) IPgd1Tg8ByE/UkOzHg1FmI/AAAAAAAACB0/QYg8kE Vp5_0/s1600/db4o_vs_orm.png
51 Revenge of the Object Nerds Document databases Structured documents XML and JSON (JavaScript Object Notation) become more prevalent within applications Web programmers start storing these in BLOBS in MySQL Emergence of XML and JSON databases
52 Memchache DB MongoDB Key Value Oracle NoSQL Voldemort JSON based CouchDB Dynamo DynamoDB Document RethinkDB Riak XML based MarkLogic BerkeleyDB XML Cassandra Hbase Neo4J Table Based BigTable HyperTable Graph Database Infinite Graph Accumulo FlockDB
53 It s not a database, it s a key value store
54 No Means Yes!
55 Trend #4: Column-oriented DB Dell - Restricted - Confidential
56 Row orientation vs column orientation Row oriented database ID Name DOB Salary Sales Expenses 1001 Dick 21/12/60 67, Jane 12/12/55 55, Robert 17/02/80 22, Dan 15/03/75 65, Steven 11/11/81 76, Block ID Name DOB Salary Sales Expenses Dick 21/12/60 67, Jane 12/12/55 55, Robert 17/02/80 22, Dan 15/03/75 65, Steven 11/11/81 76, Block 1 Dick Jane Robert Dan Steven 2 21/12/60 12/12/55 17/02/80 15/03/75 11/11/ ,000 55,000 22,000 65,200 76, Column oriented database
57 Analytical Queries Row oriented database SELECT SUM(salary) FROM saleperson Block ID Name DOB Salary Sales Expenses Dick 21/12/60 67, Jane 12/12/55 55, Robert 17/02/80 22, Dan 15/03/75 65, Steven 11/11/81 76, Block 1 Dick Jane Robert Dan Steven 2 21/12/60 12/12/55 17/02/80 15/03/75 11/11/ ,000 55,000 22,000 65,200 76, Column oriented database
58 Compression Row oriented database Poor compression ratio (low repetition) Block ID Name DOB Salary Sales Expenses Dick 21/12/60 67, Jane 12/12/55 55, Robert 17/02/80 22, Dan 15/03/75 65, Steven 11/11/81 76, Good compression ratio (high repetition) Block 1 Dick Jane Robert Dan Steven 2 21/12/60 12/12/55 17/02/80 15/03/75 11/11/ ,000 55,000 22,000 65,200 76, Column oriented database
59 Inserts Row oriented database INSERT INTO salesperson Block ID Name DOB Salary Sales Expenses Dick 21/12/60 67, Jane 12/12/55 55, Robert 17/02/80 22, Dan 15/03/75 65, Steven 11/11/81 76, Block 1 Dick Jane Robert Dan Steven 2 21/12/60 12/12/55 17/02/80 15/03/75 11/11/ ,000 55,000 22,000 65,200 76, Column oriented database
60 C-Store (Vertica) Solution for inserts Bulk sequential loads Merged Query Read Optimized Store Columnar Disk-based Highly Compressed Bulk loadable Asynchronous Tuple Mover Continual Parallel inserts Write Optimized Store Row oriented Uncompressed Single row inserts
61 Exadata Hybrid Columnar Compression (EHCC) Compression Unit (~<1M) Block (8K) Block Block Block Column 1 Column 2 Column 3 Column 4 Row Row Row
62 Exadata Hybrid Columnar Compression SELECT SUM(Column4) FROM table Provides high compression ratio Manageable impact on row read/write operations Some optimization of analytic queries
63 Trend #5: The End of Disk? 68
64 5MB HDD circa 1956
65 The more that things change...
66 Faster or slower? IO/CPU -390 CPU 1,013 IO/Capacity -630 Disk Capacity 1,635 IO Rate 260-1, ,000 1,500 2,000 %age change
67 Solid state disk to the rescue DDR RAM Drive SATA flash drive PCI flash drive SSD storage Server
68 Cheaper by the IO SSD DDR-RAM SSD PCI flash SSD SATA Flash Magnetic Disk 4, ,000 2,000 3,000 4,000 5,000 Seek time (us)
69 $$/GB $$/GB But not by the GB HDD MLC SDD SLC SSD
70 $/GB Tiered storage management Main Memory DDR SSD Flash SSD Fast Disk (SAS, RAID 0+1) $/IOP Slow Disk (SATA, RAID 5) Tape, Flat Files, Hadoop
71 Cost (US$/GB) Size (GB) In-Memory databases $100, Cost of RAM falling 50% each 18 months. $10, Some databases can fit entirely within the RAM of a single server or cluster of servers $1, $ US$/GB Size (GB) $ $ Year
72 Oracle Times Ten Clients In-memory transactional database Disk-based Checkpoints and disk-based logging By default, COMMITs are not durable (writes to the transaction log are asynchronous). Can configure synchronous replication or synchronous log writes to avoid data loss Columnar compression and analytic functions in the Exalytics version Memory Point in time snapshot Commits Checkpoints Transaction Logs
73 SAP Hana Memory Column store Persistence Layer Txn logs Row Store Savepoints Data files Delta store Note: Table must be either row or column not both
74 Exalytics Instantaneous!
75 You keep using that word. I do not think it means what you think it means
76 Exalytics Hardware: 2 TB RAM 4 10GBe, 2 InfiniBand ports 6x1.2TB SAS (7.2 TB) 3x800GB (2.4TB) SSD Software: Oracle BI ESSBase Oracle R Times-Ten 12c In-memory
77 VoltDB Clients Clients Clients Single threaded access to memory: no latch/mutex waits Transactions in selfcontained stored procedures: minimal locking K-Safety for COMMIT: No sync waits CPU CPU CPU CPU CPU CPU In-memory Partition In-memory Partition In-memory Partition In-memory Partition In-memory Partition In-memory Partition
78 Spark (sort of) in-memory Hadoop In Memory compute Spark Streaming Mlib Machine Learning SparkSQL HDFS compatible Libraries for data processing, machine learning, streaming, SQL, etc Spark: in-memory distributed compute Python and Scala interfaces Part of the Berkeley Data Analytic Stack HDFS Tachyon in memory File system Mesos Cluster manager
79 Oracle 12c in-memory database Column store Memory (SGA) Row store Column Store (IMCU) OLTP Analytics (SMU) Redo Logs Data files
80 What does all this mean for me?
81 Trend #6: shameless product plugs will increase over the next 120 seconds 89
82 Toad: your companion in the Big Data revolution
83 Toad for Hadoop
84 SharePlex for Hadoop JMS Queue Hadoop Poster HBase Real Time replication Change Data Capture Redo-logs Batched HDFS File Copy Audit / Change Data
85 Toad BI Suite join and analyse data from any source
86 Dell Statistica
87 Dell In-Memory Appliances for Cloudera Enterprise Starter Configuration 8 Node Cluster R720-4 Infrastructure Nodes R720XD- 4 Data Nodes Force10- S55 ~176TB (disk raw space) ~1.5TB (raw memory) Mid-Size Configuration 16 Node Cluster R720-4 Infrastructure Nodes R720XD- 12 Data Nodes Force10- S4810P Force10- S55 ~528TB (disk raw space) ~4.5 TB (raw memory) Small Enterprise Configuration 24 Node Cluster R720-4 Infrastructure Nodes R720XD- 20 Data Nodes ~880TB (disk raw space) ~7.5 TB (raw memory) Expansion Unit- R720XD-4 Data, Cloudera Enterprise Data Hub, Scale in Blocks
88 Dell appliances for any database Dell provides appliances and reference architectures specifically designed for: Oracle SQL Server HANA SSD database acceleration Large memory footprints
89 Big Data for the rest of us Success in Big Data requires capabilities at multiple technology levels: hardware, software infrastructure, business intelligence and analytics Only Dell can deliver capabilities at every technology layer Only Dell s solutions are designed and priced to suit mid-market initial deployments and to scale to the largest enterprise Advanced Analytics Business Intelligence Data Integration Systems Management Hadoop and database software Server and Storage Toad Data point Boomi Statistica Boomi, Toad Intelligence Central Dell Foglight and TOAD Dell appliances for Hadoop, Oracle, etc Dell servers and storage arrays
90 Thank you.
Cloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
Lecture Data Warehouse Systems
Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
Structured Data Storage
Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Large scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
So What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL [email protected] / @marcua
The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg Adam Marcus MIT CSAIL [email protected] / @marcua About Me Social Computing + Database Systems Easily Distracted: Wrote The NoSQL Ecosystem in
<Insert Picture Here> Big Data
Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big
News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
Challenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Quick History of Data Management 4000 B C Manual recording From tablets to papyrus to paper A. Payberah 2014 2
Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
NoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
How To Create A Data Visualization With Apache Spark And Zeppelin 2.5.3.5
Big Data Visualization using Apache Spark and Zeppelin Prajod Vettiyattil, Software Architect, Wipro Agenda Big Data and Ecosystem tools Apache Spark Apache Zeppelin Data Visualization Combining Spark
Introduction to Big Data Training
Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB
Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.
Preview of Oracle Database 12c In-Memory Option 1 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any
Introduction to NOSQL
Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo
Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya
Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now
Moving From Hadoop to Spark
+ Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com [email protected] Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Big Data: Tools and Technologies in Big Data
Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
Hurtownie Danych i Business Intelligence: Big Data
Hurtownie Danych i Business Intelligence: Big Data Robert Wrembel Politechnika Poznańska Instytut Informatyki [email protected] www.cs.put.poznan.pl/rwrembel Outline Introduction to Big Data
X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released
General announcements In-Memory is available next month http://www.oracle.com/us/corporate/events/dbim/index.html X4-2 Exadata announced (well actually around Jan 1) OEM/Grid control 12c R4 just released
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Can the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
Constructing a Data Lake: Hadoop and Oracle Database United!
Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.
Dell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
Database Performance with In-Memory Solutions
Database Performance with In-Memory Solutions ABS Developer Days January 17th and 18 th, 2013 Unterföhring metafinanz / Carsten Herbe The goal of this presentation is to give you an understanding of in-memory
Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
TE's Analytics on Hadoop and SAP HANA Using SAP Vora
TE's Analytics on Hadoop and SAP HANA Using SAP Vora Naveen Narra Senior Manager TE Connectivity Santha Kumar Rajendran Enterprise Data Architect TE Balaji Krishna - Director, SAP HANA Product Mgmt. -
Overview of Databases On MacOS. Karl Kuehn Automation Engineer RethinkDB
Overview of Databases On MacOS Karl Kuehn Automation Engineer RethinkDB Session Goals Introduce Database concepts Show example players Not Goals: Cover non-macos systems (Oracle) Teach you SQL Answer what
extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
Hadoop implementation of MapReduce computational model. Ján Vaňo
Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed
Applications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
Hadoop: Embracing future hardware
Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop
Big Data Course Highlights
Big Data Course Highlights The Big Data course will start with the basics of Linux which are required to get started with Big Data and then slowly progress from some of the basics of Hadoop/Big Data (like
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
A Survey of Distributed Database Management Systems
Brady Kyle CSC-557 4-27-14 A Survey of Distributed Database Management Systems Big data has been described as having some or all of the following characteristics: high velocity, heterogeneous structure,
Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY MANNING ANN KELLY. Shelter Island
Making Sense ofnosql A GUIDE FOR MANAGERS AND THE REST OF US DAN MCCREARY ANN KELLY II MANNING Shelter Island contents foreword preface xvii xix acknowledgments xxi about this book xxii Part 1 Introduction
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Introduction to Hadoop. New York Oracle User Group Vikas Sawhney
Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014
BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014 Ralph Kimball Associates 2014 The Data Warehouse Mission Identify all possible enterprise data assets Select those assets
Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October 2013 10:00 Sesión B - DB2 LUW
Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software 22 nd October 2013 10:00 Sesión B - DB2 LUW 1 Agenda Big Data The Technical Challenges Architecture of Hadoop
Hadoop IST 734 SS CHUNG
Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to
Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,
Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC, Bellevue, WA Legal disclaimer The information in this
Open source large scale distributed data management with Google s MapReduce and Bigtable
Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: [email protected] Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory
NoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
Lecture 10: HBase! Claudia Hauff (Web Information Systems)! [email protected]
Big Data Processing, 2014/15 Lecture 10: HBase!! Claudia Hauff (Web Information Systems)! [email protected] 1 Course content Introduction Data streams 1 & 2 The MapReduce paradigm Looking behind the
Comparison of the Frontier Distributed Database Caching System with NoSQL Databases
Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Dave Dykstra [email protected] Fermilab is operated by the Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359
Workshop on Hadoop with Big Data
Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly
BIG DATA: STORAGE, ANALYSIS AND IMPACT GEDIMINAS ŽYLIUS
BIG DATA: STORAGE, ANALYSIS AND IMPACT GEDIMINAS ŽYLIUS WHAT IS BIG DATA? describes any voluminous amount of structured, semi-structured and unstructured data that has the potential to be mined for information
3 Case Studies of NoSQL and Java Apps in the Real World
Eugene Ciurana [email protected] - pr3d4t0r ##java, irc.freenode.net 3 Case Studies of NoSQL and Java Apps in the Real World This presentation is available from: http://ciurana.eu/geecon-2011 About Eugene...
Introduction to Big Data! with Apache Spark" UC#BERKELEY#
Introduction to Big Data! with Apache Spark" UC#BERKELEY# This Lecture" The Big Data Problem" Hardware for Big Data" Distributing Work" Handling Failures and Slow Machines" Map Reduce and Complex Jobs"
Scalable Architecture on Amazon AWS Cloud
Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies [email protected] 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
Introducing Oracle Exalytics In-Memory Machine
Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle
SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies
Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08
MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15
MongoDB in the NoSQL and SQL world. Horst Rechner [email protected] Berlin, 2012-05-15 1 MongoDB in the NoSQL and SQL world. NoSQL What? Why? - How? Say goodbye to ACID, hello BASE You
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Architectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Database Scalability and Oracle 12c
Database Scalability and Oracle 12c Marcelle Kratochvil CTO Piction ACE Director All Data/Any Data [email protected] Warning I will be covering topics and saying things that will cause a rethink in
Big Data Technologies. Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015
Big Data Technologies Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015 Situation: Bigger and Bigger Volumes of Data Big Data Use Cases Log Analytics (Web Logs, Sensor
Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA
A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA Ompal Singh Assistant Professor, Computer Science & Engineering, Sharda University, (India) ABSTRACT In the new era of distributed system where
Hadoop for MySQL DBAs. Copyright 2011 Cloudera. All rights reserved. Not to be reproduced without prior written consent.
Hadoop for MySQL DBAs + 1 About me Sarah Sproehnle, Director of Educational Services @ Cloudera Spent 5 years at MySQL At Cloudera for the past 2 years [email protected] 2 What is Hadoop? An open-source
Cost-Effective Business Intelligence with Red Hat and Open Source
Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,
MySQL and Hadoop. Percona Live 2014 Chris Schneider
MySQL and Hadoop Percona Live 2014 Chris Schneider About Me Chris Schneider, Database Architect @ Groupon Spent the last 10 years building MySQL architecture for multiple companies Worked with Hadoop for
MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products
MaxDeploy Ready Hyper- Converged Virtualization Solution With SanDisk Fusion iomemory products MaxDeploy Ready products are configured and tested for support with Maxta software- defined storage and with
Large-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki [email protected] http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
MapReduce with Apache Hadoop Analysing Big Data
MapReduce with Apache Hadoop Analysing Big Data April 2010 Gavin Heavyside [email protected] About Journey Dynamics Founded in 2006 to develop software technology to address the issues
Comparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
Database Revolution: Old SQL, NewSQL, NoSQL Huh? Michael Bowers April 9, 2013 v2.9
Database Revolution: Old SQL, NewSQL, NoSQL Huh? Michael Bowers April 9, 2013 v2.9 Database Revolution: Old SQL, NewSQL, NoSQL Huh? Prepared by Michael Bowers 2013 03 28 v. 2.9 Abstract We are in the middle
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing
Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go
Peninsula Strategy. Creating Strategy and Implementing Change
Peninsula Strategy Creating Strategy and Implementing Change PS - Synopsis Professional Services firm Industries include Financial Services, High Technology, Healthcare & Security Headquartered in San
Main Memory Data Warehouses
Main Memory Data Warehouses Robert Wrembel Poznan University of Technology Institute of Computing Science [email protected] www.cs.put.poznan.pl/rwrembel Lecture outline Teradata Data Warehouse
NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
2009 Oracle Corporation 1
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Benchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
