Big Data: Big IT Party?
|
|
|
- Caitlin Martin
- 10 years ago
- Views:
Transcription
1 Copyright R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photographic, or otherwise, without the explicit written permission of the copyright owners. Big Data: Big IT Party? by Rick F. van der Lans R20/Consultancy BV Rick F. van der Lans Rick F. van der Lans is an independent consultant, lecturer, and author. He specializes in warehousing, business intelligence, base technology, and virtualization. He is managing director of R20/Consultancy B.V.. Rick has been involved in various projects in which warehousing, and integration technology was applied. Rick van der Lans is an internationally acclaimed lecturer. He has lectured professionally for the last twenty five years in many of the European and Middle East countries, the USA, South America, and in Australia. He has been invited by several major software vendors to present keynote speeches. He is the author of several books on computing, including his new Data Virtualization for Business Intelligence Systems. Some of these books are available in different languages. Books such as the popular Introduction to SQL is available in English, Dutch, Italian, Chinese, and German and is sold world wide. He also authored The SQL Guide to Ingres and SQL for MySQL Developers. As author for BeyeNetwork.com, writer of whitepapers, chairman for the annual European Enterprise Data and Business Intelligence Conference, and as columnist for a few IT magazines, he has close contacts with many vendors. R20/Consultancy B.V. is located in The Hague, The Netherlands, You can get in touch with Rick via: [email protected] LinkedIn: Copyright R20/Consultancy B.V., The Hague, The Netherlands 2 Do We Agree On What Big Data Is? Copyright R20/Consultancy B.V., The Hague, The Netherlands 3 Copyright R20/Consultancy B.V., The Hague, The Netherlands 4 1
2 WikiBon February 2014 WikiBon February 2014 Source: Source: Copyright R20/Consultancy B.V., The Hague, The Netherlands 5 Copyright R20/Consultancy B.V., The Hague, The Netherlands 6 Gartner: Big Data Market Forecast Gartner s Hype Cycle for Emerging Tech s July 2013 Big will drive $232 billion in spending through It will directly or indirectly drive $96 billion of worldwide IT spending in 2012, and is forecast to drive $120 billion of IT spending in Copyright R20/Consultancy B.V., The Hague, The Netherlands 7 Copyright R20/Consultancy B.V., The Hague, The Netherlands 8 2
3 McKinsey Global Institute: Benefits of Big Data Big Data has the potential to increase the value of the US Health Care industry by $300 Billion to increase the industry value of Europe s public sector administration by EUR 250 Billion to decrease manufacturing (development and assembly) costs by 50% to increase service provider revenue by $100 Billion due to global personal location to increase US Retails net margin by 60% Copyright R20/Consultancy B.V., The Hague, The Netherlands 9 Copyright R20/Consultancy B.V., The Hague, The Netherlands 10 Big Data Exaggerations It s All About Analytics Big : A revolution that will transform how we live, work and think Companies are being destroyed and created around big, Management of big Key to survival in the health care sector Big has arrived and is shaping IT today The disruptive power of big Copyright R20/Consultancy B.V., The Hague, The Netherlands 11 Copyright R20/Consultancy B.V., The Hague, The Netherlands 12 3
4 Analytical Challenges of Tomorrow Improve product development Optimize business processes Improve customer care Improve customer delight Improve pro-active customer care Personalize products External Data: UK-based Retail Company 10 degree rise in temperature means 300% more barbecue meat, 45% more lettuce, and 50% more coleslaw A city-center store will see an uplift in sandwiches (to eat outside) on a warm weekday, and almost no effect at all on a warm weekend Result: 6 million UK pounds less food wastage in the summer, 50 million less stock in warehouses Copyright R20/Consultancy B.V., The Hague, The Netherlands 13 Copyright R20/Consultancy B.V., The Hague, The Netherlands 14 Social Media Data Sensor Data Internet of Things Copyright R20/Consultancy B.V., The Hague, The Netherlands 15 Copyright R20/Consultancy B.V., The Hague, The Netherlands 16 4
5 Privacy? Quantity Quality Copyright R20/Consultancy B.V., The Hague, The Netherlands 17 Copyright R20/Consultancy B.V., The Hague, The Netherlands 18 Copyright R20/Consultancy B.V., The Hague, The Netherlands 19 Copyright R20/Consultancy B.V., The Hague, The Netherlands 20 5
6 Databases are Boring! Copyright R20/Consultancy B.V., The Hague, The Netherlands 21 Copyright R20/Consultancy B.V., The Hague, The Netherlands 22 Source: The 451 Group SQL is Intergalactic DataSpeak! Or was? Can We Exploit This? Copyright R20/Consultancy B.V., The Hague, The Netherlands 23 Copyright R20/Consultancy B.V., The Hague, The Netherlands 24 6
7 Scale Up Scale Out Operations of a Query scale up scale out Scale up (vertical scaling) means adding more resources to one node in a system Scale out (horizontal scaling) means adding more nodes to a system Continuous availability/redundancy Cost/performance flexibility Contiguous upgrades Geographical distribution WITH FLIGHTPLAN(FLIGHTNO, PLAN_AIRPORTS, PLAN_FLIGHTS, START_AIRPORT, END_AIRPORT, START_TIME, END_TIME, DEPARTURE_AIRPORT, ARRIVAL_AIRPORT, DEPARTURE_TIME, ARRIVAL_TIME, PRICE, STOPS) AS (SELECT FLIGHTNO, CAST(DEPARTURE_AIRPORT '->' ARRIVAL_AIRPORT AS VARCHAR(100)), CAST(RTRIM(CHAR(FLIGHTNO)) AS VARCHAR(100)), DEPARTURE_AIRPORT, ARRIVAL_AIRPORT, DEPARTURE_TIME, ARRIVAL_TIME, DEPARTURE_AIRPORT, ARRIVAL_AIRPORT, DEPARTURE_TIME, ARRIVAL_TIME, PRICE, 0 FROM FLIGHTS WHERE DEPARTURE_AIRPORT='AMS' AND CAST(DEPARTURE_TIME AS DATE) = ' ' UNION ALL SELECT P.FLIGHTNO, P.PLAN_AIRPORTS '->' F.ARRIVAL_AIRPORT, P.PLAN_FLIGHTS '->' RTRIM(CHAR(F.FLIGHTNO)), P.START_AIRPORT, F.ARRIVAL_AIRPORT, P.START_TIME, F.ARRIVAL_TIME, P.DEPARTURE_AIRPORT, P.ARRIVAL_AIRPORT, P.DEPARTURE_TIME, P.ARRIVAL_TIME, P.PRICE + F.PRICE, STOPS+1 FROM FLIGHTPLAN AS P, FLIGHTS AS F WHERE P.ARRIVAL_AIRPORT = F.DEPARTURE_AIRPORT AND P.ARRIVAL_TIME < F.DEPARTURE_TIME AND F.DEPARTURE_AIRPORT <> 'PHX' AND LOCATE(F.ARRIVAL_AIRPORT, P.PLAN_AIRPORTS) = 0 AND STOPS < 1 AND P.ARRIVAL_TIME + 4 HOURS > F.DEPARTURE_TIME) SELECT PLAN_AIRPORTS, PLAN_FLIGHTS, START_AIRPORT, END_AIRPORT, START_TIME, END_TIME, PRICE FROM FLIGHTPLAN WHERE END_AIRPORT = 'PHX' ORDER BY PRICE ASC FETCH FIRST 1 ROW ONLY Analytical functions Recursive operations Joins Having filters Group by Complex scalar functions Projections and simple transformations Filters - selections Copyright R20/Consultancy B.V., The Hague, The Netherlands 25 Copyright R20/Consultancy B.V., The Hague, The Netherlands 26 Parallel Database Architecture Effect of Partitions on Query Response Database server Application Analytical functions Recursive operations Joins Having filters Group by Complex scalar functions Projections and simple transformations Filters - selections Master Worker 1 Worker 2 Worker 3 total throughput bottleneck number of partitions/processors Copyright R20/Consultancy B.V., The Hague, The Netherlands 27 Copyright R20/Consultancy B.V., The Hague, The Netherlands 28 7
8 Internal Database Server Administration The Market of Hadoop/NoSQL Products NewSQL Source: VoltDB / Michael Stonebraker Copyright R20/Consultancy B.V., The Hague, The Netherlands 29 Copyright R20/Consultancy B.V., The Hague, The Netherlands 30 Categories of Database Servers Aggregate Data Model all base servers SQL base servers NoSQL base servers Classic SQL base servers Analytical SQL base servers NewSQL base servers Key-value Document Column-family Graph base servers Copyright R20/Consultancy B.V., The Hague, The Netherlands 31 Copyright R20/Consultancy B.V., The Hague, The Netherlands 32 8
9 Strong Consistency vs. Eventual Consistency SQL DBMS versus NoSQL Solution Strong application application Eventual SQL base server NoSQL solution Copyright R20/Consultancy B.V., The Hague, The Netherlands 33 Copyright R20/Consultancy B.V., The Hague, The Netherlands 34 Hadoop Components The 2 nd Generation of Hadoop Copyright R20/Consultancy B.V., The Hague, The Netherlands 35 Copyright R20/Consultancy B.V., The Hague, The Netherlands 36 9
10 Hadoop 2.0 Examples of Complex Values (1) Comma-separated value "Anchorage Daily News","PO Box ","Anchorage","AK"," ", " "," ","71","","82", EDIFACT message UNB+UNOA: : : : 'XXXUNH INVOIC:D:97B:UN'XXXBGM 'XXXDTM+ 3: :102'XXXRFF+ON:521052'XXXNAD+BY ::16++ CUMMINSMIDRANGEENGINEPLANT'XXXNAD+SE ::16++ GENERALWIDGETCOMPANY'XXXCUX+1:USD'XXXLIN :IN'XXXIMD+ F++:::WIDGET'XXXQTY+47:1020:EA'XXXALI+US'XXXMOA+203: 'XXXPRI+ INV:1.179'XXXLIN :IN'XXXIMD+F++:::DIFFERENTWIDGET'XXXQTY+ 47:20:EA'XXXALI+JP'XXXMOA+203:410'XXXPRI+INV:20.5'XXXUNS+S'XXXMOA+ 39: 'XXXALC+C+ABG'XXXMOA+8:525'XXXUNT 'XXXUNZ ' Copyright R20/Consultancy B.V., The Hague, The Netherlands 37 Copyright R20/Consultancy B.V., The Hague, The Netherlands 38 Example of Complex Value (2) Unraveling the Data Model Weblog record datestamp ip request 6/1/ :10:19 AM GET /x.php?u= HTTP/1.1 6/1/2012 5:53:49 AM GET /tv/3/player/vendor/chef%20tips /player/fiveminute/content/steak/asset/gnrc_ HTTP/1.1 6/1/2012 8:55:54 AM GET /tv/3/search/content/the%20andy%20griffith%20show/s/the%20 Andy%20Griffith%20Show HTTP/1.1 6/1/2012 3:12:43 PM GET /tv/3/search/content/kathie%20lee%20gifford's%20epic%20'today'%20gaffe/s/kathie %20Lee%20Gifford's%20epic%20'Today'%20gaffe HTTP/1.1 6/1/2012 4:48:35 PM GET /tv/3/search/content/deadliest%20catch/s/deadliest%20catch HTTP/1.1 6/1/ :25:12 AM GET /x.php?u= 5.financialcontent.com/synacor?Page=QUOTE&Ticker=DJ:DJI HTTP/1.1 6/1/2012 1:58:14 AM GET /tv/3/player/vendor/chef%20tips/player /fiveminute/content/steak/asset/gnrc_ HTTP/ Unravel & Store Store Store Classic base Classic base Query Query & unravel MapReduce base Query & unravel Copyright R20/Consultancy B.V., The Hague, The Netherlands 39 Copyright R20/Consultancy B.V., The Hague, The Netherlands 40 10
11 Schema-On-Write SoW = Data written to a base has a schema A schema is not optional Fixed schema-on-write All records in a table have the same schema For example, SQL systems Variable schema-on-write When is stored in the base, a schema is written together with the itself Different records in a table can have different schemas Schema-On-Read SoR = Data written to a base has a schema Stored has no schema Complex values or schema-less values Schema-on-application-read The application assigns a schema to the schema-less (unraveling) Schema-on-base-read The base server assigns a schema to the schema-less The application receives with a schema Copyright R20/Consultancy B.V., The Hague, The Netherlands 41 Copyright R20/Consultancy B.V., The Hague, The Netherlands 42 Tyranny of Performance The Balancing Act Performance Scalability Availability Productivity Maintainability Time-to-market Copyright R20/Consultancy B.V., The Hague, The Netherlands 43 Copyright R20/Consultancy B.V., The Hague, The Netherlands 44 11
12 The Classic Reporting Environment The Upcoming Analytical Labyrinth unstructured operational external private applications bases personal store Executive applications bases personal store staging area marts staging area marts Interactive Interactive warehouse warehouse Predictive analytics sandboxes big analytics big Copyright R20/Consultancy B.V., The Hague, The Netherlands 45 Copyright R20/Consultancy B.V., The Hague, The Netherlands 46 Do We Want Analytical Silos? Heading for an Integration Labyrinth applications Self-service BI iterative predictive analytics mobile predefined applications Self-service BI iterative predictive analytics mobile predefined bases big unstructured sandboxes private bases big unstructured sandboxes private staging area warehouse & marts social media streaming bases external staging area warehouse & marts social media streaming bases external Copyright R20/Consultancy B.V., The Hague, The Netherlands 47 Copyright R20/Consultancy B.V., The Hague, The Netherlands 48 12
13 Different Database Workloads Hadoop APIs Too Technical? OLXP xml base sql base OLAP OLAP base sql base OLCP OO base sql base OLTP pre-relational base sql base time Copyright R20/Consultancy B.V., The Hague, The Netherlands 49 Copyright R20/Consultancy B.V., The Hague, The Netherlands 50 Is Google Going SQL? Market of SQL-fication Products 2012: Spanner supports general-purpose transactions, and provides a SQL-based query language. Google s motivation: We believe it is better to have application programmers deal with performance problems due to overuse of transactions as bottlenecks arise, rather than always coding around the lack of transactions. SQL-on-Hadoop Engines Examples: Apache Hive, Cassandra CQL, CitusDB, Cloudera Impala, Concurrency Lingual, Hadapt, InfiniDB, JethroData, MammothDB, MapR Drill, MemSQL, Pivotal HawQ, Progress DataDirect, ScleraDB, Simba, SpliceMachine, Data virtualization and federation servers Examples: Cirro, Cisco/Composite, Denodo, Informatica IDS, RedHat Jboss Data Virtualization, Stonebond, SQL bases (polyglot persistence) Examples: EMC Greenplum UAP, Hadapt, Microsoft Polybase, Paraccell, Tera Aster base (SQL-H), Copyright R20/Consultancy B.V., The Hague, The Netherlands 51 Copyright R20/Consultancy B.V., The Hague, The Netherlands 52 13
14 CitusData CitusDB JethroData Jethro HDFS CitusDB MongoDB Designed for analytical queries Characteristics No use of MapReduce or Hive Knows the location of speeds up access Based on PostgreSQL Queries are pushed to the nodes Statistics are collected on the UDFs are supported Jethro HDFS Designed for interactive queries Characteristics Every column is indexed!! Append-only inverted lists index entries are appended Inserts no impact on reads 30-40% extra storage Columnar store Ansi-92 SQL: DDL + query Supports joins Copyright R20/Consultancy B.V., The Hague, The Netherlands 53 Copyright R20/Consultancy B.V., The Hague, The Netherlands 54 PivotalHD Hawq PivotalHD Hawq Architecture HBase HawQ HDFS PivotalHD Hawq = Greenplum on HDFS Dual base strategy Uses the same file format as GemFire/SQLFire for transactions Greenplum = mature costbased query optimizer Hawq compatible with Greenplum ACID compliant Copyright R20/Consultancy B.V., The Hague, The Netherlands 55 Copyright R20/Consultancy B.V., The Hague, The Netherlands 56 14
15 Data Virtualization Overview (1) Data Virtualization Overview (2) application analytics & internal portal mobile App website dashboard application analytics & internal portal mobile App website dashboard SQL statement ODBC/SQL JDBC/SQL XML/SOAP REST/JSON XQuery MDX/DAX Data Virtualization Server statement SOAP message Data Virtualization Server JMS message SQL CICS JMS SQL SQL+ XSLT SOAP Hive Prop. Excel JSON bases warehouse & marts streaming applications bases unstructured ESB big social media private external bases warehouse & marts streaming applications bases unstructured ESB big social media private external Copyright R20/Consultancy B.V., The Hague, The Netherlands 57 Copyright R20/Consultancy B.V., The Hague, The Netherlands 58 Definition of Data Virtualization Data virtualization is the technology that offers consumers a unified, abstracted, and encapsulated view for querying and manipulating stored in a heterogeneous set of. The Market of Data Virtualization Servers Cirro Data Hub Cisco/Composite Information Server Denodo Platform IBM InfoSphere Federation Server Informatica Data Services Information Builders EII Oracle Data Services Integrator Progress Easyl Red Hat Teiid and Jboss Data Virtualization Stone Bond Enterprise Enabler Virtuoso And many more Copyright R20/Consultancy B.V., The Hague, The Netherlands 59 Copyright R20/Consultancy B.V., The Hague, The Netherlands 60 15
16 Data Stays Where it s Collected Data generated by day is more than can be moved across the network. Network will look like this Copyright R20/Consultancy B.V., The Hague, The Netherlands 61 Copyright R20/Consultancy B.V., The Hague, The Netherlands 62 Data Virtualization to the Rescue? Data Virtualization Server Big Data is Too Big To Move Copyright R20/Consultancy B.V., The Hague, The Netherlands 63 Copyright R20/Consultancy B.V., The Hague, The Netherlands 64 16
17 C-Level and Big Data 85% expect to gain substantial business and IT benefits from Big Data initiatives 85% have Big Data initiatives planned or in progress 70% report that these initiatives are enterprise-driven 85% of the initiatives are sponsored by a C-level executive or the head of a line of business 75% expect an impact across multiple lines of business Copyright R20/Consultancy B.V., The Hague, The Netherlands 65 Copyright R20/Consultancy B.V., The Hague, The Netherlands 66 C-Level and Big Data Battle of Chancellorsville, % ranked their access to as adequate or world-class 21% ranked their analytic capabilities as adequate or world-class 17% ranked their ability to use and analytics to transform their business as more than adequate or world-class USA Army Strength: 133,000 Copyright R20/Consultancy B.V., The Hague, The Netherlands 67 CFA Army Strength: 60,000 Copyright R20/Consultancy B.V., The Hague, The Netherlands 68 17
18 You Can t Hide For Big Data Anymore IT specialists? IT departments? Benelux / Europe? Copyright R20/Consultancy B.V., The Hague, The Netherlands 69 Copyright R20/Consultancy B.V., The Hague, The Netherlands 70 Big IT Party?? Classic SQL base servers Analytical SQL SQL base servers base servers NewSQL base servers all Key-value base servers Document NoSQL base servers Column-family Graph base servers Copyright R20/Consultancy B.V., The Hague, The Netherlands 71 Copyright Recommended Books R20/Consultancy B.V., The Hague, The Netherlands 72 18
Data Vault + Data Virtualization = Double Flexibility
Vault + Virtualization = Double Flexibility Copyright 1991-2015 R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval
What is Data Virtualization? Rick F. van der Lans, R20/Consultancy
What is Data Virtualization? by Rick F. van der Lans, R20/Consultancy August 2011 Introduction Data virtualization is receiving more and more attention in the IT industry, especially from those interested
So Many Tools, So Much Data, and So Much Meta Data
So Many Tools, So Much Data, and So Much Meta Data Copyright 1991-2012 R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval
Data Warehouse Optimization
Data Warehouse Optimization Embedding Hadoop in Data Warehouse Environments A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy September 2013 Sponsored by Copyright
Discovering Business Insights in Big Data Using SQL-MapReduce
Discovering Business Insights in Big Data Using SQL-MapReduce A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July 2013 Sponsored by Copyright 2013
What is Data Virtualization?
What is Data Virtualization? Rick F. van der Lans Data virtualization is receiving more and more attention in the IT industry, especially from those interested in data management and business intelligence.
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford
SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
The New Rules for Integration
The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy September
Creating an Agile Data Integration Platform using Data Virtualization
Creating an Agile Data Integration Platform using Data Virtualization A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy May 2013 Sponsored by Copyright
Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
Data Virtualization Usage Patterns for Business Intelligence/ Data Warehouse Architectures
DATA VIRTUALIZATION Whitepaper Data Virtualization Usage Patterns for / Data Warehouse Architectures www.denodo.com Incidences Address Customer Name Inc_ID Specific_Field Time New Jersey Chevron Corporation
How To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 [email protected] www.scch.at Michael Zwick DI
Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com
REPORT Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com The content of this evaluation guide, including the ideas and concepts contained within, are the property of Splice Machine,
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.
Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology
The Internet of Things and Big Data: Intro
The Internet of Things and Big Data: Intro John Berns, Solutions Architect, APAC - MapR Technologies April 22 nd, 2014 1 What This Is; What This Is Not It s not specific to IoT It s not about any specific
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard
Hadoop and Relational base The Best of Both Worlds for Analytics Greg Battas Hewlett Packard The Evolution of Analytics Mainframe EDW Proprietary MPP Unix SMP MPP Appliance Hadoop? Questions Is Hadoop
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing
Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go
Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager [email protected]
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010
System/ Scale to Primary Secondary Joins/ Integrity Language/ Data Year Paper 1000s Index Indexes Transactions Analytics Constraints Views Algebra model my label 1971 RDBMS O tables sql-like 2003 memcached
How To Use Big Data For Telco (For A Telco)
ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call
Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco
Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand
WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES
INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Data Migration and Access in a Cloud Computing Environment By Mike Ferguson Intelligent Business Strategies March 2014 Prepared for: Table of Contents Introduction...
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Enterprise Operational SQL on Hadoop Trafodion Overview
Enterprise Operational SQL on Hadoop Trafodion Overview Rohit Jain Distinguished & Chief Technologist Strategic & Emerging Technologies Enterprise Database Solutions Copyright 2012 Hewlett-Packard Development
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
CISC 432/CMPE 432/CISC 832 Advanced Database Systems
CISC 432/CMPE 432/CISC 832 Advanced Database Systems Course Info Instructor: Patrick Martin Goodwin Hall 630 613 533 6063 [email protected] Office Hours: Wednesday 11:00 1:00 or by appointment Schedule:
TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION
TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION Make Big Available for Everyone Syed Rasheed Solution Marketing Manager January 29 th, 2014 Agenda Demystifying Big Challenges Getting Bigger Red Hat Big
Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)
Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database) Presented By: Mike Ferguson Intelligent Business Strategies Limited 2 Day Workshop : 25-26 September 2014 : 29-30 September 2014 www.unicom.co.uk/bigdata
Can the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect
The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect IT Insight podcast This podcast belongs to the IT Insight series You can subscribe to the podcast through
Data Modeling for Big Data
Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes
Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA. by Christian Tzolov @christzolov
Federated SQL on Hadoop and Beyond: Leveraging Apache Geode to Build a Poor Man's SAP HANA by Christian Tzolov @christzolov Whoami Christian Tzolov Technical Architect at Pivotal, BigData, Hadoop, SpringXD,
Open Source Technologies on Microsoft Azure
Open Source Technologies on Microsoft Azure A Survey @DChappellAssoc Copyright 2014 Chappell & Associates The Main Idea i Open source technologies are a fundamental part of Microsoft Azure The Big Questions
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera
SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata
BIG DATA: FROM HYPE TO REALITY Leandro Ruiz Presales Partner for C&LA Teradata Evolution in The Use of Information Action s ACTIVATING MAKE it happen! Insights OPERATIONALIZING WHAT IS happening now? PREDICTING
Teradata s Big Data Technology Strategy & Roadmap
Teradata s Big Data Technology Strategy & Roadmap Artur Borycki, Director International Solutions Marketing 18 March 2014 Agenda > Introduction and level-set > Enabling the Logical Data Warehouse > Any
NoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores
Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
Cloud Scale Distributed Data Storage. Jürmo Mehine
Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented
Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization
Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy February 2015 Sponsored
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012
Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation
Manifest for Big Data Pig, Hive & Jaql
Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
Integrating Big Data into the Computing Curricula
Integrating Big Data into the Computing Curricula Yasin Silva, Suzanne Dietrich, Jason Reed, Lisa Tsosie Arizona State University http://www.public.asu.edu/~ynsilva/ibigdata/ 1 Overview Motivation Big
Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world
Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3
NoSQL systems: introduction and data models. Riccardo Torlone Università Roma Tre
NoSQL systems: introduction and data models Riccardo Torlone Università Roma Tre Why NoSQL? In the last thirty years relational databases have been the default choice for serious data storage. An architect
Comparison of the Frontier Distributed Database Caching System with NoSQL Databases
Comparison of the Frontier Distributed Database Caching System with NoSQL Databases Dave Dykstra [email protected] Fermilab is operated by the Fermi Research Alliance, LLC under contract No. DE-AC02-07CH11359
Trafodion Operational SQL-on-Hadoop
Trafodion Operational SQL-on-Hadoop SophiaConf 2015 Pierre Baudelle, HP EMEA TSC July 6 th, 2015 Hadoop workload profiles Operational Interactive Non-interactive Batch Real-time analytics Operational SQL
Next-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
Big Data Technologies. Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015
Big Data Technologies Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015 Situation: Bigger and Bigger Volumes of Data Big Data Use Cases Log Analytics (Web Logs, Sensor
New Modeling Challenges: Big Data, Hadoop, Cloud
New Modeling Challenges: Big Data, Hadoop, Cloud Karen López @datachick www.datamodel.com Karen Lopez Love Your Data InfoAdvisors.com @datachick Senior Project Manager & Architect 1 Disclosure I m a Data
Actian SQL in Hadoop Buyer s Guide
Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop
MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering
MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation
Big Data: Are You Ready? Kevin Lancaster
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
Applications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
Performance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and
CitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here
Data Virtualization for Agile Business Intelligence Systems and Virtual MDM To View This Presentation as a Video Click Here Agenda Data Virtualization New Capabilities New Challenges in Data Integration
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015
NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner [email protected] @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox [@willcoxmnk], Director Big Data Centre of Excellence (Teradata
Bringing Intergalactic Data Speak (a.k.a.: SQL) to Hadoop Martin Willcox [@willcoxmnk], Director Big Data Centre of Excellence (Teradata International) 4 th June 2015 Agenda A (very!) short history of
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
An Approach to Implement Map Reduce with NoSQL Databases
www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 4 Issue 8 Aug 2015, Page No. 13635-13639 An Approach to Implement Map Reduce with NoSQL Databases Ashutosh
Performance and Scalability Overview
Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
Data Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout
BIRT in the World of Big Data
BIRT in the World of Big Data David Rosenbacher VP Sales Engineering Actuate Corporation 2013 Actuate Customer Days Today s Agenda and Goals Introduction to Big Data Compare with Regular Data Common Approaches
Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013
Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the
Actian Vector in Hadoop
Actian Vector in Hadoop Industrialized, High-Performance SQL in Hadoop A Technical Overview Contents Introduction...3 Actian Vector in Hadoop - Uniquely Fast...5 Exploiting the CPU...5 Exploiting Single
A Survey on Big Data Analytical Tools
A Survey on Big Data Analytical Tools Mr. Mahesh G Huddar Sr. Lecturer Dept. of Computer Science and Engineering Hirasugar Institute of Technology, Nidasoshi, Karnataka, India Manjula M Ramannavar Asst.
Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM
Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that
