So What s the Big Deal?

Size: px
Start display at page:

Download "So What s the Big Deal?"

Transcription

1

2 So What s the Big Deal?

3 Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data Case Study (if we have time) Q&A Links to More Information

4 Introduction RhinoSource, Inc. Oracle App/Tech Consulting and Managed Services Oracle E Business Suite Oracle Business Intelligence Oracle Database (performance, partitioning and replication) Application Development and Advanced PL/SQL Development Advanced Technology Consulting and Managed Services Big Data Mobile Applications Cloud Computing CIO Level Advisory Services IT Strategy, Planning and Project Management ERP/CRM Evaluation and Implementation

5 WHAT IS BIG DATA?

6 What Makes Up Big Data? Blog posts, user comments s and Messaging Web server logs Instrumentation of online stores Image and video uploads Process data, such as RFID Sensor device data External data sets Census data Weather data Geographical data Shadow Data (replicated copies and change journals)

7 The 3 V s of Big Data Velocity Data generated at a faster rate than ever before Server logs, smart phones, sensor devices, RFID Challenge: Existing systems cannot process new data fast enough Variety Data more varied and complex Structured and unstructured Many formats: text, document, image, video Challenge: Existing databases do not handle varying data formats well Volume Orders of magnitude larger 2.5 Zetabytes of new data created in Zetabytes on new data projected to be created in Billion Internet users, 15 Billion connected devices Challenge: Existing databases do not cost effectively scale to Big Data sizes

8 Big Data Growth Trend Zettabytes 40% CAGR

9 How Much is 1 Zettabyte?

10 New Data by End of ZB of New Data By end of 2015!

11 SO, WHAT S THE BIG DEAL?

12 Can Big Data Show Us The Way? Scientific American, Dec 11: "...the rise of 'big data' [is] a trend that is striking many scientists as being on a par with the invention of the telescope and microscope." "...many experts believe we are on the cusp of opening up new worlds of inquiry."

13 Big Data Advantages Better, more accurate predictions Deeper, richer insights into customers, business partners and the operations Real time Big Data analytics enables faster decision making Creates competitive advantage Improves bottom line

14 Big Data Spending Companies have spent $4.3 billion on Big Data as of the end of Gartner predicts those initial investments will in turn trigger a domino effect of upgrades and new initiatives Valued at $34 billion for 2013, per Gartner. Over a 5 year period, spend is estimated at $232 billion.

15 BIG DATA TECHNOLOGIES

16 A Brief History of Big Data Scale Data Warehouse Distributed Big Data Cluster RDBMS RAC Time

17 How Do We Store Big Data? NoSQL databases store data records as key value pairs Or as triplets with a timestamp. Schema less or schema optional Values may be structured or unstructured (developer s choice). Not relational No relationships between records No join support in a NoSQL database. Does not use SQL to store and retrieve records. Highly optimized for retrieval and appending operations. High performance writes. High performance retrieval by primary key. Little functionality beyond record storage and retrieval. Highly Scalable to huge amounts of data Millions or Billions of records Partition data across many distributed, inexpensive servers for cost effective scalability and availability Must trade off between Availability versus Consistency (CAP Theorem).

18 Popular NoSQL Databases Key Value Stores Column Oriented Databases Graph Databases Document Databases

19 Why Not Relational for Big Data? Transforming and loading data into RDBMS requires extensive preprocessing of data into a pre defined schema Doesn t work well for semi structured and unstructured data Can take more time than is available before next batch must be loaded Joining multiple data sets at query time is an expensive operation RDBMS scaling must be done vertically to larger and more expensive servers and storage solutions RDBMS clustering requires expensive networking and shared storage infrastructures Fiber Channel, Infiniband, SAN, NAS Challenging to distribute data across data centers Replication strategies are add ons and complex Strict Consistency requirement is enforced at the cost of write performance and availability (CAP Theorem)

20 Dr. Brewer s CAP Theorem CA: Pick 2 Oracle RAC RDBMS CP: BigTable Hadoop/Hbase MongoDB Oracle NoSQL Redis AP: Cassandra CouchDB Dynamo Riak SimpleDB

21 Scalability Comparison (Logarithmic Scale) PB, 2000 Nodes at Facebook TB, 100 Nodes at CraigsList 71 TB, 48 Nodes at Amazon 300 TB, 400 Nodes at Digital Reasoning Terabytes Server Nodes 10 1 MongoDB Oracle RAC Cassandra Hadoop RAC

22 Scalability Comparison (Linear Scale) PB, 2000 Nodes at Facebook TB, 100 Nodes at CraigsList 71 TB, 48 Nodes at Amazon 300 TB, 400 Nodes at Digital Reasoning Terabytes Server Nodes MongoDB Oracle RAC Cassandra Hadoop RAC

23 Feature Comparison Cassandra Best of the NoSQLs for Cross Data Center Replication and High Availability Known to scale to 100 s of Terabytes (but theoretically to Petabytes) Tunable Consistency at operation level for writes and reads. Availability model (AP). Primary and Secondary Indexes Queries are Real Time (CQL, Thrift) No Join Support Masterless Peer to Peer Ring Architecture = No S.P.O.F. Provides most cost effective HA and scalability of the NoSQLs Written in Java Minimum of 3 nodes recommended. Easy to install and setup on commodity hardware. Hadoop/HBase The current Gold Standard of the NoSQLs for Data Analysis Known to scale to Petabytes (1000 s of Terabytes) Consistency model (CP) Hadoop Queries are Batch (MapReduce). HBase provides real time queries similar to Cassandra. Joins are Possible Master Slave Architecture = S.P.O.F. (Name/JobTracker Node) Written in Java Minimum of 5 nodes recommended. More challenging installation and setup. Warm Standby and Shared Storage Required for High Availability Failover, so higher infrastructure costs.

24 Best of All Worlds DataStax Enterprise Cassandra Real Time Database Peer to Peer HA Architecture Cross Data Center Replication Real Time, Low Latency Queries Hadoop A Analytics Map/Reduce, Hive, Pig (Joins) Solr Search Full Text Search Rich Document Handling (Word, PDF)

25 Plus Cluster Management

26 Current Big Data Challenges Integrating Big Data with existing databases and BI/reporting systems. JDBC, ODBC sqoop Security and Encryption DataStax Enterprise 3.0 (In Beta) Transparent Data Encryption Internal and External Authentication Data Auditing

27 IDENTIFYING BIG DATA OPPORTUNITIES

28 Big Data Use Cases Context for Interactions and Transactions Reward Points Warranty Policies Social media chatter Survey response feedback Website requests Connection with Outside Patterns Weather Data Demographic Data Geographical Data Government Compliance Data Improving Disaster and Outage Response Times by Spotting Trends Compliance Checks and Audits Competitive Insights into How Your Products and Services (and your competition s) are used and perceived in the marketplace. Database Infrastructure Behind Mobile and Web Applications

29 Great Places to Look for Big Data Opportunities Server Logs Web server and app server logs Call center/phone system logs Product Data Performance data Sensor data Positional data Streamed live or captured in Log files / Data files Current RDBMS Archive Purge Strategies What data are you deleting every day/month/year? Financial Data, Operational Data, Customer Interactions

30 Implementing Big Data Identify "Game Changing" Big Data opportunities. Define a business case. Identify existing business and functional capabilities. Augment existing capabilities with 3rd party assistance. Conduct low cost Proof of Concept project to demonstrate feasibility.

31 Low Cost Proof of Concept Take advantage of a cloud platform like Amazon Web Services (AWS) and Amazon EC2. Run a multi node cluster for less than $25/day. Get started instantly. Have a cluster up and running in only a few hours. NoSQL technologies are perfectly suited for the cloud deployed model. Amazon Machine Images (AMIs) exist for most NoSQL products that can be started in just a few minutes. You can make it as secure as you need it to be.

32 Low Cost Proof of Concept Now that you have a cluster up and running: Load up some test data. (Check out sqoop.) Get your HiveQL book in hand and start doing some analysis. Delete the servers once you are done. Only pay for the time the servers are running. You can always bring the cluster in house for production, but you might find out it s more cost effective to leave it in the Cloud!

33 (If we have time) BIG DATA CASE STUDY

34 Client Overview Mobile social networking startup Focused on families with kids Launching in Q Currently in Stealth Mode pending launch the first week of March, 2013 Big Data Use Case: Infrastructure behind mobile app

35 The Challenge Big Data application Semi structured and unstructured data Low latency (<100ms) for user experience 24 x 7 high availability Cloud deployment (Amazon AWS) Analytical capability required

36 The Solution DataStax Enterprise Big Data Database Cluster Cassandra database for low latency reads and writes Cluster architecture for high availability Tunable read and write consistency Integrated Hadoop workload support for analytics Integrated Solr workload support for search feature DataStax OpsCenter tool for cluster management Benefits High performance reads and writes = good customer experience Only single cluster required for Cassandra, Hadoop and Solr Commercial grade support Cost effective solution Fast deployment (30 days)

37 Technical Details Installed DataStax Enterprise on Amazon AWS 3 x M1.Large Nodes Will double to 6 nodes later in the year Each node will hold ~800GB of data Implemented monitoring and alerts Cluster stats collected every 15 seconds Stats stored in db and graphed Amazon SNS for notifications ( and SMS)

38 Amazon AWS and EC2

39 OpsCenter Cluster Management

40 Cluster Ring View

41 Performance Monitoring

42 Customized Dashboards

43 More Custom Dashboards

44 Q&A

45 More Reading

46 Thank you!

Data Services Advisory

Data Services Advisory Data Services Advisory Modern Datastores An Introduction Created by: Strategy and Transformation Services Modified Date: 8/27/2014 Classification: DRAFT SAFE HARBOR STATEMENT This presentation contains

More information

Firebird meets NoSQL (Apache HBase) Case Study

Firebird meets NoSQL (Apache HBase) Case Study Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Introduction to Apache Cassandra

Introduction to Apache Cassandra Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating

More information

Dominik Wagenknecht Accenture

Dominik Wagenknecht Accenture Dominik Wagenknecht Accenture Improving Mainframe Performance with Hadoop October 17, 2014 Organizers General Partner Top Media Partner Media Partner Supporters About me Dominik Wagenknecht Accenture Vienna

More information

INTRODUCTION TO CASSANDRA

INTRODUCTION TO CASSANDRA INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015

NoSQL Databases. Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 NoSQL Databases Institute of Computer Science Databases and Information Systems (DBIS) DB 2, WS 2014/2015 Database Landscape Source: H. Lim, Y. Han, and S. Babu, How to Fit when No One Size Fits., in CIDR,

More information

III Big Data Technologies

III Big Data Technologies III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Real Time Big Data Processing

Real Time Big Data Processing Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

No-SQL Databases for High Volume Data

No-SQL Databases for High Volume Data Target Conference 2014 No-SQL Databases for High Volume Data Edward Wijnen 3 November 2014 The New Connected World Needs a Revolutionary New DBMS Today The Internet of Things 1990 s Mobile 1970 s Mainfram

More information

Preparing Your Data For Cloud

Preparing Your Data For Cloud Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1 Agenda Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

Applications for Big Data Analytics

Applications for Big Data Analytics Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing Evaluating NoSQL for Enterprise Applications Dirk Bartels VP Strategy & Marketing Agenda The Real Time Enterprise The Data Gold Rush Managing The Data Tsunami Analytics and Data Case Studies Where to go

More information

Open source large scale distributed data management with Google s MapReduce and Bigtable

Open source large scale distributed data management with Google s MapReduce and Bigtable Open source large scale distributed data management with Google s MapReduce and Bigtable Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory

More information

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014 Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western

More information

Scalable Architecture on Amazon AWS Cloud

Scalable Architecture on Amazon AWS Cloud Scalable Architecture on Amazon AWS Cloud Kalpak Shah Founder & CEO, Clogeny Technologies kalpak@clogeny.com 1 * http://www.rightscale.com/products/cloud-computing-uses/scalable-website.php 2 Architect

More information

Big Data on Microsoft Platform

Big Data on Microsoft Platform Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4

More information

Structured Data Storage

Structured Data Storage Structured Data Storage Xgen Congress Short Course 2010 Adam Kraut BioTeam Inc. Independent Consulting Shop: Vendor/technology agnostic Staffed by: Scientists forced to learn High Performance IT to conduct

More information

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford SQL VS. NO-SQL Adapted Slides from Dr. Jennifer Widom from Stanford 55 Traditional Databases SQL = Traditional relational DBMS Hugely popular among data analysts Widely adopted for transaction systems

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores

More information

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case

More information

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Data Warehousing and Analytics Infrastructure at Facebook Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com Overview Challenges in a Fast Growing & Dynamic Environment Data Flow Architecture,

More information

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?

More information

Big Data Analytics Platform @ Nokia

Big Data Analytics Platform @ Nokia Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform

More information

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Real World Big Data Architecture - Splunk, Hadoop, RDBMS Copyright 2015 Splunk Inc. Real World Big Data Architecture - Splunk, Hadoop, RDBMS Raanan Dagan, Big Data Specialist, Splunk Disclaimer During the course of this presentagon, we may make forward looking

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success

The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success 1 Table of Contents Abstract... 3 Introduction... 3 Requirement #1 Smarter Customer Interactions... 4 Requirement

More information

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1

Why NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1 Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots

More information

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce

Analytics in the Cloud. Peter Sirota, GM Elastic MapReduce Analytics in the Cloud Peter Sirota, GM Elastic MapReduce Data-Driven Decision Making Data is the new raw material for any business on par with capital, people, and labor. What is Big Data? Terabytes of

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

NoSQL Data Base Basics

NoSQL Data Base Basics NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS

More information

Introduction to Big Data Training

Introduction to Big Data Training Introduction to Big Data Training The quickest way to be introduce with NOSQL/BIG DATA offerings Learn and experience Big Data Solutions including Hadoop HDFS, Map Reduce, NoSQL DBs: Document Based DB

More information

Hadoop vs Apache Spark

Hadoop vs Apache Spark Innovate, Integrate, Transform Hadoop vs Apache Spark www.altencalsoftlabs.com Introduction Any sufficiently advanced technology is indistinguishable from magic. said Arthur C. Clark. Big data technologies

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Enterprise Operational SQL on Hadoop Trafodion Overview

Enterprise Operational SQL on Hadoop Trafodion Overview Enterprise Operational SQL on Hadoop Trafodion Overview Rohit Jain Distinguished & Chief Technologist Strategic & Emerging Technologies Enterprise Database Solutions Copyright 2012 Hewlett-Packard Development

More information

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН Zettabytes Petabytes ABC Sharding A B C Id Fn Ln Addr 1 Fred Jones Liberty, NY 2 John Smith?????? 122+ NoSQL Database

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

Cloud Scale Distributed Data Storage. Jürmo Mehine

Cloud Scale Distributed Data Storage. Jürmo Mehine Cloud Scale Distributed Data Storage Jürmo Mehine 2014 Outline Background Relational model Database scaling Keys, values and aggregates The NoSQL landscape Non-relational data models Key-value Document-oriented

More information

Hadoop and its Usage at Facebook. Dhruba Borthakur dhruba@apache.org, June 22 rd, 2009

Hadoop and its Usage at Facebook. Dhruba Borthakur dhruba@apache.org, June 22 rd, 2009 Hadoop and its Usage at Facebook Dhruba Borthakur dhruba@apache.org, June 22 rd, 2009 Who Am I? Hadoop Developer Core contributor since Hadoop s infancy Focussed on Hadoop Distributed File System Facebook

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data

More information

BIG DATA TOOLS. Top 10 open source technologies for Big Data

BIG DATA TOOLS. Top 10 open source technologies for Big Data BIG DATA TOOLS Top 10 open source technologies for Big Data We are in an ever expanding marketplace!!! With shorter product lifecycles, evolving customer behavior and an economy that travels at the speed

More information

Cloud Computing Now and the Future Development of the IaaS

Cloud Computing Now and the Future Development of the IaaS 2010 Cloud Computing Now and the Future Development of the IaaS Quanta Computer Division: CCASD Title: Project Manager Name: Chad Lin Agenda: What is Cloud Computing? Public, Private and Hybrid Cloud.

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier Simon Law TimesTen Product Manager, Oracle Meet The Experts: Andy Yao TimesTen Product Manager, Oracle Gagan Singh Senior

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA

ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call

More information

Big Data: Beyond the Hype. Why Big Data Matters to You. White Paper

Big Data: Beyond the Hype. Why Big Data Matters to You. White Paper Big Data: Beyond the Hype Why Big Data Matters to You White Paper BY DATASTAX CORPORATION October 2013 Table of Contents Abstract 3 Introduction 3 Big Data and You 5 Big Data Is More Prevalent Than You

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD

Big Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the The Israeli Association of Grid Technologies July 15, 2009 Outline Architecture

More information

Don t Let Your Shoppers Drop; 5 Rules for Today s Ecommerce A guide for ecommerce teams comprised of line-of-business managers and IT managers

Don t Let Your Shoppers Drop; 5 Rules for Today s Ecommerce A guide for ecommerce teams comprised of line-of-business managers and IT managers Don t Let Your Shoppers Drop; 5 Rules for Today s Ecommerce A guide for ecommerce teams comprised of line-of-business managers and IT managers White Paper BY DATASTAX CORPORATION AUGUST 2013 Table of Contents

More information

Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER

Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER By DataStax Corporation August 2012 Contents Introduction...3 The Growth in Multiple Data Centers...3 Why

More information

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT

More information

Analyzing Big Data with AWS

Analyzing Big Data with AWS Analyzing Big Data with AWS Peter Sirota, General Manager, Amazon Elastic MapReduce @petersirota What is Big Data? Computer generated data Application server logs (web sites, games) Sensor data (weather,

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Oracle s Big Data solutions. Roger Wullschleger.

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here> s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline

More information

Information Builders Mission & Value Proposition

Information Builders Mission & Value Proposition Value 10/06/2015 2015 MapR Technologies 2015 MapR Technologies 1 Information Builders Mission & Value Proposition Economies of Scale & Increasing Returns (Note: Not to be confused with diminishing returns

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER By DataStax Corporation March 2012 Contents Introduction... 3 Big Data and You... 5 Big Data Is More Prevalent Than You Think... 5 Big

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world

Analytics March 2015 White paper. Why NoSQL? Your database options in the new non-relational world Analytics March 2015 White paper Why NoSQL? Your database options in the new non-relational world 2 Why NoSQL? Contents 2 New types of apps are generating new types of data 2 A brief history of NoSQL 3

More information

This Symposium brought to you by www.ttcus.com

This Symposium brought to you by www.ttcus.com This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB Planet Size Data!? Gartner s 10 key IT trends for 2012 unstructured data will grow some 80% over the course of the next

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Introduction to NOSQL

Introduction to NOSQL Introduction to NOSQL Université Paris-Est Marne la Vallée, LIGM UMR CNRS 8049, France January 31, 2014 Motivations NOSQL stands for Not Only SQL Motivations Exponential growth of data set size (161Eo

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY

WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY WA2192 Introduction to Big Data and NoSQL Web Age Solutions Inc. USA: 1-877-517-6540 Canada: 1-866-206-4644 Web: http://www.webagesolutions.com The following terms are trademarks of other companies: Java

More information

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores Composite Software October 2010 TABLE OF CONTENTS INTRODUCTION... 3 BUSINESS AND IT DRIVERS... 4 NOSQL DATA STORES LANDSCAPE...

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Hadoop & its Usage at Facebook

Hadoop & its Usage at Facebook Hadoop & its Usage at Facebook Dhruba Borthakur Project Lead, Hadoop Distributed File System dhruba@apache.org Presented at the Storage Developer Conference, Santa Clara September 15, 2009 Outline Introduction

More information

Introduction to Big Data and the Lambda Architecture

Introduction to Big Data and the Lambda Architecture Introduction to Big Data and the Lambda Architecture Marc Schöni Meinrad Weiss April 2014 BASEL BERN BRUGG LAUSANNE ZUERICH DUESSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MUNICH STUTTGART VIENNA 1 What

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK OVERVIEW ON BIG DATA SYSTEMATIC TOOLS MR. SACHIN D. CHAVHAN 1, PROF. S. A. BHURA

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

A Survey on Big Data Concepts and Tools

A Survey on Big Data Concepts and Tools A Survey on Big Data Concepts and Tools D. Rajasekar 1, C. Dhanamani 2, S. K. Sandhya 3 1,3 PG Scholar, 2 Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Engineering

More information

Moving From Hadoop to Spark

Moving From Hadoop to Spark + Moving From Hadoop to Spark Sujee Maniyam Founder / Principal @ www.elephantscale.com sujee@elephantscale.com Bay Area ACM meetup (2015-02-23) + HI, Featured in Hadoop Weekly #109 + About Me : Sujee

More information

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc. 2011 2014. All Rights Reserved Hortonworks & SAS Analytics everywhere. Page 1 A change in focus. A shift in Advertising From mass branding A shift in Financial Services From Educated Investing A shift in Healthcare From mass treatment

More information

A Brief Outline on Bigdata Hadoop

A Brief Outline on Bigdata Hadoop A Brief Outline on Bigdata Hadoop Twinkle Gupta 1, Shruti Dixit 2 RGPV, Department of Computer Science and Engineering, Acropolis Institute of Technology and Research, Indore, India Abstract- Bigdata is

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Big Data: Tools and Technologies in Big Data

Big Data: Tools and Technologies in Big Data Big Data: Tools and Technologies in Big Data Jaskaran Singh Student Lovely Professional University, Punjab Varun Singla Assistant Professor Lovely Professional University, Punjab ABSTRACT Big data can

More information

Distributed Storage Systems

Distributed Storage Systems Distributed Storage Systems John Leach john@brightbox.com twitter @johnleach Brightbox Cloud http://brightbox.com Our requirements Bright box has multiple zones (data centres) Should tolerate a zone failure

More information