Applications for Big Data Analytics

Similar documents
Transforming the Telecoms Business using Big Data and Analytics

Big Data Explained. An introduction to Big Data Science.

Cloud Scale Distributed Data Storage. Jürmo Mehine

Databases 2 (VU) ( )

SQL VS. NO-SQL. Adapted Slides from Dr. Jennifer Widom from Stanford

So What s the Big Deal?

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

Big Data and Data Science: Behind the Buzz Words

MongoDB in the NoSQL and SQL world. Horst Rechner Berlin,

Lecture Data Warehouse Systems

Big Data Technologies. Prof. Dr. Uta Störl Hochschule Darmstadt Fachbereich Informatik Sommersemester 2015

Open source large scale distributed data management with Google s MapReduce and Bigtable

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Peninsula Strategy. Creating Strategy and Implementing Change

Using Big Data to Explore New Opportunities. Fandhy Haristha Siregar, M.Kom, CIA, CRMA, CISA, CISM, CISSP, CEH, CEP-PM, QIA, COBIT5

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Integrating Big Data into the Computing Curricula

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Вовченко Алексей, к.т.н., с.н.с. ВМК МГУ ИПИ РАН

NoSQL Data Base Basics

Open Source Technologies on Microsoft Azure

How To Scale Out Of A Nosql Database

The evolution of database technology (II) Huibert Aalbers Senior Certified Executive IT Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

How To Handle Big Data With A Data Scientist

You should have a working knowledge of the Microsoft Windows platform. A basic knowledge of programming is helpful but not required.

Big Data. Introducción. Santiago González

Hurtownie Danych i Business Intelligence: Big Data

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

WA2192 Introduction to Big Data and NoSQL EVALUATION ONLY

Real Time Big Data Processing

Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料

Big Data Use Cases Update

How To Understand The Business Case For Big Data

Introduction to NoSQL Databases. Tore Risch Information Technology Uppsala University

Preparing Your Data For Cloud

Crazy NoSQL Data Integration with Pentaho

The NoSQL Ecosystem, Relaxed Consistency, and Snoop Dogg. Adam Marcus MIT CSAIL

Introduction to NOSQL

Evaluating NoSQL for Enterprise Applications. Dirk Bartels VP Strategy & Marketing

TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

Composite Data Virtualization Composite Data Virtualization And NOSQL Data Stores

Infrastructures for big data

TRAINING PROGRAM ON BIGDATA/HADOOP

Structured Data Storage

Are You Ready for Big Data?

NOSQL INTRODUCTION WITH MONGODB AND RUBY GEOFF

Industry Impact of Big Data in the Cloud: An IBM Perspective

Large Scale/Big Data Federation & Virtualization: A Case Study

Are You Ready for Big Data?

Enterprise Operational SQL on Hadoop Trafodion Overview

extensible record stores document stores key-value stores Rick Cattel s clustering from Scalable SQL and NoSQL Data Stores SIGMOD Record, 2010

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Sunnie Chung. Cleveland State University

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

How To Use Big Data For Telco (For A Telco)

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

REAL-TIME BIG DATA ANALYTICS

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Slave. Master. Research Scholar, Bharathiar University

Big Data: Tools and Technologies in Big Data

Comparing SQL and NOSQL databases

Cloud Big Data Architectures

Getting Started Practical Input For Your Roadmap

Cloud & Big Data a perfect marriage? Patrick Valduriez

Big Data Solutions. Portal Development with MongoDB and Liferay. Solutions

3 Case Studies of NoSQL and Java Apps in the Real World

NoSQL Systems for Big Data Management

BIG DATA TOOLS. Top 10 open source technologies for Big Data

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

Big Data and Analytics (Fall 2015)

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Tap into Hadoop and Other No SQL Sources

Document Oriented Database

Big Data Technologies Compared June 2014

A COMPARATIVE STUDY OF NOSQL DATA STORAGE MODELS FOR BIG DATA

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

How To Store Data In Nosql

Development of Real-time Big Data Analysis System and a Case Study on the Application of Information in a Medical Institution

Sentimental Analysis using Hadoop Phase 2: Week 2

Microsoft Azure: Opção de Nuvem para Todo o Desenvolvedor. Danilo Bordini & Osvaldo Daibert

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Defense Industry & Open Source & BigData

The emergence of big data technology and analytics

Can the Elephants Handle the NoSQL Onslaught?

Understanding NoSQL Technologies on Windows Azure

Introduction to Apache Cassandra

Search and Real-Time Analytics on Big Data

Why NoSQL? Your database options in the new non- relational world IBM Cloudant 1

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Comparison of the Frontier Distributed Database Caching System with NoSQL Databases

Software Engineering for Big Data. CS846 Paulo Alencar David R. Cheriton School of Computer Science University of Waterloo

Transcription:

Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail: Churn, NBO

Big Data Definition No single standard definition Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it

Big Data: 3V s

Volume (Scale) Data Volume 44x increase from 2009 2020 From 0.8 zettabytes to 35zb Data volume is increasing exponentially Exponential increase in collected/generated data

12+ TBs of tweet data every day data every day 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide? TBs of 100s of millions of GPS enabled devices sold annually 25+ TBs of log data every day 76 million smart meters in 2009 200M by 2014 2+ billion people on the Web by end 2011

Variety (Complexity) Relational Data (Tables/Transaction/Legacy Data) Text Data (Web) Semi-structured Data (XML) Graph Data Social Network, Semantic Web (RDF), Streaming Data You can only scan the data once A single application can be generating/collecting many types of data Big Public Data (online, weather, finance, etc) To extract knowledge all these types of data need to linked together

A Single View to the Customer Social Media Banking Finance Gaming Customer Our Known History Entertain Purchas e

Velocity (Speed) Data is begin generated fast and need to be processed fast Online Data Analytics Late decisions missing opportunities Examples E-Promotions: Based on your current location, your purchase history, what you like send promotions right now for store next to you Healthcare monitoring: sensors monitoring your activities and body any abnormal measurements require immediate reaction

Real-time/Fast Data Mobile devices (tracking all objects all the time) Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Sensor technology and networks (measuring all kinds of data) The progress and innovation is no longer hindered by the ability to collect data But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion 11

Some Make it 4V s

The Model Has Changed The Model of Generating/Consuming Data has Changed Old Model: Few companies are generating data, all others are consuming data New Model: all of us are generating data, and all of us are consuming data

Big Data Solutions Map/Reduce Model Apache Hadoop HDFS MapReduce Hbase PIG Hive Nosql Databases Key/Value Column-oriented Document-oriented Graph Database XML Data

So many NoSQL options More than just the Elephant in the room Over 120+ types of NoSQL databases

CAP CA CP AP Highly-available consistency Enforced consistency Eventual consistency

Flavors of NoSQL

Graph Database Use for data with a lot of many-to-many relationships recursive self-joins when your primary objective is quickly finding connections, patterns and relationships between the objects within lots of data Examples: Neo4J, FreeBase (Google)

Column Database Wide, sparse column sets Schema-light Examples: Cassandra HBase BigTable GAE HR DS

Document Database (Mongo DB) Use for data that is document-oriented (collection of JSON documents) w/semi structured data Encodings include XML, YAML, JSON & BSON binary forms PDF, Microsoft Office documents -- Word, Excel ) Examples: MongoDB, CouchDB

Key / Value Database Schema-less State (Persistent or Volatile) Examples AWS Dynamo DB Project Voldemort

So which type of NoSQL? Back to CAP CA = SQL/RDBMS SQL Sever / SQL Azure Oracle MySQL CP = nosql/column Hadoop Big Table HBase MemCacheDB AP = nosql/document or key/value DynamoDB CouchDB Cassandra Voldemort

THINK