Age of Big data. Presented by: Mohammad Iqbal BCM -2014



Similar documents
Introduction to Predictive Analytics. Dr. Ronen Meiri

A Survey on Big Data Concepts and Tools

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Doing Multidisciplinary Research in Data Science

BIG DATA TRENDS AND TECHNOLOGIES

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Big Data Streams. Analytics Challenges, Analysis, and Applications. Adel M. Alimi

Oracle Big Data for Dummies

Big Data Explained. An introduction to Big Data Science.

The Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP

Big Data Big Data/Data Analytics & Software Development

A Brief Outline on Bigdata Hadoop

Session: Big Data get familiar with Hadoop to use your unstructured data Udo Brede Dell Software. 22 nd October :00 Sesión B - DB2 LUW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Large scale processing using Hadoop. Ján Vaňo

Application Development. A Paradigm Shift

Introduction to the Mathematics of Big Data. Philippe B. Laval

BIG DATA CHALLENGES AND PERSPECTIVES

Hadoop implementation of MapReduce computational model. Ján Vaňo

Oracle Big Data for Dummies

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Taming the Beast of Big Data

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Data-Intensive Computing with Map-Reduce and Hadoop

Hadoop Big Data for Processing Data and Performing Workload

So Just What Is Big Data? James E. Tcheng, MD, FACC, FSCAI

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

HDP Enabling the Modern Data Architecture

Architectures for massive data management

BIG DATA What it is and how to use?

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Big Data Drupal. Commercial Open Source Big Data Tool Chain

Majed Al-Ghandour, PhD, PE, CPM Division of Planning and Programming NCDOT 2016 NCAMPO Conference- Greensboro, NC May 12, 2016

Hadoop IST 734 SS CHUNG

Changing the face of Business Intelligence & Information Management

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

Big Data System and Architecture

How To Scale Out Of A Nosql Database

Large-Scale Data Processing

Copyright (c) 2012, Meta Business Systems. Mario Bojilov Meta Business Systems 20 February 2013

Hadoop Introduction coreservlets.com and Dima May coreservlets.com and Dima May

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

HDP Hadoop From concept to deployment.

Tap into Hadoop and Other No SQL Sources

Big Data a threat or a chance?

Hadoop: Distributed Data Processing. Amr Awadallah Founder/CTO, Cloudera, Inc. ACM Data Mining SIG Thursday, January 25 th, 2010

Big Data and Apache Hadoop s MapReduce

Transforming the Telecoms Business using Big Data and Analytics

Big Data and Hadoop. Sreedhar C, Dr. D. Kavitha, K. Asha Rani

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Big Data: Tools and Technologies in Big Data

Big Data. Lyle Ungar, University of Pennsylvania

Hadoop Ecosystem B Y R A H I M A.

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

HOW TO LIVE WITH THE ELEPHANT IN THE SERVER ROOM APACHE HADOOP WORKSHOP

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Big Data Technologies

Big Data: Study in Structured and Unstructured Data

Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related

MySQL and Hadoop. Percona Live 2014 Chris Schneider

Real Time Big Data Processing

Workshop on Hadoop with Big Data

CSE-E5430 Scalable Cloud Computing Lecture 2

Big Data Realities Hadoop in the Enterprise Architecture

UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE

The little elephant driving Big Data

Microsoft SQL Server 2012 with Hadoop

Chapter 7. Using Hadoop Cluster and MapReduce

Community Driven Apache Hadoop. Apache Hadoop Basics. May Hortonworks Inc.

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Applications for Business Intelligence, Predictive Analytics and Big Data

Linux Clusters Ins.tute: Turning HPC cluster into a Big Data Cluster. A Partnership for an Advanced Compu@ng Environment (PACE) OIT/ART, Georgia Tech

Journal of Environmental Science, Computer Science and Engineering & Technology

What happens when Big Data and Master Data come together?

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

So What s the Big Deal?

Data Analyst Program- 0 to 100

Beginner s Guide to. BigDataAnalytics

THE AGE OF BIG DATA. Chula DataScience

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

CIS 4930/6930 Spring 2014 Introduction to Data Science Data Intensive Computing. University of Florida, CISE Department Prof.

Transcription:

Age of Presented by: Mohammad Iqbal BCM -2014

Agenda Big? Big evolution from

Big? Name Symbol Value Kilobyte KB 10^3 BIG DATA Megabyte MB 10^6 Gigabyte GB 10^9 Terabyte TB 10^12 Petabyte PB 10^15 So large data that it becomes difficult to process it using the traditional system Exabyte EB 10^18 Zettabyte ZB 10^21 Yottabyte YB 10^24 Big? Big

Difficult to process by Traditional System Unable to send Unable to View 100 MB document Unable to Edit 100 GB document Depends on capability of system 100 TB document Big? Big

Organization/Context Specific 500 TB Text,Audio,Video data per day Big Date NOT a Big data Depends on capabilities of the organization Company A Company B Big? Big

Areas of Challenges Capture search Curation Sharing Storage Transfer Anlaysis Visualization Big? Big

Big Big Large & growing files At High speed In various Format V^3 comes at high speed result in large file This files comes in various formats VELOCITY VOLUME VARIETY Big? Big

Structured / Unstructured Challenge /Opportunity Mostly wasted Used in decision making Unstructured 90% Structured 10% To analyze & extract meaningful information Big? Big

Users Applications Systems Large & growing files ( files) Sensors Big? Big

Generation point Examples Mobile devices Machine Sensors Microphones cameras Readers/Scanners Social Media Science facilities Software/program Big? Big

Sample Events generating Every day, we create 2.5 Exabytes of data i.e 2.5 billion GB, so much that 90% of the data in the world today has been created in the last few years alone. CERN Atomic facility generates 40 TB data per second. Twitter generates 12 TB of data every day. Airbus A380 generates 10 TB every 30 minutes of flight. About 650TB generated in one flight. In 2009 total data in world was estimated to be 1 ZB. By 2020 estimated to be 35 ZB. (Source :IBM.com) Big? Big

Collect Analyze Understand Big? Big

Applications Companies gaining edge by collecting,analyzing and understanding information. Government forecasting events and taking proactive actions. Big? Big

Not able to handle Big data Created to handle big data Traditional Systems (e.g RDBMS,SQL) tool (e.g NoSQL) Time Big? Big

Traditional Enterprise Approach Only So much data could be processed Processing Limit Powerful Computer Big? Big

Modern s approach Computation Combined result Computation Computation Computation Big? Big

s s Hive Map Reduce HBase Mahout File System HDFS Pig Oozie Projects Source :hortonworks/hadoop/hdfs/.com/ Flume Scoop Big? Big

MASTER Task tracker Job Tracker DATA Application Node Name Node Slaves Task tracker Task tracker Task tracker Task tracker Node Node Node Node

MASTER can be taken directly Task tracker Job Tracker DATA Application Node Name Node Know where data residing Slaves Task tracker Task tracker Task tracker Task tracker Node Node Node Node

HDFS vs GFS Similarity with file system (GFS) MapReduce Back in 1990 search engine supported by: Excite Altavista Lycos Infoseek Big? Big

Victory 1995 Excite 2000 Altavista Lycos Big? Big

evolution from GFS paper released by released paper on MapReduce created by Doug & Cafarella at Yahoo! (Nutch search engine) Yahoo donated the project to Apache 2003 2004 2005 2006 Source : & Nutch white papers Big? Big

is here!! Big? Big

scientists with just two years' experience can earn between $200,000 and $300,000 a year (wall street journel). Anyone with "data science" in his or her job title on a LinkedIn page is going to get "100 recruiter emails a day,.(wall street journel). is a super hot up-and-coming "big data" technology. (Business insider.com). Many other data scientists, especially at data-driven companies such as, Amazon, Microsoft, Walmart, ebay, LinkedIn, and Twitter, have added to and looking for developing the tool kit. (Harvard business review). "People are slapping buzzwords as on résumés and looking to get 50 or 100 percent more, and they're getting it," said Scott Gnau, president of Teradata Lab. Big? Big

References Dean & Sanjay (2004)> MapReduce: Simplied Processing on Large Clusters.google.com Dogh Cutting Nutch(2005): A Flexible and Scalable Open-Source Web Search Engine.yahoo.com Sanjay & Howard (2003): The File System, google.com https://www.ibm.com/developerworks/vn/library/contest/dwfreebooks/tim_hieu_big_/understanding_big.pdf [Accessed date 27 th nov 2014] http://www.businessinsider.com/10-tech-skills-that-will-instantly-net-you-100000- salary-2012-8?op=1[accessed date 27 th nov 2014] Big 's High-Priests of Algorithms,http://online.wsj.com/articles/academicresearchers-find-lucrative-work-as-big-data-scientists-1407543088[Accessed date 27 th nov 2014]

Thank you for your attention Q/A