Intro to Big Data and Business Intelligence

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Intro to Big Data and Business Intelligence"

Transcription

1 Intro to Big Data and Business Intelligence Anjana Susarla Eli Broad College of Business What is Business Intelligence A Simple Definition: The applications and technologies transforming Business Data into Action Business intelligence (BI) is a business management term refers to applications and technologies which are used to gather, provide access to, and analyze data and information about their company operations. Business intelligence systems can help companies gain more comprehensive knowledge of the factors affecting their business, and help companies to make better business decisions. YouTube: What is BI? Guest lecture AESC310 1

2 Data, information, and knowledge Data a collection of raw value elements or facts used for calculating, reasoning, or measuring. Information the result of collecting and organizing data in a way that establishes relationship between data items, which thereby provides context and meaning Knowledge the concept of understanding information based on recognized patterns in a way that provides insight to information. 3 Driving force - Big Data A collection of data sets so large and complex that it becomes awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analysis, and visualization. The trend to larger data sets is due to the additional information derivable from analysis of a single large set of related data, as compared to separate smaller sets with the same total amount of data. 4 Guest lecture AESC310

3 5 ISQS7339, Fall 01 Zettabyte (ZB) A quantity of information or information storage capacity equal to 10 1 bytes or 1,000 exabytes. As of April 01, no storage system has achieved one zettabyte of information. The combined space of all computer hard drives in the world was estimated at approximately 160 exabytes in 006. Seagate reported selling 330 exabytes worth of hard drives during the 011 Fiscal Year. As of 009, the entireworld Wide Web was estimated to contain close to 500 exabytes.this is a half zettabyte. 1,000,000,000,000,000,000,000 bytes = bytes = 10 1 bytes 6 Guest lecture AESC310 3

4 Data Scale 7 Market "Big data" has increased the demand of information management specialists - major companies have spent more than $15 billion for this. This industry is worth more than $100 billion and growing at almost 10% a year. 4.6 billion mobile-phone subscriptions worldwide and between 1 billion and billion people accessing the internet. The world's effective capacity to exchange information through telecommunication networks was 81 petabytes in 1986, 471 petabytes in 1993,. exabytes in 000, 65 exabytes in 007 It is predicted that the amount of traffic flowing over the internet will reach 667 exabytes annually by Guest lecture AESC310 4

5 Approach - Cloud Computing Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). The name comes from the use of a cloud-shaped symbol as an abstraction for the complex infrastructure it contains in system diagrams. Cloud computing entrusts remote services with a user's data, software and computation. Buzzword: SaaS/IaaS/PaaS 9 Distributed business intelligence Deal with big data the open & distributed approach LAMP Hadoop MapReduce HDFS NOSQL Zookeeper Storm Guest lecture AESC310 5

6 Apache Hadoop An open-source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. The Apache Hadoop framework is composed of the following modules : Hadoop Common - contains libraries and utilities needed by other Hadoop modules Hadoop Distributed File System (HDFS). HadoopYARN - a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications. Hadoop MapReduce - a programming model for large scale data processing. Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers. 1 1 A Multi-node Hadoop Cluster 1 Guest lecture AESC310 6

7 1 3 Guest lecture AESC310 7

8 1 6 ISQS 6339, Data Mgmt & BI Guest lecture AESC310 8

9 Guest lecture AESC310 9

10 1 9 ISQS 6339, Data Mgmt & BI Hadoop : Big data's big leap forward The new Hadoop is the Apache Foundation's attempt to create a whole new general framework for the way big data can be stored, mined, and processed. The biggest constraint on scale has been Hadoop s job handling. All jobs in Hadoop are run as batch processes through a single daemon called JobTracker, which creates a scalability and processing-speed bottleneck. Hadoop uses an entirely new job-processing framework built using two daemons: ResourceManager, which governs all jobs in the system, and NodeManager, which runs on each Hadoop node and keeps the ResourceManager informed about what's happening on that node. 0 Guest lecture AESC310 10

11 MapReduce.0 YARN (Yet Another Resource Negotiator) 1 The process of BI Data -> information -> knowledge -> actionable plans Data -> information: the process of determining what data is to be collected and managed and in what context Information -> knowledge: The process involving the analytical components, such as data warehousing, online analytical processing, data quality, data profiling, business rule analysis, and data mining Knowledge -> actionable plans: The most important aspect in a BI process Guest lecture AESC310 11

12 Actionable Knowledge An information asset retains its value on if the converted knowledge is actionable. Need some methods for extracting value from knowledge This is not a technical issue but an organizational one need empowered individuals in the organization to take the action There is an issue of Return on Investment (ROI) 3 BI Problems Structured Detecting Credit card fraud Setting Loan parameters Market segmentation/mass customization Deciding Marketing mix Customer Churn Reducing employee turnover Improving Quality/Efficiency Unstructured Data exploration Utilization of resources (stored knowledge) to maximum effectiveness 4 Guest lecture AESC310 1

13 5 Customer Analytics Customer profiling Targeted marketing Personalization Collaborative filtering Customer satisfaction Customer lifetime value Customer loyalty Sales Channel Analytics Marketing Sales performance and pipeline BI Applications 6 BI Applications () Supply Chain Analytics Supplier and vendor management Shipping Inventory control Distribution analysis Behavior Analysis Purchasing trends Web activity Fraud and abuse detection Customer attrition Social network analysis Guest lecture AESC310 13

14 The Evolution of Business Intelligence 1 st Generation Traditional analytics (query and reporting) nd Generation Traditional generation (OLAP, data warehousing).5 nd Generation New traditional generation 3 rd Generation - Advanced analytics Rules, predictive analytics and realtime data mining Stream analytics 7 Business Intelligence Classifications Stream Analytics* Real-time, continuous, sequential analysis (ranging from basic to advanced analytics) 3 rd -Generation BI * In lieu of stream analytics, embedded analytics, although architecturally different, could potentially play the same role Advanced Analytics/Optimization Rules Predictive Analytics Real-time and traditional Data Mining New Traditional Analytics.5-Gen Analytics (In-Memory OLAP, Search-Based) Source: Bill O Connell IBM, Aug 007 Traditional Analytics 1 st Generation Analytics (Query & Reporting) nd Generation Analytics (OLAP, Data Warehousing) Legacy BI Guest lecture AESC310 14

15 Big data is at the foundation of all the megatrends that are happening today, from social to mobile to cloud to gaming. Chris Lynch, Vertica Systems Big data is not about the data Gary King, Harvard University, making the point that while data is plentiful and easy to collect, the real value is in the analytics Guest lecture AESC310 15

16 There were 5 exabytes of information created between the dawn of civilization through 003, but that much information is now created every days. Eric Schmidt, of Google, said in 010 Information is the oil of the 1st century, and analytics is the combustion engine. Peter Sondergaard, Gartner Research Guest lecture AESC310 16

17 I keep saying that the sexy job in the next 10 years will be statisticians, and I m not kidding Hal Varian, Google You can have data without information, but you cannot have information without data. Daniel Keys Moran, computer programmer and science fiction author Guest lecture AESC310 17

18 Hiding within those mounds of data is knowledge that could change the life of a patient, or change the world. Atul Butte, Stanford School of Medicine Errors using inadequate data are much less than those using no data at all. Charles Babbage, inventor and mathematician Guest lecture AESC310 18

19 To call in the statistician after the experiment is done may be no more than asking him to perform a post mortem he may be able to say what the experiment died of. Ronald Fisher, biologist, geneticist and statistician Without big data, you are blind and deaf in the middle of a freeway Geoffrey Moore, management consultant and theorist Guest lecture AESC310 19

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014

Big Data. White Paper. Big Data Executive Overview WP-BD-10312014-01. Jafar Shunnar & Dan Raver. Page 1 Last Updated 11-10-2014 White Paper Big Data Executive Overview WP-BD-10312014-01 By Jafar Shunnar & Dan Raver Page 1 Last Updated 11-10-2014 Table of Contents Section 01 Big Data Facts Page 3-4 Section 02 What is Big Data? Page

More information

Transforming the Telecoms Business using Big Data and Analytics

Transforming the Telecoms Business using Big Data and Analytics Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN

Hadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current

More information

Business Intelligence

Business Intelligence Business Intelligence An Introduction to Business Intelligence http://www1.chihlee.edu.tw/teachers/chienhua/ Central Themes of the Lecture What is Business Intelligence? A Brief History of Business Intelligence

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica

HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety

More information

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

More information

Using Data Mining and Machine Learning in Retail

Using Data Mining and Machine Learning in Retail Using Data Mining and Machine Learning in Retail Omeid Seide Senior Manager, Big Data Solutions Sears Holdings Bharat Prasad Big Data Solution Architect Sears Holdings Over a Century of Innovation A Fortune

More information

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012

Big Data Buzzwords From A to Z. By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords From A to Z By Rick Whiting, CRN 4:00 PM ET Wed. Nov. 28, 2012 Big Data Buzzwords Big data is one of the, well, biggest trends in IT today, and it has spawned a whole new generation

More information

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics

Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,

More information

Large scale processing using Hadoop. Ján Vaňo

Large scale processing using Hadoop. Ján Vaňo Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine

More information

Ramesh Bhashyam Teradata Fellow Teradata Corporation bhashyam.ramesh@teradata.com

Ramesh Bhashyam Teradata Fellow Teradata Corporation bhashyam.ramesh@teradata.com Challenges of Handling Big Data Ramesh Bhashyam Teradata Fellow Teradata Corporation bhashyam.ramesh@teradata.com Trend Too much information is a storage issue, certainly, but too much information is also

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Customized Report- Big Data

Customized Report- Big Data GINeVRA Digital Research Hub Customized Report- Big Data 1 2014. All Rights Reserved. Agenda Context Challenges and opportunities Solutions Market Case studies Recommendations 2 2014. All Rights Reserved.

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization

More information

BIG DATA IN BUSINESS ENVIRONMENT

BIG DATA IN BUSINESS ENVIRONMENT Scientific Bulletin Economic Sciences, Volume 14/ Issue 1 BIG DATA IN BUSINESS ENVIRONMENT Logica BANICA 1, Alina HAGIU 2 1 Faculty of Economics, University of Pitesti, Romania olga.banica@upit.ro 2 Faculty

More information

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Understanding Big Data and Big Data Analytics Getting familiar with Hadoop Technology Hadoop release and upgrades

More information

Big data and its transformational effects

Big data and its transformational effects Big data and its transformational effects Professor Fai Cheng Head of Research & Technology September 2015 Working together for a safer world Topics Lloyd s Register Big Data Data driven world Data driven

More information

Modernizing Your Data Warehouse for Hadoop

Modernizing Your Data Warehouse for Hadoop Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking

More information

Big Data Zurich, November 23. September 2011

Big Data Zurich, November 23. September 2011 Institute of Technology Management Big Data Projektskizze «Competence Center Automotive Intelligence» Zurich, November 11th 23. September 2011 Felix Wortmann Assistant Professor Technology Management,

More information

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15

Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 15 Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 15 Big Data Management V (Big-data Analytics / Map-Reduce) Chapter 16 and 19: Abideboul et. Al. Demetris

More information

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com

Introduction to Predictive Analytics. Dr. Ronen Meiri ronen@dmway.com Introduction to Predictive Analytics Dr. Ronen Meiri Outline From big data to predictive analytics Predictive Analytics vs. BI Intelligent platforms What can we do with it. The modeling process. Example

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Extend your analytic capabilities with SAP Predictive Analysis

Extend your analytic capabilities with SAP Predictive Analysis September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics

More information

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84 Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

Copyright 2014, Neudesic. All rights reserved.

Copyright 2014, Neudesic. All rights reserved. 2 Accelerating Modernization Across the Enterprise Cloud Computing User Experience Enterprise Mobility Customer Relationship Management Business Analysis Managed Services Custom Application Development

More information

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Improving Access to Data for Successful Business Intelligence Part 1: Meeting Today s Business Requirements in an Increasingly Complex Environment By Mike Ferguson

More information

Application Development. A Paradigm Shift

Application Development. A Paradigm Shift Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

III JORNADAS DE DATA MINING

III JORNADAS DE DATA MINING III JORNADAS DE DATA MINING EN EL MARCO DE LA MAESTRÍA EN DATA MINING DE LA UNIVERSIDAD AUSTRAL PRESENTACIÓN TECNOLÓGICA IBM Alan Schcolnik, Cognos Technical Sales Team Leader, IBM Software Group. IAE

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON BIG DATA MANAGEMENT AND ITS SECURITY PRUTHVIKA S. KADU 1, DR. H. R.

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013

BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013 BIG DATA: ARE YOU READY? Andy Kyiet Demand Flow Intelligence May, 2013 PERSONAL BACKGROUND Founder of the first specialist Service Management & Helpdesk System provider in Europe Past President of AFSMI

More information

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate

More information

BIG DATA CHALLENGES AND PERSPECTIVES

BIG DATA CHALLENGES AND PERSPECTIVES BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,

More information

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to

More information

BIG DATA What it is and how to use?

BIG DATA What it is and how to use? BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14

More information

Hadoop Big Data for Processing Data and Performing Workload

Hadoop Big Data for Processing Data and Performing Workload Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

Manifest for Big Data Pig, Hive & Jaql

Manifest for Big Data Pig, Hive & Jaql Manifest for Big Data Pig, Hive & Jaql Ajay Chotrani, Priyanka Punjabi, Prachi Ratnani, Rupali Hande Final Year Student, Dept. of Computer Engineering, V.E.S.I.T, Mumbai, India Faculty, Computer Engineering,

More information

The Big Data Deluge: Creating Serious Business Problems. Analytics: Harnessing Big Data Deluge to Acquire Business Power

The Big Data Deluge: Creating Serious Business Problems. Analytics: Harnessing Big Data Deluge to Acquire Business Power The Big Data Deluge: Creating Serious Business Problems Analytics: Harnessing Big Data Deluge to Acquire Business Power Predictive Analytics: The Holy Grail of Big Data Analytics The Predictive Analytics

More information

Big Data in Enterprise challenges & opportunities. Yuanhao Sun 孙 元 浩 yuanhao.sun@intel.com Software and Service Group

Big Data in Enterprise challenges & opportunities. Yuanhao Sun 孙 元 浩 yuanhao.sun@intel.com Software and Service Group Big Data in Enterprise challenges & opportunities Yuanhao Sun 孙 元 浩 yuanhao.sun@intel.com Software and Service Group Big Data Phenomenon 1.8ZB in 2011 2 Days > the dawn of civilization to 2003 750M Photos

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment firms turn big data into actionable research

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Data Warehouse design

Data Warehouse design Data Warehouse design Design of Enterprise Systems University of Pavia 10/12/2013 2h for the first; 2h for hadoop - 1- Table of Contents Big Data Overview Big Data DW & BI Big Data Market Hadoop & Mahout

More information

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014 Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions

More information

Databases & Business Intelligence Part 1

Databases & Business Intelligence Part 1 Welcome back! We will have more fun. Databases & Business Intelligence Part 1 BUSA345 Lecture #8-1 Claire Hitosugi, PhD, MBA In the previous lecture We learned Define Open Source Software (OSS) and provide

More information

ANALYTICS BUILT FOR INTERNET OF THINGS

ANALYTICS BUILT FOR INTERNET OF THINGS ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that

More information

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem:

Chapter 6 8/12/2015. Foundations of Business Intelligence: Databases and Information Management. Problem: Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Chapter 6 Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP

Embedded inside the database. No need for Hadoop or customcode. True real-time analytics done per transaction and in aggregate. On-the-fly linking IP Operates more like a search engine than a database Scoring and ranking IP allows for fuzzy searching Best-result candidate sets returned Contextual analytics to correctly disambiguate entities Embedded

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Wienand Omta Fabiano Dalpiaz 1 drs. ing. Wienand Omta Learning Objectives Describe how the problems of managing data resources

More information

Blueprints for Big Data Success

Blueprints for Big Data Success Blueprints for Big Data Success Succeeding with Four Common Scenarios Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest

More information

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant juan.madera.jimenez@accenture.com

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant juan.madera.jimenez@accenture.com The Potential of Big Data in the Cloud Juan Madera Technology Consultant juan.madera.jimenez@accenture.com Agenda How to apply Big Data & Analytics What is it? Definitions, Technology and Data Science

More information

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices September 10-13, 2012 Orlando, Florida Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices Vishwanath Belur, Product Manager, SAP Predictive Analysis Learning

More information

Doing Multidisciplinary Research in Data Science

Doing Multidisciplinary Research in Data Science Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January 2015. Email: bdg@qburst.com Website: www.qburst.com Lambda Architecture Near Real-Time Big Data Analytics Using Hadoop January 2015 Contents Overview... 3 Lambda Architecture: A Quick Introduction... 4 Batch Layer... 4 Serving Layer... 4 Speed Layer...

More information

Talend Big Data. Delivering instant value from all your data. Talend 2014 1

Talend Big Data. Delivering instant value from all your data. Talend 2014 1 Talend Big Data Delivering instant value from all your data Talend 2014 1 I may say that this is the greatest factor: the way in which the expedition is equipped. Roald Amundsen race to the south pole,

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

SECURITY MEETS BIG DATA. Achieve Effectiveness And Efficiency. Copyright 2012 EMC Corporation. All rights reserved.

SECURITY MEETS BIG DATA. Achieve Effectiveness And Efficiency. Copyright 2012 EMC Corporation. All rights reserved. SECURITY MEETS BIG DATA Achieve Effectiveness And Efficiency 1 IN 2010 THE DIGITAL UNIVERSE WAS 1.2 ZETTABYTES 1,000,000,000,000,000,000,000 Zetta Exa Peta Tera Giga Mega Kilo Byte Source: 2010 IDC Digital

More information

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE

INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe

More information

Big Data Success Step 1: Get the Technology Right

Big Data Success Step 1: Get the Technology Right Big Data Success Step 1: Get the Technology Right TOM MATIJEVIC Director, Business Development ANDY MCNALIS Director, Data Management & Integration MetaScale is a subsidiary of Sears Holdings Corporation

More information

NextGen Infrastructure for Big DATA Analytics.

NextGen Infrastructure for Big DATA Analytics. NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures

More information

Changing the face of Business Intelligence & Information Management

Changing the face of Business Intelligence & Information Management 1300 530 335 info@c3businessolutions.com www.c3businesssolutions.com GPO Box 589 Melbourne VIC 3001 Australia ABN 35 122 885 465 White Paper Big Data Changing the face of Business Intelligence & Information

More information

Workshop on Hadoop with Big Data

Workshop on Hadoop with Big Data Workshop on Hadoop with Big Data Hadoop? Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly

More information

A Survey on Big Data Concepts and Tools

A Survey on Big Data Concepts and Tools A Survey on Big Data Concepts and Tools D. Rajasekar 1, C. Dhanamani 2, S. K. Sandhya 3 1,3 PG Scholar, 2 Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Engineering

More information

WHITE PAPER. 2012 Big Data Survey Results

WHITE PAPER. 2012 Big Data Survey Results 2012 Big Data Survey Results Overview Meeting the challenges of big data has out-paced traditional warehousing technology and practices. Frustrated by the cost and pace of innovation associated with proprietary

More information

China Bank BigData Usecase Huawei FusionInsight Solution

China Bank BigData Usecase Huawei FusionInsight Solution China Bank BigData Usecase Huawei FusionInsight Solution Steven Yuan, Director of Bigdata Research and Development Agenda 1. BigData Trend 2. China Bank Business Challenges & Requirements 3. Huawei s BigData

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.

Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D. Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology

More information

Top 5 Future Travel Trends in BIG Data

Top 5 Future Travel Trends in BIG Data Top 5 Future Travel Trends in BIG Data Presented by Dave Lukas Grasp Technologies, VP/CSO There were 5 exabites of information created between the dawn of civilization through 2003, but that much information

More information

How Companies are! Using Spark

How Companies are! Using Spark How Companies are! Using Spark And where the Edge in Big Data will be Matei Zaharia History Decreasing storage costs have led to an explosion of big data Commodity cluster software, like Hadoop, has made

More information

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg

The Big Picture on Big Data. Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg The Big Picture on Big Data Princeton Section 307 Dinner Meeting December 11, 2013 Richard Herczeg Objective of Talk 1. Deliver a Primer on Big Data. 2. How does this emerging topic apply to Quality? 3.

More information

Data-Intensive Computing with Map-Reduce and Hadoop

Data-Intensive Computing with Map-Reduce and Hadoop Data-Intensive Computing with Map-Reduce and Hadoop Shamil Humbetov Department of Computer Engineering Qafqaz University Baku, Azerbaijan humbetov@gmail.com Abstract Every day, we create 2.5 quintillion

More information