Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme
|
|
- Lewis Sanders
- 8 years ago
- Views:
Transcription
1 Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim [based on original slides by Lucas Rego Drumond, ISMLL 2014] 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 1 / 25
2 Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 1 / 25
3 1. What is Big Data? Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 1 / 25
4 1. What is Big Data? What is Big Data? 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 1 / 25
5 1. What is Big Data? What is Big Data? Some definitions: A collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. data Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 2 / 25
6 1. What is Big Data? Big Data Dimensions (the 4 Vs ) 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 3 / 25
7 1. What is Big Data? What is Big Data? Big Data is about: Storing and accessing large amounts of (unstructured/complex) data querying Processing high volume data streams Decision support based on large data visualization, reporting navigation / query interfaces contextualization / sense making Building predictive models trained on large data 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 4 / 25
8 2. Where to Find Big Data? Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 5 / 25
9 2. Where to Find Big Data? Big Data in Physics (CERN) Large Hadron Collider has collected data from over 300 trillion proton-proton collisions Approx. 25 Petabytes per year 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 5 / 25
10 2. Where to Find Big Data? Big Data in Genetics (Ensembl) Ensembl database contains the genome of humans and 50 other species only 250 GB source: Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 6 / 25
11 2. Where to Find Big Data? Big Data in the Web (Google) 3.3 billion searches per day (on average) 30 trillion unique URLs identified on the Web 20 billion sites crawled a day In 2008 Google processed more than 20 Petabytes of data per day Source: Jeffrey Dean and Sanjay Ghemawat MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 7 / 25
12 2. Where to Find Big Data? Big Data in Social Media (Facebook) 1.28 billion users (1.23 billion monthly active in January 2014) Size of user data stored by Facebook: 300 Petabytes Average amount of data that Facebook takes in daily: 600 Terabytes Size of Facebook s Graph Search database: 700 Terabytes Source: b Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 8 / 25
13 2. Where to Find Big Data? Big Data in Social Media (Twitter) Average number of tweets per day: 58 million Number of Twitter search engine queries every day: 2.1 billion Total number of active registered Twitter users: 645,750,000 Source: Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 9 / 25
14 2. Where to Find Big Data? Big Data Public Datasets 1000 Genomes Project DNA of 1700 humans 200 TB Common Crawl Corpus 5G web pages 81 TB Wikipedia / Freebase 1.9G subject/predicate/object triples 250 GB Million Song Dataset audio features of 1M songs 280 GB OpenStreetMap a map of earth 90 GB 2000 US Census US census data 200 GB PubChem library biological activities of small molecules 230 GB NCDC weather data daily measurements from 9000 stations 20 GB Open Library metadata of 20M books 7 GB Twitter 1.6G tweets 0.6 GB CD 700 MB, DVD 4,7 17 GB, Blu-ray GB, hard disc: 3 4 TB. 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 10 / 25
15 2. Where to Find Big Data? How Large is 1 Petabyte (PB)? 35.7M years counted one byte per second, 254 years listening to music stored in CD quality (500MB/h) 25 years watching DVDs DVDs (a 4.7 GB) but there are only TV movies on IMDB! (1.8M including TV episodes) can be stored on 341 harddisks à 3 TB/90 e (30,000 e) 96 days to read from standard harddisks sequentially (1030 MBits/s) 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 11 / 25
16 3. Big Data Applications Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 12 / 25
17 3. Big Data Applications What to do with Big Data? Application examples: Test complex scientific hypotheses (physics, genetics) Index the web, including relevance feedback by users (web) Online personalized advertising (social media, esp. Google) Recommender systems (e-commerce, esp. Amazon) Media analysis, sentiment analysis, market research (social media) e.g., Obama campaign Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 12 / 25
18 3. Big Data Applications More Applications & Case Studies T-Mobile USA: integrated Big Data across multiple IT systems to combine customer transaction and interactions data in order to better predict customer defections By leveraging social media data along with transaction data from CRM and billing systems, customer defections is said to have been cut in half in a single quarter. US Xpress: collects data elements ranging from fuel usage to tire condition to truck engine operations to GPS information Optimal fleet management McLaren s Formula One racing team: real-time car sensor data during car races Real time identification of issues with its racing cars 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 13 / 25
19 4. How to analyze Big Data? Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 14 / 25
20 4. How to analyze Big Data? How to handle Big Data? The BI Approach Data Warehouse Static databases (snapshots) Structured data Centralized approaches 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 14 / 25
21 4. How to analyze Big Data? How to handle Big Data? The Distributed Approach Massive Parallelism Heterogeneous data sources Unstructured data Data streams 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 15 / 25
22 4. How to analyze Big Data? Challenges Challenges to deal with large volumes of data: Store and query large amounts of data in a distributed environment efficiently distributed file systems nosql databases, distributed databases Process distributed large data efficiently execution environments Scale / distribute machine learning techniques distributed learning algorithms (message passing) ML execution environments 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 16 / 25
23 4. How to analyze Big Data? Execution Environments: MapReduce 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 17 / 25
24 4. How to analyze Big Data? Execution Environments: GraphLab 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 18 / 25
25 5. Big Data at ISMLL, University of Hildesheim Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 19 / 25
26 5. Big Data at ISMLL, University of Hildesheim Usage of Big Data in Cooperative Research Projects transportation REDUCTION: Danish Taxi fleet ( vehicles, 1.5 million trips, 2.2 billion GPS measurements) e-commerce / recommender systems NetFlix dataset: 100 million transactions Rossmann Online / Compra technology enhanced learning Whizz Education (1200 exercises, 250,000 students, 30 million interactions) engineering data mining Rolls Royce: jet engine vibration Detectino: ground penetrating radar 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 19 / 25
27 Big Data Analytics 5. Big Data at ISMLL, University of Hildesheim ISMLL Compute Cluster I 61 compute nodes I 840 cores, 1288 threads I 2.2 TB RAM I 183 TB hard disk capacity I 10 TFlops special nodes: I I I I database server coprocessor compute node (240 cores, 960 threads, 4 TFlops) software: I I simple scheduler (sun grid engine) Map Reduce (hadoop) 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 20 / 25
28 5. Big Data at ISMLL, University of Hildesheim Data Analytics A New Study Programme in 2015 term module type ECTS 1 Advanced Machine Learning lecture 6 Modern Optimization Techniques lecture 6 Programming Machine Learning lab 6 Seminar Data Analytics I seminar 4 application module I misc. 6 2 Advanced Database Technologies lecture 6 Data and Privacy Protection lecture 3 methodological specialization I lecture 6 Distributed Machine Learning lab 6 Seminar Data Analytics II seminar 4 Project (part I) project 6 3 Planning and Optimal Control lecture 6 methodological specialization II lecture 6 Project (part II) project 9 Seminar Data Analytics III seminar 4 application module II misc. 6 4 Master thesis and colloquium thesis 30 Total Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 21 / 25
29 5. Big Data at ISMLL, University of Hildesheim Data Analytics A New Study Programme in 2015 solid education on all fundamental aspects of data analytics at the state-of-the-art machine learning analytical database technology & execution models planning and control several methodological specializations: Bayesian Networks, Computer Vision, etc. integrated application area media systems, software engineering, environmental sciences, computer linguistics, information sciences, psychology (several still requested) hands-on lab courses a deep and fun, two term integrated group project fully internationally targeted (completely in English) 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 22 / 25
30 6. Conclusions Outline 1. What is Big Data? 2. Where to Find Big Data? 3. Big Data Applications 4. How to analyze Big Data? 5. Big Data at ISMLL, University of Hildesheim 6. Conclusions 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 23 / 25
31 6. Conclusions Big Data Chances Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 23 / 25
32 6. Conclusions Big Data... and Risks 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 24 / 25
33 6. Conclusions Conclusions Big Data Analytics addresses the analysis of large and complex data (volume, variety, velocity, veracity) at Petabyte scale (1024 Terabytes) Big Data emerges naturally in many different domains (science, web, e-commerce, social media, robotics) Big Data requires massively parallel infrastructure to be analyzed timely data centers with hundreds and thousands of compute nodes distributed databases, nosql databases execution environments (MapReduce) To exploit big data in a principled and optimal way, machine learning methods have to be scaled and distributed, requiring innovations at the level of the learning algorithms, also requiring special ML execution environments. 33. Sitzung des Arbeitskreises Informationstechnologie, Hildesheim 25 / 25
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline
More informationBITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand?
BITKOM& NIK - Big Data Wo liegen die Chancen für den Mittelstand? The Big Data Buzz big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics April 10, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationAGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
More informationBig Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
More informationCSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait
CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems
More informationAre You Ready for Big Data?
Are You Ready for Big Data? Jim Gallo National Director, Business Analytics February 11, 2013 Agenda What is Big Data? How do you leverage Big Data in your company? How do you prepare for a Big Data initiative?
More informationData Centric Computing Revisited
Piyush Chaudhary Technical Computing Solutions Data Centric Computing Revisited SPXXL/SCICOMP Summer 2013 Bottom line: It is a time of Powerful Information Data volume is on the rise Dimensions of data
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationBig Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 21 Outline
More informationApplication Development. A Paradigm Shift
Application Development for the Cloud: A Paradigm Shift Ramesh Rangachar Intelsat t 2012 by Intelsat. t Published by The Aerospace Corporation with permission. New 2007 Template - 1 Motivation for the
More informationMLg. Big Data and Its Implication to Research Methodologies and Funding. Cornelia Caragea TARDIS 2014. November 7, 2014. Machine Learning Group
Big Data and Its Implication to Research Methodologies and Funding Cornelia Caragea TARDIS 2014 November 7, 2014 UNT Computer Science and Engineering Data Everywhere Lots of data is being collected and
More informationChapter 7. Using Hadoop Cluster and MapReduce
Chapter 7 Using Hadoop Cluster and MapReduce Modeling and Prototyping of RMS for QoS Oriented Grid Page 152 7. Using Hadoop Cluster and MapReduce for Big Data Problems The size of the databases used in
More informationData Refinery with Big Data Aspects
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data
More informationBetter Decision Making
Better Decision Making Big Data Analytics Webinar, November 2013 Dr. Wolfgang Martin Analyst and Member of the Boulder BI Brain Trust Better Decision Making Process Oriented Businesses. Decision Making:
More informationLarge-Scale Data Processing
Large-Scale Data Processing Eiko Yoneki eiko.yoneki@cl.cam.ac.uk http://www.cl.cam.ac.uk/~ey204 Systems Research Group University of Cambridge Computer Laboratory 2010s: Big Data Why Big Data now? Increase
More informationTHE AGE OF BIG DATA. Chula DataScience
THE AGE OF BIG DATA Asst. Prof. Natawut Nupairoj, Ph.D. Mobile Application and System Services Research Group Department of Computing Engineering Chulalongkorn University natawut.n@chula.ac.th Data is
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationSEAIP 2009 Presentation
SEAIP 2009 Presentation By David Tan Chair of Yahoo! Hadoop SIG, 2008-2009,Singapore EXCO Member of SGF SIG Imperial College (UK), Institute of Fluid Science (Japan) & Chicago BOOTH GSB (USA) Alumni Email:
More informationHow To Use Big Data Effectively
Why is BIG Data Important? March 2012 1 Why is BIG Data Important? A Navint Partners White Paper May 2012 Why is BIG Data Important? March 2012 2 What is Big Data? Big data is a term that refers to data
More informationBIG DATA CHALLENGES AND PERSPECTIVES
BIG DATA CHALLENGES AND PERSPECTIVES Meenakshi Sharma 1, Keshav Kishore 2 1 Student of Master of Technology, 2 Head of Department, Department of Computer Science and Engineering, A P Goyal Shimla University,
More informationProblems to store, transfer and process the Big Data 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM 1
Problems to store, transfer and process the Big Data COURSE: COMPUTING CLUSTERS, GRIDS, AND CLOUDS LECTURER: ANDREY SHEVEL ITMO UNIVERSITY SAINT PETERSBURG 6/2/2016 GIANG TRAN - TTTGIANG2510@GMAIL.COM
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationHow to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
More informationBig Data: Study in Structured and Unstructured Data
Big Data: Study in Structured and Unstructured Data Motashim Rasool 1, Wasim Khan 2 mail2motashim@gmail.com, khanwasim051@gmail.com Abstract With the overlay of digital world, Information is available
More informationFour Orders of Magnitude: Running Large Scale Accumulo Clusters. Aaron Cordova Accumulo Summit, June 2014
Four Orders of Magnitude: Running Large Scale Accumulo Clusters Aaron Cordova Accumulo Summit, June 2014 Scale, Security, Schema Scale to scale 1 - (vt) to change the size of something let s scale the
More informationChapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related
Chapter 11 Map-Reduce, Hadoop, HDFS, Hbase, MongoDB, Apache HIVE, and Related Summary Xiangzhe Li Nowadays, there are more and more data everyday about everything. For instance, here are some of the astonishing
More informationCIS492 Special Topics: Cloud Computing د. منذر الطزاونة
CIS492 Special Topics: Cloud Computing د. منذر الطزاونة Big Data Definition No single standard definition Big Data is data whose scale, diversity, and complexity require new architecture, techniques, algorithms,
More informationHadoop. http://hadoop.apache.org/ Sunday, November 25, 12
Hadoop http://hadoop.apache.org/ What Is Apache Hadoop? The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using
More informationWhat happens when Big Data and Master Data come together?
What happens when Big Data and Master Data come together? Jeremy Pritchard Master Data Management fgdd 1 What is Master Data? Master data is data that is shared by multiple computer systems. The Information
More informationHadoop and Map-reduce computing
Hadoop and Map-reduce computing 1 Introduction This activity contains a great deal of background information and detailed instructions so that you can refer to it later for further activities and homework.
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationHow Big Is Big Data Adoption? Survey Results. Survey Results... 4. Big Data Company Strategy... 6
Survey Results Table of Contents Survey Results... 4 Big Data Company Strategy... 6 Big Data Business Drivers and Benefits Received... 8 Big Data Integration... 10 Big Data Implementation Challenges...
More informationThe Rise of Industrial Big Data. Brian Courtney General Manager Industrial Data Intelligence
The Rise of Industrial Big Data Brian Courtney General Manager Industrial Data Intelligence Agenda Introduction Big Data for the industrial sector Case in point: Big data saves millions at GE Energy Seeking
More informationDanny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
More informationA New Era Of Analytic
Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness
More informationBig Data Executive Survey
Big Data Executive Full Questionnaire Big Date Executive Full Questionnaire Appendix B Questionnaire Welcome The survey has been designed to provide a benchmark for enterprises seeking to understand the
More informationDATA MINING WITH HADOOP AND HIVE Introduction to Architecture
DATA MINING WITH HADOOP AND HIVE Introduction to Architecture Dr. Wlodek Zadrozny (Most slides come from Prof. Akella s class in 2014) 2015-2025. Reproduction or usage prohibited without permission of
More informationDoing Multidisciplinary Research in Data Science
Doing Multidisciplinary Research in Data Science Assoc.Prof. Abzetdin ADAMOV CeDAWI - Center for Data Analytics and Web Insights Qafqaz University aadamov@qu.edu.az http://ce.qu.edu.az/~aadamov 16 May
More informationExtreme Computing. Big Data. Stratis Viglas. School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk. Stratis Viglas Extreme Computing 1
Extreme Computing Big Data Stratis Viglas School of Informatics University of Edinburgh sviglas@inf.ed.ac.uk Stratis Viglas Extreme Computing 1 Petabyte Age Big Data Challenges Stratis Viglas Extreme Computing
More informationA Professional Big Data Master s Program to train Computational Specialists
A Professional Big Data Master s Program to train Computational Specialists Anoop Sarkar, Fred Popowich, Alexandra Fedorova! School of Computing Science! Education for Employable Graduates: Critical Questions
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationOutline. What is Big data and where they come from? How we deal with Big data?
What is Big Data Outline What is Big data and where they come from? How we deal with Big data? Big Data Everywhere! As a human, we generate a lot of data during our everyday activity. When you buy something,
More informationBig Data and Healthcare Payers WHITE PAPER
Knowledgent White Paper Series Big Data and Healthcare Payers WHITE PAPER Summary With the implementation of the Affordable Care Act, the transition to a more member-centric relationship model, and other
More informationA Novel Cloud Based Elastic Framework for Big Data Preprocessing
School of Systems Engineering A Novel Cloud Based Elastic Framework for Big Data Preprocessing Omer Dawelbeit and Rachel McCrindle October 21, 2014 University of Reading 2008 www.reading.ac.uk Overview
More informationCSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop)
CSE 590: Special Topics Course ( Supercomputing ) Lecture 10 ( MapReduce& Hadoop) Rezaul A. Chowdhury Department of Computer Science SUNY Stony Brook Spring 2016 MapReduce MapReduce is a programming model
More informationINTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE
INTRODUCTION TO APACHE HADOOP MATTHIAS BRÄGER CERN GS-ASE AGENDA Introduction to Big Data Introduction to Hadoop HDFS file system Map/Reduce framework Hadoop utilities Summary BIG DATA FACTS In what timeframe
More informationOracle Big Data for Dummies
Oracle Big Data for Dummies Sai Janakiram Penumuru WW Product Expert Cloud Platforms The Father of Microbiology First Microbiologist Antonie Philips van Leeuwenhoek 2 Sai Janakiram Penumuru o o o o o o
More informationBig Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel
Big Data and Analytics: Getting Started with ArcGIS Mike Park Erik Hoel Agenda Overview of big data Distributed computation User experience Data management Big data What is it? Big Data is a loosely defined
More informationRaul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada
What is big data? Raul F. Chong Senior program manager Big data, DB2, and Cloud IM Cloud Computing Center of Competence - IBM Toronto Lab, Canada 1 2011 IBM Corporation Agenda The world is changing What
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationHPC technology and future architecture
HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr
More informationBig Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料
Big Data: Opportunities & Challenges, Myths & Truths 資 料 來 源 : 台 大 廖 世 偉 教 授 課 程 資 料 美 國 13 歲 學 生 用 Big Data 找 出 霸 淩 熱 點 Puri 架 設 網 站 Bullyvention, 藉 由 分 析 Twitter 上 找 出 提 到 跟 霸 凌 相 關 的 詞, 搭 配 地 理 位 置
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationIntroduction to Engineering Using Robotics Experiments Lecture 17 Big Data
Introduction to Engineering Using Robotics Experiments Lecture 17 Big Data Yinong Chen 2 Big Data Big Data Technologies Cloud Computing Service and Web-Based Computing Applications Industry Control Systems
More informationHow To Use Hadoop For Gis
2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Big Data: Using ArcGIS with Apache Hadoop David Kaiser Erik Hoel Offering 1330 Esri UC2013. Technical Workshop.
More informationAn Oracle White Paper June 2013. Oracle: Big Data for the Enterprise
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
More informationBig Data and Open Data
Big Data and Open Data Bebo White SLAC National Accelerator Laboratory/ Stanford University!! bebo@slac.stanford.edu dekabytes hectobytes Big Data IS a buzzword! The Data Deluge From the beginning of
More informationBig Data With Hadoop
With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials
More informationHP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica
HP Vertica at MIT Sloan Sports Analytics Conference March 1, 2013 Will Cairns, Senior Data Scientist, HP Vertica So What s the market s definition of Big Data? Datasets whose volume, velocity, variety
More informationThe Next Wave of Data Management. Is Big Data The New Normal?
The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationExecutive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
More informationDe la Business Intelligence aux Big Data. Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris. 22/01/14 Séminaire Big Data
De la Business Intelligence aux Big Data Marie- Aude AUFAURE Head of the Business Intelligence team Ecole Centrale Paris 22/01/14 Séminaire Big Data 1 Agenda EvoluHon of Business Intelligence SemanHc Technologies
More informationCAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science
CAP4773/CIS6930 Projects in Data Science, Fall 2014 [Review] Overview of Data Science Dr. Daisy Zhe Wang CISE Department University of Florida August 25th 2014 20 Review Overview of Data Science Why Data
More informationReducing Environmental Footprint based on Multi-Modal Fleet Management Systems for Eco-Routing and Driver Behavior Adaptation
Reducing Environmental Footprint based on Multi-Modal Fleet Management Systems for Eco-Routing and Driver Behavior Adaptation Josif Grabocka Umer Khan Lars Schmidt-Thieme Information Systems and Machine
More informationAn Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
More informationwww.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage
www.pwc.com/oracle Next presentation starting soon Business Analytics using Big Data to gain competitive advantage If every image made and every word written from the earliest stirring of civilization
More informationTransforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
More informationThe Big Deal about Big Data. Mike Skinner, CPA CISA CITP HORNE LLP
The Big Deal about Big Data Mike Skinner, CPA CISA CITP HORNE LLP Mike Skinner, CPA CISA CITP Senior Manager, IT Assurance & Risk Services HORNE LLP Focus areas: IT security & risk assessment IT governance,
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationHadoop Big Data for Processing Data and Performing Workload
Hadoop Big Data for Processing Data and Performing Workload Girish T B 1, Shadik Mohammed Ghouse 2, Dr. B. R. Prasad Babu 3 1 M Tech Student, 2 Assosiate professor, 3 Professor & Head (PG), of Computer
More informationBIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER
BIG DATA & SOCIAL INNOVATION KENNETH THOMAS, CLIENT MANAGER 1 MAKING THE RIGHT DECISSION AT THE RIGHT PLACE AT THE RIGHT TIME 2 THE DATA MULTIPLIER EFFECT AT WORK BUSINESS DRIVEN HUMAN DRIVEN MACHINE DRIVEN
More informationR.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5
Distributed data processing in heterogeneous cloud environments R.K.Uskenbayeva 1, А.А. Kuandykov 2, Zh.B.Kalpeyeva 3, D.K.Kozhamzharova 4, N.K.Mukhazhanov 5 1 uskenbaevar@gmail.com, 2 abu.kuandykov@gmail.com,
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationDepartment of Computer Science University of Cyprus EPL646 Advanced Topics in Databases. Lecture 14
Department of Computer Science University of Cyprus EPL646 Advanced Topics in Databases Lecture 14 Big Data Management IV: Big-data Infrastructures (Background, IO, From NFS to HFDS) Chapter 14-15: Abideboul
More informationData-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
More informationDelivering new insights and value to consumer products companies through big data
IBM Software White Paper Consumer Products Delivering new insights and value to consumer products companies through big data 2 Delivering new insights and value to consumer products companies through big
More informationGetting to Know Big Data
Getting to Know Big Data Dr. Putchong Uthayopas Department of Computer Engineering, Faculty of Engineering, Kasetsart University Email: putchong@ku.th Information Tsunami Rapid expansion of Smartphone
More informationTapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
More informationBringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the
More informationSurfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
More informationA Study of Data Management Technology for Handling Big Data
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 9, September 2014,
More informationBig Data Systems CS 5965/6965 FALL 2014
Big Data Systems CS 5965/6965 FALL 2014 Today General course overview Q&A Introduction to Big Data Data Collection Assignment #1 General Course Information Course Web Page http://www.cs.utah.edu/~hari/teaching/fall2014.html
More informationUnderstanding Your Customer Journey by Extending Adobe Analytics with Big Data
SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction
More informationBig Data and Hadoop with components like Flume, Pig, Hive and Jaql
Abstract- Today data is increasing in volume, variety and velocity. To manage this data, we have to use databases with massively parallel software running on tens, hundreds, or more than thousands of servers.
More informationCSE-E5430 Scalable Cloud Computing Lecture 2
CSE-E5430 Scalable Cloud Computing Lecture 2 Keijo Heljanko Department of Computer Science School of Science Aalto University keijo.heljanko@aalto.fi 14.9-2015 1/36 Google MapReduce A scalable batch processing
More informationDAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY
Big Data Analytics DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY Tom Haughey InfoModel, LLC 868 Woodfield Road Franklin Lakes, NJ 07417 201 755 3350 tom.haughey@infomodelusa.com
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationBig Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications
Big Data Big Data: Introduction and Applications August 20, 2015 HKU-HKJC ExCEL3 Seminar Michael Chau, Associate Professor School of Business, The University of Hong Kong Ample opportunities for business
More informationA Survey on Big Data Concepts and Tools
A Survey on Big Data Concepts and Tools D. Rajasekar 1, C. Dhanamani 2, S. K. Sandhya 3 1,3 PG Scholar, 2 Assistant Professor, Department of Computer Science and Engineering, Sri Krishna College of Engineering
More informationHere comes the flood Tools for Big Data analytics. Guy Chesnot -June, 2012
Here comes the flood Tools for Big Data analytics Guy Chesnot -June, 2012 Agenda Data flood Implementations Hadoop Not Hadoop 2 Agenda Data flood Implementations Hadoop Not Hadoop 3 Forecast Data Growth
More informationQLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
More informationBig Data in Telco & Banking Analytics. Benjamin Sznajder IBM Research Haifa
Big Data in Telco & Banking Analytics Benjamin Sznajder IBM Research Haifa Agenda What is Big Data, Why Now IBM s approach Big Data in Banking industry A Telco scenario Bytes and bytes Megabyte: 1 minute
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationBIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
More informationUNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE
UNDERSTANDING THE BIG DATA PROBLEMS AND THEIR SOLUTIONS USING HADOOP AND MAP-REDUCE Mr. Swapnil A. Kale 1, Prof. Sangram S.Dandge 2 1 ME (CSE), First Year, Department of CSE, Prof. Ram Meghe Institute
More informationBig Data Challenges in Bioinformatics
Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?
More informationAligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
More information