Unlocking the Full Potential of Big Data
|
|
- Britton Davidson
- 8 years ago
- Views:
Transcription
1 Unlocking the Full Potential of Big Data Lilli Japec, Frauke Kreuter JOS anniversary June 2015
2 The report is available at
3 Task Force Members: Lilli Japec, Co-Chair, Statistics Sweden Frauke Kreuter, Co-Chair, JPSM at the U. of Maryland, U. of Mannheim & IAB Marcus Berg, Stockholm University Paul Biemer, RTI International Paul Decker, Mathematica Policy Research Cliff Lampe, School of Information at the University of Michigan Julia Lane, American Institutes for Research Cathy O Neil, Johnson Research Labs Abe Usher, HumanGeo Group
4 AAPOR (American Association for Public Opinion Research) a professional organization dedicated to advancing the study of public opinion, broadly defined, to include attitudes, norms, values, and behaviors promotes best practices and transparency works to educate its members as well as policy makers, the media, and the public at large to help them make better use of surveys and survey findings, and to inform them about new developments in the field other task force reports available on
5 Outline of our presentations What is Big Data? Paradigm shift Big Data activities in different organizations Skills required Big Data process and data quality
6 three main data sources UNTIL RECENTLY
7 Survey Data Administrative Data Experiments
8 NOW
9 US Aggregated Inflation Series, Monthly Rate, PriceStats Index vs. Official CPI. Accessed January 18, 2015 from the PriceStats website.
10 Number of vehicles detected in the Netherlands on December 1, 2011 created by Statistics Netherlands (Daas et al. 2013). The vehicle size is shown in different colors; black is small size, red is medium size and green is large size.
11 Social media sentiment (daily, weekly and monthly) in the Netherlands, June November The development of consumer confidence for the same period is shown in the insert (Daas and Puts 2014).
12 Big Data
13 Hope that found/organic data Can replace or augment expensive data collections More (= better) data for decision making Information available in (nearly) real time
14 New paradigm New business model Federal agencies no longer major players New analytical model Outliers Finegrained analysis New units of analysis New sets of skills Computer scientists Citizen scientists Different cost structure Source: Julia Lane
15 Eurostat Big Data Action Plan and Roadmap Pilots exploring the potential of selected big data sources The project will also include activities on: Methodological frameworks, Quality frameworks, Metadata frameworks, IT infrastructures, Communication, Legal frameworks, Ethical frameworks, Skills and training, and Experience sharing.
16 UNECE and Big Data The Sandbox provides a computing environment to load Big Data sets and tools Consumer price indices experimenting with the computation of price indexes Mobile telephone data statistics on tourism and daily commuting Smart meters statistics on power consumption using data collected from smart meter readings. Traffic loops traffic statistics using data from traffic loops Social media using Twitter data to analyze sentiment and to tourism flows. Job portals computing statistics on job vacancies Web scraping tested methods for automatically collecting data from web sources.
17 UNECE Big Data Inventory
18 Statistics Netherlands: Roadmap BIG DATA Two focus projects: the use of traffic loop data for transportation statistics the use of mobile phone data for daytime population and tourism statistics. Six other projects: the use of internet data for price statistics, investigating the use of bank and credit card transactions, the use of social media data for detecting trends in social cohesion, the use of internet data for encoding enterprise purchases and sales, investigating the use of smartcards of public transport for statistics, and the use of internet data for statistics about job vacancies. Source: Pieter Vlag, Statistics Netherlands 18
19 Examples from Statistics Sweden Scanner data to improve the Household Budget Survey Job vacancy statistics by scraping of the web To evalutate the use of AIS (Automatic Identification System) data. Cooperation between Statistics Sweden and the agency for Transport Analysis (Trafa). Research funding from the Swedish Innovation Agency (Vinnova).
20 Source: Moström and Justesen, Statistics Sweden One day data
21 What tasks are required to get there? SKILLS
22 We have to do this jointly Data Output/Access Example: map visualization / privacy Data Analysis Example: Hadoop MapReduce; High Frequency Data Data Curation/Storage Data Generating Process Research Questions Example: Hadoop Distributed File System Examples: geolocated social media + survey + administrative data Examples: Behavior of interest (migration/political participation/job searches)
23 Source: Abe Usher
24 Big words What is big data? What is Hadoop File System? (HDFS) What is Hadoop MapReduce? (MR) How do you link surveys with big data? Source: Abe Usher
25 Computer scientist Data preparation MapReduce algorithms Python/R programming Hadoop ecosystem System Administrator Storage systems (MySQL, Hbase, Spark) Cloud computing: Amazon Web Services (AWS) Google Compute Engine Hadoop ecosystem Source: Abe Usher
26 What do we know about the data generating process? RESEARCH
27 Veracity Who? What? Why? Who is missing? Who is counted repeatedly? What is not said / measured?..and why?
28 But (at least) one more V
29 Terrorist Detector Terrorist Detector Errors in Big Data: An Illustration Suppose 1 in 1,000,000 people are terrorists The Big Data Terrorist Detector is 99.9 accurate The detector says your friend, Jack is a terrorist. What are the odds that Jack is really a terrorist? Source: Paul Biemer 29
30 Terrorist Detector Terrorist Detector Errors in Big Data: An Illustration Suppose 1 in 1,000,000 people are terrorists The Big Data Terrorist Detector is 99.9 accurate The detector says your friend, Jack is a terrorist. What are the odds that Jack is really a terrorist? Answer: 1 in 1000 i.e., 99.9% of the terrorist detections will be false! Source: Paul Biemer 30
31 Big Data Process Map Generate Source 1 ETL Extract Analyze Filter/Reduction (Sampling) Source 2 Source K Transform (Cleanse) Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 31
32 Big Data Process Map Generation Source 1 Source 2 Source K ETL Errors include: Extract low signal/noise ratio; lost signals; failure to capture; non-random (or nonrepresentative) sources; metadata that are lacking, absent, or erroneous. Transform (Cleanse) Load (Store) Analyze Filter/Reduction (Sampling) Computation/ Analysis (Visualization) Source: Paul Biemer 32
33 Big Data Process Map Generation Source 1 Source 2 Source K ETL Extract Transform (Cleanse) Load (Store) Analyze Errors include: specification error (including, errors in meta-data), matching error, Filter/Reduction coding error, editing error, data (Sampling) munging errors, and data integration errors.. Computation/ Analysis (Visualization) Source: Paul Biemer 33
34 Generation Source 1 Big Data Process Map Data are filtered, sampled or otherwise Errors reduced. include: ETL This sampling may errors, involve selectivity further errors (or lack transformations of representativity), Extract of the modeling data. errors Analyze Filter/Reduction (Sampling) Source 2 Source K Transform (Cleanse) Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 34
35 Big Data Process Map Generation Source 1 ETL Extract Analyze Filter/Reduction (Sampling) Source 2 Source K Errors include: Transform modeling errors, inadequate or (Cleanse) erroneous adjustments for representativity, computation and algorithmic errors. Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 35
36 POTENTIAL
37 We have to do this jointly Data Output/Access Data Analysis Data Curation/Storage Data Generating Process Research Questions Example: map visualization / privacy Psychology, Law, Math&Comp, Business Example: Hadoop MapReduce; High Frequency Data Economics, Social Sciences, Business, Math&Comp Example: Hadoop Distributed File System Math & Computer Science, Applied Statistics Examples: geolocated social media + survey + administrative data Social Science & Psychology, Humanities, Econ, Business Examples: Behavior of interest (migration/political participation/job searches) Any field
38 ..and think about legal framework
Task Force Members: Lilli Japec Frauke Kreuter Marcus Berg Paul Biemer Paul Decker Cliff Lampe
Task Force Members: Lilli Japec, Co-Chair, Statistics Sweden Frauke Kreuter, Co-Chair, JPSM at the U. of Maryland, U. of Mannheim & IAB Marcus Berg, Stockholm University Paul Biemer, RTI International
More informationMicro Data Hubs for Central Banks and a (different) view on Big Data
Micro Data Hubs for Central Banks and a (different) view on Big Data Stefan Bender (Deutsche Bundesbank) Statistical Forum, Frankfurt, November, 19th 2015 Seite 1 The content of the paper represents the
More informationTotal Survey Error: Adapting the Paradigm for Big Data. Paul Biemer RTI International University of North Carolina
Total Survey Error: Adapting the Paradigm for Big Data Paul Biemer RTI International University of North Carolina Acknowledgements Phil Cooley, RTI Alan Blatecky, RTI 2 Why is a total error framework needed?
More informationInternational collaboration to understand the relevance of Big Data for official statistics
Statistical Journal of the IAOS 31 (2015) 159 163 159 DOI 10.3233/SJI-150889 IOS Press International collaboration to understand the relevance of Big Data for official statistics Steven Vale United Nations
More informationAAPOR Report on Big Data
AAPOR Report on Big Data AAPOR Big Data Task Force February 12, 2015 Prepared for AAPOR Council by the Task Force, with Task Force members including: Lilli Japec, Co-Chair, Statistics Sweden Frauke Kreuter,
More informationTutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
More informationVisualization and Big Data in Official Statistics
Visualization and Big Data in Official Statistics Martijn Tennekes In cooperation with Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge From a Official Statistics point of view Three types
More informationBig Data. Case studies in Official Statistics. Martijn Tennekes. Special thanks to Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge
Big Data Case studies in Official Statistics Martijn Tennekes Special thanks to Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge From a Official Statistics point of view Three types of
More informationBig Data andofficial Statistics Experiences at Statistics Netherlands
Big Data andofficial Statistics Experiences at Statistics Netherlands Peter Struijs Poznań, Poland, 10 September 2015 Outline Big Data and official statistics Experiences at Statistics Netherlands with:
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationbig data in the European Statistical System
Conference by STATEC and EUROSTAT Savoir pour agir: la statistique publique au service des citoyens big data in the European Statistical System Michail SKALIOTIS EUROSTAT, Head of Task Force 'Big Data'
More informationBig data, the future of statistics
Big data, the future of statistics Experiences from Statistics Netherlands Dr. Piet J.H. Daas Senior-Methodologist, Big Data research coordinator and Marco Puts, Martijn Tennekes, Alex Priem, Edwin de
More informationGAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
More informationThe Sandbox 2015 Report
United Nations Economic Commission for Europe Statistical Division Workshop on the Modernisation of Official Statistics November 24-25, 2015 The Sandbox project The Sandbox 2015 Report Antonino Virgillito
More informationKeywords: big data, official statistics, quality, Wikipedia page views, AIS.
Comparative assessment of three quality frameworks for statistics derived from big data: the cases of Wikipedia page views and Automatic Identification Systems Fernando Reis 1, Loredana di Consiglio 1,
More informationMeeting with the Advisory Scientific Board of Statistics Sweden November 12, 2013
Advisory Scientific Board Ingegerd Jansson Suad Elezović Notes November 12, 2013 Meeting with the Advisory Scientific Board of Statistics Sweden November 12, 2013 Board members Stefan Lundgren, Statistics
More information2015 SOI Consultants Panel Meeting
2015 SOI Consultants Panel Meeting The content of this presentation is the opinion of the writer(s) and does not necessarily represent the opinion of the Internal Revenue Service Welcome Statistics of
More informationExtending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team rlancaster@orbitz.com @rob1lancaster Organizer of Chicago
More informationBig Data and Official Statistics The UN Global Working Group
Big Data and Official Statistics The UN Global Working Group Dr. Ronald Jansen Chief, International Trade Statistics United Nations Statistics Division jansen1@un.org Overview What is Big Data? What is
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationGetting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
More informationONS Big Data Project Progress report: Qtr 1 Jan to Mar 2014
Official ONS Big Data Project Qtr 1 Report May 2014 ONS Big Data Project Progress report: Qtr 1 Jan to Mar 2014 Jane Naylor, Nigel Swier, Susan Williams Office for National Statistics Background The amount
More informationBig Data (and official statistics) *
Distr. GENERAL Working Paper 11 April 2013 ENGLISH ONLY UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE (ECE) CONFERENCE OF EUROPEAN STATISTICIANS ORGANISATION FOR ECONOMIC COOPERATION AND DEVELOPMENT (OECD)
More informationQuestionnaire about the skills necessary for people. working with Big Data in the Statistical Organisations
Questionnaire about the skills necessary for people working with Big Data in the Statistical Organisations Preliminary results of the survey (19.08 2014) More detailed analysis will be prepared by October
More informationA very short talk about Apache Kylin Business Intelligence meets Big Data. Fabian Wilckens EMEA Solutions Architect
A very short talk about Apache Kylin Business Intelligence meets Big Data Fabian Wilckens EMEA Solutions Architect 1 The challenge today 2 Very quickly: OLAP Online Analytical Processing How many beers
More informationHow To Use Big Data For Business
Big Data Maturity - The Photo and The Movie Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson Mike
More informationOHS - The Big Data Project
Official ONS Big Data Project Qtr 2 Report August 2014 ONS Big Data Project Progress report: Qtr 2 April to June 2014 Jane Naylor, Nigel Swier, Susan Williams Office for National Statistics Background
More informationUtilizing big data to bring about innovative offerings and new revenue streams DATA-DERIVED GROWTH
Utilizing big data to bring about innovative offerings and new revenue streams DATA-DERIVED GROWTH ACTIONABLE INTELLIGENCE Ericsson is driving the development of actionable intelligence within all aspects
More informationBig Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
More informationA Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
More informationHadoop. MPDL-Frühstück 9. Dezember 2013 MPDL INTERN
Hadoop MPDL-Frühstück 9. Dezember 2013 MPDL INTERN Understanding Hadoop Understanding Hadoop What's Hadoop about? Apache Hadoop project (started 2008) downloadable open-source software library (current
More informationJoined up Government needs Joined up Data. John.Dunne@cso.ie IPA/ICS 10 th Annual Public Sector IT Conference Dublin Castle 31 st October 2014
Joined up Government needs Joined up Data John.Dunne@cso.ie IPA/ICS 10 th Annual Public Sector IT Conference Dublin Castle 31 st October 2014 Joined up data - context Irish Statistical System: The Way
More informationChukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
More informationThe? Data: Introduction and Future
The? Data: Introduction and Future Husnu Sensoy Global Maksimum Data & Information Technologies Global Maksimum Data & Information Technologies The Data Company Massive Data Unstructured Data Insight Information
More informationBIG DATA AND ANALYTICS
BIG DATA AND ANALYTICS Björn Bjurling, bgb@sics.se Daniel Gillblad, dgi@sics.se Anders Holst, aho@sics.se Swedish Institute of Computer Science AGENDA What is big data and analytics? and why one must bother
More informationESS event: Big Data in Official Statistics
ESS event: Big Data in Official Statistics v erbi v is 1 Parallel sessions 2A and 2B LEARNING AND DEVELOPMENT: CAPACITY BUILDING AND TRAINING FOR ESS HUMAN RESOURCES FACILITATOR: JOSÉ CERVERA- FERRI 2
More informationNative Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
More informationTrends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
More informationStrategies For Setting Up Your Organisation For Success With Big Data. Kevin Long Business Development Director Teradata
Strategies For Setting Up Your Organisation For Success With Big Data Kevin Long Business Development Director Teradata Agenda Developing a big data strategy and plan that is aligned with your organisation
More informationBig Data a big issue for Official Statistics?
Big Data a big issue for Official Statistics? ASC Conference 26 September 2014 Pete Brodie Session objectives Big Data and Official Statistics The ONS Big Data Project aims Wider engagement and communication
More informationBig Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
More informationBig Data @ CBS. Experiences at Statistics Netherlands. Dr. Piet J.H. Daas Methodologist, Big Data research coördinator. Statistics Netherlands
Big Data @ CBS Experiences at Statistics Netherlands Dr. Piet J.H. Daas Methodologist, Big Data research coördinator Statistics Netherlands April 20, Enschede Overview Big Data Research theme at Statistics
More informationBig Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
More informationUsing Data Mining and Machine Learning in Retail
Using Data Mining and Machine Learning in Retail Omeid Seide Senior Manager, Big Data Solutions Sears Holdings Bharat Prasad Big Data Solution Architect Sears Holdings Over a Century of Innovation A Fortune
More informationNavigating Big Data business analytics
mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what
More informationWHAT DOES BIG DATA MEAN FOR OFFICIAL STATISTICS?
UNITED NATIONS ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS 10 March 2013 WHAT DOES BIG DATA MEAN FOR OFFICIAL STATISTICS? At a High-Level Seminar on Streamlining Statistical Production
More informationBig Data and Data Science. The globally recognised training program
Big Data and Data Science The globally recognised training program Certificate in Big Data Analytics Duration 5 days Big Data and Data Science enables value creation from data, through the use of calculative
More informationBig Data Analytics. Optimizing Operations and Enabling New Business Models
Big Data Analytics Optimizing Operations and Enabling New Business Models By Sudeep Tandon Big Data has been the it term in business for nearly half a decade but few organizations have really leveraged
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationData Analyst Program- 0 to 100
Development Data Analyst Program- 0 to 100 Master the Data Analysis tools like Pig and hive Data Science Build a recommendation engine 1 Data Analyst Program- 0 to 100 HADOOP SCHOOL OF TRAINING Basics
More informationProject Outline: Data Integration: towards producing statistics by integrating different data sources
Project Outline: Data Integration: towards producing statistics by integrating different data sources Introduction There are many new opportunities created by data sources such as Big Data and Administrative
More information6 Steps to Faster Data Blending Using Your Data Warehouse
6 Steps to Faster Data Blending Using Your Data Warehouse Self-Service Data Blending and Analytics Dynamic market conditions require companies to be agile and decision making to be quick meaning the days
More informationBig Data: What Can Official Statistics Expect?
Big Data: What Can Official Statistics Expect? Peter Hackl Österreichische Statistiktage 2015 Outline Data Needs in Official Statistics Alternative Data Sources Historical Facts Some Initiatives in Detail
More informationCrack Open Your Operational Database. Jamie Martin jameison.martin@salesforce.com September 24th, 2013
Crack Open Your Operational Database Jamie Martin jameison.martin@salesforce.com September 24th, 2013 Analytics on Operational Data Most analytics are derived from operational data Two canonical approaches
More informationData First Framework. How to Build Your Enterprise Data Hub. Luis Campos Big Data Solutions Director Oracle Europe, Middle East and Africa
Data First Framework How to Build Your Enterprise Data Hub Luis Campos Big Data Solutions Director Oracle Europe, Middle East and Africa @luigicampos June 2014 Copyright 2015 Oracle and/or its affiliates.
More informationThis survey addresses individual projects, partnerships, data sources and tools. Please submit it multiple times - once for each project.
Introduction This survey has been developed jointly by the United Nations Statistics Division (UNSD) and the United Nations Economic Commission for Europe (UNECE). Our goal is to provide an overview of
More informationReference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
More informationBig Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
More informationBig Data - Business, Math, Technology Best combination for big data 商 业 理 解, 数 据 科 学, 技 术 实 践 之 完 美 结 合
Big Data - Business, Math, Technology Best combination for big data 商 业 理 解, 数 据 科 学, 技 术 实 践 之 完 美 结 合 Li Lei Big Data Chief Architect @ Huawei Corporate Agenda 1. Big Data Trends 2. Business, Math and
More informationWednesday, October 6, 2010
Evolving a New Analytical Platform What Works and What s Missing Jeff Hammerbacher Chief Scientist, Cloudera October 10, 2010 My Background Thanks for Asking hammer@cloudera.com Studied Mathematics at
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationData Science and Big Data: Below the Surface and Implications for Governance
Data Science and Big Data: Below the Surface and Implications for Governance Randy Soper The views expressed are those of the author and do not reflect the official position or policy of the Defense Intelligence
More informationHadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
More informationBig Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level?
Big Data Analytics: Where is it Going and How Can it Be Taught at the Undergraduate Level? Dr. Frank Lee Chair, ECE/CS/IT New York Institute of Technology Old Westbury, NY 11568 Topics This talk describes:
More informationBIG DATA AND MICROSOFT. Susie Adams CTO Microsoft Federal
BIG DATA AND MICROSOFT Susie Adams CTO Microsoft Federal THE WORLD OF DATA IS CHANGING Cloud What s making this possible? Electrical efficiency of computers doubles every year and ½. Laptops and mobile
More informationBig Data, Official Statistics and Social Science Research: Emerging Data Challenges
Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of
More informationBig Data Analytics OverOnline Transactional Data Set
Big Data Analytics OverOnline Transactional Data Set Rohit Vaswani 1, Rahul Vaswani 2, Manish Shahani 3, Lifna Jos(Mentor) 4 1 B.E. Computer Engg. VES Institute of Technology, Mumbai -400074, Maharashtra,
More informationBig Data & Analytics @ Netflix. Paul Ellwood February 9th, 2015
Big Data & Analytics @ Netflix Paul Ellwood February 9th, 2015 Who Am I? Director, Data Science & Engineering Also Leader, DataKind San Francisco chapter Formerly: Director, Product Analytics @ Netflix
More informationBIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
More informationANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
More informationMachine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI-261323
Machine Learning and Cloud Computing trends, issues, solutions Daniel Pop HOST Workshop 2012 Future plans // Tools and methods Develop software package(s)/libraries for scalable, intelligent algorithms
More informationThis Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
More informationUsing distributed technologies to analyze Big Data
Using distributed technologies to analyze Big Data Abhijit Sharma Innovation Lab BMC Software 1 Data Explosion in Data Center Performance / Time Series Data Incoming data rates ~Millions of data points/
More informationCloud Computing Training
Cloud Computing Training TechAge Labs Pvt. Ltd. Address : C-46, GF, Sector 2, Noida Phone 1 : 0120-4540894 Phone 2 : 0120-6495333 TechAge Labs 2014 version 1.0 Cloud Computing Training Cloud Computing
More informationIntroduction to Big Data the four V's
Chapter 1: Introduction to Big Data the four V's This chapter is mainly based on the Big Data script by Donald Kossmann and Nesime Tatbul (ETH Zürich) Big Data Management and Analytics 15 Goal of Today
More informationApplications for Big Data Analytics
Smarter Healthcare Applications for Big Data Analytics Multi-channel sales Finance Log Analysis Homeland Security Traffic Control Telecom Search Quality Manufacturing Trading Analytics Fraud and Risk Retail:
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationBig Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager steve.gonzales@thinkbiganalytics.com
More informationAppSymphony White Paper
AppSymphony White Paper Secure Self-Service Analytics for Curated Digital Collections Introduction Optensity, Inc. offers a self-service analytic app composition platform, AppSymphony, which enables data
More informationA Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
More informationSearch and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
More informationIs a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
More informationWROX Certified Big Data Analyst Program by AnalytixLabs and Wiley
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
More informationONS Big Data Project Progress report: Qtr 1 January to March 2015
Official ONS Big Data Project Qtr 1 Report May 2015 ONS Big Data Project Progress report: Qtr 1 January to March 2015 Jane Naylor, Nigel Swier, Susan Williams, Karen Gask, Rob Breton Office for National
More informationReal Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationBringing the Power of SAS to Hadoop. White Paper
White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What
More informationBig Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014
Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools
More informationAutomated Machine Learning For Autonomic Computing
Automated Machine Learning For Autonomic Computing ICAC 2012 Numenta Subutai Ahmad Autonomic Machine Learning ICAC 2012 Numenta Subutai Ahmad 35% 30% 25% 20% 15% 10% 5% 0% Percentage of Machine Learning
More informationReport of the 2015 Big Data Survey. Prepared by United Nations Statistics Division
Statistical Commission Forty-seventh session 8 11 March 2016 Item 3(c) of the provisional agenda Big Data for official statistics Background document Available in English only Report of the 2015 Big Data
More informationRoadmap Talend : découvrez les futures fonctionnalités de Talend
Roadmap Talend : découvrez les futures fonctionnalités de Talend Cédric Carbone Talend Connect 9 octobre 2014 Talend 2014 1 Connecting the Data-Driven Enterprise Talend 2014 2 Agenda Agenda Why a Unified
More informationStudent Project 2 - Apps Frequently Installed Together
Student Project 2 - Apps Frequently Installed Together 42matters is a rapidly growing start up, leading the development of next generation mobile user modeling technology. Our solutions are used by big
More information#TalendSandbox for Big Data
Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND
More informationThe Future of Business Analytics is Now! 2013 IBM Corporation
The Future of Business Analytics is Now! 1 The pressures on organizations are at a point where analytics has evolved from a business initiative to a BUSINESS IMPERATIVE More organization are using analytics
More informationPlay with Big Data on the Shoulders of Open Source
OW2 Open Source Corporate Network Meeting Play with Big Data on the Shoulders of Open Source Liu Jie Technology Center of Software Engineering Institute of Software, Chinese Academy of Sciences 2012-10-19
More informationBig Data: Challenges. Institute = Computational Crossroads. Azer Bestavros Founding Director. Big Data Cloud Security = Big Joke 6/2/2014
The Rafik B. Hariri Institute for Computing at Boston University The Hariri Institute at BU A world class center for discovery and innovation in computing and computational science & engineering Azer Bestavros
More informationOpen source Google-style large scale data analysis with Hadoop
Open source Google-style large scale data analysis with Hadoop Ioannis Konstantinou Email: ikons@cslab.ece.ntua.gr Web: http://www.cslab.ntua.gr/~ikons Computing Systems Laboratory School of Electrical
More informationAdvanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
More informationNEWLY EMERGING BEST PRACTICES FOR BIG DATA
2000-2012 Kimball Group. All rights reserved. Page 1 NEWLY EMERGING BEST PRACTICES FOR BIG DATA Ralph Kimball Informatica October 2012 Ralph Kimball Big is Being Monetized Big data is the second era of
More information