Task Force Members: Lilli Japec Frauke Kreuter Marcus Berg Paul Biemer Paul Decker Cliff Lampe
|
|
|
- Rafe Chase
- 10 years ago
- Views:
Transcription
1
2 Task Force Members: Lilli Japec, Co-Chair, Statistics Sweden Frauke Kreuter, Co-Chair, JPSM at the U. of Maryland, U. of Mannheim & IAB Marcus Berg, Stockholm University Paul Biemer, RTI International Paul Decker, Mathematica Policy Research Cliff Lampe, School of Information at the University of Michigan Julia Lane, American Institutes for Research Cathy O Neil, Johnson Research Labs Abe Usher, HumanGeo Group Acknowledgement: We are grateful for comments, feedback and editorial help from Eran Ben-Porath, Jason McMillan, and the AAPOR council members.
3 The report has four objectives: 1. to educate the AAPOR membership about Big Data (Section 3) 2. to describe the Big Data potential (Section 4 and Section 7) 3. to describe the Big Data challenges (Section 5 and 6) 4. to discuss possible solutions and research needs (Section 8)
4 Big Data AAPOR Task Force Source: Frauke Kreuter
5 until recently three main data sources
6 Survey Data Administrative Data Experiments Source: Frauke Kreuter
7 now
8 US Aggregated Inflation Series, Monthly Rate, PriceStats Index vs. Official CPI. Accessed January 18, 2015 from the PriceStats website.
9 Number of vehicles detected in the Netherlands on December 1, 2011 created by Statistics Netherlands (Daas et al. 2013). The vehicle size is shown in different colors; black is small size, red is medium size and green is large size.
10 Social media sentiment (daily, weekly and monthly) in the Netherlands, June November The development of consumer confidence for the same period is shown in the insert (Daas and Puts 2014).
11 Big Data
12 Hope that found/organic data Can replace or augment expensive data collections More (= better) data for decision making Information available in (nearly) real time Source: Frauke Kreuter
13 But (at least) one more V
14 Thank You!
15 CHANGE IN PARADIGM AND RISKS INVOLVED Julia Lane New York University American Institutes for Research University of Strasbourg
16 Big Data definition Big Data is an imprecise description of a rich and complicated set of characteristics, practices, techniques, ethics, and outcomes all associated with data. (AAPOR) No canonical definition By characteristics: Volume Velocity Variety (and Variability and Veracity) By source: found vs. made By use: professionals vs. citizen science By reach: datafication By paradigm: Fourth paradigm Source: Julia Lane
17 Motivation New business model Federal agencies no longer major players New analytical model Outliers Finegrained analysis New units of analysis New sets of skills Computer scientists Citizen scientists Different cost structure Source: Julia Lane
18 New Frameworks Source: Ian Foster, University of Chicago
19 New kinds of analysis Source: Jason Owen Smith and UMETRICS data
20 Access for Research Source: Julia Lane
21 Value in other fields Source: Julia Lane
22 Source: Julia Lane
23 Core Questions What is the legal framework? What is the practical framework? What is the statistical framework? Source: Julia Lane
24 Legal Framework Current legal structure inadequate The recording, aggregation,and organization of information into a form that can be used for data mining, here dubbed datafication, has distinct privacy implications that often go unrecognized by current law (Strandburg) Assessment of harm from privacy inadequate Privacy and big data are incompatible Anonymity not possible Informed consent not possible Source: Julia Lane
25 Informed consent (Nissenbaum) Source: Julia Lane
26 Statistical Framework Importance of valid inference The role of statisticians/access Inadequate current statistical disclosure limitation Diminished role of federal statistical agencies Limitations of survey New analytical framework: Mathematically rigorous theory of privacy Measurement of privacy loss Differential privacy Source: Julia Lane
27 Some suggestions Source: Julia Lane
28 And a reminder of why Source: Julia Lane
29 Comments and questions
30 Skills Required to Integrate Big Data into Public Opinion Research Abe Usher Chief Technology Officer, HumanGeo
31 Outline Big data demystified Four layers of big data Skills required Easter eggs Source: Abe Usher
32 Courtesy of Google Trends: Big Data Today
33 Courtesy of Google Trends: Big Data & Public Opinion vs Pop Culture
34 Big data: De-mystified What is big data? What is Hadoop File System? (HDFS) What is Hadoop MapReduce? (MR) Source: Abe Usher 34
35 20 th Century model of analysis How can you identify a legitimate hip-hop artist (versus someone who just gets up and rhymes)? Tracy Morrow (aka Ice T ) Source: Abe Usher 35
36 20 th Century model of analysis How can you identify a legitimate hip-hop artist (versus someone who just gets up and rhymes)? Game knows game, baby. Tracy Morrow (aka Ice T ) Source: Abe Usher 36
37 20 th Century model of analysis How can you identify a legitimate hip-hop artist (versus someone who just gets up and rhymes)? Tracy Morrow (aka Ice T ) If you have expert knowledge, then you are capable of answering complex questions by interpreting domain specific information. [paraphrased] Source: Abe Usher 37
38 New model of analysis Peter Gibbons hatches a plot to write a computer virus that grab fractions of a penny from a corporate retirement account. Office Space Takeaway point: Little bits of value (information) provide deep insights in the aggregate Source: Abe Usher 38
39 New model of analysis Takeaway point: Hadoop simplifies the creation of massive counting machines Source: Abe Usher 39
40 Big Data: Layers Source: Abe Usher 40
41 Big Data: Layers Data Output Example: map visualization Data Analysis Example: Hadoop MapReduce Data Storage Example: Hadoop Distributed File System Data Source(s) Examples: geolocated social media (Proxy variable for behavior of interest) Source: Abe Usher 41
42 Source: Abe Usher Four roles related to big data: each provide different skills
43 Skills by role Computer scientist Data preparation MapReduce algorithms Python/R programming Hadoop ecosystem System Administrator Storage systems (MySQL, Hbase, Spark) Cloud computing: Amazon Web Services (AWS) Google Compute Engine Hadoop ecosystem Source: Abe Usher 43
44 Big data enables new insights into human behavior Geolocated social media activity in Washington DC during a 15 minute time period generated by MR. TweetMap Source: Abe Usher
45 Contact information Abe Usher 45
46 Easter Eggs Google do a barrel roll Google gravity Google search in Klingon Source: Abe Usher 46
47 Big Data Veracity: Error Sources and Inferential Risks Paul Biemer RTI International and University of North Carolina Source: Paul Biemer
48 Errors in Big Data: An Illustration Suppose 1 in 1,000,000 people are terrorists The Big Data Terrorist Detector is 99.9 accurate The detector says your friend, Jack is a terrorist. What are the odds that Jack is really a terrorist? Terrorist Detector Terrorist Detector Source: Paul Biemer 48
49 Errors in Big Data: An Illustration Suppose 1 in 1,000,000 people are terrorists The Big Data Terrorist Detector is 99.9 accurate The detector says your friend, Jack is a terrorist. What are the odds that Jack is really a terrorist? Answer: 1 in 1000 i.e., 99.9% of the terrorist detections will be false! Terrorist Detector Terrorist Detector Source: Paul Biemer 49
50 Some questions regarding Big Data veracity What constitutes a Big Data error? What are the sources and causes of the errors? Do the error distributions vary by source? Are the errors systematic or variable or both? Systematic error in Google Flu Trends data Source: Paul Biemer 50
51 Some questions regarding Big Data veracity What constitutes a Big Data error? What are the sources and causes of the errors? Do the error distributions vary by source? Are the errors systematic or variable or both? How do the errors affect data analysis such as Classifications Correlations Regressions How can analysts mitigate these effects? Source: Paul Biemer 51
52 Total Error Framework for Traditional Data Sets Typical File Structure Record # V 1 V 2 V K variables or features Population units Source: Paul Biemer
53 Total Error Framework for Traditional Data Sets Typical File Structure Record # V 1 V 2 V K variables or features Population units total error = row error + column error + cell error Source: Paul Biemer
54 Possible Column and Cell Errors Typical File Structure Record # V 1 V 2 V K variables or features Population units Source: Paul Biemer Misspecified variables = specification error Variable values in error = content error Variable values missing = missing data?
55 Possible Row Errors Typical File Structure Record # V 1 V 2 V K variables or features Population units Missing records = undercoverage error Non-population records = overcoverage Duplicated records = duplication error Source: Paul Biemer
56 Shortcomings of the Traditional Framework for Big Data Big Data files are often not rectangular hierarchically structure or unstructured Data may be distributed across many data bases Sometimes federated, but often not Data sources may be quite heterogeneous Includes texts, sensors, transactions, and images Errors generated by Map/Reduce process may not lend themselves to column-row representations. Source: Paul Biemer 56
57 Big Data Process Map Generate Source 1 ETL Extract Analyze Filter/Reduction (Sampling) Source 2 Source K Transform (Cleanse) Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 57
58 Big Data Process Map Generation Source 1 Source 2 Source K ETL Errors include: Extract low signal/noise ratio; lost signals; failure to capture; non-random (or nonrepresentative) sources; metadata that are lacking, absent, or erroneous. Transform (Cleanse) Load (Store) Analyze Filter/Reduction (Sampling) Computation/ Analysis (Visualization) Source: Paul Biemer 58
59 Big Data Process Map Generation Source 1 Source 2 Source K ETL Extract Transform (Cleanse) Load (Store) Analyze Errors include: specification error (including, errors in meta-data), matching error, Filter/Reduction coding error, editing error, data (Sampling) munging errors, and data integration errors.. Computation/ Analysis (Visualization) Source: Paul Biemer 59
60 Big Data Process Map Generation Source 1 Data are filtered, sampled or otherwise Errors reduced. include: ETL This sampling may errors, involve selectivity further errors (or lack transformations of representativity), Extract of the modeling data. errors Analyze Filter/Reduction (Sampling) Source 2 Source K Transform (Cleanse) Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 60
61 Big Data Process Map Generation Source 1 ETL Extract Analyze Filter/Reduction (Sampling) Source 2 Source K Errors include: Transform modeling errors, inadequate or (Cleanse) erroneous adjustments for representativity, computation and algorithmic errors. Load (Store) Computation/ Analysis (Visualization) Source: Paul Biemer 61
62 Implications for Data Analysis Study of rare groups is problematic Stork Die-off Linked to Human Birth Decline Biased correlational analysis Biased regression analysis Coincidental correlations Source: Paul Biemer 62
63 Implications for Data Analysis Study of rare groups is problematic Biased correlational analysis Biased regression analysis Coincidental correlations Noise accumulation inability to identify correlates Incidental endogeneity Cov(error, covariates) Source: 63 Paul Biemer
64 Implications for Data Analysis Study of rare groups is problematic Biased correlational analysis Biased regression analysis Coincidental correlations Noise accumulation inability to identify correlates Incidental endogeneity Cov(error, covariates) These latter three issues are a concern even if the data could be regarded as error-free. Data errors can considerably exacerbate these problems. Current research is aimed at investigating these errors. Source: Paul Biemer 64
65 Recommendations 1. Surveys and Big Data are complementary data sources not competing data sources. There are differences between the approaches, but this should be seen as an advantage rather than a disadvantage. 2. AAPOR should develop standards for the use of Big Data in survey research when more knowledge has been accumulated.
66 3. AAPOR should start working with the private sector and other professional organizations to educate its members on Big Data 4. AAPOR should inform the public of the risks and benefits of Big Data.
67 5. AAPOR should help remove the barrier associated with different uses of terminology. 6. AAPOR should take a leading role in working with federal agencies in developing a necessary infrastructure for the use of Big Data in survey research.
Unlocking the Full Potential of Big Data
Unlocking the Full Potential of Big Data Lilli Japec, Frauke Kreuter JOS anniversary June 2015 facebook.com/statisticssweden @SCB_nyheter The report is available at https://www.aapor.org Task Force Members:
Total Survey Error: Adapting the Paradigm for Big Data. Paul Biemer RTI International University of North Carolina
Total Survey Error: Adapting the Paradigm for Big Data Paul Biemer RTI International University of North Carolina Acknowledgements Phil Cooley, RTI Alan Blatecky, RTI 2 Why is a total error framework needed?
2015 SOI Consultants Panel Meeting
2015 SOI Consultants Panel Meeting The content of this presentation is the opinion of the writer(s) and does not necessarily represent the opinion of the Internal Revenue Service Welcome Statistics of
Micro Data Hubs for Central Banks and a (different) view on Big Data
Micro Data Hubs for Central Banks and a (different) view on Big Data Stefan Bender (Deutsche Bundesbank) Statistical Forum, Frankfurt, November, 19th 2015 Seite 1 The content of the paper represents the
AAPOR Report on Big Data
AAPOR Report on Big Data AAPOR Big Data Task Force February 12, 2015 Prepared for AAPOR Council by the Task Force, with Task Force members including: Lilli Japec, Co-Chair, Statistics Sweden Frauke Kreuter,
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
The? Data: Introduction and Future
The? Data: Introduction and Future Husnu Sensoy Global Maksimum Data & Information Technologies Global Maksimum Data & Information Technologies The Data Company Massive Data Unstructured Data Insight Information
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA
Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA http://kzhang6.people.uic.edu/tutorial/amcis2014.html August 7, 2014 Schedule I. Introduction to big data
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
This Symposium brought to you by www.ttcus.com
This Symposium brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation @Techtrain Technology Training Corporation www.ttcus.com Big Data Analytics as a Service (BDAaaS) Big Data
Big Data & Security. Aljosa Pasic 12/02/2015
Big Data & Security Aljosa Pasic 12/02/2015 Welcome to Madrid!!! Big Data AND security: what is there on our minds? Big Data tools and technologies Big Data T&T chain and security/privacy concern mappings
ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics
BIG DATA & ANALYTICS Transforming the business and driving revenue through big data and analytics Collection, storage and extraction of business value from data generated from a variety of sources are
North Highland Data and Analytics. Data Governance Considerations for Big Data Analytics
North Highland and Analytics Governance Considerations for Big Analytics Agenda Traditional BI/Analytics vs. Big Analytics Types of Requiring Governance Key Considerations Information Framework Organizational
From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data
100 001 010 111 From Raw Data to 10011100 Actionable Insights with 00100111 MATLAB Analytics 01011100 11100001 1 Access and Explore Data For scientists the problem is not a lack of available but a deluge.
COMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 11 (Part II) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
Advanced Big Data Analytics with R and Hadoop
REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional
Reference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
Data Isn't Everything
June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration,
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum
Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms
ICT Perspectives on Big Data: Well Sorted Materials
ICT Perspectives on Big Data: Well Sorted Materials 3 March 2015 Contents Introduction 1 Dendrogram 2 Tree Map 3 Heat Map 4 Raw Group Data 5 For an online, interactive version of the visualisations in
W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
Chapter 6. Foundations of Business Intelligence: Databases and Information Management
Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Safe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM
QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014
Big Data Analytics An Introduction Oliver Fuchsberger University of Paderborn 2014 Table of Contents I. Introduction & Motivation What is Big Data Analytics? Why is it so important? II. Techniques & Solutions
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
Big Data & Analytics @ Netflix. Paul Ellwood February 9th, 2015
Big Data & Analytics @ Netflix Paul Ellwood February 9th, 2015 Who Am I? Director, Data Science & Engineering Also Leader, DataKind San Francisco chapter Formerly: Director, Product Analytics @ Netflix
Using Predictive Maintenance to Approach Zero Downtime
SAP Thought Leadership Paper Predictive Maintenance Using Predictive Maintenance to Approach Zero Downtime How Predictive Analytics Makes This Possible Table of Contents 4 Optimizing Machine Maintenance
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management
Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must
Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data
Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.
Big Data Analytics Platform @ Nokia
Big Data Analytics Platform @ Nokia 1 Selecting the Right Tool for the Right Workload Yekesa Kosuru Nokia Location & Commerce Strata + Hadoop World NY - Oct 25, 2012 Agenda Big Data Analytics Platform
International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: [email protected]
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Big Data for Development: What May Determine Success or failure?
Big Data for Development: What May Determine Success or failure? Emmanuel Letouzé [email protected] OECD Technology Foresight 2012 Paris, October 22 Swimming in Ocean of data Data deluge Algorithms
Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.
Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their
Crack Open Your Operational Database. Jamie Martin [email protected] September 24th, 2013
Crack Open Your Operational Database Jamie Martin [email protected] September 24th, 2013 Analytics on Operational Data Most analytics are derived from operational data Two canonical approaches
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics
Surfing the Data Tsunami: A New Paradigm for Big Data Processing and Analytics Dr. Liangxiu Han Future Networks and Distributed Systems Group (FUNDS) School of Computing, Mathematics and Digital Technology,
BIG DATA IN BUSINESS ENVIRONMENT
Scientific Bulletin Economic Sciences, Volume 14/ Issue 1 BIG DATA IN BUSINESS ENVIRONMENT Logica BANICA 1, Alina HAGIU 2 1 Faculty of Economics, University of Pitesti, Romania [email protected] 2 Faculty
Policy-based Pre-Processing in Hadoop
Policy-based Pre-Processing in Hadoop Yi Cheng, Christian Schaefer Ericsson Research Stockholm, Sweden [email protected], [email protected] Abstract While big data analytics provides
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012
Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster Nov 7, 2012 Who I Am Robert Lancaster Solutions Architect, Hotel Supply Team [email protected] @rob1lancaster Organizer of Chicago
CHAPTER SIX DATA. Business Intelligence. 2011 The McGraw-Hill Companies, All Rights Reserved
CHAPTER SIX DATA Business Intelligence 2011 The McGraw-Hill Companies, All Rights Reserved 2 CHAPTER OVERVIEW SECTION 6.1 Data, Information, Databases The Business Benefits of High-Quality Information
Extend your analytic capabilities with SAP Predictive Analysis
September 9 11, 2013 Anaheim, California Extend your analytic capabilities with SAP Predictive Analysis Charles Gadalla Learning Points Advanced analytics strategy at SAP Simplifying predictive analytics
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &
BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research & Innovation 04-08-2011 to the EC 8 th February, Luxembourg Your Atos business Research technologists. and Innovation
BIG DATA DAY BAKU 2015
BIG DATA DAY BAKU 2015 Qafqaz University Azerbaijan, Baku, 16 May 2015 www.cedawi.org/big-data-day/ SUMMARY REPORT BIG DATA DAY BAKU 2015 The Internet Services, Web and Mobile Applications, Pervasive Communication
SURVEY REPORT DATA SCIENCE SOCIETY 2014
SURVEY REPORT DATA SCIENCE SOCIETY 2014 TABLE OF CONTENTS Contents About the Initiative 1 Report Summary 2 Participants Info 3 Participants Expertise 6 Suggested Discussion Topics 7 Selected Responses
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
Search and Real-Time Analytics on Big Data
Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its
Data-intensive HPC: opportunities and challenges. Patrick Valduriez
Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,
Testing 3Vs (Volume, Variety and Velocity) of Big Data
Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used
Architectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
How To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley
WROX Certified Big Data Analyst Program by AnalytixLabs and Wiley Disclaimer: This material is protected under copyright act AnalytixLabs, 2011. Unauthorized use and/ or duplication of this material or
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK BIG DATA HOLDS BIG PROMISE FOR SECURITY NEHA S. PAWAR, PROF. S. P. AKARTE Computer
A New Era Of Analytic
Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness
Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data
INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are
Big Data, Official Statistics and Social Science Research: Emerging Data Challenges
Big Data, Official Statistics and Social Science Research: Emerging Data Challenges Professor Paul Cheung Director, United Nations Statistics Division Building the Global Information System Elements of
A Scalable Data Transformation Framework using the Hadoop Ecosystem
A Scalable Data Transformation Framework using the Hadoop Ecosystem Raj Nair Director Data Platform Kiru Pakkirisamy CTO AGENDA About Penton and Serendio Inc Data Processing at Penton PoC Use Case Functional
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected]
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected] Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
Big Data. Fast Forward. Putting data to productive use
Big Data Putting data to productive use Fast Forward What is big data, and why should you care? Get familiar with big data terminology, technologies, and techniques. Getting started with big data to realize
White Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business
BUDT 758B-0501: Big Data Analytics (Fall 2015) Decisions, Operations & Information Technologies Robert H. Smith School of Business Instructor: Kunpeng Zhang ([email protected]) Lecture-Discussions:
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
6 Steps to Faster Data Blending Using Your Data Warehouse
6 Steps to Faster Data Blending Using Your Data Warehouse Self-Service Data Blending and Analytics Dynamic market conditions require companies to be agile and decision making to be quick meaning the days
Big Data and Analytics (Fall 2015)
Big Data and Analytics (Fall 2015) Core/Elective: MS CS Elective MS SPM Elective Instructor: Dr. Tariq MAHMOOD Credit Hours: 3 Pre-requisite: All Core CS Courses (Knowledge of Data Mining is a Plus) Every
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION
GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.
Big Data Analytics. Prof. Dr. Lars Schmidt-Thieme
Big Data Analytics Prof. Dr. Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany 33. Sitzung des Arbeitskreises Informationstechnologie,
Keywords: big data, official statistics, quality, Wikipedia page views, AIS.
Comparative assessment of three quality frameworks for statistics derived from big data: the cases of Wikipedia page views and Automatic Identification Systems Fernando Reis 1, Loredana di Consiglio 1,
Big Data and Open Data
Big Data and Open Data Bebo White SLAC National Accelerator Laboratory/ Stanford University!! [email protected] dekabytes hectobytes Big Data IS a buzzword! The Data Deluge From the beginning of
IBM Big Data Platform
IBM Big Data Platform Turning big data into smarter decisions Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie Big Data, Bigga byte kräver Pigga Hertz! May 16, 2013 By 2015, 80% of
Utilizing big data to bring about innovative offerings and new revenue streams DATA-DERIVED GROWTH
Utilizing big data to bring about innovative offerings and new revenue streams DATA-DERIVED GROWTH ACTIONABLE INTELLIGENCE Ericsson is driving the development of actionable intelligence within all aspects
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
Big Data at DST. Bill Nixon, Matt Crouch
Big Data at DST Bill Nixon, Matt Crouch 2013 DST Systems, Inc. 2013 All rights DST Systems, reserved. Inc. All rights reserved. The enclosed materials are highly sensitive, proprietary and confidential.
Cloud Big Data Architectures
Cloud Big Data Architectures Lynn Langit QCon Sao Paulo, Brazil 2016 About this Workshop Real-world Cloud Scenarios w/aws, Azure and GCP 1. Big Data Solution Types 2. Data Pipelines 3. ETL and Visualization
Introduction to Big Data with Apache Spark UC BERKELEY
Introduction to Big Data with Apache Spark UC BERKELEY This Lecture Data Cleaning Data Quality: Problems, Sources, and Continuum Data Gathering, Delivery, Storage, Retrieval, Mining/Analysis Data Quality
