ONS Big Data Project. Karen Gask Office for National Statistics
|
|
|
- Jody Lilian May
- 10 years ago
- Views:
Transcription
1 ONS Big Data Project Karen Gask Office for National Statistics
2 Plan for today Introduce the ONS Big Data Project Provide a brief overview of our work to date Provide information about our future work plan
3 What is Big Data? Big data are high volume, high velocity, and high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization (Gartner 2012) Volume - exceeds limits of traditional column and row databases - constantly growing Velocity - arrives rapidly, often in real time Variety - does not have a standard structure, e.g. text, images
4 How is big data generated? Sensors gathering information: e.g. Climate, traffic etc. Social media: posts, pictures and videos Digital satellite images Purchase transaction records High volume administrative & transactional records Mobile phone GPS signals
5 What is the ONS Big Data Project? A project which aims to: investigate the potential for big data in official statistics while understanding the challenges establish an ONS policy and longer term strategy which incorporates ONS s position within Government and internationally in this field Recommend next steps to support the strategy going forward First phase from January 2014 March 2015 Now extended for another year
6 Big Data Project work packages Management and Strategy Stakeholder Engagement Communication Analysis and infrastructure: Smart meters #Twitter Pilots Prices Mobile phones
7 Pilot 1: Smart-type meters Research Question: Investigate the potential of smart-type meter electricity data (high frequency 30 mins) to model likelihood of household occupancy patterns More efficient response chasing Data from smart-type meter trials in Great Britain and Republic of Ireland A range of potential methods identified Need to be careful of privacy and ethics
8 Smart-type Meter Energy Use Profiles Active week Inactive week Inactive week?
9 Pilot 2: Mobile Phones Mobile phone data to model population flows, e.g. Commuting statistics Building relationships with mobile network operators and other parts of UK Government No data yet. Seeking better coordinated data access for Government Privacy and ethics (again)
10 Pilot 3: Prices Project Research Question: To investigate how we can scrape prices data from the internet and how this data could be used within price statistics ONS prices collection is manual Web scraping promises more detailed, more frequent and cheaper data Prototype web scrapers: 35 CPI/RPI item categories 3 supermarkets Daily collection since April (around 6,500 a day)
11 Pilot 3: Prices by webscraping Rendered webpage: HTML code:... </div><div class="productlists" id="endfacets-1"><ul class="cf products line"><li id="p " class=" first"><div class="desc"><h3 class="inbasketinfocontainer"><a id="h " href="/groceries/product/details/?id= " class="si_pl_ title"><span class="image"><img src=" alt="" /><!----></span>warburtons Toastie Sliced White Bread 800G</a></h3><p class="limitedlife"><a href=" the freshest food to your door- Find out more ></a></p><div class="desccontent"><!----><div class="promo"><a href="/groceries/specialoffers/specialofferdetail/default.aspx?promoid=a " title="all products available for this offer" id="flyout promo-a pos" class="promoflyout"><span class="promoimgbox"><img src="/groceries/uiassets/i/sites/retail/superstore/online/product/pos/2for.png" class="promoflyout promo" alt="special Offer" id="flyout promo-a posimg" /></span><em>any 2 for 2.00</em></a><span> valid from 21/1/2014 until 10/2/2014</span></div><div class="tools"><div class="moreinfo"><a href="/groceries/product/details/?id= " class="midiflyout" id="flyout midi-0-"><img class="midiflyout hd" src=" nfoblue.gif" alt="" title="view product information" id="flyout midi-1-" /></a></div><!----><div class="links"><ul><li><a href=" class="shelfflyout active plaintooltip" id="s-tt " title="premium White Bread"> Rest of <span class="hide">premium White Bread <!----></span>shelf </a></li></ul></div></div></div></div><div class="quantity"><div class="content addtobasket"><p class="price"><span class="lineprice"> 1.45<!----></span><span class="linepriceabbr"> ( 0.18/100g)</span></p><h4 class="hide">add to basket</h4><form method="post" id="fmultisearch "...
12 Daily Price Index (Whiskey)
13 Pilot 4: Twitter Project Research Question: To investigate how to capture geo-located tweets from Twitter and how this data might provide insights on internal migration 7 months of geo-located tweets within Great Britain (about 100 million data points) Methodology to infer place of usual residence: - Identify user anchor points by clustering tweets - Identify residential anchor points using AddressBase and nearest neighbour analysis
14 Lots of activity in different places but where does this person live?
15 Most likely lives here Cluster_id Count Raw Data Clusters derived Cluster Centroid Noise
16 Use case: Student mobility
17 Presentation and dissemination challenges Volume of data presents challenge to presentation Used R software to automate visualisation for smart-type meter pilot Visualising summary statistics (like with small data ) Could use interactive visualisations (eg. by using free software Tableau) Appropriate / ethical use of data needs to be clearly communicated
18 Use of R to display half hourly electricity data for a year Active Inactive
19 Use of R animations
20 Where to from here? Funding 1 year more Prioritisation of pilots and other activities (Continue? New?) Improve our understanding of technology Team expansion (TBA) Establishment of Government Data Science Partnership to coordinate delivery and further development of data science in government
21 Questions?
ONS Big Data Project Progress report: Qtr 1 January to March 2015
Official ONS Big Data Project Qtr 1 Report May 2015 ONS Big Data Project Progress report: Qtr 1 January to March 2015 Jane Naylor, Nigel Swier, Susan Williams, Karen Gask, Rob Breton Office for National
BIG DATA FUNDAMENTALS
BIG DATA FUNDAMENTALS Timeframe Minimum of 30 hours Use the concepts of volume, velocity, variety, veracity and value to define big data Learning outcomes Critically evaluate the need for big data management
Now, Next and the Future: IT, Big Data and other Implications for RIM. Presented by Michael S. Smith / http://about.me/mikessmith
Now, Next and the Future: IT, Big Data and other Implications for RIM Agenda for This Afternoon Now: What trends are creating implications within the profession? Next: Why is IT now concerned about RIM?
CSC590: Selected Topics BIG DATA & DATA MINING. Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait
CSC590: Selected Topics BIG DATA & DATA MINING Lecture 2 Feb 12, 2014 Dr. Esam A. Alwagait Agenda Introduction What is Big Data Why Big Data? Characteristics of Big Data Applications of Big Data Problems
Big Data for Official Statistics The 2030 Agenda for Sustainable Development
Big Data for Official Statistics The 2030 Agenda for Sustainable Development Ronald Jansen Assistant Director United Nations Statistics Division 10/09/2015 United Nations Statistics Division Slide 1 Overview
Statistical Challenges with Big Data in Management Science
Statistical Challenges with Big Data in Management Science Arnab Kumar Laha Indian Institute of Management Ahmedabad Analytics vs Reporting Competitive Advantage Reporting Prescriptive Analytics (Decision
big data in the European Statistical System
Conference by STATEC and EUROSTAT Savoir pour agir: la statistique publique au service des citoyens big data in the European Statistical System Michail SKALIOTIS EUROSTAT, Head of Task Force 'Big Data'
Global Positioning Systems. Karen Walls Clinical Lead OT Dementia Team Northern Trust
Global Positioning Systems Karen Walls Clinical Lead OT Dementia Team Northern Trust Aim To give a short overview of the Northern Trust Occupational Therapy s Project on the use of Global Positioning Systems
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank
Danny Wang, Ph.D. Vice President of Business Strategy and Risk Management Republic Bank Agenda» Overview» What is Big Data?» Accelerates advances in computer & technologies» Revolutionizes data measurement»
Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013
Annex: Concept Note Friday Seminar on Emerging Issues Big Data for Policy, Development and Official Statistics New York, 22 February 2013 How is Big Data different from just very large databases? 1 Traditionally,
The Challenges of Geospatial Analytics in the Era of Big Data
The Challenges of Geospatial Analytics in the Era of Big Data Dr Noordin Ahmad National Space Agency of Malaysia (ANGKASA) CITA 2015: 4-5 August 2015 Kuching, Sarawak Big datais an all-encompassing term
How To Understand The Benefits Of Big Data
Findings from the research collaboration of IBM Institute for Business Value and Saïd Business School, University of Oxford Analytics: The real-world use of big data How innovative enterprises extract
Visualization and Big Data in Official Statistics
Visualization and Big Data in Official Statistics Martijn Tennekes In cooperation with Piet Daas, Marco Puts, May Offermans, Alex Priem, Edwin de Jonge From a Official Statistics point of view Three types
Vision Based Parking Lot Monitoring: Available Parking Spaces Information
Vision Based Parking Lot Monitoring: Available Parking Spaces Information Leonard Yoon University of California, San Diego Department of Electrical & Computer Engineering [email protected] Kyumin Cho University
Big Data a threat or a chance?
Big Data a threat or a chance? Helwig Hauser University of Bergen, Dept. of Informatics Big Data What is Big Data? well, lots of data, right? we come back to this in a moment. certainly, a buzz-word but
Big Data and Society: The Use of Big Data in the ATHENA project
Big Data and Society: The Use of Big Data in the ATHENA project Professor David Waddington CENTRIC Lead on Ethics, Media and Public Disorder [email protected] Helen Gibson CENTRIC Researcher [email protected]
Understanding data visualisation to create insight
Understanding data visualisation to create insight 72hrs of you tube video 571 new websites 100m new emails 277,000 tweets.. created every minute Channel growth Data vs Visualisation Where do you start?
FutureWorks Nokia technology vision 2020: personalize the network experience. Executive Summary. Nokia Networks
Nokia Networks FutureWorks Nokia technology vision 2020: personalize the network experience Executive Summary White paper - Nokia Technology Vision 2020: Personalize the Network Experience CONTENTS Aligning
Big Data : Next Big Thing or Big Distraction?
Big Data : Next Big Thing or Big Distraction? Dirk Quartel and Richard Tweedie June 14 What is Big Data? Data whose size is beyond the ability of the typical data base and software tools to capture, store
Big Data Analytics: 14 November 2013
www.pwc.com CSM-ACE 2013 Big Data Analytics: Take it to the next level in building innovation, differentiation and growth 14 About me Data analytics in the UK Forensic technology and data analytics in
Big data coming soon... to an NSI near you. John Dunne. Central Statistics Office (CSO), Ireland [email protected]
Big data coming soon... to an NSI near you John Dunne Central Statistics Office (CSO), Ireland [email protected] Big data is beginning to be explored and exploited to inform policy making. However these
Getting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
Sensor Devices and Sensor Network Applications for the Smart Grid/Smart Cities. Dr. William Kao
Sensor Devices and Sensor Network Applications for the Smart Grid/Smart Cities Dr. William Kao Agenda Introduction - Sensors, Actuators, Transducers Sensor Types, Classification Wireless Sensor Networks
Overview SFLOLC1. Develop and implement a management system to ensure
compliance Overview What this standard is about This standard is about developing and implementing a management system to ensure compliance with the requirements of the operator's licence for freight transport.
Irish Water meters and AMR technology
Irish Water meters and AMR technology IW/MP/EMF/B/V1/0714 If you have any questions or require further information please contact Irish Water: Web: www.water.ie Twitter: @IrishWater Telephone: LoCall 1890
IEEE IoT IoT Scenario & Use Cases: Social Sensors
IEEE IoT IoT Scenario & Use Cases: Social Sensors Service Description More and more, people have the possibility to monitor important parameters in their home or in their surrounding environment. As an
SEO Workshop Today s Coach Lynn Stevenson. SEO Analyst
SEO Workshop Today s Coach Lynn Stevenson SEO Analyst Overview Introduction to SEO Importance of Content SEO Content Best Practices Keyword Research Optimizing Content Common Pitfalls Social Media and
Unlocking the Full Potential of Big Data
Unlocking the Full Potential of Big Data Lilli Japec, Frauke Kreuter JOS anniversary June 2015 facebook.com/statisticssweden @SCB_nyheter The report is available at https://www.aapor.org Task Force Members:
In which new or innovative ways do you think RPAS will be used in the future?
Written evidence Submitted by Trilateral Research & Consulting On the House of Lords Select Committee on the European Union call for evidence on Civil use of remotely piloted aircraft systems (RPAS) in
Using Predictive Maintenance to Approach Zero Downtime
SAP Thought Leadership Paper Predictive Maintenance Using Predictive Maintenance to Approach Zero Downtime How Predictive Analytics Makes This Possible Table of Contents 4 Optimizing Machine Maintenance
Cyber Security: Confronting the Threat
09 Cyber Security: Confronting the Threat Cyber Security: Confronting the Threat 09 In Short Cyber Threat Awareness and Preparedness Active Testing Likelihood of Attack Privacy Breaches 9% 67% Only 9%
1. Understanding Big Data
Big Data and its Real Impact on Your Security & Privacy Framework: A Pragmatic Overview Erik Luysterborg Partner, Deloitte EMEA Data Protection & Privacy leader Prague, SCCE, March 22 nd 2016 1. 2016 Deloitte
Big Data and Complex Networks Analytics. Timos Sellis, CSIT Kathy Horadam, MGS
Big Data and Complex Networks Analytics Timos Sellis, CSIT Kathy Horadam, MGS Big Data What is it? Most commonly accepted definition, by Gartner (the 3 Vs) Big data is high-volume, high-velocity and high-variety
Collaborations between Official Statistics and Academia in the Era of Big Data
Collaborations between Official Statistics and Academia in the Era of Big Data World Statistics Day October 20-21, 2015 Budapest Vijay Nair University of Michigan Past-President of ISI [email protected] What
HomeReACT a Tool for Real-time Indoor Environmental Monitoring
HomeReACT a Tool for Real-time Indoor Environmental Monitoring Tessa Daniel, Elena Gaura, James Brusey Cogent Computing Applied Research Centre Faculty of Engineering and Computing Coventry University,
The 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
Big Data Use Cases Update
Big Data Use Cases Update Sanat Joshi Industry Solutions Manufacturing Industries Business Unit 1 Data Explosion Web & social networks experienced it first Infographic by Go-gulf.com 2 Number Of Connected
Call for Inputs: Big Data in retail general insurance
Financial Conduct Authority Call for Inputs: Big Data in retail general insurance November 2015 Thematic Review TR15/1 Contents 1 Overview 3 2 Background 5 3 Our proposed framework 9 4 Does the use of
How to Intelligently Make Sense of Real Data of Smart Cities: Practical Approach
University of Murcia Faculty of Computer Science Department of Information and Communications Engineering Spain How to Intelligently Make Sense of Real Data of Smart Cities: Practical Approach Antonio
Small Steps Towards Big Data Ric Clarke, Australian Bureau of Statistics
Small Steps Towards Big Data Ric Clarke, Australian Bureau of Statistics ECB Workshop on Using Big Data, 7-8 April 2014 1 Agenda Review some basic Big Data concepts Describe the Big Data opportunity for
Big Data-Challenges and Opportunities
Big Data-Challenges and Opportunities White paper - August 2014 User Acceptance Tests Test Case Execution Quality Definition Test Design Test Plan Test Case Development Table of Contents Introduction 1
Web of Systems for a digital world
Web of Systems for a digital world Dubai, siemens.com From the Internet to the Web of Systems Internet World Wide Web Web 2.0 Web of Systems ARPANET TCP/IP http VoIP Mobile web Social media Smart grid
Smart Cities Solution Overview Innovation Center Network, Research & Innovation. SAP SE Reiner Bildmayer
Smart Cities Solution Overview Innovation Center Network, Research & Innovation SAP SE Reiner Bildmayer Why Cities need to be Run Better Challenges and Opportunities ~50% of the world s population currently
Split Lane Traffic Reporting at Junctions
Split Lane Traffic Reporting at Junctions White paper 1 Executive summary Split Lane Traffic Reporting at Junctions (SLT) from HERE is a major innovation in real time traffic reporting. The advanced algorithm
Potential and Pitfalls of Health-Related Big Data. Ana Aizcorbe. March 6, 2014
Potential and Pitfalls of Health-Related Big Data Ana Aizcorbe March 6, 2014 What is Big Data? Big Data is revolutionizing 21st-century business without anybody knowing what it actually means. MIT Technology
WHITE PAPER ON. Operational Analytics. HTC Global Services Inc. Do not copy or distribute. www.htcinc.com
WHITE PAPER ON Operational Analytics www.htcinc.com Contents Introduction... 2 Industry 4.0 Standard... 3 Data Streams... 3 Big Data Age... 4 Analytics... 5 Operational Analytics... 6 IT Operations Analytics...
Big Data for Development: What May Determine Success or failure?
Big Data for Development: What May Determine Success or failure? Emmanuel Letouzé [email protected] OECD Technology Foresight 2012 Paris, October 22 Swimming in Ocean of data Data deluge Algorithms
SEO 360: The Essentials of Search Engine Optimization INTRODUCTION CONTENTS. By Chris Adams, Director of Online Marketing & Research
SEO 360: The Essentials of Search Engine Optimization By Chris Adams, Director of Online Marketing & Research INTRODUCTION Effective Search Engine Optimization is not a highly technical or complex task,
DIGITAL MARKETING E-LEARNING
DIGITAL MARKETING E-LEARNING EXPLANATION, BENEFITS AND PRICING Ensure your team is kept up to date with world class content on all areas of digital marketing. This online platform offers highly interactive
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Delivering new insights and value to consumer products companies through big data
IBM Software White Paper Consumer Products Delivering new insights and value to consumer products companies through big data 2 Delivering new insights and value to consumer products companies through big
Big Data better business benefits
Big Data better business benefits Paul Edwards, HouseMark 2 December 2014 What I ll cover.. Explain what big data is Uses for Big Data and the potential for social housing What Big Data means for HouseMark
Are Facebook and Twitter Solutions Right For You What s Hot What s Not Keywords and Adwords
Are Facebook and Twitter Solutions Right For You What s Hot What s Not Keywords and Adwords It s A Brand New Day It s A Brand New Day New Ways To Connect New Ways To Connect Become Relevant Be Where Consumers
Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015
Bringing Strategy to Life Using an Intelligent Platform to Become Ready Informatica Government Summit April 23, 2015 Informatica Solutions Overview Power the -Ready Enterprise Government Imperatives Improve
A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS
A PLATFORM FOR SHARING DATA FROM FIELD OPERATIONAL TESTS Yvonne Barnard ERTICO ITS Europe Avenue Louise 326 B-1050 Brussels, Belgium [email protected] Sami Koskinen VTT Technical Research Centre
Systems of Discovery The Perfect Storm of Big Data, Cloud and Internet-of-Things
Systems of Discovery The Perfect Storm of Big Data, Cloud and Internet-of-Things Mac Devine CTO, IBM Cloud Services Division IBM Distinguished Engineer [email protected] twitter: mac_devine Forecast for
Deploying Big Data to the Cloud: Roadmap for Success
Deploying Big Data to the Cloud: Roadmap for Success James Kobielus Chair, CSCC Big Data in the Cloud Working Group IBM Big Data Evangelist. IBM Data Magazine, Editor-in- Chief. IBM Senior Program Director,
Big data for official statistics
Big data for official statistics Strategies and some initial European applications Martin Karlberg and Michail Skaliotis, Eurostat 27 September 2013 Seminar on Statistical Data Collection WP 30 1 Big Data
Rapid Visualization with Big Data Analytics. Ravi Chalaka VP, Solution and Social Innovation Marketing
Rapid Visualization with Big Data Analytics Ravi Chalaka VP, Solution and Social Innovation Marketing Imagine the Future Innovative cities that dramatically enhance the wellbeing of its citizens Safer
Tourism statistics - update by Eurostat
Advisory Committee on Tourism Brussels, 15 December 2015 Tourism statistics - update by Eurostat August Götzfried DG EUROSTAT, Unit G-3 Short-term statistics; tourism Outline of the presentation Employment
Analytics & Big Data What, Why and How. Colin Murphy FSAI Dr. Richard Southern Sinead Kiernan FSAI
Analytics & Big Data What, Why and How Colin Murphy FSAI Dr. Richard Southern Sinead Kiernan FSAI 07.04.2014 Agenda Introduction What is Analytics and Big Data? Growth of Analytics and Big Data What does
New Frontiers for Official Statistics
European Data Forum 2015 November 16-17, 2015, Luxembourg New Frontiers for Official Statistics Mariana KOTZEVA EUROSTAT, Deputy Director General Key issues 1. A dynamically changing data ecosystem 2.
The Future of Smart In our Daily Lives
Internet of Things The Future of Smart In our Daily Lives Karen Lomas. Director, Smart Cities EMEA 1 Executive Summary As the developed world evolves individuals and collective groups be they corporations,
UN Global Working Group (GWG) on Big Data for Official Statistics. Presented by: Gemma Van Halderen
UN Global Working Group (GWG) on Big Data for Official Statistics Presented by: Gemma Van Halderen Role of the UN GWG Created in 2014, as an outcome of the 45 th meeting of the UN Statistical Commission.
YOU VS THE SENSORS. Six Requirements for Visualizing the Internet of Things. Dan Potter Chief Marketing Officer, Datawatch Corporation
YOU VS THE SENSORS Six Requirements for Visualizing the Internet of Things Dan Potter Chief Marketing Officer, Datawatch Corporation About Datawatch NASDAQ: DWCH Pioneer in real-time visual data discovery
Big Data, Not Big Brother: Best Practices for Data Analytics Peter Leonard Gilbert + Tobin Lawyers
Big Data, Not Big Brother: Best Practices for Data Analytics Peter Leonard Gilbert + Tobin Lawyers March 2013 How Target Knew a High School Girl Was Pregnant Before Her Parents Did just because you can,
Understanding & Realizing Big Data Potential
Understanding & Realizing Big Data Potential 2014 Latin America Treasury & Finance Conference A Blueprint for a Digitally Connected Treasury Driss R. Temsamani Analytics & Innovation Head [email protected]
A New Era Of Analytic
Penang egovernment Seminar 2014 A New Era Of Analytic Megat Anuar Idris Head, Project Delivery, Business Analytics & Big Data Agenda Overview of Big Data Case Studies on Big Data Big Data Technology Readiness
Smart City Live! 9-10 May 2016, Nice
Monday, May 9, 2016 Smart City Live! 9-10 May 2016, Nice Draft agenda as of November 20, 2015 SMART LIVING SMART CITY SERVICES 9:00 AM CASE STUDY: Developing Smart Energy communities Understanding the
Data Based Decision Making in Manufacturing Supply Chains N. Viswanadham
Manufacturing Supply Chains N. Viswanadham IGSTC workshop on Strategies and Concepts for Advanced Manufacturing Computer Science and Automation Indian Institute of Science, Bangalore January 23-24, 2014
Search engine optimisation (SEO)
Search engine optimisation (SEO) Moving up the organic search engine ratings is called Search Engine Optimisation (SEO) and is a complex science in itself. Large amounts of money are often spent employing
Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya
Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now
UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick
UN Global Pulse: Harnessing Big Data for a Revolution in Sustainable Development and Humanitarian Action Robert Kirkpatrick Director @rkirkpatrick www.unglobalpulse.org @unglobalpulse Global Pulse Vision:
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
