Urban Analytics & Big Data Is Big Data Enough for Effective City Simulation?



Similar documents
Big Data, Little Data, Real-Time Streaming & Smart Cities

Representation & Visualization in GIS: Emerging Trends, New Methods. Michael Batty

Fares Policy In London: Impact on Bus Patronage

The SIMULACRA Modelling Framework: Applications to London and the South East

Innovations in London s transport: Big Data for a better customer experience

Visualising Space Time Dynamics: Plots and Clocks, Graphs and Maps

Travelling in UK Cities Today

3.2 Our customers and users tell us that they want four things:

What are the physical characteristics of York? Why problems do you think there could be in York use map evidence.

Spatio-Temporal Patterns of Passengers Interests at London Tube Stations

The New Mobility: Using Big Data to Get Around Simply and Sustainably

136 deaths in 2007 (Latest figures available) UK (129 in England) 2,458 serious injuries in 2007 in the UK source- National Office of Statistics

NFC in Transport for London

Monitoring and evaluation of walking and cycling (draft)

SKY BET CHAMPSIONSHIP PLAY-OFF SATURDAY 28 TH MAY 2016

Visiting our London office

The Mersey Gateway Project

Bartlett Summer Show 2015: ScanLAB Projects to document UK s leading architecture show with state of the art scanning technology

Big Data Collection and Utilization for Operational Support of Smarter Social Infrastructure

London 2012 Travel Demand Management Chris HANLEY

Transport for London. Travel in London, Supplementary Report: London Travel Demand Survey (LTDS)

TIBCO Spotfire Metrics Modeler User s Guide. Software Release 6.0 November 2013

Getting around London

Cycle Strategy

Around 33 million journeys are currently made on Metrolink every year. This is forecast to increase to 44 million journeys by 2019.

Think London - A Review

We ve enjoyed being with you through the days of the Plain Old Telephone System.

USE OF STATE FLEET VEHICLE GPS DATA FOR TRAVEL TIME ANALYSIS

Seamless journeys from door to door.

Learning Management System (LMS) Guide for Administrators

Cycle Network Modelling A new evidence-based approach to the creation of cycling strategy

Visit your Google Play or App Store and search for Right Move Perth. The App is FREE to download.

Traffic Management App. User Guide

Why build the Silvertown Tunnel?

NETGEAR genie Apps. User Manual. 350 East Plumeria Drive San Jose, CA USA. August v1.0

KANSAS TRUCK ROUTING INTELLIGENT PERMITTING SYSTEM

Journey to Work Patterns in the Auckland Region

Real Time Bus Monitoring System by Sharing the Location Using Google Cloud Server Messaging

Smartphone Applications for ITS

Measuring the Street:

TfL s Contactless Ticketing: Oyster and Beyond Transport for London. Lauren Sager Weinstein Head of Oyster Development

TravelOAC: development of travel geodemographic classifications for England and Wales based on open data

How to use your go card on the TransLink network. TransLink go card user guide

A new Garden Neighbourhood for West Guildford An opportunity for Smart Growth. university of surrey November 2013

Guardian Tracking Systems

Surveys If you want to find out what a large group of people think, the easiest thing to do is carry out a survey.

Buses in Essex County

Spatial Analysis of Accessibility to Health Services in Greater London

Bike sharing schemes (BSS)

The Road to Growth Is transport the driver of local economic development?

Data About Cities: Redefining Big, Recasting Small

Deployments not Standards to make IoT a success

NHSP:Online. Uploading Agency Rates. NHSP:Online. Agency Rates Admin V2.2 Issued December 2012 Page 1 of 31

Big Data for a better Customer Experience. Lauren Sager Weinstein Head of Analytics Transport for London

Rural Road Safety Policy in Korea: Lesson Learned

Predictive Analytics: Extracts from Red Olive foundational course

Rigorous Performance Testing on the Web. Grant Ellis Senior Performance Architect, Instart Logic

The role of stakeholders in the evolution of cycling infrastructure in London. Brian Deegan Design Engineer

Google Analytics workbook

siemens.com/mobility Travel smarter with electronic ticketing

SUSTAINABLE MOBILITY EVALUATION IN URBAN AREAS

AMAZON S3: ARCHITECTING FOR RESILIENCY IN THE FACE OF FAILURES Jason McHugh

Workplace travel surveys

Briefing document: How to create a Gantt chart using a spreadsheet

Managing Extreme Weather at Transport for London. ARCC Assembly - 12 June 2014 Helen Woolston, Transport for London Sustainability Coordinator

Demand for Long Distance Travel

Vehicle Tracking System Roadstar

Overview of the Travel Demand Forecasting Methodology

Analytics case study

Interoperability, Resilience & Availability

Human Enhancement. Human Enhancement Democs game (ver 2.0) Game instructions. a Democs game

Reference Manual. IQ System Database Manager Software

What lessons for future urban regeneration in East London can be learnt from projects in the past? Year 11 Geography fieldtrip.

MICROSOFT EXCHANGE MAIN CHALLENGES IT MANAGER HAVE TO FACE GSX SOLUTIONS

AT Metro Monthly Patronage May 2015

Tools and Operational Data Available st RESOLUTE Workshop, Florence

Nearly 38,000 vehicles cross the Shinnecock Canal on Sunrise Highway (NYS Route 27) daily, during peak summer months.

Quarter /15 (1 April 30 June 2014)

Keeping Glasgow and your

SEO AND CONTENT MANAGEMENT SYSTEM

Sustainable Travel Plan. Date: December Estates Department

IP/SIP Trunk Software User Guide

Basic Steps to Using the Wildlife Application

Climate Change Adaptation for London s Transport System

Fare Change 2014 Frequently Asked Questions & Answers

GTFS: GENERAL TRANSIT FEED SPECIFICATION

Secure Web Appliance. Reverse Proxy

CONTACTS & WORKING HOURS

Central London ongestion charging

UCL CENTRE FOR ADVANCED SPATIAL ANALYSIS OPEN DATA & VISUALISATION

Where On Earth Will Three Different Satellites Provide Simultaneous Coverage?

Magnum AVL GPS Fleet Tracking User Interface Help Guide

The Mersey Gateway Project

FURTHER PARTICULARS 1. ABOUT THE DEPARTMENT OF MANAGEMENT SCIENCE AND INNOVATION

Customer Satisfaction Index 2014

Inner London s economy a ward-level analysis of the business and employment base

Quarter /14 (1 July 30 September 2013)

OmniPlan for Mac Reviewer s Guide

Roads Task Force - Technical Note 10 What is the capacity of the road network for private motorised traffic and how has this changed over time?

Network Rail October 2007 Strategic Business Plan. Supporting Document. Demand Forecasting in the SBP

Transcription:

A Symposium Hosted by I/S: A Journal of Law and Policy for the Information Society, at Ohio State University Moritz College of Law Urban Analytics & Big Data Is Big Data Enough for Effective City Simulation? Michael Batty m.batty@ucl.ac.uk @jmichaelbatty http://www.complexcity.info/ http://www.casa.ucl.ac.uk/ March 21, 2014 Centre Centre for Advanced for Advanced Spatial Spatial Analysis, University College London If you wish to look at the PDF of this presentation it is available on one of my blogs http://www.spatialcomplexity.info/ Go to this URL and then click on the Big Data Future ICON At the top of the Right Column Menu 1

Smart Card Data Oyster Card Taps Tap at start and end of train journeys Tap at start only on buses Accepted at 695 Underground and rail stations, and on thousands of buses (9,000 daily average routes) 991 million Oyster Card taps over Summer 2012 this is big data 2

http://www.simulacra.info/ http://planefinder.net/ 3

http://planefinder.net/ http://planefinder.net/ 4

http://planefinder.net/ http://planefinder.net/ 5

6

Key Themes Changes: Big Data, Short Times, Fine Spatial Scales: The Next 5 Minutes or the Next 5 Years? How Do We Build Good Models of Cities Using Big Data? Aggregating Big Data? Is Big Data Right Data Exemplars The London Oyster Card Data Set: Disruption Simpler Systems: Public Bikes as Exemplar Long & Short Term Flooding: Tyndall Cities Project Big Disruptions in Infrastructure: Olympics,Cross Rail Regional Breakdown: Phase Transitions Attempting a Synthesis: What Next for Planning? 7

Changes: Big Data, Short Times, Fine Spatial Scales: The Next 5 Minutes or the Next 5 Years? Big data is distorting or at least changing our attention span from longer to shorter time periods in which intervention takes place This is because big data is largely but not exclusively based on massive volumes terabytes over very short time spans seconds at very precise spatial scales centimetres or less The focus thus turns to issues such as disruption of systems for individuals over 5 minutes to 1 hour to 1 day rather than change for aggregate groups of populations from 1 year to 5 years to 50 years. To an extent there is a general shift from equity to efficiency in thinking about cities in this way, in the short term. It is a mild change in focus and perhaps we shouldn t make too much of it Of course big data is also being streamed incessantly without any potential end limit although systems will inevitably change and there is no guarantee that big data will tell us about change on any and every time scale. Big data can also be traditionally collected data by manual means and a good working definition is anything that will not fit into an Excel spreadsheet the population of the US is big data and it might be collected from many sources, none of which are streamed in real time. This is getting to be big data, 300 million individual records say. Big data is also as unreliable as small data, perhaps more so and I will tell you of some problems in it as we go along. 8

How Do We Build Good Models of Cities Using Big Data? Aggregating Big Data? Is Big Data Right Data If our attention is shifted to the short term, the focus changes from issues of equity in aggregate populations to issues of efficiency in individual behaviours. The focus also shifts from equilibrium states to dynamic change. And this changes our models and perhaps even what we are interested in. In one sense, we really don t have good theories to use big data in the same ways that we have been using much more structured data such as those based on censuses My example of the Oyster card data masses of data but not quite right for thinking about transport planning, or at least the planning of new routes and stations. Fine for short term disruption and plans to counter that, perhaps?.. not yet clear We have to aggregate the data anyway to bring it to the point where we can use it to test our models. Our models rely on the law of large numbers, I am afraid and despite the focus these days on agent based models and micro simulation, at the end of the day validation depends on making sure the aggregate patterns are reproduced and explained. The models we have tended to work with are really a mixture of explanatory and predictive in Eytan Adar s words yesterday. However there is a point that with big data, streamed and used in real time or more passively, then we can build models closer to the data and by that I mean data driven models And some of my examples are just this explorations of the data that tell us new things about the world that help us think and predict even the short term. As I proceed, my examples become more model based and less big data based although our these represent a mix of big small fat thin and so on data 9

Because big data is all about time, this in the field of cities is giving us the potential to begin to think about new kinds of dynamics. Harvey Miller yesterday talked eloquently about dynamics and we do now stand at a threshold where we can begin to deal with all the paraphernalia that has come out of network science and complexity theory in the last 20 years Small worlds, tipping points, weak ties, key hubs, catastrophes, bifurcations,, order effects, infinite repercussions and so on, the science of phase transition, of complexity and most of all of redundancy in systems. It is the science of heart attacks, and societal collapse and much else besides. It is the science of how the flows and interactions, how social networks keep the city functioning. So big data does have the promise of these kinds of explanations but most big data has not been assembled into a form to do this much of this yet. The London Oyster Card Data Set Ok let me begin to illustrate these ideas with some examples which involve big data and networks. I will return to the examples of transport in London. Our Oyster card data set involves all tap in and tap outs by unique ID number with location and time for every person over a 6 month period 7 million + a day on just the tube, 50 million a week, 160 million a month, nearly 1 billion for every half year. This is very big data simply to explore it takes a lot of computer time. And that is an issue with us. It is very good data in that we can begin to get a very detailed sense of how people travel over time the routiness of travel but there are a lot of holes in the data too as I illustrated earlier. The network is complicated too. 10

Tube, Overground and National Rail Networks in London where Oyster cards can be used + all the bus routes Roth C., Kang S. M., Batty, M., and Barthelemy, M. (2011) Structure of Urban Movements: Polycentric Activity and Entangled Hierarchical Flows. PLoS ONE 6(1): e15923. doi:10.1371/journal.pone.0015923 11

Animation over 24 hours of speeded up position/time of tubes: How can we match this supply of vehicles from the API queries to the demand from the Oyster card data? 12

The effect of a bus strike Tuesday 22 nd May 2012, 09:00 Wednesday 23 rd May 2012, 09:00 The left image shows the effect of the bus strike on 22 nd May 2012, while the image on the right shows a normal day. 13

Disruption caused by closing Liverpool Street in terms of the graph of the tube network Closing Green Park shifts in betweenness centrality Bank & Monument Stations: 5 Lines and 2 Modes 60k Entries/Exits Weekdays 35k Entries/Exits Weekends 14

Particular Events: Weekdays, Saturdays and Sundays Entry at Camden Town (10 Mn. Intervals) Entry at Arsenal (10 Mn. Intervals) 400 900 Weekday Weekday 300 Saturday Sunday 800 Saturday Sunday 200 700 600 100 Number of Events 0 100 200 Number of Events 500 400 300 Events 200 300 400 Nightlife 100 0 500 2am 4am 6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am 100 2am 4am 6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am Time of Day Time of Day Entry at Bank (10 Mn. Intervals) Entry at Bayswater (10 Mn. Intervals) 1000 150 Weekday Weekday 800 Saturday Sunday Saturday Sunday 100 600 400 50 Number of Events 200 0 200 Number of Events 0 50 400 600 800 Work 100 Tourism? 1000 2am 4am 6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am 150 2am 4am 6am 8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am 4am Time of Day Time of Day Disruption on the Circle and District Lines Circle and District line part closure From Edgware Road to Aldgate/Aldgate East 19 th July 2012 07:49 to 12:04 1234022 Oyster Cards with regular pattern during disrupted time period travelled 15

London Underground and Rail Network Geographic form Increased Travel Time Greater than 2SD above mean increase on usual travel time for that Oyster Card Size equal to proportion of users that regularly travel from station during time period, and travelled that during disruption 16

Origin Changes Locations from where individuals changed from their usual origin station Destination Changes Locations from where individuals changed their usual destination station 17

Complete Switch to Bus Locations from where users replaced usual journey with a bus-only journey Partial Switch to Bus Locations from where users replaced a part of usual journey with a bus journey 18

Upton Park 0.79 Dagenham East 0.77 Dagenham Heathway 0.76 Total Proportion Disrupted Proportion of all 79103 users identified as being disrupted from usual patterns 19

Simpler Systems: Public Bike as Exemplars World wide case study from Ollie O Brien: 100 cities, 3 years of data Docking station status Journey records Looking at cultural behaviour Each docking station shown by a circle Blue = empty, Purple = ~50% full, Red = full Normally two graphs Weekday (normally Wednesday) Weekend (normally Saturday) Live versions at http://bikes.oobrien.com/ Let us look at the big map first of all schemes, then we will go live and look at Tel Aviv, I hope, then London 20

The Barclays Cycle Hire London Project BCH commenced operations in July 2010 with 5,000 bicycles and 315 docking stations distributed across the City of London and parts of eight London boroughs.[10] The coverage zone spans approximately 17 square miles (44 km2), roughly matching the Zone 1 Travelcard area. Currently there are some 8,000 'Boris Bikes' and 570 docking stations in the BCH scheme, which has been used for over 19 million journeys to date 21

As yet no records of demand from people logging on, so no management capabilities, but could happen probably from an App based software but maybe from the server 22

23

http://oobrien.com/vis/bikes/ 24

The Website: Real Time Visualisation of Origins and Destinations Activity http://bikes.oobrien.com/london/ http://bikes.oobrien.com/global.php http://bikes.oobrien.com/columbus/ 25

Your can play back the last couple of days from the animator for many of the cities where the data is captured online Initial Analyses Flows Origins and Destinations 26

Thre are quite a few visualisations on Vimeo which have been developed by James Cheshire and Martin Austwick where they have used shortest routes methods to figure out bike paths http://vimeo.com/19982736 And one from Jo Wood at City University http://vimeo.com/33712288 27

Long & Short Term Flooding: The Tyndall Cities Project: Big Models Rather than Big Data We have been involved in a large consortium project led by Newcastle Civ Eng to look at an integrated assessment of climate change in Greater London MoSeS http://www.casa.ucl.ac.uk/movies-weblog/googleearth.mov 28

Flooding from our 3D Virtual London Model Shifts in Traffic Accessibility if all Bridges across the Thames are Inoperable as far West as Hammersmith 29

Big Disruptions/Additions in Infrastructure: The Olympics games Regeneration of East London and Cross Rail The Impact of Additional Employment and Population at the Olympic Games Site in East London 30

Absolute Gains in Population (left) in East London but Relative% Gains in Population in West London (right) M. Batty (2012) Urban Regeneration as Self Organisation, Architectural Design, 215, 54 59 Cross Rail: The High Speed Rail Line from Maidenhead to Stratford 30 trains an hour each way 31

Regional Breakdown: Phase Transitions My last example shifts scale massively to the entire country. We are working on a problem essentially in the regional economic geography of the UK or at least England, Wales and Scotland examining how prosperous cities are with scale. This is called allometry of city sizes the idea is that as cities grow they generate economies of scale and to cut a long story short, their per capita income grows with size. This is called superlinearity in short their metabolism get faster with size the pace of life gets faster, congestion gets greater and crime gets greater per capita as does income, inventiveness and so on I cant go into this in any detailed but to do this we need to construct cities from the ground up to define them the work done in the US for MSAs indicates that for every doubling of city size, income more than doubles going up by 215% or so. In the UK we have not found this incomes do not go up superlinearly but linearly and this we believe is due to many factors it may well be that the UK urban system is much more integrated than the US more global and it may be that policy, historical path dependence of how the country has developed and so on is key we don t know but we need to define proper defintions of cities and this we are using percolation theory to do this let me give you a sense of how we can break up the regional system from the UK network. Now this is really big data in that the road system is coded at every intersection and link by the OS, or by OSM 32

log access log pop A Theory of City Size, Science, 21 June 2013: Vol. 340, 1418 1419 What we do is we decrease the distance threshold for places to connect with one another systematically, starting from any integrated system where we show the top ten clusters in terms of size This reveals the splits in the system eventually giving us cities 33

What we do is we decrease the distance threshold for places to connect with one another systematically, starting from any integrated system where we show the top ten clusters in terms of size This reveals the splits in the system eventually giving us cities What we do is we decrease the distance threshold for places to connect with one another systematically, starting from any integrated system where we show the top ten clusters in terms of size This reveals the splits in the system eventually giving us cities 34

What we do is we decrease the distance threshold for places to connect with one another systematically, starting from any integrated system where we show the top ten clusters in terms of size This reveals the splits in the system eventually giving us cities What we do is we decrease the distance threshold for places to connect with one another systematically, starting from any integrated system where we show the top ten clusters in terms of size This reveals the splits in the system eventually giving us cities 35

Attempting a Synthesis: What Next for Planning? Ok a longish talk I know but basically my point is that big data is changing our focus and planning in cities needs to respond The danger is that we might shift too radically big data gives is the opportunity to develop new more focussed tools for dealing with short terms changes but this does not negate long term change, our traditional focus. Big data is currently short data but as we accumulate it, it will become long data. In some senses, this enriches our ability to understand, manage and intervene. There is also a warning we need to embrace these new technologies and theories that enable us to make sense of this type of data, to integrate it if we can and to become extremely wary of its quality, as we have and continue to be about small data I am going to finish with a story about big data in 1955 You can get to see this story on one of our blogs run by Stephen Gray. It isn t so active now but it will be as he is going to use it to put up some software for his Twitter work. It is called the Big Data Tool Kit www.bigdatatoolkit.org And look at post October 3 rd 2012 36

Let me acknowledge my colleagues Ollie O Brien for the Bikes project Ed Manley for the Oyster Card Disruption studies Jon Reades for Visualising the Oyster Card Data Richard Milton for tube train data visualisations and linking LUTI models to Google Earth Steve Evans and Andy Hudson Smith for 3D visualisations Tyndall Cities Consortium for London LUTI model Elsa Arcaute, Pete Ferguson, Erez Hatna and Anders Johannson for the urban size and percolation work Just let me finish by giving my coordinates and remind you that this power point as a PDF is on one of our blogs at http://www.spatialcomplexity.info/ M. Batty, CASA, UCL, 90 Tottenham Court Road, London WC1E 6BT m.batty@ucl.ac.uk http://www.complexcity.info/ @jmichaelbatty 37

My New Book published November 2013 http://www.mitpress.mit.edu/ 38