BIG DATA FOR MODELLING 2.0 ENHANCING MODELS WITH MASSIVE REAL MOBILITY DATA DATA INTEGRATION www.ptvgroup.com Lorenzo Meschini - CEO, PTV SISTeMA COST TU1004 final Conference www.ptvgroup.com Paris, 11 May 2015 Seite 1
BIG DATA TRADITIONAL DEFINITION A collection of data too massive to be handled efficiently by traditional databases tools and methods Big Data IS NOT only related to non-trivial sizes of data, but it IS rooted in the push to discover hidden/useful insights in data. www.ptvgroup.com Seite 2
BIG DATA THE THREE "V"s : Volume, Velocity, Variety VOLUME is the sheer size of the data being collected VELOCITY is the speed at which data is flowing into a business s infrastructure and the ability of software solutions to receive and process that data quickly VARIETY refers to different data format incoming into your platform, and the challenge to be able to take raw, (un)structured data and organize it. www.ptvgroup.com Seite 3
BIG DATA READY TO USE? Three challenges besides data availability from a business point of view: STORE: can you store the vast amounts of data being collected? PROCESS: can you organize, clean, and analyze the data collected? ACCESS: Can you search and query this data in a organized manner? www.ptvgroup.com Seite 4
FINDING HIDDEN DATA INSIGHTS Once you get beyond storage and management, you still have the enormous task of creating actionable business intelligence (BI) from the datasets you ve collected. There are so many types of analytic models, and different ways of providing infrastructure for this process. But the analytics solution must scale, too. Ultimately, analytics tools rely on a great deal of reasoning and analysis to extract data patterns and data insights, but this capacity means nothing for a business if they can t then create actionable intelligence. www.ptvgroup.com Seite 5
SOME STATISTICS (2015) Big Data Plans are Underway for Most Organizations RDBMS Still Dominates the Broader IT Industry Almost All Orgs Expect Their Storage Needs to Grow Exponentially www.ptvgroup.com Seite 6
WHAT ABOUT BIG TRANSPORT & MOBILITY DATA? The market already offers world or continent wide services and solutions based on individual vehicle and/or people mobility trajectories or movements Raw data sources Vehicle Trajectories form black boxes for insurance applications or vehicle location systems Vehicle Trajectories from navigation systems Crowd sourcing from Mobile phone apps Localization of mobile phones Offered services / products Real time traffic monitoring & information Performance measures Maps Speed profiles and travel times on road segments Travel time matrices Observed od matrices Trajectories www.ptvgroup.com Seite 7
WHAT ABOUT BIG PUBLIC TRANSPORT & MOBILITY DATA? Public transport data are currently collected and stored on a local base: Raw data sources Service plans PT vehicle trajectories from AVL and AVM systems PT events (delay/cancellation/rerouting) Tickets emission/collections Crowd sourcing from Mobile phone apps Services produced are currently often limited within the entities collecting the data Real time information Performance and Level of Service measures Clearing Service planning (schedule) Some companies are trying to bring services to a global level Aggregating local data www.ptvgroup.com Seite 8
CHALLENGES ENABLERS AND OPPORTUNITIES Challenges Collecting data on PT worldwide: data are (owned?) by different authorities that won't provide them Go multimodal: collecting Bike, pedestrians counts Mode of transport identification car, bike, PT can be very similar in urban contexts Same trip, several transport systems Enablers Open data Crowd sourcing Internet of things Opportunities Smart cities www.ptvgroup.com Seite 9
PUBLIC TRANSPORT DATA MINING FOR MODELLING 2.0 Big Data (historical) on PuT Computer Science Transportation Engineering Pure statistical/machine learning approach Modelling approach + Calibration by data Modelling 2.0 www.ptvgroup.com Seite 10
DATA DRIVEN MODELS - TODAY Input: from same raw FCD data that provide today speed profiles Output: calibrated traffic models + route choice Network attributes Free flow speeds Capacities FCD raw trajectories Optima Data Driven Demand OD matrices Network graph Traffic zones Available flow counts Route choice Turning ratio (by destination zone) www.ptvgroup.com Seite 11
DATA DRIVEN MODELS - TOMORROW Input: from same raw FCD data that provide today speed profiles Output: calibrated traffic models + route choice Multi modal trips Optima Data Driven Network attributes Free flow speeds Transit Capacities Waiting times Acess / Egress / Interchange points Demand OD matrices Modal split Network & Service graph Traffic zones Multi modal flow counts Route choice Turning ratio (by destination zone) www.ptvgroup.com Seite 12
DATA DRIVEN MODELS FUNCTIONAL OVERVIEW Observed Vehicle trajectories Link speeds by day type ASSIGNMENT MATRIX UPDATE Zones (Origin destinations) Map Matching & speed calc. Splitting rates by destination and day type Assignment matrix estimation Assignment matrix by day type Graph Day types definitions Observed matrices by day type Assignment matrix by day type Zones (Origin destinations) Flow measures OD matrix correction OD MATRIX UPDATE Zones (Origin destinations) Initial Graph Link speeds by day type speed, capacity and jam density correction GRAPH UPDATE Initial OD matrix Corrected OD matrices by day type Corrected OD matrices by day type Corrected Graph Graph www.ptvgroup.com Seite 13
MODELLING 2.0 AN EXAMPLE Creation of a graph model for Transport Assignment Running the Big Data analysis tools you discover, from FCD probes for example, that some streets should be included into the model because they are deeply used!!! Running online Big Data tools you can update in real time parameters of your model, for example for the route choice model the turn probabilities at a given intersection. www.ptvgroup.com Seite 14
BIG DATA & DATA DRIVEN MODELS FUTURE NEEDS Big data can contribute to enhance calibrating and validating all our models Trip generation Trip distribution Mode choice Route choice Supply calibration We need to conceive new calibrating methodologies Capacity and flow level recognition Transport system & mode recognition Path choice recognition www.ptvgroup.com Seite 15
Thank you Lorenzo Meschini CEO, PTV SISTeMA Realtime Solutions Director, PTV Group lorenzo.meschini@ptvgroup.com www.ptvgroup.com Seite 16