Making Sense of Urban Data Anil Yazici, PhD Camille Kamga, PhD Data Sim @ Hasselt, Belgium July 213
NEW YORK CITY Population, 211 estimate: 8,244,91 Land area in square miles, 21: 32.64 Persons per square mile, 21: 27,12.5
TRANSPORTATION MODE IN NEW YORK CITY Transit Systems: Rail Commuter Subway Systems (NYCT, PATH) Bus Systems Ferries Private Automobile Taxi Dollar Vans Bicycle Pedicabs Walking
GPS-TAXI IN NEW YORK CITY More than 13,2 Taxis 27 Credit Card Payment and GPS system
What is urban data like?
Data flow while you are walking to the subway Corner of St. Nicholas Av. & W145th St. February 2 nd, ~5:2 PM
Smart Cities, Urban Science Popular topics both in industry and academia IBM Cisco Accenture CUSP (Center for Urban Science and Progress) at NYU-Poly Media Lab at MIT Not confined to transportation Energy consumption, carbon footprint, waste management, emergency management Basically, tackling all types of urban challenge with help of data sources
TLC Yellow Cab Data GPS dataset with more than 37 million taxi trips covering the period from January 1, 29 to November 28, 21. Each trip record includes: trip origin and destination (Lat-Long) the time of pick-up and drop-off the number of passengers trip fare (base fare + surcharge + tolls ) trip distance
How to make sense of taxi data? Probe vehicle travel time congestion patterns Travel time variability/reliability Passenger urban travel patterns Home to work, or business to another business? Morning rush hour vs. Evening rush hour Taxi as a taxi Taxi industry, a multi billion $ business Taxis role in public transportation system Taxi demand and supply Medallion prices Fare hikes New technology: E-hail?
Any Drivers? Sunday Saturday Friday Thursday Wednesday Tuesday Monday Sunday Saturday Friday Thursday Wednesday Tuesday Monday Sunday Saturday Friday Thursday Wednesday Tuesday Monday 1AM 2AM 3AM 4AM 1AM 2AM 3AM 4AM 1AM 2AM 3AM 4AM 5AM 6AM 7AM 5AM 6AM 7AM 5AM 6AM 7AM 8AM 9AM 1AM 8AM 9AM 1AM 8AM 9AM 1AM AVERAGE TRAVEL TIMES ACTUAL AVERAGE TRAVEL TIMES 11AM 1PM 2PM 3PM VARIANCE OF TRAVEL TIMES 11AM 1PM 2PM 3PM 4PM 5PM 6PM 4PM 5PM 6PM COV OF AVERAGE TRAVEL TIMES 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM STANDARD DEVIATION OF TRAVEL TIMES 7PM 8PM 9PM COEFFICIENT OF VARIATION OF TRAVEL TIME DISTRIBUTION 7PM 8PM 9PM 1PM 11PM 1PM 11PM 1PM 11PM 6 4 3.5 3 2.5.6.55.5
Weather impacts? What happens to your travel time when it is raining? What about travel time variability? WEATHER DATA Data gathered from www.wunderground.com Weather conditions are updated every hour, unless there is a change in the existing conditions. For each trip, the weather at the time of pick-up is assumed as the weather condition for the entirety of that particular trip.
Assumptions
Percentage Change in Travel Time - Light Rain PERCENTAGE CHANGE IN AVERAGE TRAVEL TIMES - LIGHT RAIN Sunday Saturday Friday Thursday Wednesday Tuesday Monday 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM 2% 18 16 14 12 1 8 6 4 2 %
Percentage Change in Standard Deviation - Light Rain Sunday Saturday PERCENTAGE CHANGE IN STANDARD DEVIATION - LIGHT RAIN 5% 4 Friday 3 Thursday 2 Wednesday 1 Tuesday Monday 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM -1 %
Percentage Change in Coefficient of Variation - Light Rain PERCENTAGE CHANGE IN COEFFICIENT OF VARIATION - LIGHT RAIN Sunday Saturday Friday Thursday 25% 2 15 1 5 Wednesday Tuesday Monday 1AM2AM 3AM 4AM5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM6PM 7PM 8PM 9PM 1PM 11PM -5-1 %
Travel Time, Congestion Why do we (or should we) care? Actual vs. Perceived travel time Objective vs. Subjective travel time Travel decisions; time, mode Value of travel time (VOT) Toll pricing Economic evaluation of transport projects Value of travel time reliability (VOR) Mode choice, route choice and more
How is the Value of Travel Time/Reliability Calculated? Stated Preference Surveys Revealed Preference Surveys
Subjective vs. Objective Travel Times Subjective.. based on what? Individual Time One, first, needs to know the objective travel times, to investigate the subjective travel times A guideline for objective travel time patterns are needed Taxi GPS data with almost 4 million records provides good basis for objective travel times
Classification and Regression Trees (CART)
Homogeneous Time Periods Based on CART Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED AVERAGE TRAVEL TIMES 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM 4 6 Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED STANDARD DEVIATION 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM 2.5 3 3.5 Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED COV 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM.5.55.6 Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED AVERAGE TRAVEL TIMES 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED STANDARD DEVIATION 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM Monday Tuesday Wednesday Thursday Friday Saturday Sunday ESTIMATED COV 1AM 2AM 3AM 4AM 5AM 6AM 7AM 8AM 9AM 1AM 11AM 1PM 2PM 3PM 4PM 5PM 6PM 7PM 8PM 9PM 1PM 11PM
Making Theoretical Sense : Travel Time Reliability Measures The concept of reliability in general is based on component failures relatively well defined for mechanical and production systems. For transit systems with scheduled trips, failure can be defined based on on-time performance for road networks, there is no such concept. Many measures are suggested for road networks Variance, coefficient of variaation, measures based on distribution quantiles, etc Let s see whether these measures (supposedly measure the same phenomenon) are consistent
- 1AM 1AM - 2AM 2AM - 3AM 3AM - 4AM 4AM - 5AM 5AM - 6AM 6AM - 7AM 7AM - 8AM 8AM - 9AM 9AM - 1AM 1AM - 11AM 11AM 12PM 12PM - 1PM 1PM - 2PM 2PM - 3PM 3PM - 4PM 4PM - 5PM 5PM - 6PM 6PM - 7PM 7PM - 8PM 8PM - 9PM 9PM - 1PM 1PM - 11PM 11PM Consistency of Suggested Measures 1: Unreliable to Reliable CoV.7 1.7.2.2.2 Wed Fri Sun Skew 1.8.5.2.2.2.2.2.2.5.3.2.2.2.2 Var 1.3.7.3.3.7 1.7.3 CoV.7 1.7.2.2.2 Skew.7.5.2.2.2.1.2.2.5.4.2.2.2.2 Var 1.4.7.4.71 1.7.4 CoV.7 1.7.6.32.6.3.6 Skew.7.4.2.1.1.1.4.4.1.2.1 Var.6.4.2.6.4.2.6.4 Something is wrong? There is a need for more robust reliability measures
Making More Sense: Taxi Industry in New York City More than 13, licensed taxicabs Average of 66, passengers every day (Schaller Consulting, 24) There are 68, to 895, passengers based our findings. No obligation for taxi drivers to serve at any given time Lease prices vary based on the day of a week and day/night shifts.
Taxi Industry in General Taxi industry is heavily regulated (e.g., taxi operating permits and licenses). Once the taxi supply is regulated and a market is formed through taxi licenses, a taxi fleet becomes an economic entity. The decision to issue new licenses or to permit fare changes requires research.
What to look for? How do ridership and taxi-supply levels change for different time periods and under various weather conditions? Does weather have an impact on the passenger pick-up rate and on hourly revenues for taxi drivers? What are the possible impacts of the recent New York City taxi-fare increase on taxi demand and supply equilibrium?
WHY? UTRC s JFK Taxi Study Drivers don t come to JFK for pick up during rain. They make more money in Manhattan when it rains All cabs are full!! A Manhattan resident s cry during rain Urban legend? Individual taxi operators are income-target oriented and often reduce the amount of service as the demand increases (Schaller, 1999; Thaler, 1997) What if not 17 percent? Lease prices, medallion market
Taxi Ridership and Trip Statistics in NYC Mon Tue Wed Thu Fri Sat Sun Average Ridership 679,29 762,55 787,112 838,973 864,911 895,725 734,566 Average Number of Trips Average Occupancy (passengers/taxi) 42,267 473,511 487,5 516,363 519,846 55,79 419,598 1.62 1.61 1.61 1.62 1.66 1.77 1.75 NYC subway and bus weekday average ridership are about 5.2 million and 2.5 million respectively.
Average Ridership Average Number of Drivers Average Ridership and Average Number of Drivers on Duty Per Hour 5 x 1 4 Average Ridership per Day Per Hour -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM Mon Tue Wed Thu Fri Sat Sun 15 1 5 Average Number of Drivers per Day Per Hour -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM x 1 4 5 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 15 1 5 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM x 1 4 5 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 15 1 5 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 5 x 1 4 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- Time 15 1 5 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- Time
Average Number of Pick-ups Average Trip Distance [miles] 4 2 Average Distance per Trip and Number of Trips Per Hour per Driver Average Number of Pick-ups per Hour per Driver What about the revenue? 5 Average Trip Distance -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM 4 5 2 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 4 5 2 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 4 5 2 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- Time Mon Tue Wed Thu Fri Sat Sun 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- Time
# of PICK-UPS DISTANCE [MILES] Average Trip Distance and Pick-up Rate per Driver per Hour under Different Weather Conditions 6 4 2 Clear Light Rain Moderate Rain Heavy Rain Light Snow Moderate Snow Heavy Snow AVERAGE PICK-UPS PER HOUR PER DRIVER 6 4 2 AVERAGE DISTANCE PER TRIP -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM -1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM 6 6 4 4 2 2 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 6 6 4 4 2 2 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 6 6 4 4 2 2 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- TIME 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- TIME
DOLLARS [$] Revenue per Hour per Driver for Different Weather Conditions AVERAGE REVENUE PER HOUR PER DRIVER 6 4 2-1AM 1AM-2AM 2AM-3AM 3AM-4AM 4AM-5AM 5AM-6AM 6 4 2 6AM-7AM 7AM-8AM 8AM-9AM 9AM-1AM 1AM-11AM 11AM-12PM 6 4 2 12PM-1PM 1PM-2PM 2PM-3PM 3PM-4PM 4PM-5PM 5PM-6PM 6 4 2 6PM-7PM 7PM-8PM 8PM-9PM 9PM-1PM 1PM-11PM 11PM- TIME Clear Light Rain Moderate Rain Heavy Rain Light Snow Moderate Snow Heavy Snow
DOLLARS [$] DOLLARS [$] Estimated Impacts of Taxicab Fare Increase REVENUE PER HOUR PER DRIVER (5AM-6AM) 6 17.5% 17.4% 17.6% 16.7% 4 17.7% 17.% 16.8% 2 Clear Base Fare + Surcharges Light Rain Moderate Rain Heavy Rain Light Snow Moderate Snow Heavy Snow Metered Fare REVENUE PER HOUR PER DRIVER (2PM-3PM) 6 4 15.8% 15.4% 15.1% 15.1% 15.6% 14.6% 13.6% 2 Clear Light Rain Moderate Rain Heavy Rain Light Snow Moderate Snow Heavy Snow
What sense did we make? Substantial variations in ridership, trip distances and taxi supply for different DOW-TOD and weather-conditions. Minimum taxi supply is maintained at the level at which drivers receive approximately $2 per hour (excluding tips). Under rain conditions Higher pick-up rates and slightly shorter-distance trips Higher hourly revenues Snow conditions do not necessarily increase the hourly revenue per driver.
On Taxi Demand-Supply Balance Taxi activity mainly confined to Manhattan and the whole available taxi inventory is not on the streets even at peak taxi demand periods. Increasing the available number of taxis do not guarantee an increase the level of service, especially if there is no guarantee for drivers of certain minimum revenue (e.g. $2 per hour) It is unlikely that weather conditions change the taxi supply at the beginning of the day or shift. However, when there is prolonged inclement weather conditions, drivers who reach their income targets may end their shifts early, leading to temporary taxi shortages.
Possible Implications of the Latest Fare Increase After the latest the fare adjustments, revenue increases are not experienced equally for all DOW-TOD periods. the uneven revenue increases among different shift periods may eventually alter the corresponding lease prices. the increase in revenues may affect taxi drivers shift choice to be on duty.
More on Taxis In Progress Taxi driver decision making process for airport pick-ups After each dop-off, driver can either go to airport (decision=1), or keep on cruising for new passengers on the street (decision=) Binary decision Binary logistic regression Factors that affect drivers decision to go to JFK: Factors Time of Day Seriously? Day of Week Borough Weather Condition First Trip? Short Return Ticket
Quantifying the Impacts Factors Time of Day Day of Week Borough Weather Condition First Trip? Short Return Ticket Impacts ~1.5 times more likely during PM peak compared to AM peak ~¼ times less likely on Saturdays compared to Sundays ~5 times less likely at Manhattan compared to Brooklyn ~1/5 times less likely during rain ~ 4 times more likely if it is the first trip of the shift ~3 times more likely while at Queens and Brooklyn
OK, Quantified Then What? Policies?
More Ongoing Research No! Why do people take taxi? Why pay more when you have a dense subway system? Is the subway system really dense? Ask for someone living in Queens, or Red Hook How does the land use affect the taxi ridership?
The end Die Ende Fin Son Thank you! Q&A