Visualizing More Than Twenty Years of Flight Data for the Raleigh-Durham International Airport



Similar documents
Demographics of Atlanta, Georgia:

What Can We Learn by Disaggregating the Unemployment-Vacancy Relationship?

Deploying Regional Jets to Add New Spokes to a Hub. Ian Savage* and Burgess Scott Northwestern University

WHITE PAPER. Business Intelligence for Airlines and Flight Distribution. Sample NDC Charts

December 2015 Lutgert College Of Business FGCU Blvd. South Fort Myers, FL Phone

THE COLUMBIA METROPOLITAN AIRPORT EXAMPLE

AirportInfo. Airline Mergers & Acquisitions

Customer Service Plan. (Issued in Compliance with 14 CFR Part 259)

ANTHONY P. CARNEVALE TAMARA JAYASUNDERA BAN CHEAH THE COLLEGE ADVANTAGE: WEATHERING THE ECONOMIC STORM EXECUTIVE SUMMARY

Delta Air Lines, Inc. David Pruchno BUS A-Su13 - M/W 6-9

White Paper: Efficient Management of Cloud Resources

Walt Disney World Walt Disney World Walt Disney World Walt Disney World Walt Disney World Walt Disney World Walt Disney World

Change-Point Analysis: A Powerful New Tool For Detecting Changes

FRAConnect Incentive program

CASE STUDY. AUSTRIAN AIRLINES Modernizes Network Security for First Class Performance

The work breakdown structure can be illustrated in a block diagram:

Swedavia s Consultation Process Airport Charges 2014 Price Lists. Charges & Regulations Manager January 2 nd 2014

Daniel G. Fry Massachusetts Institute of Technology December, 2013

Suggested presentation outline

Teller & Cash Activity Analysis Tools

Online Video & the Media Industry

FORECASTING. Operations Management

AGENCY: Federal Aviation Administration (FAA), Department of Transportation.

Grocery Shopping: Who, Where and When

The Economic Impact of Commercial Airports in 2010

ALCOHOL, 2013 HIGHLIGHTS

Based on Chapter 11, Excel 2007 Dashboards & Reports (Alexander) and Create Dynamic Charts in Microsoft Office Excel 2007 and Beyond (Scheck)

Lloyd Potter is the Texas State Demographer and the Director of the Texas State Data Center based at the University of Texas at San Antonio.

The Unfriendly Skies. Five Years of Airline Passenger Complaints to the Department of Transportation

Academic Calendars. Term I (20081) Term II (20082) Term III (20083) Weekend College. International Student Admission Deadlines

Norwegian Air Shuttle ASA (NAS) Q and FY February 2004

Revisiting the Survey Form: The Effects of Redesigning the Current Employment Statistics Survey s Iconic 1-Page Form with a Booklet Style Form

Tracking Real Estate Market Conditions Using the HousingPulse Survey

Moving to the RTP (Research Triangle Park) area, the Triangle :

CALL VOLUME FORECASTING FOR SERVICE DESKS

As Flight Delays at United, American & Delta Jump, Airlines Oppose Airport Proposal for Funding that Could Be Used to Reduce Delays

Routes Development in Lima

Analytics That Allow You To See Beyond The Cloud. By Alex Huang, Ph.D., Head of Aviation Analytics Services, The Weather Company, an IBM Business

Serious Delinquency Rates 100 Largest Metro Areas, June By Rob Pitingolo and Leah Hendey, Urban Institute

A Dynamic Programming Approach for 4D Flight Route Optimization

TS Channels and Alliances. Travel & Expense Reimbursement Policy. for. U.S. Based Partners

Produced for the OFFICE OF NATIONAL DRUG CONTROL POLICY

Fall headcount enrollments in graduate level engineering programs within the State University System of Florida increased from 2,771 to 5,638 between

Florida. Aviation & Aerospace Industry

Statistical Bulletin. Annual Survey of Hours and Earnings, 2014 Provisional Results. Key points

Infographics in the Classroom: Using Data Visualization to Engage in Scientific Practices

10 knacks of using Nikkei Volatility Index Futures

Patterns of Job Growth and Decline

TRANSPORTATION. Georgia s Strength in Transportation 4 TRANPORTATION SYSTEMS IN GEORGIA. Highways 2 Airport 3 Railroads 4 Ports 5. Inside this issue:

Housing Price Forecasts, Illinois and Chicago MSA

Nonprofit Fundraising Change in the Number of Companies

Investigating Quarterly Trading Day Effects

2006 Report Card for Pennsylvania s Infrastructure

AAA Corporate Travel Newsletter January 30 th, 2015

The Effect of the Internet on Performance and Quality: Evidence from the Airline Industry

Introduction to Airline Management

CREATING A CORPORATE BOND SPOT YIELD CURVE FOR PENSION DISCOUNTING DEPARTMENT OF THE TREASURY OFFICE OF ECONOMIC POLICY WHITE PAPER FEBRUARY 7, 2005

Demand for Long Distance Travel

County-to-County Migration Flows

Every Cornell student knows what it feels like to have to wait on the slow

*This information brochure has been issued pursuant to provisions of EC 261/2004 Regulation of the European Parliament and European Union Council.

Predicting Credit Score Calibrations through Economic Events

SECTION C2 LATTS STRATEGIC AIRPORT SYSTEM

WORLD. Geographic Trend Report for GMAT Examinees

Data Review and Analysis Program (DRAP) Flight Data Visualization Program for Enhancement of FOQA

ATSB RESEARCH AND ANALYSIS REPORT ROAD SAFETY. Characteristics of Fatal Road Crashes During National Holiday Periods

Data Visualization Techniques

Airline Quality Rating 2012

ESSENTIAL COMPONENTS OF WATER-LEVEL MONITORING PROGRAMS. Selection of Observation Wells

Using INZight for Time series analysis. A step-by-step guide.

The following reports were prepared independent of the

TRAVEL POLICY FOR THE U.S. SCIENCE SUPPORT PROGRAM (USSSP) OFFICE

Stone Way N Rechannelization: Before and After Study. N 34th Street to N 50th Street

Common Tools for Displaying and Communicating Data for Process Improvement

Working with MIDT, ARC and BSP Data

Application Trends Survey

Define calendars. Pre specified holidays

763XXX Timing Analysis, Critical Path Method (CPM) Project Schedule

In Search of a Elusive Truth How Much do Americans Spend on their Health Care?

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

Airline Schedule Development

This SAS Plan is adopted for all scheduled flights operated by SAS to and from the US.

Accident & Emergency Department Clinical Quality Indicators

VIANELLO FORENSIC CONSULTING, L.L.C.

The 25 th Annual MBA Games Weekend. Overview Brochure. Network. Have Fun. Give Back.

Atlantic City Tourism Performance Indicators (AC-TPI)

MKE Air Service Update mitchellairport.com

Report for March 2015

Today s Smart Solution For Corporate Travel

Pedestrian and Bicycle Crash Data Analysis:

Current Market. Predicting the future is a risky business. Meet your new

IN THE HANDS OF TIME

Performance Measures for RIDOT s Traffic Management Center

NASDAQ DUBAI TRADING AND SETTLEMENT CALENDAR On US Federal Reserve Holidays, no settlements will take place for USD.

Interactive Dynamic Modeling for the Passenger Flow Bottleneck and Security Checkpoint Management at an Airport

Financial Operating Procedure: Budget Monitoring

Ms. Jarrette reported that operations at OEA, with inquiries/complaints. These approaches to runways 07 and

years in the community

ATAC International Business Development Strategy 2012/2013

Transcription:

CS-BIGS 5(1): 51-57 2012 CS-BIGS http://www.bentley.edu/centers/csbigs/crotty.pdf Visualizing More Than Twenty Years of Flight Data for the Raleigh-Durham International Airport Michael T. Crotty SAS Institute, USA Over two decades of airline flight data for the Raleigh-Durham International airport (RDU) are examined. Visualizing the data reveals that there are multiple phases of air traffic activity at RDU, corresponding to the transition from being an American Airlines hub airport to being a non-hub airport serving a greater variety of airlines. The changing patterns involve the daily number of flights as well as the locations of the reciprocal flights flying in and out of RDU. An analysis of arrival and departure delay data is also presented. Keywords: large data, visualization, data exploration 1. Introduction In this paper, we investigate and visualize data for domestic flights that originated or terminated at Raleigh Durham International airport (RDU). The data set contains over 2 million flights from October 1987 to December 2008. Some of the questions we used to direct the visualization project were: Has the fast population growth of the Raleigh-Cary metropolitan area been reflected in the number of flights in and out of RDU? How has the amount of air traffic changed over the 21 years of data? What trends in RDU s air traffic can be found in the airline data provided? How has the geographic distribution of airports with reciprocal traffic with RDU changed over time? Has the on-time performance of RDU flights improved or worsened over the 21 years of data? Starting from the questions listed above, our analysis split into four categories which correspond to the next four sections of this paper. Section 2 describes the overall trends in the data, including descriptions of the four distinct phases of air traffic at RDU over the 21 year period. Section 3 describes the analysis of daily scheduled flights in and out of RDU. Section 4 shows the changes in geographic distribution of RDU s reciprocal flights. Section 5 gives a brief analysis of flight delays at RDU over the 21 year period. Finally, Section 6 contains some final thoughts on this visualization project and possible future extensions to it. With the exception of Section 5 focused on (observed) arrival and departure delays, this analysis looks only at scheduled flights. Therefore, there is no direct focus on the effect on RDU air travel of the September 11, 2001 terrorist attacks. However, there is some evidence of lengthy delays decreasing in 2001 before increasing dramatically from 2002 onward. This will be discussed in further detail in Section 5. This analysis was originally presented at the 2009 Joint Statistical Meetings as an entry to the Data Expo 2009 poster competition sponsored by the American Statistical

- 52 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty Association s Sections on Statistical Computing and Statistical Graphics. At the 2010 Joint Statistical Meetings, this analysis was presented in a topic contributed session entitled Data Expo 2009: A Second Look at Flight Delays. While some initial data preparation was done using the DATA step in SAS, the bulk of the work (including all the visualizations) was performed using JMP. Since JMP stores data in memory, it was necessary to subset the overall airline data set. Planes fly to and from RDU over the author s house every day, so confining the data to all flights in and out of RDU was a natural choice. JMP has recently made it easier to add maps to graphs, and the interactive nature of JMP makes it ideal for a data exploration project such as this one. The data and scripts for this analysis are available in the supplementary materials for this article. 2. Overall Trends Analysis The population of the Raleigh-Cary metropolitan area grew steadily during the 21 year period; in fact, in 2009, Forbes.com named it the fastest growing metropolitan area in the country. We started our analysis by relating the yearly population of the area (using Wake County data as a proxy for the metropolitan area) to the number of scheduled flights per year and the total number of passengers going through RDU per year. Figure 1 displays this three-way comparison. We noted that the flight and passenger trends relate to each other fairly closely in the 1990s, but less so during the rest of the 21 year period. Also, the population trend does not correlate with either the flight or passenger trends until about 1995, after which point, the flight and passenger trends track the population data better. This observation led to an investigation of the history of RDU and the airlines serving it. After investigating the history of RDU over the 21 year period, four phases of air traffic patterns began to emerge. American Airlines (AA) used RDU as their central east coast hub up until late 1993; between late 1993 and early 1996, there was a transition period during which the hub was slowly closed down. After early 1996, there was a gradual recovery of flights and passengers. Finally, the years 2004 through 2008 were relatively stable, although there was still a fair amount of variation during these years. These four phases and their dates are listed in Table 1 and are used in the analyses of daily scheduled flights (Section 3) and of flight distribution (Section 4). 1,000,000 750,000 500,000 250,000 Population Flights Passengers 1990 1995 2000 2005 100,000 Figure 1. Comparison of annual trends in the population of Wake County (home of RDU), the number of flights at RDU and the number of passengers going through RDU. The number of passengers values are 100 times the values on the right axis. After the closing of the AA hub, the flights and passengers data track the population data better than during the years of RDU being an AA hub. Figure 2 gives a plot of monthly trends in total flights in/out of RDU, the distinct number of airlines operating at RDU, and the number of AA flights. This plot shows the effects of the closing of the AA hub at RDU between September 1993 and April 1996. There is also a noticeable increase in the number of airlines in the Recovery and Stability phases of the data. One possible explanation for this is that after the AA hub closed, there was more room at the airport for other carriers to operate. 50,000 Figure 2.Time plot of the average number of daily scheduled flights per month in/out of RDU over the entire period of data as well as the average number of daily scheduled AA flights in/out of RDU. This clearly shows the initial effects of AA discontinuing its hub service. Also plotted is the number of airlines serving RDU each month. As the recovery of air traffic following the AA hub closing led to more stabilized current levels, the diversity of airlines serving RDU steadily increased.

- 53 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty Note that no comparison to national trends in carrier diversity was performed, so it could be that the number of airline carriers was increasing nationally over this time as well. Figure 3. Map of airports that served as either a destination from RDU or an origin to RDU, sized and colored by the number of flights from 1987 to 2008. Airports with fewer than 100 flights and airports located outside the continental US were excluded from this map. See Table 2 for the number of flights that were affected by this exclusion. The final analysis that was performed on the entire data set of scheduled flights was to look at the geographic distribution of the airports with flights to/from RDU in the 21 year period. Figure 3 shows this distribution for airports within the continental US; this restriction removed 13,249 flights to/from Puerto Rico and the US Virgin Islands. Any airports with fewer than 100 flights are excluded as well, which removed only 52 flights. Also, the original data set only contained flights entirely contained in the United States, so any international flights (to/from Canada or Europe) were excluded prior to obtaining the data. As one might expect, ect, the bulk of the flights to/from RDU are with airports in the eastern part of the US. Section 4 explores how the geographic distribution of reciprocal airports changed over the 21 year period, broken down into the four phases listed in Table 1. Table 1. The four phases of the RDU airline data and their dates. Phase Start Date End Date American Airlines Hub October 1987 August 1993 Closing the Hub September 1993 April 1996 Recovery May 1996 December 2003 Stability January 2004 December 2008 3. Daily Flights Analysis This section investigates patterns in daily scheduled flights, one phase of the data at a time. For each of the following four subsections, a compressed time plot was produced showing the number of scheduled flights for every day in the time period of the particular phase. The days are color coded by day of the week and grouped in vertical columns by month. There are also vertical reference lines to denote years. While this graphical technique might obscure the day-to-day time series appearance, the goal was to visualize the trends over larger temporal scales. At the same time, all the data points are still shown in a concise graph. Note that all of the graphs in this section use the same vertical scale and range; this is to allow for comparison across the four time periods. A smoothing spline was fit to the data in each phase, to help visualize the trend over time. The spline was fit to the data as it is represented graphically. All data points in a single month are treated as having the same time value for the purposes of the spline fit. 3.1. American Airlines Hub (October 1987 August 1993) Figure 4 displays the daily scheduled flights for the first phase, corresponding to the time period when AA used RDU as one of their hubs. During this time, the number of flights is fairly steady. There is an unexplained dip and recovery in early 1991. Flights per Day Figure 4. AA Hub (Oct 1987 Aug 1993). Compressed time plot of number of flights per day by month with a smoothing spline fit. Points are colored by day of the week. An interesting trend that shows up in this plot is that for most years, there is a very low data point, which is for a Thursday in November Thanksgiving Day in most years has many fewer scheduled flights. Because the data are grouped by month, it is not easy to view the days around Thanksgiving, although for some years, the Friday after Thanksgiving is visible as being below the rest of the data points in November.

- 54 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty Finally, weekdays consistently have more scheduled flights than Saturdays and Sundays. This is not surprising, since there would be a lot of business travel during the week. This could also be shown with a monthplot, but then the variability of a particular day of the week within a month would be lost. 3.2. Closing the Hub (September 1993 April 1996) Figure 5 displays the daily scheduled flights for the second phase, corresponding to the time period when AA closed its hub at RDU. The most obvious aspect of the daily flights analysis for this phase is the dramatic decrease in the number of flights. This decrease corresponds with the closing of the AA hub at RDU, which was a gradual process that took about 2 years. By early 1996, the number of daily flights had bottomed out at between 100 and 150 flights per day. Throughout this period, the higher number of flights on weekdays continues to hold. However, the Thanksgiving Day effect is only strong in 1993, and is much less dramatic in 1994 and 1995. This is also the shortest phase of the data. between Thanksgiving Day and other very low days of other months of the year now. By the end of this third phase of the data, we see that the number of daily flights is back to around 300, which is near where it was for the AA Hub phase. Finally, as mentioned in Section 1, since this is an analysis of scheduled flights, it is difficult to see any noticeable effect of the September 11, 2001 terrorist attacks. Figure 6. Recovery (May 1996 Dec 2003). Compressed time plot of number of flights per day by month with a smoothing spline fit. Points are colored by day of the week. 3.4. Stability (January 2004 December 2008) 400 300 200 Figure 5. Closing the Hub (Sep 1993 Apr 1996). Compressed time plot of number of flights per day by month with a smoothing spline fit. Points are colored by day of the week. 3.3. Recovery (May 1996 December 2003) From the shortest phase of the data, we move to the longest phase, which is a gradual recovery of flights over 8 years. In this phase, shown in Figure 6, we start to see the weekend effect change a bit. Sunday is still lower than weekdays, but Saturday is now even lower than Sunday. This trend also seems to intensify toward the end of this phase. The Thanksgiving Day effect is still present in this phase as well, although there is less difference 100 Jan 2004 Dec 2008 Figure 7. Stability (Jan 2004 Dec 2008). Compressed time plot of number of flights per day by month with a smoothing spline fit. Points are colored by day of the week. The final phase of the data, shown in Figure 7, shows a lot of variability, but the daily number of flights continues to average between 300 and 400. This is actually slightly higher than the daily number of flights in the first phase when RDU was an AA hub. Interestingly, the weekday versus weekend difference is greatest in this phase, while the Thanksgiving Day trend is much harder to identify in many of the years in this

- 55 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty phase. The Saturday flights being less than the Sunday flights trend is even greater in this phase than in the Recovery phase. We do not have any theories on why there seems to be a 2 to 3 year cycle in the Stability phase, but the decline at the end of 2008 could be related to the start of the economic recession of the late 2000s in the United States. 5. Delay Data Analysis Our final analysis uses a different aspect of the data than the previous sections. Instead of looking at scheduled flights, we shift our focus to actual delays for both arrivals and departures. Specifically, we are interested in answering the question of whether delays at RDU have changed over the 21 year period. 4. Flight Distribution Analysis This section breaks down the overall geographic distribution of flights to/from RDU that was presented in Figure 3 for each of the four phases of the data. Just as in Figure 3, airports with fewer than 100 flights were excluded from the maps, as were flights to/from Puerto Rico and the US Virgin Islands. See Table 2 for details on how many flights were excluded from each map based on these criteria. Table 2. Number of excluded flights for Figures 8 through Figures 11. Outside Continental Phase Under 100 flights US American Airlines Hub 4 9889 Closing the Hub 1 3156 Recovery 10 204 Stability 37 0 Total 52 13249 At the beginning of the 21 year period when RDU was an AA hub (shown in Figure 8), all of the reciprocal airports are east of the Mississippi, with the exceptions of St. Louis and Dallas. Much of the traffic is to the north or south of RDU, consistent with the status of RDU being AA s central east coast hub between New York and Miami. Chicago, Dallas and Atlanta also have a large number of flights in the AA Hub phase. Figure 8. AA Hub (Oct 1987 Aug 1993). Analog map to Figure 3 for the period that RDU served as an AA hub. Figure 9. Closing the Hub (Sep 1993 Apr 1996). Analog map to Figure 3 for the period that AA scaled back hub service at RDU. During the Closing the Hub phase (shown in Figure 9), there is not much change in the distribution of flights, with the exception that it becomes a bit sparser overall. The Recovery and Stability phases are perhaps the most interesting in terms of geographic distribution. They show the rise of geographic diversity of reciprocal airports, primarily west of the Mississippi. The introduction of Southwest Airlines to RDU may account for many of these new destinations in the western US, including Chicago s Midway airport. Between Figures 10 and 11, there is a general trend of more flights to/from airports west of RDU and fewer east coast airports. However, in the Stability phase, the share of flights to/from the New York airports does seem to increase. Figure 10. Recovery (May 1996 Dec 2003).Analog map to Figure 3 for the period that RDU traffic increased to 2008 levels. The distribution of delays (measured in minutes) is necessarily a skewed distribution; flights can only arrive (or depart) so many minutes early, but they can be many

- 56 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty minutes, or even hours, late in arriving or departing. To account for this skewness in the data, we approached the problem by looking at medians of the delays per month. The median delay is represented by the black line in Figures 12 and 13 for arrivals and departures, respectively. Figure 11. Stability (Jan 2004 Dec 2008). Analog map to Figure 3 for the last five years where RDU traffic has leveled off. percentiles above the median, however, show that there has been an increase over time in the length of the longest delays for both arrivals and departures. There are a few possible explanations for the increased length of longer delays. One is that there is more security to go through after September 11, 2001 that could be slowing down the system in general. Also, as can be inferred from Figure 1, after 2000, either flights operating out of RDU are simply using larger planes or they are filling the planes more efficiently; it is plausible that either of these could increase lengthy delays. We fit a smoothing spline to the monthly percentiles as a way to clearly visualize the trend over time. Also, flights that were recorded as over 1 hour early or over 6 hours late were excluded as being suspect data. This only removed 47 arrivals and 33 departures out of over 2 million flights. As with other parts of the project, we have not compared any of these RDU trends to overall national trends. 6. Conclusions Arrival Delay (minutes) Figure 12. Arrival Delays.Plot of percentiles of the monthly arrival delays. Departure Delay (minutes) Figure 13. Departure Delays.Plot of percentiles of the monthly departure delays. After noticing that the median delay was fairly constant over the 21 year period for both arrivals and departures, we delved deeper into the data and looked at the 5th, 10th, and 25th percentiles on both the high and low side of the median. Because of the skewness of the data, the three percentiles below the median are fairly close together and not terribly interesting. The three Changes in the patterns of air traffic at RDU over 21 years (from 1987 to 2008) are explored. Most notably, American Airlines utilized RDU as a hub on the central east coast of the US until 1993. By 1996, the hub was completely gone, but a slow recovery started to take place. The levels of air traffic were more or less stable over the five year period ending in 2008. The population growth of the surrounding area only seems to correlate with air traffic levels in the post-hub time period. Also, as more airlines served the airport, more destinations were added. However, delays for both arrivals and departures increased as well. There are many possible future investigations that could be pursued. One would be to compare RDU s trends with national trends. The analysis performed here could even be replicated for other airports around the country. It would also be interesting to get more data from the RDU airport aside from just flight data that could help shed light on the validity of the speculative explanations for the trends presented here. With regards to the delay data, adding weather data could be very interesting in helping explain some of the variation in delays. There are also more data for RDU contained in the flight data provided by the Data Expo 2009 competition that could be further explored, especially in the on-time performance area. What airline, time of day, day of the week, month of the year should you choose to travel on to minimize the chance of being delayed at RDU?

- 57 - Visualizing Flight Data for Raleigh-Durham International Airport / Crotty Finally, it is clear that on-time performance is only one criterion when people book an airline ticket. It is tempting to think about combining other factors, namely ticket price, to this data set to better assess various regular flights in and out of a particular airport. Acknowledgements: The author wishes to thank the editor and an anonymous reviewer for their helpful comments. References Data Expo 2009. Available at http://stat-computing.org/ dataexpo/2009 City growth. Available at http://www.forbes.com Passenger data. Available at http://en.wikipedia.org/wiki/ Airline_hubs_at_RDU#Passenger_statistics Population data. Available at http://www.google.com/ publicdata Flight data. Available at http://www.transtats.bts.gov/ OT_Delay/OT_DelayCause1.asp Airport data. Available at http://www.faa.gov Airline data. Available at http://stat-computing.org/ dataexpo/2009/supplemental-data.html RDU history. Available at http://www.rdu.com/aboutrdu/ history.htm RDU history. Available at http://wikibin.org/articles/airlinehubs-at-rdu.html Correspondence : Michael.Crotty@sas.com