HATRIS JTDB Query Tool

HATRIS JTDB Query Tool Reference manual Version: May 2011 This reference manual contains information on the data sources and processes used to generate the fields presented in the results table of the JTDB Query Tool. Instructions on running the Query Tool are found in the HATRIS JTDB user guide. The steps in the JTDB Query Tool are as follows: 1. Select dates (start and end dates); 2. Select time period (15 minute or hourly interval); 3. Select road (road number) 4. Select links; 5. Select output format; 6. Results. This manual refers to these steps where appropriate. Background to HATRIS HATRIS is the Highways Agency (HA) Traffic Information System. HATRIS currently contains two distinct databases: TRADS (Traffic Flow Database System). This contains count data from about 11,000 inductive loops installed across the HA network of strategic roads JTDB (Journey Time Database). This contains estimates of average journey time, average speed and total flow for every 15 minute interval throughout the year for about 2,500 junction to junction links on the HA network 1 The HATRIS database is currently hosted by Capita Symonds, Carlisle. The data sources and reference data used to generate the JTDB Query Tool results table are loaded into the TRADS and JTDB databases and processed using algorithms which verify and filter the data, match the data to the HA network and aggregate the data to the required temporal and spatial resolution. Traffic flow data has been collected by the HA for many years. The TRADS database was developed in 2000-02 and contains detailed traffic flow information, including information dating back to the mid 1980s for some locations. The JTDB was separately developed in 2003-04 by Mott MacDonald, supported by TRL, under contract to the HA. It contains link related journey time and flow information for every date and 15 minute interval dating back to 1 st September 2002. In February 2005 Capita Symonds were contracted to further develop the two databases and bring them together as HATRIS. At the same time TRL were independently appointed to be technical advisors for HATRIS. 1 Journey times for a 15-minute period are defined as the average for all vehicles passing the upstream end of the link in the 15-minute period. Flow is a count of the number of these vehicles. 1

New flow data is added to TRADS continuously and an additional month of data is added to JTDB one month in arrears. The data is normally available for access by the query tool early in the following month (e.g. March data will be available in early May). JTDB Query Tool data sources The flow data in JTDB is harvested from TRADS. If 15-minute flows are available, these are used; otherwise the hourly flow is divided by 4. The flow data in TRADS originates from three separate systems of inductive loops, all owned by the HA and is processed by Traffic Information Services (TiS). The sources are as follows: 1. HA loops known as TAME loops - these were the original loops used for TRADS in the past. 2. HA MIDAS loops one loop on each link of the HA network where MIDAS is installed is designated as a counting loop; 3. NTCC loops installed from 2004 onwards for the National Traffic Control Centre (contractor TiS) these have replaced many of the TAME loops. The current source data used to estimate the 15-minute journey times and speeds for the links of the HA network available to access using the JTDB Query Tool is as follows: 1. The HA MIDAS 2 system of inductive loops at nominal 500m intervals on approximately 40% of the motorway network 1-minute average speeds and flows for the loop sites are used; 2. 3 ANPR (Automatic Number Plate Recognition) journey time data, available on most all-purpose trunk roads and some motorways journey times for individual vehicles between the ANPR cameras are used; Please Note: ANPR source data was not supplied after December 2012 3. ITIS 4 spot speed data obtained from vehicles fitted with GPS (Global Positioning System) satellite tracking devices (used up to December 2007) hourly average speeds for links of the ITIS network are used; 4. GPS satellite tracking data (used from January 2008 onwards) individual speed and location data for individual vehicles observed at 10 second intervals are used. JTDB reference data The process used to generate the results tables uses reference data which is either fixed or updated at intervals. The main table of reference data relates to the HATRIS link definition of the HA network. Fields are listed in Table 1. A fixed look-up table is used to convert the speed limit to a free flow speed as shown in Table 2. Field name Label in query tool results table Table 1: HATRIS network RRef Link ID The unique ID of the link. References starting with LM are motorways and those starting with AL are A roads. DfTNumber - Road number startlink - DfT number of link at start node endlink - DfT number of link at end node 2 Motorway Incident Detection and Automatic Signalling 3 Ltd 4 ITIS Holdings plc 2

Field name Label in query tool results table UniqueID Link ID for Link. From [DfTNumber], [StartLink] and [EndLink] concatenated. natureofroad - Single or Dual Carriageway (used to derive the [freeflowspeed] see Table 2). length Link Length The length of the link in kilometres speedlimit - Speed limit (km/h); may be a length weighted average if speed limit changes along link. Used to derive [freeflowspeed] (see Table 2) freeflowspeed Used to derive Total Delay Free flow speed (km/h); may be a length weighted average if speed limit changes along link (see Table 2). The values used are standard DfT values for different carriageway types/classes of road. StartX - 6 digit OSGR easting coordinate of link start point StartY - 6 digit OSGR northing coordinate of link start point EndX - 6 digit OSGR easting coordinate of link end point EndY - 6 digit OSGR northing coordinate of link end point ValidFrom - Date from which this link definition is valid (inclusive) ValidTo - Date to which this link definition is valid (inclusive) LatestChangeDate - Latest date upon which this link definition was changed. Used for table maintenance Table 2: The free-flow speed look-up table Road Class All Purpose (A) Motorway (M) Number of carriageways Single Dual Dual Speed limit (mph) Free flow speed (km/h) 30 35.40548 40 51.49888 50 64.37360 60 72.42030 70 99.77908 30 40.23350 40 48.28020 50 59.54558 60 91.73238 70 99.77908 30 48.28020 40 64.37360 50 80.46700 60 96.56040 70 107.82580 At Step 3 of the Query Tool selection process a road number is selected. The list displayed refers to the [DfTNumber] in the HATRIS network table. At Step 4 of the selection process for the Query Tool a table of links is displayed from which to select those required. This table is based on the HATRIS network Table 1 as shown in Table 3. 3

Field of HATRIS network table Table 3: Relationship between HATRIS network fields and table displayed at Step 4 of query tool Field of table shown at Step 4 of query tool selection process Select Links RRef Road Ref The unique ID of the link. References starting with LM are motorways and those starting with AL are A roads. StartX Start X 6 digit OSGR easting coordinate of link start point StartY Start Y 6 digit OSGR northing coordinate of link start point EndX End X 6 digit OSGR easting coordinate of link end point EndY End Y 6 digit OSGR northing coordinate of link end point UniqueID ID for Link. From [DfTNumber], [StartLink] and [EndLink] concatenated. length Length The length of the link in kilometres speedlimit (units km/h) Speed Limit (units mph) Speed limit; may be a length weighted average if speed limit changes along link. A look-up table is used to associate the TRADS loop data with the links of the HATRIS network. This can be a complex operation because for some links count data from TRADS must be manipulated to generate the flow on the link. Table 4: TRADS site to HATRIS link look-up table Field name Format Rref String HATRIS link reference DFTNo String DfT road number AreaNo Integer The Highways Agency s area, in the range 1 to 14, or 25 to 34 SiteNo String TRADS site identifier, if this is a MIDAS or TAME site; otherwise null If the first character is V this is a virtual 5 TRADS site TMUNo String TRADS site identifier, if this is an NTCC (TMU) site; otherwise null If the first character is V this is a virtual TRADS site Action String Text code which describes how this site should be combined with others which map to the same HATRIS link. ValidFrom Date/Time Date from which the match is valid ValidTo Date/Time Date to which the match is valid The column action determines how the data from specified TRADS site should be combined with another site mapped to the same HATRIS link valid on the same date: a average with other site/sites; = equals (where only one matching site); + add (in combination with another site); - subtract (in combination with another site); h ha halve; halve and average with another site/sites. 5 A virtual site is a combination of other sites (for example the sum of sites covering one lane only) 4

Outline of processing The processing of data from each source follows the same general method, with some exceptions. These are detailed in the description of each stage. The stages are shown in Figure 1. (1) Load one month of source (2) Verify and filter data (3) Generate 15-minute data and in-fill (4) Match source network to HATRIS network (5) Aggregate data to generate 15- minute data for each HATRIS link (6) Generate output fields for Results table (1) Load Figure 1: Stages in the processing of data The MIDAS, ANPR, GPS (ITIS before January 2008) and TRADS data is loaded into the database. Only the data for the TRADS sites in the TRADS site to HATRIS link look-up Table 4 are loaded. (2) Verify and filter MIDAS ANPR GPS ITIS TRADS If fewer than 1380 valid (not zero or 255) 1-minute flow values (out of 1440 in one day), then the record is marked as invalid. Data for links marked bad by are removed and filtering software is used to remove outliers. Not filtered at the load stage (see (3) below). Filtering rules are used to exclude invalid data. No filtering undertaken (data is validated by the data supplier TiS). (3) Generate 15-minute data and in-fill Data is processed to generate journey times for 15-minute intervals. This process varies according to the source. MIDAS ANPR GPS ITIS TRADS Average journey times for MIDAS Sections are generated by simulating the passage of vehicles setting off at 1-minute intervals and averaging the 15 estimates. The MIDAS Sections are a one-to-one match to the HATRIS links. Average journey times are calculated for all vehicles passing the start of the camera-to-camera link in the 15-minute interval. Observations are matched to HATRIS links and filtered using rules to reject journeys for vehicles which may have stopped on the link or made a diversion. The journey time for each vehicle is calculated from the average observation speeds and averaged for all vehicles passing the start of the link in the 15-minute interval. The hourly data is applied to each of the 15-minute intervals. 15-minute data is loaded. Finally, the data for each of the sources is in-filled to generate a value for every 15-minute interval in the month. The in-filling process is described in Appendix A. 5

(4) Match data to HATRIS network The matching differs for each source. MIDAS ANPR GPS ITIS TRADS The MIDAS Sections are a one-to-one match to the HATRIS links. The matching table is used to generate journey times for HATRIS links. Generated in previous stage (3). The matching table is used to generate journey times for HATRIS links. The TRADS site to HATRIS link look-up table - Table 4 is used. (5) Aggregate The aggregation process differs for each source. MIDAS ANPR GPS ITIS TRADS No aggregation is necessary. Where more than one link matches a HATRIS link the overlap lengths are used to aggregate values. Generated in previous stage (3). Where more than one ITIS matches a HATRIS link the values are averaged. The rules recorded in the TRADS site to HATRIS link look-up table - Table 4 are used. (6) Output to results table The aggregation stage for each of the journey time source generates 15-minute data for links of the HATRIS network. Only one estimate of the journey time and speed is output to the results tables. The preference rules are: 1. For motorways (excluding M6 Toll) a. Use MIDAS data, including in-filled; b. Use GPS data, including in-filled (ITIS data before January 2008) 2. For non-motorways and M6 Toll a. Use ANPR data, including in-filled; b. Use GPS data, including in-filled (ITIS data before January 2008). There are minor differences between the 15-minute and hourly results tables selected at stage 2. Table 5 and Table 6 show the 15-minute and hourly table of results and the source of the data. Appendices B, C and D contain the details of the fields [Day Type Id], [Quality] and [Total Flow, avg d over day type]. When the results are exported to a csv file at stage 6, three more fields are included to define the time interval and day type (see Table 7). 6

Field Name Table 5: Table of results for 15-minute interval selection Source Link ID The unique ID of the link. [Rref] from HATRIS network table. Link Link Length (km) Date Time Period Id Day Type Id Quality Avg Travel Time (secs) Avg Travel Speed (km/h) Total Flow (vehicles) Total Flow, avg d over day type (vehicles) Total Delay (vehicle secs) Vehicle kilometres (km) A description of the link. The length of the link. The date associated with the record (dd/mm/yyyy). The time associated with the record. A category of days. The quality of the journey time data for the record (High, Medium or Low). The average journey time for the link. The average journey time for the link. The total number of vehicles on the link. Adjusted flow based on average for the day type The total delay for the set of vehicles. The total distance covered by all vehicles. [UniqueID] from HATRIS network table. [Length] from HATRIS network table. Date associated with 15-minute data. Time interval associated with 15-minute data. In range 0 to 95, corresponding with 00:00:00 00:14:59 to 23:45:00 23:59:59. In range 0-14 (8 and 10 are not used) see Appendix B for definition. See Appendix C for details. Estimate of average journey time for preferred source. Estimate of average journey time for preferred source. Estimate of flow from TRADS. See Appendix D for details. The [Avg Travel Time] minus the travel time at the [freeflowspeed] (from HATRIS network table), multiplied by the [Total Flow, avg d over day type]. Set to zero if the calculation yields a negative number. [Total Flow, avg d over day type] multiplied by [Link Length]. 7

Field Name Table 6: Table of results for hourly interval selection Source Link ID The unique ID of the link. [Rref] from HATRIS network table. Link Link Length (km) Date Time Period Id Day Type Id High Quality Medium Quality Low Quality Avg Travel Time (secs) Avg Travel Speed (km/h) Total Flow (vehicles) Total Flow, avg d over day type (vehicles) Total Delay (vehicle secs) Vehicle kilometres (km) A description of the link. The length of the link. The date associated with the record (dd/mm/yyyy). The time associated with the record. A category of days. Count of High Quality journey time data. Count of Medium Quality journey time data. Count of Low Quality journey time data. The average journey time for the link. The average journey time for the link. The total number of vehicles on the link. Adjusted flow based on average for the day type The total delay for the set of vehicles. The total distance covered by all vehicles. [UniqueID] from HATRIS network table. [Length] from HATRIS network table. Date associated with 15-minute data. Time interval associated with first 15-minutes of hour. In range 0 to 95, corresponding with 00:00:00 00:14:59 to 23:45:00 23:59:59. In range 0-14 (8 and 10 are not used) see Appendix B for definition. Count of 15-minute intervals in hour with High Quality journey time data. See Appendix C for definition of High Quality. Count of 15-minute intervals in hour with Medium Quality journey time data. See Appendix C for definition of Medium Quality. Count of 15-minute intervals in hour with Low Quality journey time data. See Appendix C for definition of Low Quality. Average of values for each 15-minute interval. Average of values for each 15-minute interval. Sum of values for each 15-minute interval. Sum of values for each 15-minute interval (see Appendix D for details). Sum of values for each 15-minute interval. Sum of values for each 15-minute interval. 8

Field Name Table 7: Additional fields shown in csv file Source Time Period The range of times For [Time PeriodId] 0 to 95 the corresponding values are 00:00 00:15 to 23:45 00:00. Day Day Category The range of days of the week relating to the day type A description of the day type See Appendix B. See Appendix B. Appendix A: In-filling In-filling of journey times and flows is designed to generate a value for every link of the source networks (the HATRIS network in the case of GPS data) where values are missing. The reason could be that the equipment was faulty or that there were no observations recorded. For in-filling journey times a set of data with which to in-fill is identified as follows: 1. If there are observed values for the link in the 15-minute intervals immediately before the missing time interval and / or immediately after the missing time interval on the same day vertical infilling can be performed. For all sources except ITIS, time intervals ±1 are used followed by ±2 if there is no observed data for ±1. For ITIS data, only one search of time intervals ±4 is made as the data is hourly. 2. If there are no observed values available in step 1, horizontal in-filling is used. A set of days is assembled from the previous 5 years of data for the same day type (see Table 8). The days selected are the 10 nearest the day being in-filled or the day exactly one year before for which there was observed data. If D is the date (represented as a serial number with units of 1 day) being processed and d is the date of another record then the proximity of date d to date D is D-d. The values used for in-filling under the nearest 10 system are those ten which have the lowest value of 6. It is possible that neither of these methods may return 10 records. In this case, the number of records available is used. If no records meet the criteria then a NULL value is returned indicating that in-filling has failed. The average journey time to use for in-filling is the median for each of the records in the set (note that for vertical in-filling a mean or median will produce the same result because the sample size is either 1 or 2). Vertical in-filling is not used for flow data. The method described above is used, with some modifications to maximise the availability of a value for every site and 15-minutes. The set of data used to in-fill flow for a given record has been modified to be the 10 nearest to the day being in-filled or the same day the previous year of the same day type (or the primary-related day type see Table 8). The primary-related day type is only used if the no values are identified using the day type. If no data are available then in-filling cannot take place and the flow will be assigned a NULL value. The in-filled data are calculated by calculating the median of the 15-minute flow for the days selected. 6 366 in a leap year 9

Appendix B: Day types The day type categories and primary-related day types used for in-filling flows are shown in Table 8. Day type 0 First working day of normal week Table 8: Day types Primaryrelated type Day 7 Monday - Friday 1 Normal working Tuesday 9 Tuesday 2 Normal working Wednesday 9 Wednesday 3 Normal working Thursday 9 Thursday 4 Last working day of normal week 5 Saturday, but excluding days falling within type 14 6 Sunday, but excluding days falling within type 14 7 9 11 12 13 First day of week - school holidays, but excluding days falling within type 12, 13 or 14 Middle of week - school holidays, but excluding days falling within type 12, 13 or 14 Last day of week - school holidays, but excluding days falling within type 12, 13 or 14 Bank Holidays, including Good Friday, but excluding days falling within type 14 Christmas period holidays between Christmas Day and New Year s Day (not Saturday or Sunday) 6 Thursday - Friday None None Saturday Sunday 0 Monday - Friday 2 Tuesday - Thursday 4 Thursday - Friday None None Any 14 Christmas Day or New Year s Day None Any Monday - Friday 10

Appendix C: Quality assignment rules The criteria used to assign the data quality High, Medium and Low levels vary between the sources. The definition is in Table 9. Table 9: Quality assignment criteria Source High Medium Low MIDAS (1) Observed data (2) A minimum of 1 loop per 1km of link ANPR (1) Observed data (2) For links matching HATRIS link: (a) Average sample >5 (b) Minimum sample >0 (1) Not observed data (2) A minimum of 1 loop per 1km of link (1) Not observed data (2) For links matching HATRIS link: (a) Average sample >5 (b) Minimum sample >0 ITIS Not applicable (1) Observed data (2) For ITIS links matching HATRIS link: (a) Average car sample and average HGV sample >5 (b) Minimum car sample and minimum HGV sample >0 GPS (1) Observed data (2) Sample > 6 (1) Observed data (2) 6 Sample > 4 Criteria for Medium quality not achieved Criteria for Medium quality not achieved Criteria for Medium quality not achieved Criteria for Medium quality not achieved Appendix D: Adjusted flow The field [Total Flow, avg d over day type] is known variously as adjusted flow or estimated flow. This is an indicative value of flow for each link and day type (see Appendix B for definition of day types), which is used for calculations of delay. It is needed because during an incident the flow drops due to drivers diverting or making a decision not to travel. To accommodate this behaviour, the delay is calculated using the adjusted flow, which represents the normal flow on the link at the time on a day of the same day type. The adjusted flow is the median of the observed flow values for 10 records selected using very similar rules to those used for in-filling (see Appendix A). The 10 days nearest to the day being in-filled or the same day the previous year of the same day type for which there was non-null data are selected (i.e. infilled data is used). The primary-related day types are not used for calculating the adjusted flow. If D is the date (represented as a serial number with units of 1 day) being processed and d is the date of another record then the proximity of date d to date D is D-d. The values used for in-filling under the nearest 10 system are those ten which have the lowest value of 7. 7 366 in a leap year 11