1 Develp Agency SPF Frm SafetyAnalystWiki Cntents [hide] 1 Safety Perfrmance Functins 1.1 What SPFs Are Needed 1.2 Functinal Frm f SPFs 1.3 Data Needs fr Develpment f SPFs 1.4 Statistical Assumptins and Sftware 1.5 References Safety Perfrmance Functins This dcument prvides guidance n the develpment f safety perfrmance functins (SPFs) fr use with the SafetyAnalyst sftware. SPFs are prvided in SafetyAnalyst and autmatically calibrated using each agency s data, s it is nt necessary fr agencies t develp their wn SPFs. Hwever, since sme agencies may prefer t implement SafetyAnalyst using SPFs develped with their wn agency s data; this mem prvides guidance n the apprpriate prcedures fr SPF develpment. SPFs are regressin relatinships between target accident frequencies and traffic vlumes that can be used t predict the expected lng-term accident frequency fr a site. SPFs are used in the EB methdlgies that estimate the safety perfrmance f sites in several f the analytical tls prvided with SafetyAnalyst. Thus, SPFs are essential t the functining f thse tls. What SPFs Are Needed Within SafetyAnalyst, SPFs are needed fr three types f sites (radway segments, intersectins, and ramps) and fr several subtypes f thse site types. Site type and subtypes are based upn radway characteristics and ther site characteristics. The site types and subtypes fr which SPFs are needed are as fllws: Radway Segments Rural tw-lane highway segments Rural multilane undivided highway segments Rural multilane divided highway segments Rural freeway segments 4 lanes Rural freeway segments 6+ lanes Rural freeway segments within an interchange area 4 lanes Rural freeway segments within an interchange area 6+ lanes Urban tw-lane arterial segments Urban multilane undivided arterial segments Urban multilane divided arterial segments Urban ne-way arterial segments Urban freeway segments 4 lanes Urban freeway segments 6 lanes
2 Urban freeway segments 8+ lanes Urban freeway segments within an interchange area 4 lanes Urban freeway segments within an interchange area 6 lanes Urban freeway segments within an interchange area 8+ lanes Intersectins Rural three-leg intersectins with minr-rad STOP cntrl Rural three-leg intersectins with signal cntrl Rural fur-leg intersectins with minr-rad STOP cntrl Rural fur-leg intersectins with all-way STOP cntrl Rural fur-leg intersectins with signal cntrl Urban three-leg intersectins with minr-rad STOP cntrl Urban three-leg intersectins with signal cntrl Urban fur-leg intersectins with minr-rad STOP cntrl Urban fur-leg intersectins with all-way STOP cntrl Urban fur-leg intersectins with signal cntrl Ramps Rural diamnd ff-ramps Rural diamnd n-ramps Rural parcl lp ff-ramps Rural parcl lp n-ramps Rural free-flw lp ff-ramps Rural free-flw lp n-ramps Rural direct r semidirect cnnectin ramps Urban diamnd ff-ramps Urban diamnd n-ramps Urban parcl lp ff-ramps Urban parcl lp n-ramps Urban free-flw lp ff-ramps Urban free-flw lp n-ramps Urban direct r semidirect cnnectin ramps Fr each site subtype, SPFs are needed bth fr ttal (TOT) (i.e., all accident severity levels cmbined) and fatal-and-injury (FI) (i.,e., all accidents in which a fatality ccurred and all accidents in which a persnal injury f any severity level ccurred) accidents. N SPFs are needed t estimate the frequency f fatal and severe injury (FS) accidents and prpertydamage-nly (PDO) accidents. Fr FS accident frequencies, estimates are calculated within SafetyAnalyst by using the FI SPF multiplied by the prprtin f FS accidents ut f all FI accidents. Similarly, when PDO accident frequency values are needed, they are determined within SafetyAnalyst as the difference between TOT and FI accident frequencies. Functinal Frm f SPFs The SPFs needed fr SafetyAnalyst predict accident frequency as a functin f annual average daily traffic (ADT) vlume alne. Fr radway segments and ramps, the independent variable representing traffic vlume is the ADT f the radway segment r ramp. Fr intersectins, tw independent variables represent traffic vlume, the ADTs f the tw intersecting rads (classified as the majr and minr rad, where the majr rad is typically defined as the rad with the higher ADT).
3 The dependent variable f the SPFs (i.e., the variable whse value is predicted by the mdel) is accidents per mile per year fr radway segments and ramps and accidents per year fr intersectins. The functinal frm fr radway segment SPFs is: where: ( 1 ) κ = predicted number f target accidents per mile per year ADT = average annual daily traffic vlume (veh/day) fr the radway segment fr bth directins f travel cmbined Since κ is expressed in terms f target accidents per mile per year, Equatin (1) is equivalent t: where: ( 2 ) N = predicted number f target accidents per site per year L = length (in miles) f the radway segment The functinal frm fr intersectin SPFs is: where: ( 3 ) κ = predicted number f target accidents per intersectin per year MajADT = average annual daily traffic vlume n the majr rad (veh/day) fr bth directins f travel cmbined MinADT = average annual daily traffic vlume n the minr rad (veh/day) fr bth directins f travel cmbined The functinal frm fr ramp SPFs is: where: ( 4 ) κ = predicted number f target accidents per mile per year ADT = average annual daily traffic vlume (veh/day) fr the ramp Since κ is expressed in terms f target accidents per mile per year, Equatin (4) is equivalent t: ( 5 )
4 where: N = predicted number f target accidents per site per year L = length (in miles) f the ramp In all three equatins, a, b, and c represent the regressin parameters that are estimated frm the available data. Since the default SPFs prvided within SafetyAnalyst were develped using data frm multiple states, a calibratin prcedure is included as part f the data imprt and preprcessing prcedures. Yearly calibratin factrs are calculated within SafetyAnalyst during the data imprt and preprcessing prcedures using an agency s wn accident and traffic vlume data and the default SPFs prvided with SafetyAnalyst. The calibratin factrs are intended t accunt fr differences in accident patterns in different gegraphical areas that are nt directly addressed by the SPFs and prvide accident predictins that are cmparable t the estimates that a highway agency wuld btain frm SPFs develped using its wn accident recrds system. The yearly calibratin factr fr a given year and site subtype is calculated as the rati f the sum f bserved accidents fr all sites fr a specific site subtype t the sum f the predicted accidents fr the same sites using the ADT and accident cunt values fr that year. When an agency develps its wn SPFs fr use within SafetyAnalyst, the yearly calibratin factrs fr use with these agency defined SPFs are by definitin 1.0. Data Needs fr Develpment f SPFs All f the data needed fr the creatin f SPFs is the same data that is needed t perate and use the SafetyAnalyst sftware. As a result, it is highly recmmended that an agency imprt their data int SafetyAnalyst, run the preprcessing prgrams n the data, then exprt the data fr use in the develpment f SPFs. Alternatively, users may independently create the necessary data as described in the remaining part f this sectin. The creatin f SPFs requires data n the lcatin infrmatin, radway characteristics, traffic vlumes, and accidents. Each f these types f data and their prcessing requirements are described next. The radway characteristics, which shuld be the current characteristics at a site, are used t define the site types and subtypes identified in the first sectin f this dcument. The lgic and data used t create the subtypes in SafetyAnalyst ut f agency data are described in the Site Subtype Assignment dcument n the SafetyAnalyst Wiki. Please nte that SafetyAnalyst des this assignment autmatically with agency data when imprting the data. Hwever, users will have t assign site subtypes themselves when using their wn data t develp SPFs. Additinally, segment length data is needed fr radway segments and ramps. Sites with missing length shuld nt be used t develp SPFs. Finally, radway segments may als require sme additinally prcessing based n their gemetrics. Regressin mdels fr radway segments rely n bserved accidents at a site. As radway accidents ccur infrequently ver a perid f time, statistically significant regressin mdels fr radway segments are ften nly btained by cnsidering radway segments f a certain minimal length, usually between 0.04 and 0.1 miles. Fr sme agencies t btain valid mdels, smaller radway segment recrds may need t be cmbined int lnger radway segments t meet that minimum length; such lnger segments shuld be as hmgeneus as pssible with respect t key variables such as: area type, terrain, functinal class, number f thrugh lanes, auxiliary lanes, lane width, median type, median
5 width, shulder type, shulder width, access cntrl, driveway density, ADT, psted speed limit, tw-way vs. ne-way peratin, bikeway, and interchange influence area. Traffic vlume data shuld be supplied fr each calendar year fr which accident data are used in SafetyAnalyst. Fr radway segments, traffic vlume shuld be fr bth directins, as the SPFs predict accidents fr bth directins. Sme states keep separate recrds fr traffic vlumes and accidents in each directin f travel n divided highways. If a state develps an SPF fr ne directin f travel n a divided highway in the fllwing frm: ( 6 ) It shuld be cnverted t a tw-directinal SPF fr use in SafetyAnalyst as fllws: ( 7 ) where value f the intercept term wuld be entered int SafetyAnalyst as 2(0.5) b e α. Fr intersectins, yearly ADT are needed fr bth the majr and minr rads, where the majr rad is generally defined t be the rad with the highest ADT. If tw majr-r minrrad legs f an intersectin have different ADTs, the higher f the tw ADT values shuld be used. Years fr which ADT are missing can be estimated by interplatin r extraplatin. Sites that are missing ADT infrmatin fr all years shuld be excluded frm SPF develpment. Lcatin infrmatin is needed fr sites t match/merge accidents recrds t the sites. Accidents shuld be assigned t nly ne site and be related t the type f site. Fr example, radway segments shuld nt be matched with any intersectin-related r ramprelated accidents. Accidents that ccur at an intersectin (within the curbline limits) r that are classified as intersectin-related and ccur within 250 ft f an intersectin shuld be assigned t the intersectin. Accidents shuld be assigned t radway segments and ramps by matching the lcatin infrmatin. If an accident ccurs n the pint between tw radway segments then it shuld be assigned t either the beginning segment r ending segment s that all similarly situated accidents are assigned in the same way. If an agency has difficulty in fllwing these rules fr accident assignment exactly because f data limitatins, they shuld be fllwed as clsely as pssible. Accident data shuld als include infrmatin n the severity f the accident. All reprted accidents, including fatal, injury, and prperty-damage-nly accidents shuld be included in the develpment f the ttal accident SPFs. The accident severity data shuld then be used t identify which accidents shuld be cnsidered in the develpment f fatal and injury SPFs. Accident data shuld be prvided fr a minimum f three years (preferably five) up t a maximum f ten years. The histrical accident data shuld include whle calendar years fr each year f data. Statistical Assumptins and Sftware SPFs are usually develped with negative binmial regressin analysis, but can be develped with Pissn regressin analysis, depending n the relatinship between the mean and variance in the data. When accident frequency data have the same mean and variance, then accident frequency fllws a Pissn distributin. Alternatively, when the accident variance exceeds the mean, r the data are verdispersed, then accident frequency fllws a negative binmial (NB) distributin. In fact, the variance f a NB distributin is a functin f its mean (i.e., the mean plus a dispersin parameter multiplied by the square f
6 the mean). In this way, when the dispersin parameter nears zer, the NB distributin appraches the Pissn distributin. Since mst accident frequencies are verdispersed, NB regressin is typically used. Hwever, Pissn regressin is an acceptable substitute if the dispersin parameter is near zer. The parameters f these distributins can be indirectly estimated using a generalized linear mdel t btain the mdel regressin cefficients shwn in the functinal frm sectin f this dcument. Generalized linear mdels are extensins f cnventinal linear mdels where the dependant variable is related t the linear independent variables thrugh a nnlinear link functin and the dependent variable is generated frm a distributin functin in the expnential family. Several cmmercially available statistical sftware packages ffer generalized linear mdel prcedures that can be used t estimate the regressin cefficients. In particular, the use f the prcedure, PROC GENMOD 1, f the statistical package SAS will be described in this sectin. In the GENMOD prcedure, the mdel regressin cefficients are estimated by the methd f maximum likelihd using a ridge-stabilized Newtn-Raphsn algrithm. The asympttic nrmality f maximum likelihd estimates is used t btain test f significance f the parameters and gdness f fit measures fr the mdels. T perfrm an analysis in this prcedure, several data items will need t be specified: dataset, dependent variable and distributin, independent variable(s), link functin, and ffset factr. The dataset used by the prcedure shuld be rganized as ne recrd per site and cntain the site subtype, dependent variables, independent variable(s), and ffset factr. The minimum number f sites needed fr mdel cnvergence is dependent n a number f accidents ccurring at the sites. Site types that experience relatively high accident frequencies, such as urban intersectins, will require fewer sites and/r fewer years f accident histry. In cntrast, rural radway segments generally need mre sites and/r mre years f accident histry. Pissn and negative binmial regressin are perfrmed with a lg link functin, which is smetimes referred t as lglinear mdeling. A lg linear relatinship between the accidents and ADT ensures that predictins frm the fitted mdel are psitive. The dependent variables, r variables whse values are predicted by the mdel, are the number f TOT r FI accidents that are predicted t have ccurred at each site during the histry perid. Cnsequently, fr SPF develpment in SafetyAnalyst there shuld be tw accident variables in the dataset, ne cntaining the cunt f all (i.e., TOT) accidents ccurring at the site during the histry perid and ne cntaining a similar cunt f fatal and injury (i.e., FI) accidents. Hwever, individual mdels are t be created fr each dependent variable. Traffic vlume(s) are the nly independent variable(s) cnsidered in the SPF frms shwn in Equatins 1 thrugh 3. As shwn in these equatins, these variables have a nnlinear relatinship with predicted accidents. As a result, the natural lgarithm f these variables shuld be used in the mdeling prcedure rather than the actual values. There des nt need t be a placehlder variable in the data fr the intercept, as it is autmatically prvided by the sftware. The ffset factr, r scale factr, serves t nrmalize the predicted accidents t a per mile per year basis, since accident frequencies, nt accident rates, are used in the mdel. Fr radway segments and ramps, the natural lgarithm f the length f the segment multiplied by the number f accident calendar years is included as a scale factr. Intersectins use the
7 natural lgarithm f the number f accident calendar years as a scale factr s predicted accidents are given n a per site per year basis. Once the dataset has been assembled fr regressin analysis, and the usual data quality checks fr utliers and mdel assumptins are cnducted, the fllwing SAS cde may be used t generate the SPF mdel cefficients: PROC GENMOD; BY SiteSubtype; MODEL TtAcc=lgADT / LINK=Lg DIST=NEGBIN OFFSET=lgLengthYrs; In this specificatin, TtAcc and lgadt are the dependent and independent variables (respectively) defined in the dataset, lglengthyrs is the ffset value defined in the dataset, DIST= ptin specifies the negative binmial distributin, LINK= ptin specifies lg-linear regressin mdel, and the BY SiteSubtype ptin creates a separate mdel fr each change in value f the SiteSubtype variable defined in the dataset. Several ther ptins are available that culd be specified in the abve statements t cntrl the algrithm and display additinal statistics (e.g., type I tests, type III tests, cnfidence intervals, etc.). In particular, using the cmbined ptins f DIST=NEGBIN SCALE=0 NOSCALE, can test fr verdispersin in a Pissn mdel. Overdispersin is assessed by testing whether the negative binmial dispersin parameter is equal t zer (i.e., when the negative binmial distributin is equivalent t the Pissn distributin). Mdel cnvergence, gdness-f-fit, and statistical significance f the cefficients can be assessed with the utput autmatically generated by the sftware. While all f the details f these cmpnents cannt be described in this dcument, sme basic guidelines n excluding mdels fr use in SafetyAnalyst are prvided in the remainder f this sectin. Cefficients frm mdels that d nt cnverge shuld nt be used in SafetyAnalyst. Nncnvergence f the Newtn-Raphsn algrithm can ccur fr sme datasets. Pr perfrmance can be the result f a number f factrs: ill-cnditined data (e.g., data that are extremely large r extremely small), a nn-psitive definite Hessian matrix can indicate linear dependencies amng the parameters, mdel misspecificatin, r vilatins f the errr assumptins. Hwever, the mst prbable explanatin fr nncnvergence is lack f event data (i.e., small accident cunts). Asympttic r large sample inference used by this prcedure, which is based n maximizing the likelihd functin, may nt be apprpriate in situatins where the ttal number f accidents in any site subtype grup is small. Fr thse grupings, exact Pissn regressin is a mre statistically valid apprach t get regressin estimates and p-values. Hwever, this capability is nt currently prvided in this prcedure, rather LgXact 4.1 sftware shuld be used. There are several gdness-f-fit statistics available by the prcedure. Hwever, the preferred fit statistics used by transprtatin researchers are usually calculated frm utput generated by the prcedure rather than the prcedure itself. Fr example, the Freeman- Tukey R 2 cefficient 2 (RFT 2 ) has been presented in Appendix J f the User Manual fr the SPFs that have been develped fr SafetyAnalyst. This gdness-f-fit statistic is nt available within the sftware prcedure and must be calculated utside f the prcedure. As bserved in Appendix J, selectin f mdels fr use with SafetyAnalyst was nt limited by the value f this gdness-f-fit statistic, particularly if it was the nly available mdel. Hwever, an agency develping their wn SPFs may cmpare and cnsider this gdnessf-fit statistic when chsing between their wn SPFs r the nes prvided by SafetyAnalyst.
8 Cefficient estimates prvided by the prcedure shuld be assessed by their magnitude and directin as well as their statistical significance. Fr example, a negative cefficient fr ADT, indicating accidents decrease as ADT increase, is prbably never apprpriate. Als, the significance level used t assess cefficient estimates is mre relaxed in transprtatin research. A significance level f 10% is generally used t assess cefficient estimates fr ADT n radways, ramps, and the majr rad f intersectins. Hwever, a significance level f 20% is usually cnsidered fr minr rad ADT n intersectins. Finally, the estimate f the dispersin parameter shuld always be psitive. That is, a negative value fr the dispersin parameter shuld indicate that there are prblems with the mdel, even thugh n warnings are issued by the SAS prcedure. Hwever, if the value is clse t zer, then the mdel shuld be re-tried assuming a Pissn distributin. Since the EB-methdlgy used in SafetyAnalyst requires a value fr the dispersin parameter, SPFs generated frm Pissn distributins shuld assume a small psitive value, like 0.3. References 1 t010.htm 2 Fridstrm, L., et al. Measuring the Cntributin f Randmness, Expsure, Weather, and Daylight t the Variatin in Rad Accident Cunts, Accident Analysis and Preventin (1995), Vl. 27, pp