A STRATIFIED SAMPLING PLAN FOR BILLING ACCURACY IN HEALTHCARE SYSTEMS Jiracai Buddakulsomsiri a Partaa Partaadee b Swatatra Kacal a a Departmet of Idustrial ad Maufacturig Systems Egieerig, Uiversity of Miciga-Dearbor b Departmet of Agro-Idustry Tecology Maagemet, Kasetsart Uiversity, Tailad Abstract Tis study presets a stratified samplig pla for estimatig accuracy of billig performace for te claims submitted to tird party payers i ealtcare systems. Te populatio cosists of ospital claims wit amouts ragig from zero, udreds, tousads, to rare ig millio dollars. Accuracy of te billig process is estimated by auditig a sample of claims wit two measuremets: te overall percet accuracy ad te total dollar accuracy. Difficulties i costructig te samplig pla arise we te umber of strata ad teir boudaries are ukow, ad we te two measuremets require differet samplig scemes. Te proposed samplig pla is desiged to perform effectively for estimatig bot measuremets. It determies a overall sample size ad tests various umbers of strata to fid a appropriate stratificatio. Te optimal stratum boudary poits are foud usig te rectagular stratificatio metod o te claim dollar amouts. Te overall sample is te assiged to strata wit a mixed strategy betwee te proportioal ad optimal allocatios ad fially te accuracy estimates ad teir precisios are obtaied. Te samplig pla is tested o a actual populatio obtaied from isurace idustry wit simulated claim errors. Te results sow effectiveess of te pla for bot accuracy measuremets. Itroductio I ealtcare systems a iterestig problem commoly ecoutered is te estimatio of te accuracy performace i processig of te ospital claims submitted to tird party payers. Importat caracteristics of te claim populatio iclude igly positive skewess of te claim dollar amouts ad a relatively low processig error rate. Two measures of accuracy performace iclude te populatio percet accuracy ad total dollar accuracy. Te percet accuracy is te percetage of te claims wic are processed correctly i te populatio. Te total dollar accuracy measures te dollar amout tat sould ave bee claimed. Statistical samplig is a widely used tool for tis estimatio. Te samplig pla usually implemeted for te populatio wit suc caracteristics is stratified radom samplig. I tis samplig metod, all claims i te populatio are divided ito several subpopulatios, called strata, from wic a simple radom sample is idepedetly selected. Te sampled claims are te audited so tat te accuracy measures ca be estimated. Stratificatio ca usually provide a iger precisio i te estimatio, if te populatio ca be efficietly divided ito omogeeous strata. However, uder te situatio were te populatio strata structure is uclear, two critical questios tat must be addressed are te umber of strata ad teir boudary poits. I additio, we tere are two differet measures to be estimated simultaeously as i tis case, te sample must be appropriately assiged to strata so tat bot measures ca be estimated wit ig precisio. Tis paper proposes a stratified samplig procedure tat ca be used to estimate bot accuracy measures from te same sample. Te steps i costructig te samplig pla, icludig determiig te overall sample size, formig te strata, coosig te appropriate umber of strata, ad estimatig te accuracy, are preseted. A example of implemetig te proposed pla to a actual ospital claim populatio is also provided. Literature Reviews Desigig a stratified radom samplig pla for accoutig data ivolves a umber of decisios. We te populatio strata structure is ukow, te auditor must first select te stratifyig variable for costructig strata. Te, te umber of strata ad teir boudaries must be determied. For te sample allocatio, te Neyma or optimum allocatio as bee prove to miimize te variace of te estimate for a fixed total sample size ad is ofte used i practice. However, it ca oly be approximated, sice te stratum variaces, wic are required i te computatio, are typically ot available. Te umber of strata is usually decided by te precisio gai from stratificatio as te umber of strata icreases ad te cost of stratified samplig (Cocra, 977). Homogeeous strata ca be costructed by settig te boudaries, so tat a miimum variability witi every stratum ca be attaied. Te set of equatios to determie te optimum boudaries, uder Neyma allocatio, for a give umber of strata was derived by Daleius (957). However, te equatios are difficult to compute, tus, a
umber of approximate metods for strata boudary costructio ave bee proposed by may researcers. Four suc metods are briefly described ext. Maalaobis (95), ad Hase et al. (953) propose a metod to costruct te strata boudaries by makig te product of te stratum weigt ad te stratum mea (or te aggregate value) of te stratifyig variable equal for all strata. Daleius ad Hodges (959) preseted tat te approximate optimum boudaries ca be obtaied by costructig equal itervals o te cumulative of te square roots of te frequecy distributio of te stratifyig variable. Ekma (959) costructed te strata boudaries by equalizig te product of stratum weigt ad stratum rage. Seti (963) costructed te tables of te optimum boudaries for some stadard cotiuous distributios, uder Neyma, equal, ad proportioal allocatios. If te distributio of te study populatio resembles wit te stadard distributios, te boudaries are obtaiable from te tables. Hess et al. (966) studied four sample allocatio metods ad four boudary costructio metods o data for medical ospitals, wic is a igly-positive-skewed populatio. It was foud tat, wit optimum stratificatio, te equal allocatio ad te Neyma allocatio perform te best i gaiig variace reductio, wile amog te metods for costructig boudaries, Ekma s metod, Seti s metod wit adjustmet, ad Daleius s ad Hodges s metods wit adjustmet were foud to perform te best. Te empirical study of samplig o accoutig populatios was explored by Neter ad Loebbecke (975). Four accoutig populatios were used to geerate several study populatios wit various error rates. Te beaviors of estimators o tose geerated populatios uder various samplig plas were reported. Te study sowed tat satisfactory results were acieved by usig te stratified samplig o te accoutig data. Proposed Stratified Samplig Pla Te study populatios of iterest cosist of ospital claims wit amouts ragig from zero, udreds, tousads, to rare ig millio dollars. Te summary of statistics of oe of te populatio is provided i Table. Te statistics sows tat te populatio is igly positive-skewed, wit a uge stadard deviatio. Tis caracteristic is commo for te populatio of accoutig data (Neter ad Loebbecke 975). Te objective of te samplig is to collect evidetial iformatio to fairly assess te accuracy of claim processig operatios o a quarterly basis. I tis study, a error is defied as te differece betwee te processed amout ad te audit amout of a claim. A overpaid claim implies positive error, wereas a uderpaid claim implies egative error. From te past estimate of accuracy, it was foud tat te populatios ave very low error rates (i.e. close to %), ad tere is o strog evidece tat te error rate for ig dollar claims is substatially differet ta tat of low dollar claims. Based o tis iformatio ad expert opiio, te followig assumptios are made: () Te error rate is statistically te same trougout te populatio, regardless of te claim amouts, () Tere is o sigificat correlatio betwee te claim amout ad te error amout, ad (3) A overpaid error amout caot exceed its processed claim amout, wereas a uderpaid error amout may. Table : Frequecy distributio ad summary statistics of te claim amout populatio i oe quarter (3 mots) Claim amout ($) Number of claims 0,59,067 0.0,000 6,73,69,000.0 0,000 40,89 0,000.0 00,000 8,03 more ta 00,000 Total 9,38,80 Total claim amout $ 806,400,496 Mea * $ 63.7 Stadard deviatio * $,09.73 Skewess * $ 5.04 Maximum $ 67,796.59 Miimum $ 0 * Tese statistics are calculated oly o o-zero dollar claims. Two measuremets of iterest icludig te overall percet accuracy ad te total dollar accuracy of te claims, are to be estimated, usig te proposed stratified samplig pla. Te purpose of usig stratificatio is to gai more precisio i estimatig te total dollar accuracy measuremet. Stratificatio ca reduce te effect of skewess i claim amout populatio wic will lead to reductio i te overall stadard error of te estimates. Altoug te mai reaso for usig stratified samplig is to better estimate te total dollar accuracy, te samplig pla will be used for bot measuremets. Te procedure is described i te followig subsectios. Te variable cose for stratificatio is te processed amout of ospital claims. Tere are two reasos justified for tis: () te data are easy to obtai, ad () te data ca be used directly i calculatig te measuremets for assessig te accuracy of claim processig operatios. Determie te Overall Sample Size, Te overall sample size cosists of tree sample compoets, wic are te sample sizes for () zero dollar stratum, () o-zero dollar strata, ad (3) rare ig dollar stratum tat will be 00% audited. Te determiatios of sample sizes are as follows. Table lists te otatio used i te calculatio.
() Determie a : We processed claims are classified ito two classes correctly processed claims ad processed claims wit errors, te samplig procedure is called attribute samplig. Te sample size for attribute samplig a ca be calculated usig Equatios () ad () as follows: Table : Notatio for te determiatio of sample sizes a, b, c sample size for estimatig populatio percet accuracy, total dollar accuracy, ad 00% auditig for rare ig dollar stratum te populatio size N te iitial sample size calculated witout te a0 fiite populatio correctio (fpc) factor P te expected percet accuracy (from past quarter performace) d te desired precisio level Z α/ te stadard ormal variate associated wit te level of cofidece α SD te advace estimate of te stadard P 0 deviatio of te populatio proportio of zero dollar claims i te populatio Zα / a 0 = P( P) () d = a0 a a0 () + N I a situatio were te populatio size is very large, a0 ca be used as a approximated sample size. () Determie b : Te sample size for total dollar accuracy is based o te use of iterval estimatio, as sow i Equatio (3). NZα / SD b = (3) d Te desired precisio parameter d i Equatio (3) is defied to be a acceptable amout of dollar error tat te auditor is willig to accept i te estimatio process. Te advace estimate of te populatio stadard deviatio, SD, ca be obtaied from te error ad te claim audit amouts from past samples usig te differece estimatio metod. (3) Determie c : Te sample size c is obtaied by settig a cut-off poit for rare ig dollar stratum, wic cotais all claims wit amouts iger ta te cut-off poit. I geeral, te auditor may coose te cut-off poit depedig o te amout of effort or resource allocated to te rare ig dollar stratum. I tis study, $00,000 was used as suggested by experts i te field. After te tree sample sizes are determied, te overall sample size would cosist of () te portio of a tat is assiged to te zero dollar stratum usig proportioal allocatio, () te maximum betwee b ad te remaiig of a, ad (3) c ; see Equatio (4). = P0 a + max[ b,( P0 ) a ] + c (4) It is importat to empasize tat i some situatios were te desired precisio is ig, te calculated sample size usig tis procedure may require resources far more ta wat are available to te auditor. A alterative ad more practical way is to use te auditor s judgmet to determie te overall sample size based o te amout of resources (e.g., time ad ma-our) tat are available. Neverteless, it is strogly suggested tat te overall sample size is at least a + c, so tat at least te percet accuracy is estimated wit te desired precisio. Desig te Samplig Pla Desigig te samplig pla ivolves two major steps: formig te strata ad allocatig te overall sample. Two critical ad iterrelated decisios to be made i formig te strata are () te umber of strata, ad () te locatios of te boudary poits betwee strata. () Number of Strata: Researc studies i stratificatio suggest tat te precisio of te estimate is ordiarily ot improved sigificatly by usig beyod 0 (Ares ad Loebbecke, 98). Sigificat gais i te precisio usually are obtaied from te first few strata; ece, formig oly a few strata (peraps 5 to 0 strata) will typically yield most of te possible gais from stratificatio. () Stratum Boudary Poits: To determie te boudary poits te approximate rectagular metod, wic implemets te equal cumulative f(y) rule (i.e. equal cumulative square root of frequecy), is used. Te reaso for coosig tis rule is tat it ca be easily implemeted ad as bee sow to work well o te populatio wit strog positive skewess (Hess et al., 966, ad Neter ad Loebbecke, 975). Te procedure is as follows. Step : Arbitrarily coose a umber of itervals L (e.g. 00, 00) Te larger te L, te fier te scale for te boudary poits would be. Step : Set up L itervals of claim amouts, eac wit a associated iterval widt ω i, were i deotes te iterval idex. For coveiet purpose, te cut-off poit for te rare ig dollar stratum may be used i tis step by settig eac iterval widt to be equal to te cut-off poit divided by L. Note tat ω i eed ot be of equal size. Step 3: Cout te umber of claims i eac iterval N i. Step 4: Calculate te frequecy, f(y) = ω i Ni, te square root of te frequecy, f(y), ad te cumulative of te square root of te frequecy for eac iterval. First proposed by Daleius ad Hodges (959), later modified by Cocra (963), ad tested by Hess et al. (966)
Step 5: Determie te total value of te cumulative f(y). Step 6: Divide te total cumulative f(y) by te desired umber of strata H. Te stratum boudary poits are te boudary poits of te itervals tat are approximately equal i widt o te cumulative f(y) scale. Te proposed allocatio metod is a mixed strategy betwee two commo samplig allocatios: proportioal allocatio ad optimal allocatio. I geeral, proportioal allocatio sould be used we differet parts of te populatio are proportioally represeted i te sample. It is terefore appropriate for estimatig te percet accuracy sice it is assumed tat te percet accuracy is approximately te same trougout te populatio. O te cotrary, optimal allocatio sould be used we te variability of te measuremet varies sigificatly across all strata, wic is te case for dollar accuracy. Optimal allocatio assigs differet sample sizes to strata proportioal to te strata variability. Tat is, te strata wit larger claim amouts ave more variability ta te smaller oes; tus, to icrease te overall precisio of te estimate, te samplig fractios i tose strata sould be icreased. For eac stratum, te stratum stadard deviatio S ad te umber of claims i te stratum N are determied. Te sample size for total dollar accuracy is te optimally allocated usig Equatio (5) if te remaiig sample uits equals b, or Equatio (6) oterwise. NS = H b NS = NS = H 0 NS = ( P ) a Calculate te Estimates of te Accuracy Performace ad Teir Precisios Table 3 presets te otatio used i te estimatio formula. Te populatio percet accuracy Pˆ is estimated as follows. First, Vˆ ca be estimated usig Equatio (7), Vˆ = H + b i = 0 (5) (6) v (7) Pˆ ca be estimated usig te traditioal metod as i Equatios (8), or usig Agresti s metod (Agresti ad Coull, 998) as i Equatio (9), wic adds some adjustmet to te traditioal estimatio metod. It is importat to ote tat i tis calculatio te stratified sample is treated as if it were a simple radom because of te assumptio tat te error rate is costat trougout te populatio. Vˆ Pˆ = (8) ˆ ˆ V + Zα/ P = (9) + Zα/ Te (-α) 00% cofidece iterval of Pˆ ca be calculated usig Equatio (0), x i x Table 3: Notatio for te estimatio calculatio stratum idex; = 0,,,, H, H+, were 0 deotes zero dollar stratum, to H deote te o-zero dollar strata, ad H+ deotes te rare ig dollar stratum wit 00% audit rate te processed amout of te i t claim i te sample take from stratum te mea processed amout of claims i te sample take from stratum y te audit amout of te i t claim i te i sample take from stratum y te mea audit amout of claims i te sample take from stratum s te variace of te audit amout of claims i te sample take from stratum e te error amout of te i t claim i te i sample take from stratum v te biary variable idicatig weter te i t i claim i te sample take from stratum is processed correctly N te size of stratum X te processed amout of te i t claim i i stratum X te total processed amout of all claims i stratum X te total processed amout of all claims i te populatio Vˆ te total umber of correct claims i all strata combied Pˆ te estimate of te percet accuracy for te wole populatio Yˆ te estimate of te total dollar accuracy for te wole populatio s (Y) ˆ te estimate of te variace of Yˆ P( ˆ P) ˆ ˆ ± Z (0) P α/ ) Te populatio total dollar accuracy is estimated usig te differece estimatio metod. First, X is calculated usig Equatios (),
N X = X i () Yˆ ad s (Yˆ ) ca be estimated usig Equatios () ad (3), respectively, s (Y) ˆ = + H + = H + = [ X + N ( y x )] Y ˆ = () N N ( ) ( x x ) ( y y ) ( x x )( y y ) (3) i i i i Fially, te (-α) 00% cofidece iterval of Yˆ ca be calculated usig Equatio (4), Yˆ ± Z s (Y) ˆ (4) α / Computatioal Results Te Populatio To test te proposed stratified samplig pla, a actual populatio of ospital claims is used. Sice te true percet accuracy ad te total dollar accuracy of te populatio are ukow, simulated errors were used to establis te target audit values for bot measuremets. Table 4 summarizes te populatio target values. Table 4: Total audit values of te populatio Percet error rate 3% Total processed amout $ 806,400,496 Total simulated overpaid $,55,04 Total simulated uderpaid $ (3,3,768) Total target audit amout $ 798,07,60 Te Tested Samplig Plas For te pla testig purpose, te overall sample size was cose arbitrarily at 500, accordig to te auditor s curret practice at te isurace compay tat provided te populatio data. Te two types of pla tested iclude simple radom samplig (SRS) pla, ad te proposed pla wit o-zero dollar strata ragig from to 0. Overall Percet Accuracy Te test results are sowed i Table 5, wit all calculatios doe at 95% cofidece level. Te estimate of percet error is obtaied usig bot traditioal metod ad Agresti s Metod. Note tat te calculatios for te precisio of te percet error are based o Agresti s estimate. Te bases for pla comparisos are te percet error estimates ad teir precisio. Eac result is a average of 00 estimates from 00 samples. Oly te results from te proposed stratified samplig plas wit 3 to 0 strata (i.e. to 8 strata for o-zero dollar claims) are sow i te table. From te results, it was foud tat all plas perform statistically te same i terms of te estimate ad its precisio. Te traditioal metod sligtly uderestimates te percet error, wereas Agresti s metod sligtly overestimates it. To be coservative i te estimatio, te Agresti s metod is terefore recommeded. Total Dollar Accuracy Te bases for pla compariso for total dollar accuracy iclude () te average percet off-target, calculated from te differeces betwee te processed claim amouts ad te estimated audit amouts from te samples, ad () te relative stadard error, wic is te ratio of te stadard error of te estimate to te total audit value for te populatio. From Table 5, all plas perform statistically te same wit respect to te average percet off-target. However, it is clear tat te stratified samplig plas perform better ta te SRS pla i term of te relative stadard error. Usig te reductio i te relative stadard error we te umber of strata for te o-zero dollar strata Table 5: Estimate of performace accuracy for te populatio wit 3% target error rate Pla # of strata Percet accuracy measure Total dollar accuracy measure Traditioal Agresti s Precisio Relative Stadard Error % Off-Target SRS 3.05% 3.35%.58% 0.74% 0.03% Stratified samplig 3.6% 3.00%.50% 0.66% 0.6% 4.77% 3.4%.53% 0.66% 0.05% 5.66% 3.00%.50% 0.60% -0.0% 6.87% 3.3%.55% 0.6% -0.03% 7.7% 3.06%.5% 0.54% 0.0% 8.73% 3.3%.53% 0.5% 0.0% 9.83% 3.%.55% 0.58% -0.0% 0.68% 3.06%.5% 0.54% 0.04%
is ragig from to 0, it was foud tat te appropriate umber of strata for o-zero dollar claims ($0.0 $00,000) sould be approximately 8 (i.e. te total umber of strata is 0). Tat is, addig more strata did ot yield muc more improvemet i te precisio. Fially, Table 6 presets te strata boudaries (from rectagular metod) ad te allocatio of te sample size of 500 (from optimal allocatio) for te stratified samplig pla wit 0 strata. Table 6: Te te-stratum samplig pla wit = 500 Stratum Boudary Stratum Sample Size $0 3 ($0, $40] 3 3 ($40, $0] 47 4 ($0, $50] 34 5 ($50, $650] 39 6 ($650, $,570] 39 7 ($,570, $3,960] 39 8 ($3,960, $0,430] 39 9 ($0,430, $00,000) 89 0 [$00,000, ) Coclusio Te problem of estimatig te accuracy performace for te populatio of ospital claims is preseted i tis paper. We te populatio strata structure is ukow, te rectagular stratificatio metod is applied o te claim amouts to determie te optimal boudary poits betwee strata, ad te precisio gai i te estimatio process (i.e. reductio i relative stadard error) is used to idetify te appropriate umber of strata. Te proposed samplig pla also implemets a mixed strategy betwee proportioal ad optimal allocatios so tat bot percet accuracy ad total dollar accuracy ca be estimated simultaeously. Te pla is tested wit a actual populatio obtaied from isurace idustry. Te test results sow effectiveess of te proposed pla i estimatig bot accuracy measures. Refereces Agresti, A. ad Coull, B. A. (998). Approximate Is Better ta Exact for Iterval Estimatio of Biomial Proportios, Te America Statisticia, 5 (), 9-6. Ares, A. A., ad Loebbecke, J. K. (98). Applicatios of Statistical Samplig to Auditig, Pretice Hall, Ic., New Jersey. Cocra, W. G. (977). Samplig Teciques, 3 rd editio. Jo Wiley ad Sos, Ic. New York. Daleius, T. (957). Samplig i Swede. Cotributios to te Metods ad Teories of Sample Survey Practice. Almqvist ad Wic ksell, Stockolm. Daleius, T. ad Hodges, J. L., Jr. (959). Miimum Variace Stratificatio, Joural of te America Statistical Associatio, 54 (85), 88-0. Ekma, G. (959). A Approximatio Useful i Uivariate Stratificatio, Te Aalysis of Matematical Statistics, 30, 9-9. Hase, M. L., Hurwitz, W. N., ad Madow, W. G. (953). Sample Survey Metods ad Teory, Volume. Jo Wiley ad Sos, Ic. New York. Hess, I., Seti, V. K., ad Balakrisa, T. R. (966). Stratificatio: A Practical Ivestigatio, Joural of te America Statistical Associatio, 6 (33), 74-90. Maalaobis, P.C. (95). Some Aspects of te Desig of Sample Surveys, Sakayā,, -7. Neter, J., ad Loebbecke, J. K. (975). Beavior of Major Statistical Estimators i Samplig Accoutig Populatios: A Empirical Study. America Istitute of Certified Public Accoutats, Ic. New York. Seti, V. K. (963). A Note o Optimum Stratificatio of Populatios for Estimatig te Populatio Meas, Te Australia Joural of Statistics, 5, 0-33. Biograpical Sketc Jiracai Buddakulsomsiri is a Assistat Professor i te Departmet of Idustrial ad Maufacturig Systems Egieerig at te Uiversity of Miciga- Dearbor. He eared is P.D. degree i Idustrial Egieerig from Orego State Uiversity i 003. His researc areas of iterest iclude applied statistics, data miig, ad project scedulig ad maagemet. Partaa Partaadee received er P.D. degree (004) from te Departmet of Idustrial ad Maufacturig Egieerig at Orego State Uiversity, USA. Her researc iterests iclude applied operatios researc, veicle routig, scedulig, ad productio plaig. Se is curretly a faculty member i te Departmet of Agro-Idustry Tecology Maagemet at Kasetsart Uiversity, Bagkok, Tailad. Swatatra Kacal, P.D., is a Professor ad Cair of te Departmet of Idustrial ad Maufacturig Systems Egieerig at te Uiversity of Miciga- Dearbor. He is a Fellow of te Istitute of Idustrial Egieers ad a Fellow of te Healtcare Iformatio ad Maagemet Systems Society. Dr. Kacal is a past Presidet of te Society for Healt Systems ad is curretly servig as Tecical Vice Presidet for Society ad Divisios for te Istitute of Idustrial Egieers.