Generation of Synthetic Data Sets for Evaluating the Accuracy of Knowledge Discovery Systems

Size: px
Start display at page:

Download "Generation of Synthetic Data Sets for Evaluating the Accuracy of Knowledge Discovery Systems"

Transcription

1 Geeratio of Sythetic Data Sets for Evaluatig the Accuracy of Kowledge Discovery Systes Daiel R. Jeske Uiversity of Califoria Deartet of Statistics Riverside, CA Behrokh Saadi Lucet Techologies Perforace Aalysis Deartet Holdel, NJ Pegyue J. Li La e, Sea Cox, Rui Xiao Ted ouglove, Mih Ly Doug Holt, Rya Rich Uiversity of Califoria, Riverside CA ABSTRACT Iforatio Discovery ad Aalysis Systes (IDAS) are desiged to correlate ultile sources of data ad use data iig techiques to idetify otetial evets that could occur i the future. Alicatio doais for IDAS are uerous ad iclude the eergig area of hoelad security. Develoig test cases for a IDAS requires backgroud data sets ito which hyothetical future scearios ca be overlaid. The IDAS ca the be easured i ters of false ositive ad false egative error rates. Obtaiig the test data sets ca be a obstacle due to both rivacy issues ad also the tie ad cost associated with collectig ultile istaces of a diverse set of data sources. I this aer, we give a overview of the desig ad architecture of a IDAS Data Set Geerator (IDSG) that will eable a fast ad corehesive test of a IDAS. The IDSG will geerate data usig sytactical rule-based algoriths ad also seatic grahs that rereset iterdeedecies betwee attributes. The IDSG tool will feature a default set of attribute geeratio caabilities as well as a wizard that allows users to grow the scoe of data the tool ca geerate. e describe the data geeratio algoriths for soe default attributes such as ae, geder, age, address, hoe uber, driver s licese uber, social security uber, credit card uber, icoe level, occuatio, ad credit card trasactios. The credit card trasactio exale illustrates the use of seatic grahs to cature attribute deedecies, ad also illustrate our aroach for aroxiatig high diesioal ultivariate distributios. Categories ad Subect Descritors D..5 [Testig ad Debuggig]: Testig Tools. D..4 [Software/Progra erificatio]: Statistical Methods, alidatio. I..4 [Kowledge Reresetatios Foraliss ad Methods]: Seatic Grahs. Geeral Ters Algoriths, Desig, erificatio Keywords Iforatio Discovery, Data Miig, Data Geeratio.. INTRODUCTION. Backgroud Moder data collectio ethods i cobiatio with the raid icreases i iforatio techology have ade ossible the asseblig of extesive data sets o idividuals. Iforatio Discovery ad Aalysis Systes (IDAS) for a iortat tool for turig large quatities of data that have bee collected ito iforatio that ca be used. IDAS s are desiged to correlate ultile sources of data ad use data iig techiques to fid relatioshis withi disarate data sets that could be used to redict evets that occur i the future. A IDAS extracts iforatio fro data by fidig atters, threads, ad relatioshis. To do this, IDASs use a hybrid of statistical ad artificial itelligece ethodologies such as atter recogitio, classificatio, categorizatio, ad learig, ad eloy data iig tools based o eural etworks, Bayesia etworks, classificatios schees, ad regressio odels, aog others, to cull out ad aggregate iforatio across differet data sources. IDAS s have bee a aor asset to busiess alicatios such as fraud revetio [,7] ad are i use i the edical field i a wide variety of alicatios icludig hel i diagosis [5] ad aalysis of edical videos [3].A recet survey by the US Geeral Accoutig Office foud that 5 federal agecies are coductig or la to coduct 99 searate data iig efforts, with 3 of these curretly oeratioal [4]. It is believed that IDAS s could be equally effective for itelligece alicatios such as rovidig leadig idicators of terrorist acts.. Motivatig Proble A critical techical issue with IDASs is their ability to rovide accurate iferece. Give the diversity of techiques used to develo a IDAS, it is desirable to have a baselie aroach for testig their ability to ake accurate iferece as well as their ability to deal with large iut data sets with varyig degrees of accuracy. A iortat art of testig of IDAS s is the geeratio of sythetic data sets for use i test cases. Develoig test cases for a IDAS requires backgroud data sets ito which hyothetical future scearios ca be overlaid. The IDAS ca the be easured i ters of false ositive ad false egative error rates. Obtaiig the test data sets ca be a obstacle due to both rivacy issues ad also the tie ad cost

2 associated with collectig ultile istaces of a diverse set of data sources. I this aer, we give a overview of the desig ad architecture of a IDAS Data Set Geerator (IDSG) that will eable a fast ad corehesive test of a IDAS. This IDSG is curretly uder develoet for the Deartet of Hoelad Security (DHS) by the Uiversity of Califoria, Riverside ad Lucet Techologies. The secific goal is to desig ad develo a rototye syste to exlore the feasibility of sythesizig data sets for testig the effectiveess of IDASs desiged to ucover otetial threat scearios. Such a data sythesis caability is a attractive alterative for testig a IDAS whe realistic data sets are ot readily available due to rivacy or restrictive access issues, or the challege ad costs of aassig diverse data fro ultile sources is iosig. Further, a sythetic data caability allows various features ad caabilities of a IDAS to be exercised i ultile test cases, thus erittig a eaigful figure of erit of its efficacy to be established. There are several difficulties i the geeratio of IDAS test data i geeral, ad for the DHS roect i articular. Geeral difficulties iclude: Multivariate data IDASs use large ubers of data fields with ukow correlatios betwee fields. Multile data tyes Data for these systes ca be cotiuous, oial, ordial, textual, or iage data. Lack of available high diesio data Readily available data, the tye eeded to quickly develo test cases, tyically are reseted i two or at ost three diesios. DHS secific difficulties iclude: Privacy issues The Deartet of Defese, Office of the Isector Geeral issued a reort fidig that the Defese Advaced Proects Agecy (DARPA) had ot adequately cosidered rivacy cocers associated with the Total Iforatio Awareess Progra terroris data iig roect [3]. Security issues The eed to aitai security for the DHS recludes the direct iteractio with the cliet to idetify relevat data fields. Lack of traiig data Because of security eeds, traiig data for evaluatig the effectiveess of the data sythesis tools are ot available. The roble of develoig a data set geerator that ca create realistic data for all ossible IDAS data iut tyes for all ossible IDASs is foridable, ad really is ot our goal. Istead, our aroach is to sythesize data sets of sufficiet quality to eable IDAS evaluatio. Our visio is a exerietal latfor that geerate data sets cotaiig user-desiged threat sceario iforatio as well as backgroud data whose quality ca be icreased as eeded by iosig relatioshis, correlatios, ad other statistical coditios o the data attributes withi or across data sets..3 Obective Exales of geeratio of sythetic data through various eas for testig uroses are available i ay areas of research. For exale, sythetic data sets were geerated for testig a Robust/Resistat crystallograhic refieet rocedure usig two sets of Gaussia rado errors, a Gaussia error distributio cotaiated with 0% draw fro a high variace Gaussia distributio, ad a log-tailed rado error distributio [8]. Sythetic data has bee geerated for use i groud truth testig of differet rotei sot detectio software ackages through the use of a Gaussia ixture odel obtaied fro traiig data [9]. Sythetic hadwritig data have bee geerated usig rado erturbatios of actual hadwritig for use i testig cursive had-writig recogitio systes [0]. Alteratively, the US Cesus Bureau is exlorig the use of sythetic data as a eas of rotectig the cofidetiality of the ublic while aitaiig statistical quality []. I ultivariate cases, articularly where there are colex covariace structures requirig a uderstadig of the iterdeedecies betwee variables, the data sythesis rocess becoes ore colex. Cetral to the aroach i this aer is the develoet of a obect-orieted iforatio odel to rereset data obects that readily erit iosig additioal data structure ad coditios o data attributes. Our sythetic data geeratio desig is based o the use of seatic grahs. For this roect, a seatic grah is defied as a grahical deictio of the iterrelatioshis betwee data fields of iterest. Seatic grahs have bee used i various fields for suarizig colex relatioshis betwee ultile factors. For exale, docuet suaries have bee roduced usig seatic sub-grahs to rereset the various characteristics of the docuet but i a vastly reduced size [6]. Colex relatioshis betwee airs of ous i seteces have bee suarized usig autoatic grah algoriths [] as well. A key asect of our obective to sythesize evaluative IDAS iut data is to rovide a scalable desig that ca be adated to icrease data quality, add data tyes, ad address coutatioal costraits dictated by our users. The rest of this aer is orgaized as follows. Sectio describes the geeral architecture used i our data geeratio aroach. Sectio 3 rovides a few illustrative alicatios of our roosed desig, with a exale draw fro geeratig credit card trasactio data. Sectio 4 suarizes the ricial fidigs todate of this roect.. ARCHITECTURE. User Sceario To better deostrate the fuctioality of IDSG, we reset a tyical user sceario i Figure. As show i the figure, the user iitially eters iforatio o the tye of data set, tables, table relatioshis, the attributes withi each table, ad the attribute relatioshi.

3 Figure. A Tyical User Sceario At this oit, the user eeds to defie the relatioshis aogst the attributes. This will be doe by a SeaticGrahBuilder. As a exale, for a credit card trasactio data set, the user ay require two tables, CardHolder ad CardTrasactio, with -toay relatioshi betwee the CardHolder ad the CardTrasactio Tables. The table CardHolder has attributes: Nae, Address, TelehoeNuber, Date-of-Birth, Geder, Icoe, CardNuber, CustoerSiceDate. CardTrasactio table icludes attributes CardNuber, TrasactioDate, Trasactio Aout, PurchasedIte, ad BusiessNae. For selectio of the attributes, the user will be rovided with a set of re-defied attributes to choose fro. However, if desired, a IDSG izard will allow the user to defie ew attributes. Oce the above two sets of iforatio are rovided, IDSG ay build the DataSetSchea. For the schea, the user eeds to iut the ultilicity relatioshi betwee the tables. I this exale, the user eeds to secify how ay trasactio records er CardHolder/card to geerate. For this, IDSG will rovide a list of robability distributios to select fro. O the scree a artially costructed seatic grahs will aear (Figure ). The grah is artially costructed sice soe of the attributes or relatioshis ay ot have bee redefied ad thus eed to be secified by the user. For exale, the relatioshi betwee Age ad Icoe ay already exist, but the user ay decide to add a additioal relatioshi betwee Geder ad Icoe. This will add a additioal lik to the seatic grah betwee the Geder ad Icoe attributes ad oe a widow for secificatio of the tye of relatioshi. Alteratively, the user ay deterie to aitai the structure of the grah (odes ad edges) ad oly odify araeters. For exale, the user ay decide to chage the araeters for Geder distributio fro to 40-60, if the uber of Male-Feale credit card holders are a ratio. Figure. A Partially Costructed Seatic Grah Oce the schea ad the seatic grah are rereseted, the user rovides the iforatio o other araeters. For exale, the uber of records i each table, the duratio of tie for the evets, the geograhic area (for addresses), the atioality (for aes), ad the tye of ites urchased (eg, airlie, hotels, ad car retals oly). The above araeters further costrai the tye of data geerated. Oce these araeters are iut, the IDSG data geerator odule ca geerate the data sets ad store the i soe secific forat. Iitially, this forat is a coa searated file.. Seatic Grahs ad Related Tables A key ste i our ethodology is the use of a seatic grah to rereset deedecy relatioshis aog differet data sets. hile seatic grahs tyically are used to suarize reexistig data, i our ethodology the seatic grah serves as a guide for the creatio of the sythetic data. The seatic grah is also used to illustrate the data geeratio sequece to be used i the sythesizig rocess. The first hase of the data geeratio rocess is the costructio of the seatic grah. A key assutio i our ethodology is that seatic grahs ca adequately cature the relatioshis withi the data. The IDSG tool uses a seatic grah that describes these data relatioshis. The basic eleets of our seatic grahs are data attributes ad directed deedecies. Ovals i the seatic grah rereset various data attributes, ad arrows idicate the deedecy relatioshi betwee two data attributes. The uber of arrows coig ito a data attribute dictates the diesio of a ier table that holds iforatio about coditioal robability distributios for the data attribute. The coditioal distributio of the data attribute is theoretically secified oce the values of all of the uer level data attributes are kow. Data attributes without icoig arrows such as illustrated i Figure 3a will be assiged a distributio fro either the syste or the user. Sice this tye of data obect does ot have icoig arrows, they are ideedet of other data attributes. The geeratio of this tye of data attribute is doe with sile rado salig.

4 .3 Aroxiatig High-Diesioal Data Deedecies Figure 3. a) No icoig deedecies, b) sigle icoig deedecy, ad c) two icoig deedecies To geerate a data obect with a sigle icoig arrow, such as Figure 3b we will assue its uer level data attribute A is already geerated ad takes the value a. The we ca geerate a realizatio of data attribute B accordig to the coditioal robability distributio of B, give A = a, which is secified i the ier table associated with data attribute B. To geerate a data attribute with ultile icoig arrows, such as Figure 3c we eed to the coditioal distributios of C, give all the ossible cobiatios of values for A ad B. The set of coditioal distributios for C, give A ad B, is show i Table. Table would, i theory, be the ier table associated with data attribute C. The table etry is k robability vector associated with the k levels of C, give A = a ad B = b. Ufortuately, high diesioal distributio tables are difficult to obtai i ractice without doig extesive custoized survey work. The tye of iforatio that is ore easily foud is the set of (weighted) row ad colu averages show i Table as { i} i = ad { }, which corresod to the coditioal = distributios of C, give A ad give B, resectively. I Sectio.3, we describe a aroxiatio to of the for ˆ = αi + ( α ), for 0 α. Clearly ˆ is also a robability vector, ad the fact it deeds oly o the ore easily obtaied row ad colu averages of Table ake the estiator useable. Give A = ai ad B = b, a value of for C is the geerated fro the distributio ˆ. I the ext sectio, we elaborate o what value to use for the ixig araeter α. i As described i the revious sectio, we roose to aroxiate with ˆ = αi + ( α ). These aroxiatios are a iortat art of our IDSG sice obtaiig tables such as Table is difficult i ractice, where as obtaiig is relatively easier. The = iforatio such as { i} i = ad { } otivatio for aroxiatios of the for ˆ ( ) = αi + α icreases whe higher diesioal seatic grahs are utilized. For exale, if a variable D has three iuts, say A, B, ad C, the a 3-diesioal table with etries k would be eeded. Here, k would corresod to the robability distributio for D, give A = ai, B = b ad C = c k. Such a table would be extreely difficult to obtai. However, is is quite coceivable that row, colu ad laar averages { i} i =, { } X r k = ad = { k } would be available ad the ˆ k = αki + βk + ( αk βk ) Xk could be used to aroxiate k. For the 3-diesioal case, there are two ixig araeters 0 αk, βk. I the ext sectio, we describe our ethodology for deteriig the values of ixig araeters, describig the aroach for the case two diesios where a sigle araeter 0 α is required. Extesios to the higher diesioal cases are straightforward. Table. Coditioal Distributios for C, give A ad B A \ B b b b a a Row Average a Colu Average

5 .3. Algorith for Mixig Paraeter The weighted row ad colu averages show i Table are fuctios of weights derived fro the oit distributio for A ad B, which is show i Table. Here, = Pr ( A= a, B= b ). i Table. Joit Distributio of A ad B A \ B b b b a a Margial for A i i a Margial for B i i i It follows that the weighted row ad colu averages show i Table ca be exressed as i i i i i i i i = i + i + + i, i ii ii ii = + + +, I geeral alicatios, the robability vectors ad = (,,, ) will ot be kow. e odel the vectors as ideedet idetically distributed k-diesioal uifor Dirichlet rado vectors, ad as a -diesioal uifor Dirichlet rado vector that is ideedet of all the. I soe sese, our odel corresods to a Bayesia oiforative rior, though our use of the odel is ot withi the tyical Bayesia aradig. Rather, we use this aroach to defie a structural characterizatio of the ad values that we ca otiize agaist. hile we caot exect the Dirichlet assutios to hold across all alicatios, it ca be argued that it is the ost reasoable thig to do i the absece of ay other iforatio. For a give realizatio of ({ }, ) we ca fid the otial value of α by iiizig the total error su of squares Q = αi ( α ). It is easy to show that the iiizig α, over the iterval (0,) is the value ( ) ( i) αˆ = Mi, Max 0,. ( i) ( i) The quatity ˆ α is a rado variable, varyig through the Dirichlet distributios used for the ad. The distributio of ˆ α deeds o, ad k, but does ot deed o either i or. Our choice for the ixig araeter is the ea value of ˆ α, which we obtai uerically by the followig algorith: ) Siulate ideedet realizatios of fro a k- diesioal uifor Dirichlet distributio ) Siulate a oe realizatio of fro a -diesioal uifor Dirichlet distributio 3) Coute ˆ α 4) Reeat stes ()-(3) a large uber (e.g., 0,000) ties 5) Coute the average of the geerated values of ˆ α Table 3 shows the ea values for ˆ α for selected choices of, ad k. Siilar tables ca easily be built for higher diesioal robles. For exale, i the case where three iuts A, B ad C drive the value of a 4 th variable D, the table would have colus for,, r ad k, where r reresets the uber of levels for C ad k is the uber of levels of D. The algorith described above is easily exteded to coute the ea values of the two ixig araeters ˆk α ad ˆk β. Table 3. Mixig Paraeter as a Fuctio of (,,k) k Mea of ˆ α Illustratio e ow illustrate the rocedure described i the revious sectio with a sile exale ivolvig exediture atters as a fuctio of Age ad Geder. I this exale, Age ad Geder corresod to the variables A ad B, resectively. The variable C, exediture tye, is a two-level categorical variable that idetifies a exediture as either ecessary (N) or uecessary (U). Figure 4 is a aroriate seatic grah for this exale, ad we have =3, =, ad k=. Our obective is to siulate a value for C, give we have already siulated values for A ad B. I order to ileet our ethodology, we eed the two tables show i Table 4 ad Table 5. Table 4 gives the roortio of N ad U exeditures for each geder, ad Table 5 gives the

6 roortio of N ad U exeditures for each of the three age grous. Table 4. Probability of N ad U Exeditures For Geder Male Feale = Pr ( C = U) = Pr ( C = U) Table 5. Probability of N ad U Exeditures For Age Grou Age Grou Age Grou Age Grou 3 = Pr ( C = U) = Pr ( C = U) = Pr ( C = U) Suose we wated to siulate exediture tye for a Male ( A = a ) i Age Grou ( B = b ). Referrig to Table 3, we fid the ixig araeter for ( k,, ) = (,3,) is e would thus estiate usig ˆ Pr( C = N Male, Age-Grou ) ( 0.44) = ˆ Pr( C = U Male, Age-Grou ) = ( 0.44) A siulated value of C, give A = a ad B = b, is the obtaied by radoly accordig to the robabilities give by ˆ ad ˆ. versios of IDSG ay iclude database forats as outut. This decisio will be ade at soe oit i the future whe we have ore kowledge of user s IDAS ad ore develoet tie. 3. ILLUSTRATIE APPLICATIONS 3. Nae Geeratio I order to geerate aes, we first eed to fid a aes database that we ca sale fro. e obtaied our sets of aes fro olie source such as the Cesus Bureau. The aes are the stored i the aster database as ale first ae, feale first ae, ad last ae i searate files. Siilar files are created for each atioality. he the user rus the geerator, it will go through the database ad radoly ick out a last ae, the cobie it with either a first ae fro the ale list or the feale list to geerate a full ae with geder. The user of the tool ca also secify which atioality of origi they wat the aes, which will geerate a ore realistic dataset. The geerator will go through the database ad cobie last ae files fro secific atioality with first ae files to geerate the desire atioality. This rocess ca be illustrated by usig a seatic grah (Figure 6). Natioality Geder Last Nae First ae Figure 4. Nae Geeratio Scheatic. 3. Credit Card Nuber Geeratio To geerate realistic credit card ubers, we use the seatic grah show i Figure 7. Card Tye.4 Outut Forats The IDSG ust be caable of oututtig geerated sythetic datasets i ortable forats. These data sets ay be stored i volues of CDs or DDs. The datasets will be dyaically geerated based o user secified araeters with otetially very large size which eas that ortability, silicity, ad flexibility are the three ai requireets. Storage of outut data i a database, i flat files/cs, or XML are soe of the ai choices. Storage of the outut i a database has eough flexibility to allow the user to reserve all data relatioshis ad also to otiize the dataset size. But a Alicatio Prograig Iterface (API) will be required for the IDAS, which icreases the colexity ad tie of syste develoet. e have chose coa searated values (CS) as the iitial choice of IDSG for the outut forat sice it is ortable, sile ad flexible. However, this requires the outut of all data i flat forat to reserve the data relatioshis. This will icrease the dataset size sigificatly. This advatage ca be offset by chea storage edia ad requirig oly oe-tie iortig to a IDAS. Future Card Issuer MII Credit card Nuber Figure 5. Credit card uber seatic grah The first digit o a credit card is the Maor Idustry Idetifier (MII) which reresets the source fro where the credit card was issued. There are te issuer categories, but for our urose we will oly use four of the for radoly geeratig a credit card uber. For exale, a credit card uber startig with 6 is assiged for erchadisig ad bakig uroses, such as i the case of the Discovery card. Credit card ubers startig with 4

7 ad 5 are used for bakig ad fiacig uroses, as i the case of isa ad MasterCard. The last issuer category we used is 3 to rereset travel ad etertaiet used, for istace the Aerica Exress card. Table 9 is a overview of the rules for uberig credit card. The first six ubers icludig the MII reresets the issuer idetifier. The rest of the digits o the credit card rereset the cardholder s accout uber. Issuer Table 6. Credit Card Paraeters Idetifier Discovery 60xx 6 Mastercard 5xxxx-55xxx 6 Aerica Exress 34xxx, 37xxx 5 isa 4xxxxx 3, 6 Legth (Nubers) exediture category for a 30-year old erso with a icoe of $50,000. A uifor rado uber geerator ca be used to radoly ick oe of the categories accordig to the robabilities ˆ. i 8 Table 7. Detailed Joit Distributio of Exediture by Age. The credit card uber geeratio rocess begis by radoly geeratig the card tye (Master, isa, Aerica Exress, or Discovery) deedig o the robability distributio for the tye of cards i circulatio. After a card tye is icked, a aroriate MII uber is assiged to the card, which icludes a radoly saled uber, varyig i legth fro to 5 digits log, deedig o the issuer tye. The fial ste i the rocess is to geerate a rado strig of ubers whose legth deeds o the card tye is geerated. e use the Luh algorith to check whether or ot the credit card uber we radoly geerated is a valid uber or ot. To use this algorith, we first double the value of alteratig digits of the card uber, startig with the first digit. Next, we subtract 9 fro those ubers that are greater tha 9. Fially, we su the trasfored ubers together ad the divide by 0. If there are o reaiders, the card uber is cosidered valid. Thus, this is a valid visa credit card uber sice there are o reaiders. 3.3 Credit Card Trasactio Geeratio I this sectio, we illustrate the geeratio of oe articular asect of a credit card urchase trasactio. Our exale akes use of Figure 4 which shows how exediture category (variable C) deeds o age grou (variable A) ad icoe level (variable B). Exediture category has 7 levels: Food, Housig, Aarel, Trasortatio, Health Care, Etertaiet, ad Other. Table 7 shows the robability of each exediture category as a fuctio 7 of Age Grou. The colus of Table 7 corresod to the { i } i= vectors i Sectio. Table 8 shows the robability of each exediture category as a fuctio of Icoe Level. The colus 9 of Table 8 corresod to the { } = vectors i Sectio. Tables 7ad 8 were obtaied fro the U.S. Cesus Bureau. To siulate a exediture category for a 30-year old erso with a Icoe Level of $50,000, we use the algorith described i Sectio.3 takig ( k,, ) = (6,9,7) to fid that the ixig araeter is The estiated robability distributio for exediture category is thus ˆ 8 = ( 0.47) 8. The eleets of i rereset the estiated robabilities of each Table 8. Joit Distributio of Exediture by Icoe. 4. SUMMAR e have described a desig ad architecture for a test data geeratio tool that ca facilitate buildig test cases for IDAS tools. Our tool eables a IDAS develoer to overcoe tie ad cost issues associated with gatherig real data to build test cases by akig sythetic data available. Sythetic test data of sufficiet quality is a viable ecooic alterative to gatherig real data sets, which i ay cases ight ot eve be ossible. The develoet iterval for our tool exteds through the caledar year 005. e aticiate our data geeratio latfor will lay a iortat role i coarig the effectiveess of differet IDAS. By overlayig secific sceario o our backgroud data sets, a IDAS evaluator ca easure coetig IDAS i ters of false egative ad false ositive error rate. 5. REFERENCES [] Abowd, J.M. ad Lae, J.I. Sythetic Data ad Cofidetiality Protectio. U.S. Cesus Bureau, LEHD Progra Techical Paer No. TP-003-0, (003) [] Cha, P.K., Fa,., Prodroidis, A.L., ad Stolfo, S.J. Distributed Data Miig i Credit Card Fraud Detectio. IEEE Itelliget Systes 4(6), (999). [3] Deartet of Defese, Office of the Isector Geeral. Iforatio Techology Maageet: Terroris Iforatio Awareess Progra. Reort No. D (004). [4] Geeral Accoutig Office, Data Miig: Federal Efforts Cover a ide Rage of Uses. GAO (004).

8 [5] Kusiak, A., Kerstie, K.H., Ker, J.A., McLaughli, K.A., ad Tseg, T.L. Data Miig: Medical ad Egieerig Case Studies. Proceedigs of the Idustrial Egieerig Research 000 Coferece, Clevelad, Ohio, May -3, (000), -7. [6] Leskovec, J. Grobelik, M., ad Millic-Fraylig, N. Learig Sub-structures of Docuet Seatic Grahs for Docuet Suarizatio. LikKDD 004, August 004, Seattle A, USA. (004). [7] Orerod, T., Morley, N., Ball, L., Lagley, C., ad Seser, C. Usig Ethograhy To Desig a Mass Detectio Tool (MDT) For The Early Discovery of Isurace Fraud. CHI 003, Aril 5-0, 003, Ft. Lauderdale, Florida, USA. ACM /03/0004 (003) [8] Price, E., ad Nicholso,.L. A Test of a Robust/Resistat Refieet Procedure o Sythetic Data. Acta Cryst., A39, (983), [9] Rogers, M. Graha, J., ad Toge, R.P. Usig Statistical Iage Models for Obective Evaluatio of Sot Detectio i Two-Diesioal Gels. Proteoics, Jue, 3(6) (003), [0] arga, T. ad Buke, H. Geeratio of Sythetic Traiig Data for a HMM-based Hadwritig Recogitio Syste. Proceedigs of the Seveth Iteratioal Coferece o Docuet Aalysis ad Recogitio (ICDAR 003) /03 IEEE Couter Society (003). [] iddows, D. ad Dorow, B. A Grah Model for Usuervised Lexical Acquisitio. 9 th Iteratioal Coferece o Coutatioal Liguistics (COLING 9). Taiei, August (00) [] u,.t., Stefaova, L., Mitra, A.K., ad Krishaurti, T.N.. Multi-Model Sythetic Suereseble Predictio Syste. Acta Cryst., A39, (983), [3] Zhu, X., Aref,.G., Fa, J., Catli, A.C., ad Elagarid, A.K. Medical ideo Miig for Efficiet Database Idexig, Maageet, ad Access. IEEE It. Cof. O Data Egieerig (ICDE 03), Bagalore, Idia, March 5-March 8, (003), -.

Supply Chain Network Design with Preferential Tariff under Economic Partnership Agreement

Supply Chain Network Design with Preferential Tariff under Economic Partnership Agreement roceedigs of the 2014 Iteratioal oferece o Idustrial Egieerig ad Oeratios Maageet Bali, Idoesia, Jauary 7 9, 2014 Suly hai Network Desig with referetial ariff uder Ecooic artershi greeet eichi Fuaki Yokohaa

More information

CHAPTER 4: NET PRESENT VALUE

CHAPTER 4: NET PRESENT VALUE EMBA 807 Corporate Fiace Dr. Rodey Boehe CHAPTER 4: NET PRESENT VALUE (Assiged probles are, 2, 7, 8,, 6, 23, 25, 28, 29, 3, 33, 36, 4, 42, 46, 50, ad 52) The title of this chapter ay be Net Preset Value,

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

3 Energy. 3.3. Non-Flow Energy Equation (NFEE) Internal Energy. MECH 225 Engineering Science 2

3 Energy. 3.3. Non-Flow Energy Equation (NFEE) Internal Energy. MECH 225 Engineering Science 2 MECH 5 Egieerig Sciece 3 Eergy 3.3. No-Flow Eergy Equatio (NFEE) You may have oticed that the term system kees croig u. It is ecessary, therefore, that before we start ay aalysis we defie the system that

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

A Load Balancing Algorithm for High Speed Intrusion Detection

A Load Balancing Algorithm for High Speed Intrusion Detection A Load Balacig Algorith for High Seed Itrusio Detectio LU Sheg, GONG Jia, RUI Suyig Deartet of Couter Sciece ad Egieerig, Southeast Uiversity, Naig, Chia Easter Chia (North) Network Ceter of CERNET (0096)

More information

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling

Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria

More information

Ant Colony Algorithm Based Scheduling for Handling Software Project Delay

Ant Colony Algorithm Based Scheduling for Handling Software Project Delay At Coloy Algorith Based Schedulig for Hadlig Software Project Delay Wei Zhag 1,2, Yu Yag 3, Juchao Xiao 4, Xiao Liu 5, Muhaad Ali Babar 6 1 School of Coputer Sciece ad Techology, Ahui Uiversity, Hefei,

More information

Engineering Data Management

Engineering Data Management BaaERP 5.0c Maufacturig Egieerig Data Maagemet Module Procedure UP128A US Documetiformatio Documet Documet code : UP128A US Documet group : User Documetatio Documet title : Egieerig Data Maagemet Applicatio/Package

More information

An Electronic Tool for Measuring Learning and Teaching Performance of an Engineering Class

An Electronic Tool for Measuring Learning and Teaching Performance of an Engineering Class A Electroic Tool for Measurig Learig ad Teachig Perforace of a Egieerig Class T.H. Nguye, Ph.D., P.E. Abstract Creatig a egieerig course to eet the predefied learig objectives requires a appropriate ad

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

(VCP-310) 1-800-418-6789

(VCP-310) 1-800-418-6789 Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.

More information

Review: Classification Outline

Review: Classification Outline Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio

More information

Domain 1: Identifying Cause of and Resolving Desktop Application Issues Identifying and Resolving New Software Installation Issues

Domain 1: Identifying Cause of and Resolving Desktop Application Issues Identifying and Resolving New Software Installation Issues Maual Widows 7 Eterprise Desktop Support Techicia (70-685) 1-800-418-6789 Domai 1: Idetifyig Cause of ad Resolvig Desktop Applicatio Issues Idetifyig ad Resolvig New Software Istallatio Issues This sectio

More information

The Binomial Multi- Section Transformer

The Binomial Multi- Section Transformer 4/15/21 The Bioial Multisectio Matchig Trasforer.doc 1/17 The Bioial Multi- Sectio Trasforer Recall that a ulti-sectio atchig etwork ca be described usig the theory of sall reflectios as: where: Γ ( ω

More information

Regression with a Binary Dependent Variable (SW Ch. 11)

Regression with a Binary Dependent Variable (SW Ch. 11) Regressio with a Biary Deedet Variable (SW Ch. 11) So far the deedet variable (Y) has bee cotiuous: district-wide average test score traffic fatality rate But we might wat to uderstad the effect of X o

More information

INTEGRATED TRANSFORMER FLEET MANAGEMENT (ITFM) SYSTEM

INTEGRATED TRANSFORMER FLEET MANAGEMENT (ITFM) SYSTEM INTEGRATED TRANSFORMER FLEET MANAGEMENT (ITFM SYSTEM Audrius ILGEVICIUS Maschiefabrik Reihause GbH, [email protected] Alexei BABIZKI Maschiefabrik Reihause GbH [email protected] ABSTRACT The

More information

ODBC. Getting Started With Sage Timberline Office ODBC

ODBC. Getting Started With Sage Timberline Office ODBC ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

CDAS: A Crowdsourcing Data Analytics System

CDAS: A Crowdsourcing Data Analytics System CDAS: A Crowdsourcig Data Aalytics Syste Xua Liu,MeiyuLu, Beg Chi Ooi, Yaya She,SaiWu, Meihui Zhag School of Coputig, Natioal Uiversity of Sigapore, Sigapore College of Coputer Sciece, Zhejiag Uiversity,

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Subject CT5 Contingencies Core Technical Syllabus

Subject CT5 Contingencies Core Technical Syllabus Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling

GSR: A Global Stripe-based Redistribution Approach to Accelerate RAID-5 Scaling : A Global -based Redistributio Approach to Accelerate RAID-5 Scalig Chetao Wu ad Xubi He Departet of Electrical & Coputer Egieerig Virgiia Coowealth Uiversity {wuc4,xhe2}@vcu.edu Abstract Uder the severe

More information

Pre-Suit Collection Strategies

Pre-Suit Collection Strategies Pre-Suit Collectio Strategies Writte by Charles PT Phoeix How to Decide Whether to Pursue Collectio Calculatig the Value of Collectio As with ay busiess litigatio, all factors associated with the process

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Best of security and convenience

Best of security and convenience Get More with Additioal Cardholders. Importat iformatio. Add a co-applicat or authorized user to your accout ad you ca take advatage of the followig beefits: RBC Royal Bak Visa Customer Service Cosolidate

More information

CCH Accountants Starter Pack

CCH Accountants Starter Pack CCH Accoutats Starter Pack We may be a bit smaller, but fudametally we re o differet to ay other accoutig practice. Util ow, smaller firms have faced a stark choice: Buy cheaply, kowig that the practice

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Business Rules-Driven SOA. A Framework for Multi-Tenant Cloud Computing

Business Rules-Driven SOA. A Framework for Multi-Tenant Cloud Computing Lect. Phd. Liviu Gabriel CRETU / SPRERS evet Traiig o software services, Timisoara, Romaia, 6-10 dec 2010 www.feaa.uaic.ro Busiess Rules-Drive SOA. A Framework for Multi-Teat Cloud Computig Lect. Ph.D.

More information

Confidence Intervals for Two Proportions

Confidence Intervals for Two Proportions PASS Samle Size Software Chater 6 Cofidece Itervals for Two Proortios Itroductio This routie calculates the grou samle sizes ecessary to achieve a secified iterval width of the differece, ratio, or odds

More information

1 Adaptive Control. 1.1 Indirect case:

1 Adaptive Control. 1.1 Indirect case: Adative Control Adative control is the attet to redesign the controller while online, by looking at its erforance and changing its dynaic in an autoatic way. Adative control is that feedback law that looks

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS Uit 8: Iferece for Proortios Chaters 8 & 9 i IPS Lecture Outlie Iferece for a Proortio (oe samle) Iferece for Two Proortios (two samles) Cotigecy Tables ad the χ test Iferece for Proortios IPS, Chater

More information

Distributed Storage Allocations for Optimal Delay

Distributed Storage Allocations for Optimal Delay Distributed Storage Allocatios for Optial Delay Derek Leog Departet of Electrical Egieerig Califoria Istitute of echology Pasadea, Califoria 925, USA derekleog@caltechedu Alexadros G Diakis Departet of

More information

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology

INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology

More information

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System

Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System Evaluatio of Differet Fitess Fuctios for the Evolutioary Testig of a Autoomous Parkig System Joachim Wegeer 1, Oliver Bühler 2 1 DaimlerChrysler AG, Research ad Techology, Alt-Moabit 96 a, D-1559 Berli,

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

GOAL PROGRAMMING BASED MASTER PLAN FOR CYCLICAL NURSE SCHEDULING

GOAL PROGRAMMING BASED MASTER PLAN FOR CYCLICAL NURSE SCHEDULING Joural of Theoretical ad Applied Iforatio Techology 5 th Deceber 202. Vol. 46 No. 2005-202 JATIT & LLS. All rights reserved. ISSN: 992-8645 www.jatit.org E-ISSN: 87-395 GOAL PROGRAMMING BASED MASTER PLAN

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV [email protected] 1 Itroductio Imagie you are a matchmaker,

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2 Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Plug-in martingales for testing exchangeability on-line

Plug-in martingales for testing exchangeability on-line Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Detecting Voice Mail Fraud. Detecting Voice Mail Fraud - 1

Detecting Voice Mail Fraud. Detecting Voice Mail Fraud - 1 Detectig Voice Mail Fraud Detectig Voice Mail Fraud - 1 Issue 2 Detectig Voice Mail Fraud Detectig Voice Mail Fraud Several reportig mechaisms ca assist you i determiig voice mail fraud. Call Detail Recordig

More information

Baan Service Master Data Management

Baan Service Master Data Management Baa Service Master Data Maagemet Module Procedure UP069A US Documetiformatio Documet Documet code : UP069A US Documet group : User Documetatio Documet title : Master Data Maagemet Applicatio/Package :

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

GoVal Group Government Consulting and Valuation Advisory Group. real. Real expertise. Real choices. Real value.

GoVal Group Government Consulting and Valuation Advisory Group. real. Real expertise. Real choices. Real value. GoVal Group Govermet Cosultig ad Valuatio Advisory Group real. Real expertise. Real choices. Real value. Novogradac s GoVal Group Specialized cosultig services from a idustry leader. real choices. A uique

More information

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7

Forecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7 Forecastig Chapter 7 Chapter 7 OVERVIEW Forecastig Applicatios Qualitative Aalysis Tred Aalysis ad Projectio Busiess Cycle Expoetial Smoothig Ecoometric Forecastig Judgig Forecast Reliability Choosig the

More information

GIS and analytic hierarchy process for land evaluation

GIS and analytic hierarchy process for land evaluation GIS ad aalytic hierarchy process for lad evaluatio Dr. Le Cah DINH Sub-Natioal Istitute of Agricultural Plaig ad Proectio Vieta [email protected] Assoc. Prof. Dr. Tra Trog DUC Vieta Natioal Uiversity -

More information

For customers Key features of the Guaranteed Pension Annuity

For customers Key features of the Guaranteed Pension Annuity For customers Key features of the Guarateed Pesio Auity The Fiacial Coduct Authority is a fiacial services regulator. It requires us, Aego, to give you this importat iformatio to help you to decide whether

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES Cotets Page No. Summary Iterpretig School ad College Value Added Scores 2 What is Value Added? 3 The Learer Achievemet Tracker

More information

How To Solve The Homewor Problem Beautifully

How To Solve The Homewor Problem Beautifully Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

A Cyclical Nurse Schedule Using Goal Programming

A Cyclical Nurse Schedule Using Goal Programming ITB J. Sci., Vol. 43 A, No. 3, 2011, 151-164 151 A Cyclical Nurse Schedule Usig Goal Prograig Ruzzaiah Jeal 1,*, Wa Rosaira Isail 2, Liog Choog Yeu 3 & Ahed Oughalie 4 1 School of Iforatio Techology, Faculty

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

the product of the hook-lengths is over all boxes of the diagram. We denote by d (n) the number of semi-standard tableaux:

the product of the hook-lengths is over all boxes of the diagram. We denote by d (n) the number of semi-standard tableaux: O Represetatio Theory i Coputer Visio Probles Ao Shashua School of Coputer Sciece ad Egieerig Hebrew Uiversity of Jerusale Jerusale 91904, Israel eail: [email protected] Roy Meshula Departet of Matheatics

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

How to use what you OWN to reduce what you OWE

How to use what you OWN to reduce what you OWE How to use what you OWN to reduce what you OWE Maulife Oe A Overview Most Caadias maage their fiaces by doig two thigs: 1. Depositig their icome ad other short-term assets ito chequig ad savigs accouts.

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

The Forgotten Middle. research readiness results. Executive Summary

The Forgotten Middle. research readiness results. Executive Summary The Forgotte Middle Esurig that All Studets Are o Target for College ad Career Readiess before High School Executive Summary Today, college readiess also meas career readiess. While ot every high school

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

SOLAR POWER PROFILE PREDICTION FOR LOW EARTH ORBIT SATELLITES

SOLAR POWER PROFILE PREDICTION FOR LOW EARTH ORBIT SATELLITES Jural Mekaikal Jue 2009, No. 28, 1-15 SOLAR POWER PROFILE PREDICTION FOR LOW EARTH ORBIT SATELLITES Chow Ki Paw, Reugath Varatharajoo* Departet of Aerospace Egieerig Uiversiti Putra Malaysia 43400 Serdag,

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

A Supply Chain Game Theory Framework for Cybersecurity Investments Under Network Vulnerability

A Supply Chain Game Theory Framework for Cybersecurity Investments Under Network Vulnerability A Supply Chai Gae Theory Fraework for Cybersecurity Ivestets Uder Network Vulerability Aa Nagurey, Ladier S. Nagurey, ad Shivai Shukla I Coputatio, Cryptography, ad Network Security, N.J. Daras ad M.T.

More information

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics

Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to

More information

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series

Automatic Tuning for FOREX Trading System Using Fuzzy Time Series utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information