eachother'smethodsandusingajudiciouslychosencombinationofthem. Abstract

Size: px
Start display at page:

Download "eachother'smethodsandusingajudiciouslychosencombinationofthem. Abstract"

Transcription

1 IBMT.J.WatsonResearchCenter,YorktownHeights,N.Y.,U.S.A. Astatisticalperspectiveondatamining JonathanHosking,EdwinPednaultandMadhuSudan therearesomephilosophicalandmethodologicaldierences.weexaminethesedierences,and wedescribethreeapproachestomachinelearningthathavedevelopedlargelyindependently: Dataminingcanberegardedasacollectionofmethodsfordrawinginferencesfromdata.The aimsofdatamining,andsomeofitsmethods,overlapwiththoseofclassicalstatistics.however, classicalstatistics,vapnik'sstatisticallearningtheory,andcomputationallearningtheory.comparingtheseapproaches,weconcludethatstatisticiansanddataminerscanprotbystudying eachother'smethodsandusingajudiciouslychosencombinationofthem. Abstract asobtainingecientsummariesoflargeamountsofdata,identifyinginterestingstructuresand 1Introduction:astatisticianlooksatdatamining Therecentupsurgeofinterestintheeldvariouslyknownasdatamining,knowledgediscovery ormachinelearning1hastakenmanystatisticiansbysurprise.dataminingattackssuchproblems Keywords:classication,frequentistinference,PAClearning,statisticallearningtheory. temptingforastatisticiantoregarddataminingasnomorethanabranchofstatistics. lectionofmethodsforsummarizingandidentifyingpatternsindata.manystatisticalmodelsexist forexplainingrelationshipsinadatasetorformakingpredictions:clusteranalysis,discriminant problems.exploratorydataanalysis,aeldparticularlyassociatedwithj.w.tukey[18],isacol- analysisandnonparametricregressioncanbeusedinmanydataminingproblems.itistherefore torsoffutureobservations.statisticianshavewellestablishedtechniquesforattackingallofthese relationshipswithinadataset,andusingasetofpreviouslyobserveddatatoconstructpredic- Datasetscanbeverymuchlargerthanisusualinstatistics,runningtohundredsofgigabytesor timetotasinglemodel.therearedierencesofemphasisintheapproachtomodeling:comparedwithstatistics,dataminingpayslessattentiontothelarge-sampleasymptoticproperties ofitsinferencesandmoretothegeneralphilosophyof\learning",includingconsiderationofthe complexityofmodelsandofthecomputationsthattheyrequire.somemodelingtechniques,such Nonetheless,theproblemsandmethodsofdatamininghavesomedistinctfeaturesoftheirown. terabytes.dataanalysesareonacorrespondinglylargerscale,oftenrequiringdaysofcomputer independentlyofinputfromstatisticians. asneuralnetworks,haveanextensivemethodologyandterminologythathasdevelopedlargely asrule-basedmethods,arediculttotintotheclassicalstatisticalframework,andothers,such \learning"isaloadedterm. modelthatisunjustiablyelaborateforagivendataset(e.g.[11]).\machinelearning"isprobablybetter,though 1Unfortunately,\datamining"isapejorativetermtostatisticians,whouseittodescribethettingofastatistical 1

2 ties:classicalstatistics,thestatisticallearningtheoryofv.vapnik,andcomputationallearning theory.section6containssomecomparisonsandconclusions. anddatamining.insection2weobservesomeofthedierencesbetweenthestatisticalanddataminingapproachestodataanalysisandmodeling.insections3{5wedescribeinmoredetailsome Thispaperisabriefintroductiontosomeofthesimilaritiesanddierencesbetweenstatistics approachestomachinelearningthathaveariseninthreemore-or-lessdisjointacademiccommuni- Bothstatisticsanddataminingareconcernedwithdrawinginferencesfromdata.Theaimof theinferencemaybeunderstandingthepatternsofcorrelationandcausallinksamongthedata statisticshasdevelopedanapproach,describedfurtherinsection3below,thatinvolvesspecifying 2.1Featuresofdatamining values(\explanation"),ormakingpredictionsoffuturedatavalues(\generalization").classical 2Statisticsanddatamining statements.data-miningmethodshaveinmanycasesbeendevelopedforproblemsthatdonott amodelfortheprobabilitydistributionofthedataandmakinginferencesintheformofprobability whenappliedtofamiliarstatisticalproblemssuchasclassicationandregression,theyretainsome distinctfeatures.wenowmentionsomefeaturesofthedata-miningapproachesandtheirtypical easilyintotheframeworkofclassicalstatisticsandhaveevolvedinisolationfromstatistics.even implementations. Complexmodels.Someproblemsinvolvecomplexinteractionsbetweenfeaturevariables,with datasets(104to107examples).thisisinsomecasesaconsequenceoftheuseofcomplexmodels, Largeproblems.Bythestandardsofclassicalstatistics,dataminingoftendealswithverylarge shouldhavebetterprospectsofsuccessincomplexproblems. asneuralnetworksandrule-basedclassiershavethecapacitytomodelcomplexrelationshipsand a1616arrayofpixels,itisdiculttoformulateacomprehensiblestatisticalmodelthatcan identifythecharacterthatcorrespondstoagivenpatternofdots.data-miningtechniquessuch nosimplerelationshipsbeingapparentinthedata.characterrecognitionisagoodexample;given mining. Manydiscretevariables.Datasetsthatcontainamixtureofcontinuousanddiscrete-valued ofcomputationalcomplexityandscalabilityofalgorithmsareoftenofgreatimportanceindata continuousvariables.manydataminingmethodsaremoretolerantofdiscrete-valuedvariables. variablesarecommoninpractice.mostmultivariateanalysismethodsinstatisticsaredesignedfor Indeed,somerule-basedapproachesuseonlydiscretevariablesandrequirecontinuousvariablesto forwhichlargeamountsofdataareneededtoderivesecureinferences.inconsequence,issues expressedintermsofpredictionerror:forexample,inclassicationproblemsthelossfunction mightbethemisclassicationrateonasetofexamplesnotusedinthemodel-ttingprocedure. bediscretized. Wideuseofcross-validation.Data-miningmethodsoftenseektominimizealossfunction Predictionerrorisoftenestimatedbycross-validation,atechniqueknowntostatisticsbutused muchmorewidelyindatamining. canbeusedinanestedfashion the\wrappermethod"[7] tooptimizeseveralaspectsofthe model.theseincludevariousparametersthatmightotherwisebechosenarbitrarily(e.g.,the Minimizationofthepredictionerrorestimatedbycross-validationisapowerfultechniquethat 2

3 amountofpruningofadecisiontree,orthenumberofneighborstouseinanearest-neighbor examples.statisticalmethodsareparticularlylikelytobepreferablewhenfairlysimplemodelsare thegreatercomplexityofdataminingmethodsisnotalwaysjustiable:ripley[16]citesseveral approachesispossiblebutseemsrarelytobeperformed.somecomparisonshavefoundthat eliminatedfromthemodel. Fewcomparisonswithsimplestatisticalmodels.Whendataminingmethodsareused onproblemstowhichclassicalstatisticalmethodsarealsoapplicable,directcomparisonofthe classier)andthechoiceofwhichfeaturevariablesarerelevantforclassicationandwhichcanbe adequateandtheimportantvariablescanbeidentiedbeforemodeling.thisisacommonsituation logisticregressionandconcludedthattheuseofneuralnetworks\doesnotnecessarilyimplyany inbiomedicalresearch,forexample.inthiscontextvachetal.[19]comparedneuralnetworksand progress:theyfailintranslatingtheirincreasedexibilityintoanimprovedestimationofthe Acommonprobleminstatisticsanddataminingistouseobservationsonasetof\featurevariables" topredictthevalueofa\classvariable".thisproblemcorrespondstostatisticalmodelsfor classicationwhentheclassvariabletakesadiscretesetofvaluesandforregressionwhenthe regressionfunctionduetoinsucientsamplesizes,theydonotgivedirectinsighttotheinuence 2.2Classication:anillustrativeproblem ofsinglecovariates,andtheyarelackinguniquenessandreproducibility". theclassvariablebyyandthefeaturevariablesbythevectorx=[x1:::xf].itissometimes valuesoftheclassvariablecoveracontinuousrange.toillustratetherangeofapproachesavailable thisonecanlistvariousdata-miningmethodsindecreasingorderoftheirresemblancetoclassical statisticalmodeling.moredetailsofmanyofthesemethodscanbefoundin[13].wedenote usedforclassication.theclassicalstatisticalapproachisdiscriminantanalysis;startingfrom convenienttothinkofthefeaturevariablesasordinatesofa\featurespace"withtheaimofthe instatisticsanddataminingweconsidertheclassicationproblem.manydierentmethodsare ticaltechniquebasedonstatisticalmodelscontaining,usually,relativelyfewparameters.the modelingprocedureseekslinearorquadraticcombinationsofthefeaturevariablesthatidentify analysisbeingtopartitionthefeaturespaceintoregionscorrespondingtothedierentclasses (valuesofy). arecontinuous-valuedand,withineachclass,approximatelynormallydistributed. Linear/quadratic/logisticdiscriminantanalysis.Discriminantanalysisisaclassicalstatis- nonlineartransformationsoftheselinearcombinations,withtheprobabilityofafeaturevectorx belongingtoclasskbeingmodeledasmxm=1km theboundariesbetweenclasses.themostdetailedtheoryappliestocasesinwhichthefeatures izationoflogisticdiscriminationthatalsoinvolveslinearcombinationsoffeaturesbutalsoincludes Projectionpursuit.Forclassicationproblems,projectionpursuitcanbethoughtofasageneral- projectionpursuitasa\neostatistical"ratherthanaclassicalstatisticaltechnique. tation.thenonlinearitiesandoftenlargenumbersofparametersinthemodelleadsonetoregard mareprespeciedscatterplotsmoothingfunctions,choseninpartfortheirspeedofcompu- mfxj=1mjxj: 3 (1)

4 model.theprobabilityofafeaturevectorxbelongingtoclasskismodeledas Radialbasisfunctions.Radialbasisfunctionsformanotherkindofnonlinearneostatistical feedforwardnetwork,canbethoughtofasamodelsimilarto(1).however,the Neuralnetworks.Acommonformofneuralnetworkfortheclassicationproblem,themultilayer Herejjx cmjjisthedistancefrompointxinfeaturespacetothemthcentercm,misascale factor,andisabasisfunction,oftenchosentobethegaussianfunction(r)=exp( r2). MXm=1m(jjx cmjj=m): thatisunfamiliartostatisticians. Graphicalmodels.Graphicalmodels,alsoknownasBayesiannetworks,belieffunctions,orcausal neostatisticalmodels,butauniquemethodologyandterminologyforneuralnetworkshasdeveloped aredierent generallythelogisticfunction onelayeroflogistictransformationsmaybeapplied.neuralnetworksarerecognizablycloseto diagrams,involvethespecicationofanetworkoflinksbetweenfeatureandclassvariables.the m(t)=1=f1+exp( t)gisused andmorethan mtransformations linksspecifyrelationsofstatisticaldependencebetweenparticularfeatures;equallyimportantly, absenceofadirectlinkbetweentwofeaturesisanassertionoftheirconditionalindependence graphicalmodelsinvolvelargenumbersofparametersanddonottwellintotheframeworkof giventheotherfeaturesappearinginthenetwork.linksinthenetworkcanbeinterpretedas causalrelationsbetweenfeatures thoughthisisnotalwaysstraightforward,asexempliedby classicalstatisticalinference. Nearest-neighbormethods.Atitssimplest,thek-nearestneighborprocedureassignsaclassto thediscussionin[15] whichcanyieldparticularlyinformativeinferences.forrealisticproblems, pointxinfeaturespaceaccordingtothemajorityvoteoftheknearestdatapointstox.thisisa smoothingprocedure,andwillbeeectivewhenclassprobabilitiesvarysmoothlyoverthefeature basedonthevaluetakenbyasinglefeature,untilthepartitionsaresonethateachcorrespondsto Decisiontrees.Adecisiontreeisasuccessionofpartitionsoffeaturespace,eachpartitionusually asinglevalueoftheclassvariable.thisformulationbearslittleresemblancetoclassicalparametric space.questionsariseastothechoiceofkandofanappropriatedistancemeasureinfeaturespace. Theseissuesarenoteasilyexpressedintermsofclassicalstatisticalmodels.Modelspecicationis thereforedeterminedbymaximizingclassicationaccuracyonasetoftrainingdataratherthan measuredbyminimumdescriptionlength. statisticalmodels.choiceofthebesttreerepresentationisobtainedbycomparingdierenttrees intermsoftheirpredictiveaccuracy,estimatedbycross-validation,andtheircomplexity,often byformallyspecifyingandttingastatisticalmodel. Rules.Rule-basedmethodsseektoassignclasslabelstosubregionsoffeaturespaceaccordingto considerationofaruleset'spredictiveaccuracyandcomplexity. ofclassicalstatisticalmodels,andtheparametervaluesareoptimized,asfordecisiontrees,by involveparameterswhoseoptimalvaluesareunknown.themethodscannotbeexpressedinterms Individualrulescanbecomplexandhardtointerpretsubjectively.Rule-generationmethodsoften logicalcriteriasuchasifx1=3andx215andx2<30theny=1: 4

5 theirdierences.anygivendatasetmaycontainirrelevantorpoorlymeasuredfeatureswhichonly addnoisetotheanalysisandshouldforeciency'ssakebedeleted;somedependencesbetween classandfeaturesmaybemostsuccinctlyexpressedintermsofafunctionofseveralfeatures featuresx1;:::;xf.itcanbearguedthatthissimilaritybetweentheapproachesoutweighsallof ratherthanbyasinglefeature.nomethodcanbeexpectedtoperformwellifdoesnotusethe classicationproblem.however,eachapproachrequiresatsomestagetheselectionofappropriate Theforegoinglistillustratesawiderangeofstatisticalanddata-miningapproachestothe mostinformativefeatures:\garbagein,garbageout". componentsregressionexplicitlyformlinearcombinationsoffeaturesthatarethenusedasnew ontheimpurityoftheconditionalprobabilitydistributionoftheclassvariablegiventhefeatures, usedindecision-treeandrule-basedclassiers[10].asnotedabove,the\wrapper"methodisa powerfulandwidelyapplicabletechniqueforfeatureselection. above.theserangefromcriteriabasedonsignicancetestsforstatisticalmodelstomeasuresbased Constructionofnewfeaturescanbeexplicitorimplicit.Sometechniquessuchasprincipal- Explicitfeatureselectioncriteriahavebeendevelopedforseveralofthemethodsdescribed featurevariablesinthemodel.conversely,thelinearcombinationspjmjxjoffeaturesthat appearintherepresentation(1)forprojection-pursuitandneural-networkclassiersareimplicit andscienticinference.adetailedaccountofthetheoryisgivenbycoxandhinkley[2].the constructedfeatures.constructionofnonlinearcombinationsoffeaturesisgenerallyamatterfor Inthissectionwegiveabriefsummaryoftheclassical\frequentist"approachtostatisticalmodeling 3Classicalstatisticalmodeling subjectivejudgement. techniquesusedinappliedstatisticalanalysesaredescribedinmorespecializedtextssuchas[4]for andclassication,thedatavectorzisdecomposedintoz=[x;y]andyismodeledasafunction classicationproblemsand[27]forregression.weassumethatinferencefocusesonadatavectorz withtheavailabledatazi;i=1;:::;`,being`instancesofz.inmanyproblems,suchasregression with\whatmighthavehappened,butdidn't"(otherpotentiallyobservabledatavectors). vectorz.thisenables\whathappened"(theobserveddatavector)tobequantitativelycompared 3.1Modelspecication Astatisticalmodelisthespecicationofafrequencydistributionp(z)fortheelementsofthedata ofthexvalues. interest;thefrequencydistributionofxmayormaynotberelevant.inmoststatisticalregression therelationshipbetweenyandxisobservedwitherror.thealternativespecicationinwhichthe whereeisanerrortermhavingmeanzeroandsomeprobabilitydistribution;i.e.,itisassumedthat analysesthemodelhastheform functionalrelationshipy=f(x)isexactanduncertaintyarisesonlywhenpredictingyathitherto Inregressionandclassicationproblemstheconditionaldistributionofygivenx,p(yjx),isof unobservedvaluesofxismuchlesscommon:oneexampleistheinterpolationofrandomspatial processesbykriging[8]. Inclassicalstatistics,modelspecicationhasalargesubjectivecomponent.Candidatesforthe y=f(x)+e (2) distributionofz,ortheformoftherelationshipbetweenyandx,maybeobtainedfrominspection 5

6 bythemaximum-likelihoodprocedure:thejointprobabilitydensityfunctionofthedata,p(z;), 3.2Estimation Modelspecicationgenerallyinvolvesanunknownparametervector.Thisistypicallyestimated ofthedata,fromfamiliaritywithrelationsestablishedbypreviousanalysisofsimilardatasets,or function logp(z;).whenthedataareassumedtobeasetofindependentandidentically ismaximizedover.maximum-likelihoodestimationcanberegardedasminimizationoftheloss fromascientictheorythatentailsparticularrelationsbetweenelementsofthedatavector. Whenthedatavectorisdecomposedasz=[x;y],theobserveddataaresimilarlydecomposedas zi=[xi;yi],andthelossfunction(negativelog-likelihood)is distributedvectorszi,i=1;:::;`,thislossfunctionis `Xi=1 logp(yijxi;): `Xi=1 logp(zi;): varianceindependentofi,thislossfunctionisequivalenttothesumofsquares andecientasthesamplesize`increasestoinnity.exceptforcertainmodelswhoseanalysisis IftheconditionaldistributionofyigivenxiisNormalwithmeanafunctionofxi,f(xi;),and particularlysimple,classicalstatisticshaslittletosayaboutnite-samplepropertiesofestimators andpredictors. Thejusticationformaximum-likelihoodestimationisasymptotic:theestimatorsareconsistent `Xi=1fyi f(xi;)g2: theparameterisregardedasxedbutunknown,anddoesnothaveaprobabilitydistribution. Estimatesofaccuracyaretypicallyexpressedintermsofcondenceregions.Infrequentistinference Insteadoneconsidershypotheticalrepetitionsoftheprocessofgenerationofdatafromthemodel withaxedvalue0oftheparametervector,followedbycomputationof^,themaximumlikelihoodestimatorof.overtheserepetitionsaprobabilitydistributionfor^ willbebuilt Assessmentoftheaccuracyofestimatedparametersisanimportantpartoffrequentistinference. up.likelihoodtheoryprovidesanasymptoticlarge-sampleapproximationtothisdistribution. accuracywithwhichtheparametercanbeestimated. madefromthemodel.thesetooareasymptoticlarge-sampleapproximations.condencestatementsforparametersandpredictionsarevalidonlyontheassumptionthatthemodeliscorrect, thenacondenceregionforwithcondencelevel.thesizeoftheregionisameasureofthe FromitonecandeterminearegionC(^),dependingon^,ofthespaceofpossiblevaluesof,that containsthetruevalue0withprobability(nomatterwhatthistruevaluemaybe).c(^)is therelativefrequenciesofallofthepossiblevaluesofz.ifthemodelisfalse,predictionsmaybe i.e.thatforsomevalueofthespeciedfrequencydistributionp(z;)forzaccuratelyrepresents inaccurateandestimatedparametersmaynotbemeaningful. Condenceregionscanalsobeobtainedforsubsetsofthemodelparametersandforpredictions 6

7 beinadequatethroughhavingthewrongstructure:forexample,aregressionmodelmayrelatey linearlytoxwhenthecorrectphysicalrelationislinearbetweenlogyandlogx. withadditionalstructurebeingneededtodescribethepatternsinthedata.amodelmayalso isunjustiablyelaborate,withthemodelstructureinpartrepresentingmerelyrandomnoiseinthe data.underttingistheconversesituation,inwhichthemodelisanoversimplicationofreality Inadequacyofastatisticalmodelmayarisefromthreesources.Overttingoccurswhenthemodel 3.3Diagnosticchecking parameterisdroppedwillusuallybedeemedadequate. nosticgoodness-of-ttests.astatistictiscomputedwhosedistributioncanbefound,either correct.ifthecomputedvalueoftisintheextremetailofitsdistributionthereisanindicationof exactlyorasalarge-sampleasymptoticapproximation,undertheassumptionthatthemodelis Ifthecondenceregionforaparameterincludesthevaluezero,thenasimplermodelinwhichthe Inthefrequentistframework,underttingbyastatisticalmodelistypicallyassessedbydiag- Comparisonofparameterswiththeirestimatedaccuracyprovidesacheckagainstovertting. andawayofmodifyingthemodeltocorrecttheinadequacy. valueoftoften(butnotalways)suggestsaparticulardirectioninwhichthemodelisinadequate, modelinadequacy:eitherthemodeliswrongorsomethingveryunusualhasoccurred.anextreme structure,thisisanindicationofmodelinadequacyandmaysuggestsomewayinwhichthemodel independentlydistributed;ifaplotofresidualsagainstthettedvaluesshowsanynoticeable shouldbemodied. ofmodeladequacy,foridenticationeitherofunderttingorofincorrectmodelstructure.for example,theresidualsfromaregressionmodelthatiscorrectlyspeciedwillbeapproximately notusedinformalgoodness-of-ttests,theycanbeusedasthebasisofsubjectivejudgements Manydiagnosticplotsandstatisticshavebeendevisedforparticularstatisticalmodels.Though themodelanditsparticularestimatedparametervalues.inanalysesinwhichthereistheoption inuentialvalueshavebeenmeasuredwithsucientaccuracytojustifyconclusionsdrawnfrom ontheestimatedvaluesofthemodelparameters.suchdatapointsmeritcloseinspectiontocheck whethertheoutliersmayhavearisenfromfaultydatacollectionortranscription,andwhetherthe orinuentialvalues,whicharesuchthatasmallchangeinthedatavaluewillhavealargeeect observationsmaybeoutliers,valuesthatarediscordantwiththepatternoftheotherdatavalues, Diagnosticplotsarealsousedtoidentifydatavaluesthatareunusualinsomerespect.Unusual modelinadequacyrevealedbydiagnosticcheckssuggestsamodiedmodelspecicationdesigned ofcollectingadditionaldataatcontrolledpoints,forexamplewhenmodelingtherelationy=f(x) wherexcanbexedandthecorrespondingvalueofyobserved,themostinformativexvaluesat tocorrecttheinadequacy;themodiedmodelisthenitselfestimatedandchecked,andthecycle whichtocollectmoredatawillbeintheneighborhoodofoutlyingandinuentialdatapoints. Thesequenceofspecication{estimation{checkinglendsitselftoaniterativeprocedureinwhich 3.4Modelbuildingasaniterativeprocedure bespeciedapriori.thisisthecase,forexample,whenthecandidatesformasequenceofnested isrepeateduntilasatisfactorymodelisobtained.thisprocedureoftenhasalargesubjective isalsoincludedin(j+1).carefulcontrolovertheprocedureisnecessaryinordertoensurethat formalprocedurestoidentifythebestmodelcanbedevisediftheclassofcandidatemodelscan modelsm1;:::;mm,whoseparametervectors(1);:::;(m)aresuchthateveryelementof(j) component,arisingfromthemodelspecicationsandthechoiceofdiagnosticchecks.however, 7

8 inferencesarevalid,forexamplethatcondenceregionsfortheparametersinthenalmodelhave thecorrectcoverageprobability. forexamplewhetheraregressionmodely=(1) logy=(2) onthequalityoftofthemodels,theireaseofinterpretationandtheirconcordancewithknown physicalmechanismsrelatingthevariablesinthemodel. basedontheassumptionthatthenalmodeliscorrect.thisisproblematicalintworespects.in Classicalfrequentiststatisticshaslittletosayaboutthechoicebetweennonnestedmodels, manysituationsonemaybelievethatthetruedistributionofzhasaverycomplexstructureto Onceasatisfactorymodelhasbeenobtained,furtherinferencesandpredictionsaretypically 1x1+(2) 2x3.Suchdecisionsaregenerallyleftasamatterofsubjectivejudgementbased 1x1+(1) 2x2issuperiortoanalternativemodel inferences. estimatedandtestedonthesamesetofdata,andfailuretoallowforthiscanleadtoinaccurate parameterestimatorsinthenalmodelmaybeaectedbythefactthatseveralmodelshavebeen whichanystatisticalmodelisatbestanapproximation.furthermore,thestatisticalpropertiesof variabilitycancausexvariablesthatareactuallyunrelatedtoytoappeartobestatistically leadstounderestimationofthevariabilityoftheerrortermintheregressionmodel,whichcanlead procedureforidentifyingthebeststatisticalmodel,inthiscasedecidingwhichelementsofthe signicant,theestimatedregressioncoecientsofthevariablesselectedforthenalmodeltend tobeoverestimatesoftheabsolutemagnitudeofthetrueparametervalues.this\selectionbias" xcomponentofthedatavectorshouldappearintheregressionmodel(2).becauserandom Asanexampleofthislastproblem,weconsiderstepwiseregression.Thisisawidelyused topoorresultswhenthenalmodelisusedforprediction.inpracticeitisoftenbettertouseall oftheavailablevariablesratherthanastepwiseprocedureforprediction[14]. Simulation-basedmethodssuchasthebootstrap[3]enablebetterassessmentofaccuracyinnite 3.5Recentdevelopments useofnonlinearmodelsenablesawiderrangeofx{yrelationshipstobeaccuratelymodeled. classicalfrequentistapproach.akaike'sinformationcriterion[17],andrelatedmeasuresofschwarz samples. estimators[6]hasmadeinferencelesssusceptibletooutliersandinuentialdatavalues.greater Developmentsinstatisticaltheorysincethe1970shaveaddressedsomeofthedicultieswiththe 4Vapnik'sstatisticallearningtheory andrissanen,providelikelihood-basedcomparisonsofnonnestedmodels.developmentofrobust itsformisnotexactlycorrect,isoftenthepurposeoftheanalysis.thissituationisalsofacedin classicalstatisticalmodelingandhasledtothecreationofthediagnosticchecksdiscussedearlier. theformofthecorrectmodelisusuallyunknown.infact,discoveringanadequatemodel,evenif Onereasonthatclassicalstatisticalmodelinghasalargesubjectivecomponentisthatmostofthe mathematicaltechniquesusedintheclassicalapproachassumethattheformofthecorrectmodel decidedsubjectivelybasedonthejudgmentandexperienceofthedataanalyst. guidancewhencomparingdierenttypesofmodels.thequestionofmodeladequacymuststillbe However,evenwiththesediagnostics,theclassicalapproachdoesnotprovidermmathematical isknownandthattheproblemistoestimateitsparameters.indatamining,ontheotherhand, amathematicalbasisforcomparingmodelsofdierentformsandforestimatingtheirrelative ThislattersourceofsubjectivityhasmotivatedVapnikandChervonenkis[24,25,26]todevelop 8

9 statisticallearningtheorycloselymatchesthesituationactuallyfacedindatamining. nitesamplesenablesoverttingtobequantitativelyassessed.thus,theunderlyingpremiseof asymptoticstatisticsasisusuallythecaseintheclassicalapproach.thisshiftofemphasisto becorrect.inaddition,comparisonsbetweenmodelsarebasedonnite-samplestatistics,not ofthecorrectmodelistrulyunknownandthatthegoalistoidentifythebestpossiblemodel fromagivensetofmodels.themodelsneednotbeofthesameformandnoneofthemneed adequacies.thisbodyofwork,nowknownasstatisticallearningtheory,presumesthattheform 4.1Modelspecication Asinclassicalstatisticalmodeling,modelsforthedatamustbespeciedbytheanalyst.However, orderingisusedtoaddresstheissueofovertting.inpractice,modelswithfewerparametersor thedata.inaddition,apreferenceorderingoverthemodelsmustalsobespecied.thispreference insteadofspecifyingasingle(parametric)modelwhoseformisthenassumedtobecorrect,aseries ofcompetingmodelsmustbespeciedoneofwhichwillbeselectedbasedonanexaminationof modeling;however,whatisbeingestimatedisquitedierent.intheclassicalapproach,theformof explainsthedata. 4.2Estimation Estimationplaysacentralroleinstatisticallearningtheoryjustasitdoesinclassicalstatistical Whenapplyingstatisticallearningtheory,onesearchesforthemostpreferablemodelthatbest degreesoffreedomarepreferabletothosewithmore,sincetheyarelesslikelytoovertthedata. estimatingtherelativeperformanceofcompetingmodelssothatthebestmodelcanbeselected. themodelisassumedtobeknownand,hence,emphasisisplacedonestimatingitsparameters.in statisticallearningtheory,thecorrectmodelisassumedtobeunknownandemphasisisplacedon extendedsothatdenesboththespecicparametersofthemodelandtheparametricfamily specicmodel.inthecaseofaparametricfamilyofmodels,thenotationintroducedearlieris arealsoconsideredfordierentkindsofmodelingproblems. statisticallearningtheorywhencomparingprobabilitydistributions.however,otherlossfunctions Thenegativelog-likelihoodfunctionsemployedinclassicalstatisticalmodelingarealsousedin Ingeneral,statisticallearningtheoryconsidersthelossQ(z;)betweenadatavectorzanda Therelativeperformanceofcompetingmodelsismeasuredthroughtheuseoflossfunctions. towhichthemodelbelongs.inthisway,modelsfromdierentfamiliescanbecompared.when modelingthejointprobabilitydensityofthedata,theappropriatelossfunctionisthesamejoint Similarly,whenthedatavectorzcanbedecomposedintotwocomponents,z=[x;y]andweare negativelog-likelihoodusedinclassicalstatisticalmodeling: interestedinmodelingtheconditionalprobabilitydistributionofyasafunctionofx,thenthe conditionalnegativeloglikelihoodistheappropriatelossfunction: Ontheotherhand,ifwearenotinterestedintheactualdistributionofybutonlyinconstructing Q(z;)= logp(yjx;): Q(z;)= logp(z;): 0/1lossfunctionusedinpatternrecognitionisappropriate: apredictorf(x;)forythatminimizestheprobabilityofmakinganincorrectprediction,thenthe Q(z;)=0;iff(x;)=y, 1;iff(x;)6=y. 9

10 wealreadyknewallofthestatisticalpropertiesofthedata.ifthedatavectorzisgeneratedbya Ingeneral,Q(z;)canbechosendependingonthenatureofthemodelingproblemonefaces.Its minimizestheexpectedlossr()withrespecttof(z),where randomprocessaccordingtotheprobabilitymeasuref(z),thenthebestmodelistheonethat lossesimplybettermodelsofthedata. purposeistomeasuretheperformanceofamodelsothatthebestmodelcanbeselected.theonly requirementfromthepointofviewofstatisticallearningtheoryisthat,byconvention,smaller Oncealossfunctionhasbeenselected,identifyingthebestmodelwouldberelativelyeasyif utilitymeasureoftheoutcomegiventhedecision.utilitymeasuresprovideanumericalencoding ofuncertaintyoneiswillingtoacceptinchoosingariskydecisionthathasalowprobabilityof ofwhichoutcomesarepreferredoverothers,aswellasaquantitativemeasurementofthedegree nologyofdecisiontheory,isadecisionvector,zisanoutcome,andq(z;)isthe(negative) ThemodelthatminimizesR()isoptimalfromadecision-theoreticpointofview.Inthetermi- R()=ZQ(z;)dF(z): onemustchoosethemostsuitablemodelonecanidentifybasedonasetofobserveddatavectors probabilitymeasuref(z)thatdenesthestatisticalpropertiesofthedataisunknown.instead, measure thatis,thebestmodelgiventhelossfunction. utilityr()producesanoptimaldecisionconsistentwiththeriskpreferencesdenedbytheutility obtainingahighlydesirableoutcomeversusamoreconservativedecisionwithahighprobability ofamoderateoutcome.choosingthedecisionvectorthathasthebestexpected(negative) distributed,theaveragelossremp(;`)fortheobserveddatacanbeusedasanempiricalestimator zi,i=1;:::;`.assumingthattheobservedvectorsarestatisticallyindependentandidentically oftheexpectedloss,where Unfortunately,inpractice,theexpectedlossR()cannotbecalculateddirectlybecausethe modelsand/ortheirparametersareselectedbyoptimizingnumericalcriteriaofthisgeneralform. StatisticallearningtheorypresumesthatmodelsarechosenbyminimizingRemp(;`).Notethat thispresumptionisconsistentwithstandardmodel-ttingproceduresusedinstatisticsinwhich doesminimizingtheaverageempiricallossremp(;`)yieldmodelsthatalsominimizetheexpected Thefundamentalquestionofstatisticallearningtheoryisthefollowing:underwhatconditions Remp(;`)=1``Xi=1Q(zi;): fortheexpectedlosses,notfortheparameters.theexpectedlossr()foramodelisregarded expressedintermsofcondenceregions;however,inthiscase,condenceregionsareconstructed isarandomquantitythatwecansample,sinceitsvaluedependsonthevaluesoftheobserved asxedbutunknown,sincetheprobabilitymeasuref(z)thatdenesthestatisticalpropertiesof byconsideringtheaccuracyoftheempiricallossestimate.asinclassicalstatistics,accuracyis thedatavectorsisxedbutunknown.ontheotherhand,theaverageempiricallossremp(;`) lossr(),sincethelatteriswhatweactuallywanttoaccomplish?thisquestionisanswered datavectorszi,i=1;:::;`,usedinitscalculation.statisticallearningtheorythereforeconsiders condenceregionsforr()givenremp(;`). distinguishesstatisticallearningtheoryfromclassicalstatistics.oneofthefundamentaltheorems modelsareselectedbyminimizingaverageempiricalloss.thislattercaveatisthekeyissuethat dierencebetweentheexpectedandaverageempiricallosseswhiletakingintoaccountthefactthat Toconstructthesecondenceregions,weneedtoconsidertheprobabilitydistributionofthe 10

11 andaverageempiricallosses;thatis,onemustconsiderthedistributionof ofstatisticallearningtheoryshowsthat,inordertoaccountforthefactthatmodelsareselectedby minimizingaverageempiricalloss,onemustconsiderthemaximumdierencebetweentheexpected somanydegreesoffreedomthatonecanndamodelthattsthenoiseinthedatabutdoesnot adequatelyreecttheunderlyingrelationships.asaresult,oneobtainsamodelthatlooksgood whereisthesetofmodelsoneisselectingfrom. ofovertting.intuitivelyspeaking,overttingoccurswhenthesetofmodelstochoosefromhas Thereasonthatthemaximumdierencemustbeconsideredhastodowiththephenomenon 2R() Remp(;`); sup minimizesremp(;`).becauseofthissearch,themaximumdierencebetweentheexpectedand empiricallosswillunderestimatetheexpectedlossforaxedmodel,boththeprobabilityand thedegreeofunderestimationareincreasedbythefactthatweexplicitlysearchforthemodelthat averageempiricallossesisthequantitythatgovernsthecondenceregion. maticallycorrespondstoasituationinwhichtheaverageempiricallossremp(;`)substantially underestimatestheexpectedlossr().althoughthereisalwayssomeprobabilitythattheaverage relativetothetrainingdatabutthatperformspoorlywhenappliedtonewdata.thismathe- theyhavedevelopedtoconstructsmall-samplecondenceregionsfortheexpectedlossgiventhe averageempiricalloss.theresultingcondenceregionsdierfromthoseobtainedinclassical statisticsinthreerespects.first,theydonotassumethatthechosenmodeliscorrect.second, modelsoneisselectingfromindependentoftheformsofthosemodels.thismethodisbasedona theyarebasedonsmall-samplestatisticsandarenotasymptoticapproximationsasistypicallythe case.third,auniformmethodisusedtotakeintoaccountthedegreesoffreedominthesetof ThelandmarkcontributionofVapnikandChervonenkisisaseriesofprobabilityboundsthat example,thevcdimensionofalinearregressionordiscriminantmodelisequaltothenumberof termsinthemodel(i.e.,thenumberofdegreesoffreedomintheclassicalsense),sincenlinear anddoesnotformallyrequireanexactt;nevertheless,theintuitiveinsightsgainedbythinking termscanbeusedtoexactlytnpoints.theactualdenitionofvcdimensionismoregeneral ofdatavectorsforwhichoneisprettymuchguaranteedtondamodelthattsexactly.for measurementknownasthevapnik-chervonenkis(vc)dimension. abouttheconsequencesofexacttsareoftenvalidwithregardtovcdimension.forexample,one TheVCdimensionofasetofmodelscanconceptuallybethoughtofasthemaximumnumber exceedthevcdimensionofthesetofmodelstochoosefrom;otherwise,onecouldobtainanexact consequenceisthatinordertoavoidoverttingthenumberofdatasamplesshouldsubstantially ttoarbitrarydata. equallyapplicabletolinear,nonlinearandnonparametricmodels,andtocombinationsofdissimilar modelfamilies.thisincludesneuralnetworks,classicationandregressiontrees,classicationand regressionrules,radialbasisfunctions,bayesiannetworks,andvirtuallyanyothermodelfamily ofmodelswithonlyoneparameterthathaveinnitevcdimensionand,hence,areabletoexactly imaginable.inaddition,vcdimensionisamuchbetterindicatoroftheabilityofmodelstot arbitrarydatathanissuggestedbythenumberofparametersinthemodels.thereareexamples BecauseVCdimensionisdenedintermsofmodelttingandnumbersofdatapoints,itis tanysetofdata[22,23].therearealsomodelswithbillionsofparametersthathavesmallvc dimensions,whichenablesonetoobtainreliablemodelsevenwhenthenumberofdatasamplesis muchlessthanthenumberofparameters.vcdimensioncoincideswiththenumberofparameters 11

12 example,ifthelossfunctionq(z;)isthe0/1lossusedinpatternrecognition,thenwithprobability onlyforcertainmodelfamilies,suchaslinearregression/discriminantmodels.vcdimension thereforeoersamuchmoregeneralnotionofdegreesoffreedomthanisfoundinclassicalstatistics. regionislargelydeterminedbytheratioofthevcdimensiontothenumberofdatavectors.for atleast1, IntheprobabilityboundsobtainedbyVapnikandChervonenkis,thesizeofthecondence Remp(;`) pe VCdimensionhtothenumberofdatavectors`isthedominantterminthedenitionofEand, andwherehisthevcdimensionofthesetofmodelstochoosefrom.notethattheratioofthe E=4h`ln2` h+1 4`ln4 E1A; hence,inthesizeofthecondenceregionforr().otherfamiliesoflossfunctionshaveanalogous Theboundsarethereforeapplicableforanextremelywiderangeofmodelingproblemsandforany condenceregionsinvolvingthequantitye. familyofmodelsimaginable. propertiesofthedatavectors,theyarevalidforsmallsamplesizes,andtheyaredependentonly thattheymakenoassumptionsabouttheprobabilitydistributionf(z)thatdenesthestatistical onthevcdimensionofthesetofmodelsandonthepropertiesofthelossfunctionemployed. discussedindetailinbooksbyvapnik[21,22,23].theremarkablepropertiesoftheseboundsare TheconceptofVCdimensionandcondenceboundsforvariousfamiliesoflossfunctionsare 4.3Modelselection Asdiscussedatthebeginningofthissection,thedataanalystisexpectedtoprovidenotjusta singleparametricmodel,butanentireseriesofcompetingmodelsorderedaccordingtopreference, oneofwhichwillbeselectedbasedonanexaminationofthedata.theresultsofstatisticallearning theoryarethenusedtoselectthemostpreferablemodelthatbestexplainsthedata. amongthosemodelsthatoccurbeforethecut-o.asthecut-opointisadvancedthroughthe averageempiricallosssteadilydecreases.thesecondeectisthatthesizeofthecondenceregion ordering,theotheristoselectthemodelwiththesmallestaverageempiricallossremp(;`)from moremodelstochoosefromonecanusuallyobtainabetterttothedata;hence,theminimum preferenceordering,boththesetofmodelsthatappearbeforethecut-oandthevcdimensionof thissetsteadilyincrease.thisincreaseinvcdimensionhastwoeects.thersteectisthatwith Theselectionprocesshastwocomponents:oneistodetermineacut-opointinthepreference chooseacut-opointinthepreferenceordering,vapnikandchervonenkisadvocateminimizing fortheexpectedlossr()steadilyincreasesbecausethesizeisgovernedbythevcdimension.to forthedata. ofthecondenceparameter.themodelthatminimizestheaverageempiricallossremp(;`) forthosemodelsthatoccurbeforethechosencut-oisthenselectedasthemostsuitablemodel estimateofr().forexample,ifthe0/1lossfunctionwerebeingused,onewouldchoosethe theupperboundonthecondenceregionfortheexpectedloss;thatis,minimizetheworst-case cut-osoastominimizethelefthandsideoftheinequalitypresentedaboveforadesiredsetting TheoverallapproachisillustratedbythegraphinFigure1.Theprocessbalancestheability 12

13 Loss UpperBoundon ExpectedLoss Cut-O BestMinimumAverage EmpiricalLoss apoormodel.thepreferenceorderingprovidesthenecessarystructureinwhichtocompare tondincreasinglybettertstothedataagainstthedangerofoverttingandtherebyselecting Figure1:Expectedlossandaverageempiricallossasafunctionofthepreferencecut-o. PreferenceCut-O ofavailabledataincreases. 4.4Useofvalidationdata OnedrawbacktotheVapnik-ChervonenkisapproachisthatitcanbediculttodeterminetheVC processitselfattemptstomaximizetherateofconvergencetoanoptimummodelasthequantity (i.e.,vcdimension).theresultisamodelthatminimizestheworst-caselossonfuturedata.the competingmodelswhileatthesametimetakingintoaccounttheireectivedegreesoffreedom dimensionofasetofmodels,especiallyforthemoreexotictypesofmodels.evenforsimplelinear regression/discriminantmodels,thesituationisnotentirelystraightforward.therelationship statedabovethatthevcdimensionisequaltothenumberoftermsinsuchamodelisactually anupperboundonthevcdimension.ifthemodelsarewritteninacertaincanonicalform, dimensionsareordersofmagnitudesmallerthanthenumberofterms,evenifthemodelscontain thenthevcdimensionisalsoboundedbythequantityr2a2+1,whereristheradiusofthe onthevcdimensionmakesitpossibletoobtainlinearregression/discriminantmodelswhosevc billionsofterms.thisfactisextremelyfortunatebecauseitoersameansofavoidingthe\curseof smallestspherethatenclosestheavailabledatavectorsanda2isthesumofthesquaresofthe coecientsofthemodelinitscanonicalform.asvapnikhasshown[22],thisadditionalbound dimensionality,"enablingreliablemodelstobeobtainedeveninhigh-dimensionalspacesbybasing thepreferenceorderingofthemodelsonthesumofthesquaresofthemodelcoecients. canbeestimatedusingresamplingtechniques[3].inthesimplestoftheseapproaches,theavailable setofdataisrandomlydividedintotrainingandvalidationsets.thetrainingsetisusedrstto selectthebest-ttingmodelforeachcut-opointinthepreferenceordering.thevalidationset isthenusedtoestimatetheexpectedlossesoftheselectedmodelsbycalculatingtheiraverage IncaseswheretheVCdimensionofasetofmodelsisdiculttodetermine,theexpectedloss 13

14 expectedlossonthevalidationdataischosenasthemostsuitablemodel. uousparametersimpliesaninnitesetofmodels),itisveryeasytoobtaincondenceboundsfor empiricallossesonthevalidationdata.finally,themodelwiththesmallestupperboundforthe before,exceptthatenowhasthevalue aboutvcdimension[22].inparticular,thesameequationsforthecondenceboundsareusedas theexpectedlossesofthesemodelsindependentoftheirexactformsandwithouthavingtoworry Becauseonlyanitenumberofmodelsareevaluatedonthevalidationset(modelswithcontin- thesemodelsgiventheiraverageempiricallossesonthevalidationdata.sincethesameunderlying size`vofthevalidationset,onecanobtaintightcondenceregionsfortheexpectedlossesof principlesareatwork,thisapproachexhibitsthesamekindofrelationshipbetweentheexpected validationset.moreover,becausethenumbernofsuchmodelsistypicallysmallrelativetothe wherenisthenumberofmodelsevaluatedagainstthevalidationsetand`visthesizeofthe E=2`vlnN 2`vln; andaverageempiricallossesasthatshowninfigure1. expectedlossestimates,ithasthedisadvantagethatdividingtheavailabledataintosubsets decreasestheoverallaccuracyoftheresultingestimates.thisdecreaseinaccuracyisusually modelstoallofthedataandcalculatingthevcdimensionforallrelevantsetsofmodelsbecomes moreattractive. notmuchofaconcernwhendataisplentiful.however,whenthesamplesizeissmall,tting Althoughthisvalidation-setapproachhasanadvantageinthatitisrelativelyeasytoobtain Thestatisticaltheoryofminimizationoflossfunctionsprovidesageneralanalysisoftheconditions underwhichaclassofmodelsislearnable.thetheoryreducesthetaskoflearningtothatofsolving 5ComputationallearningtheoryandPAClearning empiricallossonthesamplesz1;:::;z`.beforeevendeningeciencyformally(weshalldo elaborateonpresently,thisturnsouttoberelatedtothefamousquestionfromcomputational widespreadbeliefisthatsuchalgorithmswillnotexistformanyclassofmodels.asweshall sosoon),wepointoutthatsuchecientalgorithmsarenotknowntoexist.furthermore,the Theperfectcomplementtothistheorywouldbeanecientalgorithmforeveryclassofmodels betocharacterizethemodelclassesforwhichecientalgorithmsdoexist.unfortunately,such characterizationsarealsoruledoutduetotheinherentundecidabilityofsuchquestions.inview ofthesebarriers,itbecomesclearthatthequestionofwhetheragivenmodelclassallowsforan ecientalgorithmtosolvetheminimizationproblemhastobetackledonanindividualbasis. Giventhattheanswertothisquestionismostprobablynegative,thenextbesthopewould focusonresultsthattendtounifythearea.thusmostofthissurveyisfocusedonformulating Thereareplentyofresultsthatshowhowtosolvesuchminimizationproblemsforvariousclasses ofmodels.theseshowthediversitywithintheareaofcomputationallearning.weshallhowever analysisoftheseproblems.wecoversomeofthesalientresultsinthisareainthisbriefsurvey. Thecomputationaltheoryoflearning,initiatedbyValiant'sworkin1984,isdevotedtothe 14

15 ofthemodel. 5.1Computationalmodeloflearning afunctionoftheinputandoutputsizeofthefunctiontobecomputed.thewell-entrenchedand Thecomplexityofacomputationaltaskisthenumberofelementarysteps(addition,subtraction, multiplication,division,comparison,etc.)ittakestoperformthecomputation.thisisstudiedas therightdenitionforthecomputationalsettingandexaminingseveralparametersandattributes well-studiednotionofeciencyisthatofpolynomialtime:analgorithmisconsideredecientif thenumberofelementaryoperationsitperformsisboundedbysomexedpolynomialintheinput andoutputsizes.theclassofproblemswhichcanbesolvedbysuchecientalgorithmsisdenoted byp(forpolynomialtime).thisshallbeournotionofeciencyaswell. representationofthehypothesis.inordertocircumventsuchdiculties,oneforcestherunning passasecient,bypickinganunnecessarilylargenumberofsamplesoranunnecessarilyverbose whichmaybeleftunclearbytheproblem.thechoicecouldeasilyallowaninecientalgorithmto Similarly,theoutputofthelearningalgorithmisagainarepresentationofthemodel,thechoiceof z1;:::;z`2rn,but`itselfmaybethoughtofasaparametertobechosenbythelearningalgorithm. theinputandoutputsizescarefully.theinputtothelearningtaskisacollectionofvectors Inordertostudythecomputationalcomplexityofthelearningproblem,wehavetodene with` atleast,notdirectly.butthesmallest`requiredtoguaranteegoodconvergencegrows consistentwiththedatawillbeatleastd.thusindirectlythisdoesallowtherunningtimetobe apolynomialin`. timeofthealgorithmtobepolynomialinn(theinputsizeofasinglesample)andthesizeof learningalgorithmproducesahypothesiswhosepredictionabilityisveryclose(givenbyanaccuracy algorithmisthatwithhighprobability(boundedawayfrom1byacondenceparameter),the functionqandasourceofrandomvectorsz2rnthatfollowsomeunknowndistributionf(z),a allowedtobeapolynomialin1=and1=aswell. beingdecidedbythealgorithm,andoutputsamodel(hypothesis)h(z1;:::;z`),possiblyfroma (generalized)paclearningalgorithmisonethattakestwoparameters(theaccuracyparameter) and(thecondenceparameter),reads`randomexamplesz1;:::;z`asinput,thechoiceof` Theabovediscussioncannowbeformalizedinthefollowingdenition,whichispopularly tobeecientifitsrunningtimeisboundedbyapolynomialinn,1=,1=andtherepresentation wherer()isthesameexpectedlossconsideredinstatisticallearningtheory.thealgorithmissaid Pr F[z1;:::;z`]2Rn`:R(h(z1;:::;z`))inf

16 learningproblem,inthissurveyweshallfocusonthebooleanpattern-recognitionproblemstypically examinedincomputationallearningtheory.herethedatavectorzispartitionedintoavector Hencetheaccuracyparameterrepresentsthemaximumpredictionerrordesiredforthemodel. x2f0;1gn 1andabity2f0;1gthatistobepredicted.Themodelisgivenbyafunction f:f0;1gn 1!f0;1gandthelossfunctionQ(z;)ofavectorz=[x;y]is0iff(x)=yand1 WhilethenotionofgeneralizedPAClearning(cf.[5])isitselfgeneralenoughtostudyany HenceforthwefocusonproblemsforwhichQ(z;)iscomputableeciently(i.e.,f(x)iscom- 5.2Intractablelearningproblems well-studiedcomputationalclassnp.npconsistsofproblemsthatcanbesolvedecientlybyan terministicmachinecannondeterministicallyguessthethatminimizestheloss,thussolvingthe problemeasily.ofcourse,theideaofanalgorithmthatmakesnondeterministicchoicesismerely amathematicalabstraction andnotecientlyrealizable.theimportanceofthecomputational algorithmthatisallowedtomakenondeterministicchoices.inthecaseoflearning,thenonde- classnpcomesfromthefactthatitcapturesmanywidelystudiedproblemssuchasthetravelingsalespersonproblem,orthegraphcoloringproblem.evenmoreimportantisthenotion restricted(tosomethingxed).atypicalexampleisthatoflearningapattern-recognitionproblem: tosolvethem? ofnp-hardness aproblemisnp-hardiftheexistenceofanecient(polynomial-time)algorithm tosolveitwouldimplyapolynomial-timealgorithmtosolveeveryprobleminnp.thefamous question\isnp=p?"asksexactlythisquestion:donp-hardproblemshaveecientalgorithms \3-termDNF".Itcanbeshownthatlearning3-termDNFformulaewith3-termDNFisNP-hard. Interestinglyhoweveritispossibletoecientlylearnabroaderclass\3CNF"whichcontains3- termdnf.thusthisnp-hardnessresultisnotpointingtoanyinherentcomputationalbottlenecks learningproblemtractable. tothetaskoflearning itmerelyadvocatesajudiciouschoiceofthehypothesisclasstomakethe ItiseasytoshowthatseveralPAClearningproblemsareNP-hardifthehypothesisclassis areeasytocompute,buthardtoinvert,evenonrandomlychoseninstances.suchinstancesare commonincryptography,andinparticulararetheheartofwell-knowncryptosystemssuchasrsa. Ifthisassumptionistrue,itimpliesthatNP6=P.Underthisassumptionitispossibletoshowthat patternrecognitionproblems,wherethepatternisgeneratedbyadeterministicfiniteautomaton somethingstrongerthannp6=p.acommonassumptionhereisthatthereexistfunctionswhich ofchoicefortheoutput.inordertoshowthehardnessofsuchproblemsoneneedstoassume Itishardertoshowthataclassofproblemsishardtolearnindependentoftherepresentation (orhiddenmarkovmodel)arehardtolearn,undersomedistributionsonthespaceofthedata vectors.recentresultsalsoshowthatpatternsgeneratedbyconstantdepthbooleancircuitsare Furthermore,thecomplexityofthelearningprocessisdenitelydependentontheunderlying hardtolearnundertheuniformdistribution. i.e.,moretractable,whennorestrictionsareplacedonthemodelusedtodescribethegivendata. distributionaccordingtowhichwewishtolearn.16 Insummary,thenegativeresultsshednewlightontwoaspectsoflearning.Learningiseasier,

17 theroleoftheparametersandinthedenitionoflearning.aswewillseethesearenotvery inlearningandpresentanalternatemodelwhichshowsmorerobustnesstowardssuchnoise. Thestrengthofweaklearning.Ofthetwofuzzparameters,and,usedinthedenition criticaltothelearningprocess.thesecondissuewewillconsideristheroleof\classicationnoise" Wenowmovetosomelessonslearntfrompositiveresultsinlearning.Therstofthesefocuseson 5.3PAClearningalgorithms ofpaclearning,itseemsclearthat(theaccuracy)ismoresignicantthan(thecondence), especiallyforpatternrecognitionproblems.forsuchproblems,givenanalgorithmwhichcanlearn themajorityvoteis-inaccuratewithprobability1 exp( ck)forsomec>0. algorithmktimes,producinganewhypothesiseachtime.denotethesehypothesesbyh1;:::;hk. amodelwithprobability,say2=3(oranycondencestrictlygreaterthan1=2),itiseasytoboost Useforthenewpredictionthealgorithmwhosepredictiononanyvectorxisthemajorityvoteof thepredictionsofh1;:::;hk.itiseasytoshow,byanapplicationofthelawoflargenumbers,that thecondenceofgettingagoodhypothesisasfollows.pickaparameterkandrunthelearning accuracy.ofcourse,theproblemisthatwedon'tknowwhereourearlierpredictionswerewrong(if samelearningalgorithmontheregionwhereourearlierpredictionsareinaccuratetoboostour 1=3,independentofthedistributionfromwhichthedatavectorsarepicked,thenwecouldusethe weareluckyenoughtobeabletondlearningalgorithmswhichlearntopredictwithinaccuracy isunclearastohowonecouldusealearningalgorithmwhichcanlearntopredictamodelwith inaccuracy1=3togetanewalgorithmwhichcanpredictamodelwithinaccuracy1%.however,if Theaccuracyparameter,ontheotherhand,doesnotappeartoallowsuchsimpleboosting.It robustnessofpaclearning:weaklearning(withinaccuracybarelybelow1=2)isequivalentto stronglearning(withinaccuracyarbitrarilycloseto0).howeverwestressthatthisequivalence tosquareone,itturnsoutnottobethecase.in1986,schapireshowedhowtoturnthisintuition togetaboostingresultfortheaccuracyparameteraswell.thisresultdemonstratesasurprising weknewwewouldchangeourprediction!).thoughitappearsthatthisreasoninghasledusback isobservedwithnopredictionnoise.thisisnotanassumptionjustiedbyreality.itismade usuallytogetabasicunderstandingoftheproblem.howeverinordertomakeacomputational learningresultusefulinpractice,onemustallowfornoise.numerousexamplesareknownwherean distribution. Learningwithnoise.Mostresultsincomputationallearningstartbyassumingthatthedata withanoracleandgetstoask\statistical"questionsaboutthedatavectors.atypicalstatistical queryasksfortheprobabilitythataneventdenedoverthedataspaceoccursforavectorchosen insteadofactuallyseeingdatavectorszassampledfromthespace,thelearningalgorithmworks aretoleranttoerrorswhileothersarenot,amodeloflearningcalledstatisticalquerymodelhas beenproposedbykearnsin1992.thismodelrestrictsalearningalgorithminthefollowingway: amountofnoiseaswell.howeverthisisnotuniversallytrue.tounderstandwhysomealgorithms algorithmwhichlearnswithoutclassicationnoise,canbeconvertedintoonethatcantoleratesome samplesofthedata.furthermore,itiseasytoseehowtosimulatethisoracleevenwhenthedata withinanadditiveerrorof.itiseasytoseehowtosimulatethisoracle,givenaccesstorandom presentedwithatoleranceparameter.theoraclerespondswiththeprobabilityoftheeventto atrandomfromthedistributionunderwhichweareattemptingtolearn.further,thequeryis 17

18 Table1:Statisticians'anddataminers'issuesindataanalysis. Statisticians'issues Modelspecication Parameterestimation Dataminers'issues Accuracy statisticalqueryoracleisasucientconditionforlearningwithclassicationnoise.almostall vectorscomewithsomeclassicationnoise,butlessthan.thuslearningwithaccessonlytoa Modelcomparison Diagnosticchecks Asymptotics Generalizability Computationalcomplexity Modelcomplexity potentiallearningstrategywhenattemptingtolearninthepresenceofnoise. model.thusthismodelprovidesagoodstandpointfromwhichtoanalysetheeectivenessofa knownalgorithmsthatlearnwithclassicationnoisecanbeshowntolearninthestatisticalquery Speedofcomputation handwritingrecognitionprograms.aclassoflearningalgorithmsthatbehaveinthismannerhas askquestionsaboutthedataoneistryingtolearn.considerforinstanceahandwritingrecognition program,whichgeneratessomepatternsandaskstheteachertoindicatewhatletterthispattern Alternatemodelsforlearning.ThissurveyhasfocusedonthePACmodelsinceitisclose beenstudiedunderthelabeloflearningwithqueries.othermodelsforlearningthathavebeen seemstoresemble.itisconceivablethatsuchlearningprogramsmaybemoreecientthanpassive modelsotherthanthepacmodel.thisbodyofworkconsiderslearningwhenoneisallowedto tothespiritofdatamining.however,alargebodyofworkincomputationallearningfocuseson 5.4Furtherreading Wehavegivenaveryinformalsketchofthevariousnewquestionsposedbystudyingtheprocess oflearning,orttingmodelstoagivendata,fromthepointofviewofcomputation.duetospace studiedincludecapturescenariosofsupervisedlearningandlearninginanonlinesetting. above.theinterestedreaderisreferredtothethetextonthissubjectbykearnsandvazirani[9] limitations,wedonotgiveacompletelistofreferencestothesourcesoftheresultsmentioned learningandtheirapplicabilitytopracticalscenarios. 6Conclusions foradetailedcoverageofthetopicsabovewithcompletereferences.othersurveysonthistopic include,thosebyvaliant[20]andangluin[1].finallyanumberofdierentlecturenotesarenow Theforegoingsectionsillustratesomedierencesofapproachbetweenclassicalstatisticsanddataminingmethodsthatoriginatedincomputerscienceandengineering.Table1summarizeswhatwe regardastheprincipalissuesindataanalysisthatwouldbeconsideredbystatisticiansanddata Inaddition,theapproachesofstatisticallearningtheoryandcomputationallearningtheory includespointerstootherusefulhomepagesfortrackingrecentdevelopmentsincomputational availableonlineonthistopic.thissurvey,hasinparticularusedthoseofmansour[12],which provideproductiveextensionsofclassicalstatisticalinference.theinferenceproceduresofclassical miners. 18

19 datasamplesbutnotforthefactthatinmanycasesthechoiceofmodelisdependentonthedata. statisticsinvolverepeatedsamplingunderagivenstatisticalmodel;theyallowforvariationacross andthatthetargetconceptisdeterministic.evenwiththesesimplications,usefulpositiveresults fornear-optimalmodelingarediculttoobtain,andforsomemodelingproblemsonlynegative distributionsofthedata.however,themajorityoftheresultsassumethatthedataarenoise-free seektoidentifymodelingproceduresthathaveahighprobabilityofnear-optimalityoverallpossible thatcouldinpracticebeverylarge.thepac-learningresultsfromcomputationallearningtheory Statisticallearningtheorybasesitsinferencesonrepeatedsamplingfromanunknowndistribution resultshavebeenobtained. ofthedata,andallowsfortheeectofmodelchoice,atleastwithinaprespeciedclassofmodels Forexample,statisticianstendtoworkwithrelativelysimplemodelsforwhichissuesofcomputationalspeedhaverarelybeenaconcern.Someofthedierences,however,presentopportunities inferencearerelatedtothedierentkindsofproblemsonwhichtheseapproacheshavebeenused. forstatisticiansanddataminerstolearnfromeachother'sapproaches.statisticianswoulddowell modelhasbeenidentied,andinsteadgivemoreattentiontoestimatesofpredictiveaccuracy todownplaytheroleofasymptoticaccuracyestimatesbasedontheassumptionthatthecorrect Tosomeextent,thedierencesbetweenstatisticalanddata-miningapproachestomodelingand obtainedfromdataseparatefromthoseusedtotthemodel.dataminerscanbenetbylearning fromstatisticians'awarenessoftheproblemscausedbyoutliersandinuentialdatavalues,and modelsareadequateandtheimportantvariablescanbeidentiedbeforemodeling.inproblems bymakinggreateruseofdiagnosticstatisticsandplotstoidentifyirregularitiesinthedataand inadequaciesinthemodel. withlargedatasetsinwhichtherelationbetweenclassandfeaturevariablesiscomplexand exempliedbythoselistedinsection2.2,oersnosharpdistinctionbetweenstatisticalanddataminingmethods.nosinglemethodislikelytobeobviouslybestforagivenproblem,anduseofa Asnotedearlier,statisticalmethodsareparticularlylikelytobepreferablewhenfairlysimple poorlyunderstood,dataminingmethodsoerabetterchanceofsuccess.however,manypracticalproblemsfallbetweentheseextremes,andthevarietyofavailablemodelsfordataanalysisbasedclassiermightuseadditionalfeaturevariablesformedfromlinearcombinationsoffeatures computedimplicitlybylogisticdiscriminantoraneural-networkclassier.inferencesfromseveral combinationofapproachesoersthebestchanceofmakingsecureinferences.forexample,arule- them. minerscanprotbystudyingeachother'smethodsandusingajudiciouslychosencombinationof inputfeatures \stackedgeneralization"[28].theoverallconclusionisthatstatisticiansanddata distinctfamiliesofmodelscanbecombined,eitherbyweightingthemodels'predictionsorbyan Acknowledgements additionalstageofmodelinginwhichpredictionsfromdierentmodelsarethemselvesusedas WearehappytoacknowledgehelpfuldiscussionswithseveralparticipantsattheWorkshoponData YishayMansour,DanaRonandRonittRubinfeld(M.S.). MininganditsApplications,InstituteofMathematicsanditsApplications,Minneapolis,November 1996(J.H.),manyconversationswithVladimirVapnik(E.P.),andcommentsandpointersfrom

20 References [1]Angluin,D.(1992).Computationallearningtheory:surveyandselectedbibliography.InProceedings ofthetwentyfourthannualsymposiumontheoryofcomputing,351{369.acm. [2]Cox,D.R.andHinkley,D.V.(1986).Theoreticalstatistics.London:ChapmanandHall. [3]Efron,B.(1981).Thejackknife,thebootstrap,andotherresamplingplans,CBMSMonograph38. Philadelphia,Pa.:SIAM. [4]Hand,D.J.(1981).Discriminationandclassication.Chichester,U.K.:Wiley. [5]Haussler,D.(1990).DecisiontheoreticgeneralizationsofthePAClearningmodel.InAlgorithmic LearningTheory,eds.S.Arikawa,S.Goto,S.Ohsuga,andT.Yokomori,pp.21{41.NewYork:Springer- Verlag. [6]Huber,P.J.(1981).Robuststatistics.NewYork:Wiley. [7]John,G.,Kohavi,R.,andPeger,K.(1994).Irrelevantfeaturesandthesubsetselectionproblem. InMachineLearning:ProceedingsoftheEleventhInternationalConference,pp.121{129.SanMateo, Calif.:MorganKaufmann. [8]Journel,A.G.,andHuibregts,C.J.(1978).Mininggeostatistics.London:AcademicPress. [9]Kearns,M.J.,andVazirani,U.V.(1994).Anintroductiontocomputationallearningtheory.Cambridge, Mass.:MITPress. [10]Kononenko,I.,andHong,S.J.(1997).Attributeselectionformodeling.FutureGenerationComputer Systems,thisissue. [11]Lovell,M.C.(1983).Datamining.ReviewofEconomicsandStatistics,65,1{12. [12]Mansour,Y.Lecturenotesonlearningtheory.Availablefromhttp:// [13]Michie,D.,Spiegelhalter,D.J.,andTaylor,C.C.(eds.)(1994).Machinelearning,neuralandstatistical classication.hemelhempstead,u.k.:ellishorwood. [14]Miller,A.J.(1983).Contributiontothediscussionof\Regression,predictionandshrinkage"byJ.B. Copas.JournaloftheRoyalStatisticalSociety,SeriesB,45,346{347. [15]Pearl,J.(1995).Causaldiagramsforempiricalresearch.Biometrika,82,669{710. [16]Ripley,B.D.(1994).Commenton\Neuralnetworks:areviewfromastatisticalperspective"by B.ChengandD.M.Titterington.StatisticalScience,9,45{48. [17]Sakamoto,Y.,Ishiguro,M.,andKitagawa,G.(1986).Akaikeinformationcriterionstatistics.Dordrecht, Holland:Reidel. [18]Tukey,J.W.(1977).Exploratorydataanalysis.Reading,Mass.:Addison-Wesley. [19]Vach,W.,Rossner,R.,andSchumacher,M.(1996).Neuralnetworksandlogisticregression:partII. ComputationalStatisticsandDataAnalysis,21,683{701. [20]Valiant,L.(1991).Aviewofcomputationallearningtheory.InComputationandCognition:Proceedings ofthefirstnecresearchsymposium,32{51.philadelphia,pa.:siam. [21]Vapnik,V.N.(1982).Estimationofdependenciesbasedonempiricaldata.NewYork:Springer-Verlag. [22]Vapnik,V.N.(1995).Thenatureofstatisticallearningtheory.NewYork:Springer-Verlag. [23]Vapnik,V.N.(toappear,1997).Statisticallearningtheory.NewYork:Wiley. [24]Vapnik,V.N.,andChervonenkis,A.Ja.(1971).Ontheuniformconvergenceofrelativefrequencies ofeventstotheirprobabilities.theoryofprobabilityanditsapplications,16,264{280.originally publishedindokladyakademiinaukussr,181(1968). [25]Vapnik,V.N.,andChervonenkis,A.Ja.(1981).Necessaryandsucientconditionsfortheuniform convergenceofmeanstotheirexpectations.theoryofprobabilityanditsapplications,26,532{553. [26]Vapnik,V.N.,andChervonenkis,A.Ja.(1991).Thenecessaryandsucientconditionsforconsistency ofthemethodofempiricalriskminimization.patternrecognitionandimageanalysis,1,284{305. OriginallypublishedinYearbookoftheAcademyofSciencesoftheUSSRonRecognition,Classication, andforecasting,2(1989). [27]Weisberg,S.(1985).Appliedregressionanalysis,2ndedn.NewYork:Wiley. [28]Wolpert,D.(1992).Stackedgeneralization.NeuralNetworks,5,241{

( ) = ( ) = {,,, } β ( ), < 1 ( ) + ( ) = ( ) + ( )

( ) = ( ) = {,,, } β ( ), < 1 ( ) + ( ) = ( ) + ( ) { } ( ) = ( ) = {,,, } ( ) β ( ), < 1 ( ) + ( ) = ( ) + ( ) max, ( ) [ ( )] + ( ) [ ( )], [ ( )] [ ( )] = =, ( ) = ( ) = 0 ( ) = ( ) ( ) ( ) =, ( ), ( ) =, ( ), ( ). ln ( ) = ln ( ). + 1 ( ) = ( ) Ω[ (

More information

ú Ó Á É é ú ú É ú Á Á ú É É É ú É Ó É ó É Á ú ú ó Á Á ú Ó ú Ó ú É Á ú Á ú ó ú Á ú Á É Á Á Ó É Á ú ú é ú ú ú ú Á ú ó ú Ó Á Á Á Á ú ú ú é É ó é ó ú ú ú É é ú ú ú óú ú ú Ó Á ú ö é É ú ú ú úé ú ú É É Á É

More information

Christfried Webers. Canberra February June 2015

Christfried Webers. Canberra February June 2015 c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic

More information

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes

Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on

More information

CS570 Data Mining Classification: Ensemble Methods

CS570 Data Mining Classification: Ensemble Methods CS570 Data Mining Classification: Ensemble Methods Cengiz Günay Dept. Math & CS, Emory University Fall 2013 Some slides courtesy of Han-Kamber-Pei, Tan et al., and Li Xiong Günay (Emory) Classification:

More information

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails

A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails 12th International Congress on Insurance: Mathematics and Economics July 16-18, 2008 A Uniform Asymptotic Estimate for Discounted Aggregate Claims with Subexponential Tails XUEMIAO HAO (Based on a joint

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, 2013. p i.

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, 2013. p i. New York, NY, USA: Basic Books, 2013. p i. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=2 New York, NY, USA: Basic Books, 2013. p ii. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=3 New

More information

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University

Master's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary

More information

Dr. BABASAHEB AMBEDKAR MARAHWADA UNIVERSITY, AURANGABAD. Syllabus of Post Graduate Diploma in Human Resource Management [PGDHRM]

Dr. BABASAHEB AMBEDKAR MARAHWADA UNIVERSITY, AURANGABAD. Syllabus of Post Graduate Diploma in Human Resource Management [PGDHRM] Dr. BABASAHEB AMBEDKAR MARAHWADA UNIVERSITY, AURANGABAD Syllabus of Post Graduate Diploma in Human Resource Management [PGDHRM] As Per Credit System Effective From Academic Year 2009-2010 O- 819 A Candidate

More information

Chapter 1. Introduction to Accounting and Business

Chapter 1. Introduction to Accounting and Business 1 Chapter 1 Introduction to Accounting and Business Learning Objective 1 Describe the nature of a business, the role of accounting, and ethics in business. Nature of Business and Accounting A business

More information

Heat Exchangers - Introduction

Heat Exchangers - Introduction Heat Exchangers - Introduction Concentric Pipe Heat Exchange T h1 T c1 T c2 T h1 Energy Balance on Cold Stream (differential) dq C = wc p C dt C = C C dt C Energy Balance on Hot Stream (differential) dq

More information

Nominal and Real U.S. GDP 1960-2001

Nominal and Real U.S. GDP 1960-2001 Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 318- Managerial Economics Use the data set for gross domestic product (gdp.xls) to answer the following questions. (1) Show graphically

More information

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES Advances in Information Mining ISSN: 0975 3265 & E-ISSN: 0975 9093, Vol. 3, Issue 1, 2011, pp-26-32 Available online at http://www.bioinfo.in/contents.php?id=32 ANALYSIS OF FACTOR BASED DATA MINING TECHNIQUES

More information

Adjusting Entries and the Work Sheet

Adjusting Entries and the Work Sheet Heintz & Parry th Edition Chapter 5 th Edition College Accounting Adjusting Entries and the Work Sheet 1 Prepare end-of-period adjustments. END-OF-PERIOD ADJUSTMENTS Changes occur that affect the business

More information

Essential Topic: Continuous cash flows

Essential Topic: Continuous cash flows Essential Topic: Continuous cash flows Chapters 2 and 3 The Mathematics of Finance: A Deterministic Approach by S. J. Garrett CONTENTS PAGE MATERIAL Continuous payment streams Example Continuously paid

More information

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models

R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Faculty of Health Sciences R 2 -type Curves for Dynamic Predictions from Joint Longitudinal-Survival Models Inference & application to prediction of kidney graft failure Paul Blanche joint work with M-C.

More information

Simple Linear Regression

Simple Linear Regression STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals (9.3) Conditions for inference (9.1) Want More Stats??? If you have enjoyed learning how to analyze

More information

APPENDIX 4 F HELP DESK SERVICES AND PERFORMANCE INDICATORS. In this Appendix the definitions used are as set out in Schedule 1 of the Agreement.

APPENDIX 4 F HELP DESK SERVICES AND PERFORMANCE INDICATORS. In this Appendix the definitions used are as set out in Schedule 1 of the Agreement. APPENDIX 4 F ELP DESK SERVICES AND PERFORMANCE INDICATORS 1. DEFINITIONS In this Appendix the definitions used are as set out in Schedule 1 of the Agreement. 2. ELP DESK SERVICES 2.1 General Requirements

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

The frequency function of elliptical galaxy intrinsic shapes

The frequency function of elliptical galaxy intrinsic shapes Rochester Institute of Technology RIT Scholar Works Articles 1995 The frequency function of elliptical galaxy intrinsic shapes Benoit Tremblay David Merritt Follow this and additional works at: http://scholarworks.rit.edu/article

More information

Manual for SOA Exam MLC.

Manual for SOA Exam MLC. Chapter 5. Life annuities. Extract from: Arcones Manual for the SOA Exam MLC. Spring 2010 Edition. available at http://www.actexmadriver.com/ 1/114 Whole life annuity A whole life annuity is a series of

More information

Expected default frequency

Expected default frequency KM Model Expected default frequency Expected default frequency (EDF) is a forward-looking measure of actual probability of default. EDF is firm specific. KM model is based on the structural approach to

More information

BINOMIAL DISTRIBUTION

BINOMIAL DISTRIBUTION MODULE IV BINOMIAL DISTRIBUTION A random variable X is said to follow binomial distribution with parameters n & p if P ( X ) = nc x p x q n x where x = 0, 1,2,3..n, p is the probability of success & q

More information

APPENDIX 4F HELP DESK SERVICES. In this Appendix the definitions used are as set out in Schedule 1 of the Agreement.

APPENDIX 4F HELP DESK SERVICES. In this Appendix the definitions used are as set out in Schedule 1 of the Agreement. APPENDIX 4F HELP DESK SERVICES 1. DEFINITIONS In this Appendix the definitions used are as set out in Schedule 1 of the Agreement. 2. HELP DESK SERVICES 2.1 General Requirements (c) (d) (e) Project Co

More information

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:

More information

Nutrition and Biochemistry. Pr. Max. 100. Th. Max. 25. Sign. of HOD

Nutrition and Biochemistry. Pr. Max. 100. Th. Max. 25. Sign. of HOD FINAL RESULT OF INTERNAL ASSESSMENT: -May-June /Nov-Dec Examination 20. For Academic Year 2007-08 onwards. Faculty : First Basic B.Sc. () Name of the College : Phone No : Name of s Seat Anatomy & Physiology

More information

How To Calculate The Power Of A Cluster In Erlang (Orchestra)

How To Calculate The Power Of A Cluster In Erlang (Orchestra) Network Traffic Distribution Derek McAvoy Wireless Technology Strategy Architect March 5, 21 Data Growth is Exponential 2.5 x 18 98% 2 95% Traffic 1.5 1 9% 75% 5%.5 Data Traffic Feb 29 25% 1% 5% 2% 5 1

More information

270107 - MD - Data Mining

270107 - MD - Data Mining Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 015 70 - FIB - Barcelona School of Informatics 715 - EIO - Department of Statistics and Operations Research 73 - CS - Department of

More information

Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

More information

Expectations and Future Direction of MOP Guidelines Matthew Newton, Principal Officer Rehabilitation Standards Division of Resources & Energy

Expectations and Future Direction of MOP Guidelines Matthew Newton, Principal Officer Rehabilitation Standards Division of Resources & Energy Expectations and Future Direction of MOP Guidelines Matthew Newton, Principal Officer Rehabilitation Standards Division of Resources & Energy Mine Rehab Conference 2014 Best Practice Ecological Rehabilitation

More information

Portfolio Using Queuing Theory

Portfolio Using Queuing Theory Modeling the Number of Insured Households in an Insurance Portfolio Using Queuing Theory Jean-Philippe Boucher and Guillaume Couture-Piché December 8, 2015 Quantact / Département de mathématiques, UQAM.

More information

AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE

AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE AN INTRODUCTION TO MATCHING METHODS FOR CAUSAL INFERENCE AND THEIR IMPLEMENTATION IN STATA Barbara Sianesi IFS Stata Users Group Meeting Berlin, June 25, 2010 1 (PS)MATCHING IS EXTREMELY POPULAR 240,000

More information

The term structure of Russian interest rates

The term structure of Russian interest rates The term structure of Russian interest rates Stanislav Anatolyev New Economic School, Moscow Sergey Korepanov EvrazHolding, Moscow Corresponding author. Address: Stanislav Anatolyev, New Economic School,

More information

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS).

Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS). Methodological aspects of small area estimation from the National Electronic Health Records Survey (NEHRS. Vladislav Beresovsky National Center for Health Statistics 3311 Toledo Road Hyattsville, MD 078

More information

Risk-minimization for life insurance liabilities

Risk-minimization for life insurance liabilities Risk-minimization for life insurance liabilities Francesca Biagini Mathematisches Institut Ludwig Maximilians Universität München February 24, 2014 Francesca Biagini USC 1/25 Introduction A large number

More information

Mansun Chan, Xuemei Xi, Jin He, and Chenming Hu

Mansun Chan, Xuemei Xi, Jin He, and Chenming Hu Mansun Chan, Xuemei Xi, Jin He, and Chenming Hu Acknowledgement The BSIM project is partially supported by SRC, CMC, Conexant, TI, Mentor Graphics, and Xilinx BSIM Team: Prof. Chenming Hu, Dr, Jane Xi,

More information

Stirling s formula, n-spheres and the Gamma Function

Stirling s formula, n-spheres and the Gamma Function Stirling s formula, n-spheres and the Gamma Function We start by noticing that and hence x n e x dx lim a 1 ( 1 n n a n n! e ax dx lim a 1 ( 1 n n a n a 1 x n e x dx (1 Let us make a remark in passing.

More information

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

More information

529 QuickView Ease of Enrollment and Access to Your Client Accounts

529 QuickView Ease of Enrollment and Access to Your Client Accounts 529 QuickView TH E S TAT E T R EA SURER Administered by Nevada State Treasurer OFFICE O F Ease of Enrollment and Access to Your Client Accounts 18 64 4 186 DIO ECETES CIVITAS NE VA D A Access the Client

More information

Statistics 305: Introduction to Biostatistical Methods for Health Sciences

Statistics 305: Introduction to Biostatistical Methods for Health Sciences Statistics 305: Introduction to Biostatistical Methods for Health Sciences Modelling the Log Odds Logistic Regression (Chap 20) Instructor: Liangliang Wang Statistics and Actuarial Science, Simon Fraser

More information

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS

SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS Copyright 005 by the Society of Actuaries and the Casualty Actuarial Society

More information

AKRON PUBLIC SCHOOLS CURRICULUM PACING GUIDE 2013-14

AKRON PUBLIC SCHOOLS CURRICULUM PACING GUIDE 2013-14 GRADE/COURSE: Drawing and Design Semester The student will: Suggested Artworks Suggested Text/ Resources ELA s One- Three Review Elements of Art and Principles of Design as artworks are viewed, discussed

More information

Home Loan Documents Checklist Malaysians Working In Malaysia

Home Loan Documents Checklist Malaysians Working In Malaysia Home Loan Documents Checklist Malaysians Working In Malaysia A. EMPLOYMENT NRIC (copy) Vendor /New Sales & Purchase Agreement Latest 3 months pay slip (for Basic Salary)/Latest 6 months pay slip (for Basic

More information

Constant Elasticity of Variance (CEV) Option Pricing Model:Integration and Detailed Derivation

Constant Elasticity of Variance (CEV) Option Pricing Model:Integration and Detailed Derivation Constant Elasticity of Variance (CEV) Option Pricing Model:Integration and Detailed Derivation Ying-Lin Hsu Department of Applied Mathematics National Chung Hsing University Co-authors: T. I. Lin and C.

More information

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page

Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8

More information

THE SVM APPROACH FOR BOX JENKINS MODELS

THE SVM APPROACH FOR BOX JENKINS MODELS REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 23 36 THE SVM APPROACH FOR BOX JENKINS MODELS Authors: Saeid Amiri Dep. of Energy and Technology, Swedish Univ. of Agriculture Sciences, P.O.Box

More information

Bayesian Networks. Mausam (Slides by UW-AI faculty)

Bayesian Networks. Mausam (Slides by UW-AI faculty) Bayesian Networks Mausam (Slides by UW-AI faculty) Bayes Nets In general, joint distribution P over set of variables (X 1 x... x X n ) requires exponential space for representation & inference BNs provide

More information

ANSWERS TO QUESTIONS FOR GROUP LEARNING

ANSWERS TO QUESTIONS FOR GROUP LEARNING Accounting for a 5 Merchandising Business ANSWERS TO QUESTIONS FOR GROUP LEARNING Q5-1 A merchandising business has a major revenue reduction called cost of goods sold. The computation of cost of goods

More information

Pacific Journal of Mathematics

Pacific Journal of Mathematics Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000

More information

Clicking on the + will display the courses available for selection. Science Options for Classes of 2018 If you have not yet completed Earth Science Essentials or Biology, please select these for 2015-2016

More information

EXP 481 -- Capital Markets Option Pricing. Options: Definitions. Arbitrage Restrictions on Call Prices. Arbitrage Restrictions on Call Prices 1) C > 0

EXP 481 -- Capital Markets Option Pricing. Options: Definitions. Arbitrage Restrictions on Call Prices. Arbitrage Restrictions on Call Prices 1) C > 0 EXP 481 -- Capital Markets Option Pricing imple arbitrage relations Payoffs to call options Black-choles model Put-Call Parity Implied Volatility Options: Definitions A call option gives the buyer the

More information

Distribution Analysis

Distribution Analysis Finding the best distribution that explains your data ENMAX Energy Corporation 8 October, 2015 Introduction Introduction Statistical tests Goodness of fit We often fit observations to a model (e.g., lognormal

More information

On closed-form solutions of a resource allocation problem in parallel funding of R&D projects

On closed-form solutions of a resource allocation problem in parallel funding of R&D projects Operations Research Letters 27 (2000) 229 234 www.elsevier.com/locate/dsw On closed-form solutions of a resource allocation problem in parallel funding of R&D proects Ulku Gurler, Mustafa. C. Pnar, Mohamed

More information

Solutions to Exercises, Section 4.5

Solutions to Exercises, Section 4.5 Instructor s Solutions Manual, Section 4.5 Exercise 1 Solutions to Exercises, Section 4.5 1. How much would an initial amount of $2000, compounded continuously at 6% annual interest, become after 25 years?

More information

Stochastic programming approaches to pricing in non-life insurance

Stochastic programming approaches to pricing in non-life insurance Stochastic programming approaches to pricing in non-life insurance Martin Branda Charles University in Prague Department of Probability and Mathematical Statistics 11th International Conference on COMPUTATIONAL

More information

Statistik for MPH: 2. 10. september 2015. www.biostat.ku.dk/~pka/mph15. Risiko, relativ risiko, signifikanstest (Silva: 110-133.) Per Kragh Andersen

Statistik for MPH: 2. 10. september 2015. www.biostat.ku.dk/~pka/mph15. Risiko, relativ risiko, signifikanstest (Silva: 110-133.) Per Kragh Andersen Statistik for MPH: 2 10. september 2015 www.biostat.ku.dk/~pka/mph15 Risiko, relativ risiko, signifikanstest (Silva: 110-133.) Per Kragh Andersen 1 Fra den. 1 uges statistikundervisning: skulle jeg gerne

More information

ENERGY EFFICIENCY METRICS

ENERGY EFFICIENCY METRICS ENERGY EFFICIENCY METRICS Ian Househam 011 482 5990 ihouseham@iiec.org Overview of South Africa s Energy Efficiency Strategy Energy Efficiency Strategy set sectoral and economy-wide energy efficiency targets

More information

Contents. Dedication List of Figures List of Tables. Acknowledgments

Contents. Dedication List of Figures List of Tables. Acknowledgments Contents Dedication List of Figures List of Tables Foreword Preface Acknowledgments v xiii xvii xix xxi xxv Part I Concepts and Techniques 1. INTRODUCTION 3 1 The Quest for Knowledge 3 2 Problem Description

More information

Voluntary Voting: Costs and Bene ts

Voluntary Voting: Costs and Bene ts Voluntary Voting: Costs and Bene ts Vijay Krishna y and John Morgan z November 7, 2008 Abstract We study strategic voting in a Condorcet type model in which voters have identical preferences but di erential

More information

Naïve Bayes and Hadoop. Shannon Quinn

Naïve Bayes and Hadoop. Shannon Quinn Naïve Bayes and Hadoop Shannon Quinn http://xkcd.com/ngram-charts/ Coupled Temporal Scoping of Relational Facts. P.P. Talukdar, D.T. Wijaya and T.M. Mitchell. In Proceedings of the ACM International Conference

More information

A POOLING METHODOLOGY FOR COEFFICIENT OF VARIATION

A POOLING METHODOLOGY FOR COEFFICIENT OF VARIATION Sankhyā : The Indian Journal of Statistics 1995, Volume 57, Series B, Pt. 1, pp. 57-75 A POOLING METHODOLOGY FOR COEFFICIENT OF VARIATION By S.E. AHMED University of Regina SUMMARY. The problem of estimating

More information

Marketing & Communications

Marketing & Communications & 1 & Coordinator / Assistant Supports the Department with the coordination and development of reports. May also be required to perform marketing administrative duties. Diploma $1,800-$2,500 & Oversees

More information

Presenter: Sharon S. Yang National Central University, Taiwan

Presenter: Sharon S. Yang National Central University, Taiwan Pricing Non-Recourse Provisions and Mortgage Insurance for Joint-Life Reverse Mortgages Considering Mortality Dependence: a Copula Approach Presenter: Sharon S. Yang National Central University, Taiwan

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Course Syllabus Business Intelligence and CRM Technologies

Course Syllabus Business Intelligence and CRM Technologies Course Syllabus Business Intelligence and CRM Technologies August December 2014 IX Semester Rolando Gonzales I. General characteristics Name : Business Intelligence CRM Technologies Code : 06063 Requirement

More information

ELY, WILLIAM, M.A. Pricing European Stock Options using Stochastic and Fuzzy Continuous Time Processes. (2012) Directed by Jan Rychtar. 71 pp.

ELY, WILLIAM, M.A. Pricing European Stock Options using Stochastic and Fuzzy Continuous Time Processes. (2012) Directed by Jan Rychtar. 71 pp. ELY, WILLIAM, M.A. Pricing European Stock Options using Stochastic and Fuzzy Continuous Time Processes. (2012) Directed by Jan Rychtar. 71 pp. Over the past 40 years, much of mathematical nance has been

More information

BSc in Information Technology Degree Programme. Syllabus

BSc in Information Technology Degree Programme. Syllabus BSc in Information Technology Degree Programme Syllabus Semester 1 Title IT1012 Introduction to Computer Systems 30 - - 2 IT1022 Information Technology Concepts 30 - - 2 IT1033 Fundamentals of Programming

More information

Implementing Propensity Score Matching Estimators with STATA

Implementing Propensity Score Matching Estimators with STATA Implementing Propensity Score Matching Estimators with STATA Barbara Sianesi University College London and Institute for Fiscal Studies E-mail: barbara_s@ifs.org.uk Prepared for UK Stata Users Group, VII

More information

Quantity Purchase Agreement With The State Of Indiana

Quantity Purchase Agreement With The State Of Indiana 1 of 5 This is an award of a with the Goodyear Tire & Rubber Company for tire and tire services, per RFP 15-041. The vendor agrees to charge these prices for any products ordered on any QPA release received

More information

1. Datsenka Dog Insurance Company has developed the following mortality table for dogs:

1. Datsenka Dog Insurance Company has developed the following mortality table for dogs: 1 Datsenka Dog Insurance Company has developed the following mortality table for dogs: Age l Age l 0 2000 5 1200 1 1950 6 1000 2 1850 7 700 3 1600 8 300 4 1400 9 0 Datsenka sells an whole life annuity

More information

College Algebra. George Voutsadakis 1. LSSU Math 111. Lake Superior State University. 1 Mathematics and Computer Science

College Algebra. George Voutsadakis 1. LSSU Math 111. Lake Superior State University. 1 Mathematics and Computer Science College Algebra George Voutsadakis 1 1 Mathematics and Computer Science Lake Superior State University LSSU Math 111 George Voutsadakis (LSSU) College Algebra December 2014 1 / 91 Outline 1 Exponential

More information

The Impact of Publicly Available Information on Betting Markets: Implications for Bettors, Betting Operators and Regulators

The Impact of Publicly Available Information on Betting Markets: Implications for Bettors, Betting Operators and Regulators 1 The Impact of Publicly Available Information on Betting Markets: Implications for Bettors, Betting Operators and Regulators Ming-Chien Sung and Johnnie Johnson The 6 th European conference on Gambling

More information

An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising. Anindya Ghose Sha Yang

An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising. Anindya Ghose Sha Yang An Empirical Analysis of Sponsored Search Performance in Search Engine Advertising Anindya Ghose Sha Yang Stern School of Business New York University Outline Background Research Question and Summary of

More information

Missing data and net survival analysis Bernard Rachet

Missing data and net survival analysis Bernard Rachet Workshop on Flexible Models for Longitudinal and Survival Data with Applications in Biostatistics Warwick, 27-29 July 2015 Missing data and net survival analysis Bernard Rachet General context Population-based,

More information

Using the SABR Model

Using the SABR Model Definitions Ameriprise Workshop 2012 Overview Definitions The Black-76 model has been the standard model for European options on currency, interest rates, and stock indices with it s main drawback being

More information

Some Research Problems in Uncertainty Theory

Some Research Problems in Uncertainty Theory Journal of Uncertain Systems Vol.3, No.1, pp.3-10, 2009 Online at: www.jus.org.uk Some Research Problems in Uncertainty Theory aoding Liu Uncertainty Theory Laboratory, Department of Mathematical Sciences

More information

Project & Programme Management Training Schedule January 2016 - July 2016

Project & Programme Management Training Schedule January 2016 - July 2016 Project & Programme Management Training Schedule January 2016 - July 2016 Upper Tier Bundle One Bundle Two PRINCE2 Foundation & Practitioner M_o_R Foundation & Practitioner APMP APM Professional Individual

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

BACKGROUND DISCUSSION

BACKGROUND DISCUSSION CITY COMMISSION AGENDA MEMO September 24, 2014 FROM: Brian D. Johnson, P.E., City Engineer MEETING: October 7, 2014 SUBJECT: PRESENTER: Award Construction Contract for Stone Valley Addition, Unit Two,

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

1 Inleiding 1.1 Probleemstelling 1.2 OverKPMGenKPMGITAdvisory 2 OperationeelRisico 2.1 DefinitieenomschrijvingRisico 2.2 DefinitieenomschrijvingOperationeelRisico 2.3 Regelgeving 2.3.1 LossDatabases

More information

The Sieve Re-Imagined: Integer Factorization Methods

The Sieve Re-Imagined: Integer Factorization Methods The Sieve Re-Imagined: Integer Factorization Methods by Jennifer Smith A research paper presented to the University of Waterloo in partial fulfillment of the requirement for the degree of Master of Mathematics

More information

5.3 Improper Integrals Involving Rational and Exponential Functions

5.3 Improper Integrals Involving Rational and Exponential Functions Section 5.3 Improper Integrals Involving Rational and Exponential Functions 99.. 3. 4. dθ +a cos θ =, < a

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

SUP Ann 6R: Persistency Report

SUP Ann 6R: Persistency Report SUP Ann 6R: Persistency Report 1. REP003 Persistency Report Nil Return Declaration 2. Persistency Report Life Policies 3. Persistency Report Stakeholder Pensions Financial Conduct Authority REP003 Persistency

More information

Impact of child care support on female labor supply, family income and public finance

Impact of child care support on female labor supply, family income and public finance Impact of child care support on female labor supply, family income and public finance Nicholas-James Clavet and Jean-Yves Duclos CIRPÉE, Université Laval May 2011 Preliminary please do not quote Abstract

More information

Appendix for Hierarchical Dirichlet Scaling Process for Multi-labeled Data

Appendix for Hierarchical Dirichlet Scaling Process for Multi-labeled Data Appendix for Hierarchical Dirichlet Scaling Process for Multi-labeled Data Dongwoo Kim DW.KIM@KAIST.AC.KR KAIST, Daeeon, Korea Alice Oh ALICE.OH@KAIST.EDU KAIST, Daeeon, Korea This appendix has been provided

More information

Big Data for Law Firms DAMIAN BLACKBURN

Big Data for Law Firms DAMIAN BLACKBURN Big Data for Law Firms DAMIAN BLACKBURN PUBLISHED BY IN ASSOCIATION WITH Contents Executive summary VII About the author XI Chapter 1: Introduction to big data 1 Factors leading to big data 2 The three

More information

Detail SE Transaction Set Trailer Summary GE Functional Group Trailer Summary IEA Interchange Control Trailer Summary. ISA Interchange Control Header

Detail SE Transaction Set Trailer Summary GE Functional Group Trailer Summary IEA Interchange Control Trailer Summary. ISA Interchange Control Header 820 Payment Order / Remittance Advice Segment ID Description Location ISA Interchange Control Header Heading GS Functional Group Header Heading ST Transaction Set Header Heading 1 BPR Beginning Segment

More information

3.4 - BJT DIFFERENTIAL AMPLIFIERS

3.4 - BJT DIFFERENTIAL AMPLIFIERS BJT Differential Amplifiers (6/4/00) Page 1 3.4 BJT DIFFERENTIAL AMPLIFIERS INTRODUCTION Objective The objective of this presentation is: 1.) Define and characterize the differential amplifier.) Show the

More information

1.5 / 1 -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de. 1.5 Transforms

1.5 / 1 -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de. 1.5 Transforms .5 / -- Communication Networks II (Görg) -- www.comnets.uni-bremen.de.5 Transforms Using different summation and integral transformations pmf, pdf and cdf/ccdf can be transformed in such a way, that even

More information

PRAXIS Pass Rates Fall 2010 through Spring 2013

PRAXIS Pass Rates Fall 2010 through Spring 2013 PRAXIS Pass Rates Fall 2010 through Spring 2013 Program Semester Test # N Percent Comments BS Elementary Education Fall 2010 0710 4 100% 1 was ACT PRAXIS exempt BS Elementary Education Fall 2010 0172 4

More information

Online Convex Programming and Generalized Infinitesimal Gradient Ascent

Online Convex Programming and Generalized Infinitesimal Gradient Ascent Online Convex Programming and Generalized Infinitesimal Gradient Ascent Martin Zinkevich February 003 CMU-CS-03-110 School of Computer Science Carnegie Mellon University Pittsburgh, PA 1513 Abstract Convex

More information

Could your house sale or purchase be affected by Contaminated Land?

Could your house sale or purchase be affected by Contaminated Land? Could your house sale or purchase be affected by Contaminated Land? What is Contaminated Land? The legal definition of Contaminated Land, as provided by Part IIA of the Environmental Protection Act 1990,

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

DATA MINING IN FINANCE

DATA MINING IN FINANCE DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia

More information

The Fast Convergence of Incremental PCA

The Fast Convergence of Incremental PCA The Fast Convergence of Incremental PCA Akshay Balsubramani UC San Diego abalsubr@cs.ucsd.edu Sanjoy Dasgupta UC San Diego dasgupta@cs.ucsd.edu Yoav Freund UC San Diego yfreund@cs.ucsd.edu Abstract We

More information

How To Invest In Stocks With Options

How To Invest In Stocks With Options Applied Options Strategies for Portfolio Managers Gary Trennepohl Oklahoma State University Jim Bittman The Options Institute Session Outline Typical Fund Objectives Strategies for special situations Six

More information