Size: px
Start display at page:

Download ""

Transcription

1 ImprovingRooftopDetectioninAerialImages ThroughMachineLearning yinstituteforthestudyoflearningandexpertise zroboticslaboratory,departmentofcomputerscience 2164StauntonCourt,PaloAlto,CA94306 StanfordUniversity,Stanford,CA94305 theproblemofanalyzingaerialimagesanddescribeanexistingvisionsystemthatautomatesthe whichisonestepinavisionsystemthatrecognizesbuildingsinoverheadimagery.wereview Inthispaper,weexaminetheuseofmachinelearningtoimprovearooftopdetectionprocess, Abstract recognitionofbuildingsinsuchimages.afterthis,webrieyreviewtwowell-knownlearning algorithms,representingdierentinductivebiases,thatweselectedtoimproverooftopdetection. Animportantaspectofthisproblemisthatthedatasetsarehighlyskewedandthecostofmistakes bothtrainingandtestingdataarederivedfromthesameimage.anotheraddressesbetween-image learningtotheimageanalysistask.onesetofstudiesfocusesonwithin-imagelearning,inwhich ROCanalysis.Wereportthreesetsofexperimentsdesignedtoilluminatefacetsofapplyingmachine diersforthetwoclasses,soweevaluatethealgorithmsundervaryingmisclassicationcostsusing learning,inwhichtrainingandtestingsetscomefromdierentimages.analsetinvestigates learningusingallavailableimagedatainaneorttodeterminethebestperformingmethod. Experimentalresultsdemonstratethatusefulgeneralizationoccurswhentrainingandtestingon ahandcraftedlinearclassier,thesolutioncurrentlybeingusedinthebuildingdetectionsystem. classierexceeded,byasmuchasafactoroftwo,thepredictiveaccuracyofnearestneighborand that,undermostconditionsandacrossarangeofmisclassicationcosts,atrainednaivebayesian dataderivedfromimagesthatdierinlocationandinaspect.furthermore,theydemonstrate theavailabletrainingdata. AnalysisoflearningcurvesrevealsthatnaiveBayesachievedsuperiorityusingaslittleas6%of

2 RooftopDetectionThroughMachineLearning 1 abilitytoprocessthem.computationalaidswillberequiredtolterthisoodofimagesand Thenumberofimagesavailabletoimageanalystsisgrowingrapidly,andwillsoonoutpacetheir 1.Introduction focustheanalyst'sattentiononinterestingevents,butcurrentimageunderstandingsystemsare operationscangiveacceptableresultsonsomeimagesbutnotothers. consequentlyremainfragile.handcraftedknowledgeaboutwhenandhowtouseparticularvision notyetrobustenoughtosupportthisprocess.successfulimageunderstandingreliesonknowledge, anddespitetheoreticalprogress,implementedvisionsystemsstillrelyonheuristicmethodsand inthevisionprocess,andthusforproducingmorerobustsoftware.recentapplicationsofmachine learninginbusinessandindustry(langley&simon1995)holdusefullessonsforapplicationsin imageanalysis.akeyideainappliedmachinelearninginvolvesbuildinganadvisorysystemthat Inthispaper,weexploretheuseofmachinelearningasameansforimprovingknowledgeused systemanalysesacceptableandothersuninterestingorinerror.theaimofourresearchprogram recommendsactionsbutgivesnalcontroltoahumanuser,witheachdecisiongeneratingatraining issimilartothescenarioinwhichanimageanalystinteractswithavisionsystem,ndingsome istoembedmachinelearningintothisinteractiveprocessofimageanalysis. case,gatheredinanunobtrusiveway,foruseinlearning.thissettingforknowledgeacquisition individualsinresponsetofeedbackfromthoseusers.theoveralleectshouldbeanewclass thatimageanalystsmustmakeperpicture,thusimprovingtheirabilitytodealwithahighow ofimages.moreover,theresultingsystemsshouldadapttheirknowledgetothepreferencesof Thisadaptiveapproachtocomputervisionpromisestogreatlyreducethenumberofdecisions ofsystemsforimageanalysisthatreducestheworkloadonhumananalystsandgivethemmore reliableresults,thusspeedingtheimageanalysisprocess. domain identifyingbuildingsinaerialphotographs andthendescribethevisionsystemdesigned makingatonestageinanexistingimageunderstandingsystem.webeginbyexplainingthetask forthistask.next,wereviewtwowell-knownalgorithmsforsupervisedlearningthatholdpotential Inthesectionsthatfollow,wereportprogressonusingmachinelearningtoimprovedecision forimprovingthereliabilityofimageanalysisinthisdomain.afterthis,wereportthedesignof experimentstoevaluatethesemethodsandtheresultsofthosestudies.inclosing,wediscuss relatedandfuturework. 2.NatureoftheImageAnalysisTask otherinterestingbehavior.theimagesunderscrutinyareusuallycomplex,involvingmanyobjects Theimageanalystinterpretsaerialimagesofgroundsiteswithaneyetounusualactivityor inarangeofsizesandshapes,majorandminorroadways,sidewalks,parkinglots,vehicles,and vegetation.acommontaskfacedbytheimageanalystistodetectchangeatasiteasreectedin arrangedinavarietyofpatterns.overheadimagesofforthood,texas,collectedaspartofthe dierencesbetweentwoimages,asinthenumberofbuildings,roads,andvehicles.thisinturn RADIUSproject(Firschein&Strat1997),aretypicalofamilitarybaseandincludebuildings requirestheabilitytorecognizeexamplesfromeachclassofinterest.inthispaper,wefocuson theperformancetaskofidentifyingbuildingsinsatellitephotographs.

3 RooftopDetectionThroughMachineLearning 2 theimage,includingthetimeofday(whichaectscontrastandshadows),thetimeofyear(which parameters,suchasdistancefromthesite(whichaectssizeandresolution)andviewingangle (whichaectsperspectiveandvisiblesurfaces).butothervariablesalsoinuencethenatureof Aerialimagescanvaryacrossanumberofdimensions.Themostobviousfactorsconcernviewing aectsfoliage),andthesiteitself(whichdeterminestheshapesofviewedobjects).takentogether, thesefactorsintroduceconsiderablevariabilityintotheimagesthatconfronttheanalyst. thoughabuildingorvehiclewillappeardierentfromalternativeperspectivesanddistances,the eectsofsuchtransformationsarereasonablywellunderstood.butvariationsduetotimeofday, theseason,andthesitearemoreserious.shadowsandfoliagecanhideedgesandobscuresurfaces, Inturn,thisvariabilitycansignicantlycomplicatethetaskofrecognizingobjectclasses.Al- andbuildingsatdistinctsitesmayhavequitedierentstructuresandlayouts.suchvariationsserve computervisionsystems. asmeredistractionstothehumanimageanalyst,yettheyprovideseriouschallengestoexisting particularvisionsoftware.inthenexttwosections,webrieyreviewonesuchsystemforimage studythistaskintheabstract.wemustexploretheeectofspecicinductionalgorithmson knowledgethatimprovesthereliabilityofsuchanimageanalysissystem.however,wecannot Thissuggestsanaturaltaskformachinelearning:givenaerialimagesastrainingdata,acquire analysisandtwolearningmethodsthatmightgiveitmorerobustbehavior. 3.AnArchitectureforImageAnalysis LinandNevatia(1996)reportacomputervisionpackage,calledtheBuildingsDetectionand DescriptionSystem(Budds),fortheanalysisofgroundsitesinaerialimages.Likemanyprograms forimageunderstanding,theirsystemoperatesinaseriesofprocessingstages.eachstepinvolves aggregatinglowerlevelfeaturesintohigherlevelones,eventuallyreachinghypothesesaboutthe locationsanddescriptionsofbuildings.wewillconsiderthesestagesintheorderthattheyoccur. invokesalinendertogroupedgelsintolines.junctionsandparallellinesareidentiedand combinedtoformthree-sidedstructuresor\us".thealgorithmthengroupsselectedusand junctionstoformparallelograms.eachsuchparallelogramconstitutesahypothesisaboutthe Startingatthepixellevel,Buddsusesanedgedetectortogrouppixelsintoedgels,andthen positionandorientationoftheroofforsomebuilding,sowemaycallthissteprooftopgeneration. eachrooftopcandidatetodeterminewhetherithassucientevidencetoberetained.theaim ofthisprocessistoremovecandidatesthatdonotcorrespondtoactualbuildings.ideally,the systemwillrejectmostspuriouscandidatesatthispoint,althoughanalvericationstepmaystill Afterthesystemhascompletedtheaboveaggregationprocess,arooftopselectionstageevaluates collapseduplicateoroverlappingrooftops.thisstagemayalsoexcludecandidatesifthereisno evidenceofthree-dimensionalstructure,suchasshadowsandwalls. candidates.thisprocesstakesintoaccountbothlocalandglobalcriteria.localsupportcomes improvementthroughmachinelearning,becausethisstagemustdealwithmanyspuriousrooftop fromfeaturessuchaslinesandcornersthatareclosetoagivenparallelogram.sincethesesuggest Analysisofthesystem'soperationsuggestedthatrooftopselectionheldthemostpromisefor wallsandshadows,theyprovideevidencethatthecandidatecorrespondstoanactualbuilding.

4 RooftopDetectionThroughMachineLearning 3 constraintsappliedinthisprocesshaveasolidfoundationinboththeoryandpractice. tioncriteria,thesetofrooftopcandidatesisreducedtoamoremanageablesize.theindividual Globalcriteriaconsidercontainment,overlap,andduplicationofcandidates.Usingtheseevalua- thatvaryintheirglobalcharacteristics,suchascontrastandamountofshadow.however,methods Moreover,suchrulesofthumbarecurrentlycraftedbyhand,andtheydonotfarewellonimages frommachinelearning,towhichwenowturn,maybeabletoinducebetterconditionsforselecting Theproblemisthatwehaveonlyheuristicknowledgeabouthowtocombinetheseconstraints. orrejectingcandidaterooftops.iftheseacquiredheuristicsaremoreaccuratethantheexisting handcraftedsolutions,theywillimprovethereliabilityoftherooftopselectionprocess. 4.AReviewofThreeLearningTechniques Wecanformulatethetaskofacquiringrooftopselectionheuristicsintermsofsupervisedlearning. Inthisprocess,trainingcasesofsomeconceptarelabeledastotheirclass.Inrooftopselection, associatedvalues,alongwithaclasslabel.theselabeledinstancesconstitutetrainingdatathatare examplesoftheconcept\rooftop".eachinstanceconsistsofanumberofattributesandtheir onlytwoclassesexist rooftopandnon-rooftop whichwewillrefertoaspositiveandnegative providedasinputtoaninductivelearningroutine,whichgeneratesconceptdescriptionsdesigned todistinguishthepositiveexamplesfromthenegativeones.theseknowledgestructuresstatethe conditionsunderwhichtheconcept,inthiscase\rooftop",issatised. fortherooftopdetectiontaskandselectedthetwothatshowedpromiseofachievingabalancebetweenthetruepositiveandfalsepositiverates:nearestneighbor,andnaivebayes.thesemethods Inapreviousstudy(Maloofetal.1997),weevaluatedavarietyofmachinelearningmethods usedierentrepresentations,performanceschemes,andlearningmechanismsforsupervisedconceptlearning,andexhibitdierentinductivebiases,meaningthateachalgorithmacquirescertaisentationofknowledgethatsimplyretainstrainingcasesinmemory.thisapproachclassiesnew instancesbyndingthe\nearest"storedcase,asmeasuredbysomedistancemetric,thenpredictingtheclassassociatedwiththatcase.fornumericattributes,acommonmetric(whichweusein Thenearest-neighbormethod(e.g.,Aha,Kibler,&Albert1991),usesaninstance-basedrepre- conceptsmoreeasilythanothers. eachtraininginstance,alongwithitsassociatedclass.althoughthismethodisquitesimpleand hasknownsensitivitytoirrelevantattributes,inpracticeitperformswellinmanydomains.some ourstudies)iseuclideandistance.inthisframework,learninginvolvesnothingmorethanstoring versionsselectthekclosestcasesandpredictthemajorityclass;herewewillfocusonthe\simple" estimatedconditionalprobabilitiesofeachattributevaluegiventheclass.themethodclassies nearestneighborscheme,whichusesonlythenearestcaseforprediction. newinstancesbycomputingtheposteriorprobabilityofeachclassusingbayes'rule,combiningthe descriptionforeachclass.thisdescriptionincludesanestimateoftheclassprobabilityandthe ThenaiveBayesianclassier(e.g.,Langley,Iba,&Thompson1992)storesaprobabilisticconcept storedprobabilitiesbyassumingthattheattributesareindependentgiventheclassandpredicting

5 RooftopDetectionThroughMachineLearning 4 Figure1.Visualizationinterfaceforlabelingrooftopcandidates.Thesystempresentscandidatestoauser wholabelsthembyclickingeitherthe`roof'or`non-roof'button.italsoincorporatesasimple basedonpreviouslylabeledexamples. learningalgorithmtoprovidefeedbacktotheuseraboutthestatisticalpropertiesofacandidate itations,suchassensitivitytoattributecorrelationsandaninabilitytorepresentmultipledecision theclasswiththehighestposteriorprobability.likenearestneighbor,naivebayeshasknownlim- regions,butinpracticeitbehaveswellonmanynaturaldomains. whichisequivalenttoaperceptronclassier(e.g.,zurada1992).althoughwedidnottrainthis thepurposeofcomparison.thismethodrepresentsconceptsusingacollectionofweightswand methodaswedidnaivebayesandnearestneighbor,weincludedthismethodinourevaluationfor Currently,Buddsusesahandcraftedlinearclassierforrooftopdetection(Lin&Nevatia1996), athreshold.toclassifyaninstance,whichwerepresentasavectorofnnumbersx,wecompute theoutputooftheclassierusingtheformula: Forourapplication,theclassierpredictsthepositiveclassiftheoutputis+1andpredictsthe o=(+1ifpni=1wixi> negativeclassotherwise.thereareanumberofestablishedmethodsfortrainingperceptrons,but?1otherwise usedinbuddsasthe\buddsclassier". notusethelearnedperceptronshere.henceforth,wewillrefertothehandcraftedlinearclassier ourpreliminarystudiessuggestedthattheyfaredworsethanthemanuallysetweights,sowedid

6 RooftopDetectionThroughMachineLearning 5 Table1.Characteristicsoftheimagesanddatasets.Webeganwithanadirandanobliqueimageofan areaofforthood,texas,andderivedthreesubimagesfromeachthatcontainedconcentrationsof buildings.wethenusedbuddstoextractrooftopcandidatesandlabeledeachaseitherapositive ornegativeexampleoftheconcept\rooftop". Number Image 21 Original Image LocationAspectExamplesExamples Positive 197 Negative FHOV1027 FHOV625 3 Oblique Nadir candidatesinaerialimages.thisrequiredthreethings:asetofimagesthatcontainbuildings, 5.Generating,Representing,andLabelingRooftopCandidates Wewereinterestedinhowwellthevariousinductionalgorithmscouldlearntoclassifyrooftop somemeanstogenerateandrepresentplausiblerooftops,andlabelsforeachsuchcandidate. werecollectedaspartoftheradiusprogram(firschein&strat1997).theseimagescoverthe sameareabutweretakenfromdierentviewpoints,onefromanadirangleandtheotherfroman obliqueangle.wesubdividedeachimageintothreesubimages,focusingonlocationsthatcontained Asourrststep,weselectedtwoimages,FHOV1027andFHOV625,ofFortHood,Texas,which image,producingsixdatasets.followinglinandnevatia(1996),thedatasetsdescribedeach concentrationsofbuildings,tomaximizethenumberofpositiverooftopcandidates.thisgaveus rooftopcandidateintermsofninecontinuousfeaturesthatsummarizetheevidencegatheredfrom threepairsofimages,eachpaircoveringthesameareabutviewedfromdierentaspects. thevariouslevelsofanalysis.forexample,positiveindicationsfortheexistenceofarooftop OuraimwastoimproveBuddssoweusedthissystemtogeneratecandidaterooftopsforeach junctionsadjacenttothecandidate,similarlyadjacentt-junctions,gapsinthecandidate'sedges, ofthecandidate.negativeevidenceincludedtheexistenceoflinesthatcrossthecandidate,l- includedevidenceforedgesandcorners,thedegreetowhichacandidate'sopposinglinesare andthedegreetowhichenclosinglinesfailedtoformaparallelogram. parallel,supportfortheexistenceoforthogonaltrihedralvertices,andshadowsnearthecorners thedata,andwemakenoclaimsthatthesenineattributesarethebestonesforrecognizingrooftops inaerialimages.however,becauseouraimwastoimprovetherobustnessofbudds,weneededto usethesamefeaturesaslinandnevatia'shandcraftedclassier.moreover,itseemedunlikelythat Weshouldnotethatinductionalgorithmsareoftensensitivetothefeaturesoneusestodescribe wecoulddevisebetterfeaturesthanthesystem'sauthorshaddevelopedduringyearsofresearch. themostinteresting.buddsitselfclassieseachcandidate,butsinceweweretryingtoimprove onitsability,wecouldnotusethoselabels.thus,wetriedanapproachinwhichanexpert Thethirdproblem,labelingthegeneratedrooftopcandidates,provedthemostchallengingand

7 RooftopDetectionThroughMachineLearning 6 aregionsurroundingtheactualrooftop.unfortunately,uponinspectionneitherapproachgaveus positiveornegativedependingonthedistanceoftheirverticesfromthenearestactualrooftop's corners.wealsotriedasecondschemethatusedthenumberofcandidateverticesthatfellwithin speciedtheverticesofactualrooftopsintheimage,thenweautomaticallylabeledcandidatesas satisfactorylabelingresults. process.oneisthattheyignoreinformationaboutthecandidate'sshape;agoodrooftopshould beaparallelogram,yetnearnessofverticesisneithersucientornecessaryforthisform.a seconddrawbackisthattheyignoreotherinformationcontainedintheninebuddsattributes, Analysisrevealedthedicultieswithusingsuchrelationstoactualrooftopsinthelabeling two-dimensionalspacethatdescribeslocationwithintheimage,ratherthanthenine-dimensional suchasshadowsandcrossinglines.thebasicproblemisthatsuchmethodsdealonlywiththe daunting,aseachimageproducedthousandsofcandidaterooftops.tosupporttheprocess,we spacethatwewantthevisionsystemtouseinclassifyingacandidate. eachextractedrooftoptotheuser.thesystemdrawseachcandidateovertheportionoftheimage implementedaninteractivelabelingsysteminjava,showninfigure1,thatsuccessivelydisplays Reluctantly,weconcludedthatmanuallabelingbyahumanwasnecessary,butthistaskwas fromwhichitwasextracted,thenletstheuserclickbuttonsfor`roof'or`non-roof'tolabelthe example. rooftops,andunknown.theinterfacedisplayslikelyrooftopsusinggreenrectangles,unlikely toimprovethelabelingprocess.asthesystemobtainsfeedbackfromtheuseraboutpositive andnegativeexamples,itdividesunlabeledcandidatesintothreeclasses:likelyrooftops,unlikely Thevisualinterfaceitselfincorporatesasimplelearningmechanism nearestneighbor designed sensitivityparameter1thataectshowcertainthesystemmustbebeforeitproposesalabel.after eitherthe`roof'or`non-roof'button.thesimplelearningmechanismthenusesthisinformation rooftopsasredrectangles,andunknowncandidatesasbluerectangles.thesystemincludesa toimprovesubsequentpredictionsofcandidatelabels. displayingarooftop,theusereitherconrmsorcontradictsthesystem'spredictionbyclicking fewerandfewercandidatesaboutwhichitwasuncertain,andthusspeedupthelaterstagesof session,theusertypicallyconrmsnearlyalloftheinterface'srecommendations.however,because interaction.informalstudiessuggestedthatthesystemachievesthisaim:bytheendofthelabeling Ourintentwasthat,astheinterfacegainedexperiencewiththeuser'slabels,itwoulddisplay manner,theinterfacerequiredonlyaboutvehourstolabelthe17,829roofcandidatesextracted wewereconcernedthatouruseofnearestneighbormightbiasthelabelingprocessinfavorofthis fromthesiximages.thiscomestounderonesecondpercandidate,whichstillseemsquiteecient. algorithmduringlaterstudies,wegeneratedthedatausedinsection7bythesettingsensitivity parametersothatthesystempresentedallcandidatesasuncertain.evenhandicappedinthis fascinatingissuesinourwork.toincorporatesupervisedconceptlearningintovisionsystems, whichcangeneratethousandsofcandidatesperimage,wemustdevelopmethodstoreducethe burdenoflabelingthesedata.infuturework,weintendtomeasuremorecarefullytheabilityof Insummary,whatbeganasthesimpletaskoflabelingvisualdataledustosomeofthemore learnedclassiertoordercandidaterooftops(showingtheleastcertainonesrst)andeventolter ouradaptivelabelingsystemtospeedthisprocess.wealsoplantoexploreextensionsthatusethe 1.TheusercansetthisparameterusingthesliderbarandnumbereldinthebottomrightcornerofFigure1.

8 RooftopDetectionThroughMachineLearning 7 candidatesbeforetheyarepassedontotheuser(automaticallylabelingthemostcondentones). Techniquessuchasselectivesampling(e.g.,Freundetal.1997)anduncertaintysampling(Lewis 6.Cost-SensitiveLearningandSkewedData &Catlett1994)shouldproveusefultowardtheseends. Twoaspectsoftherooftopselectiontaskinuencedourapproachtoimplementationandevaluation. First,Buddsworksinabottom-upmanner,soifthesystemdiscardsarooftop,itcannotretrieveit later.consequently,errorsontherooftopclass(falsenegatives)aremoreexpensivethanerrorson whenitcandrawuponaccumulatedevidence,suchastheexistenceofwallsandshadows.however, negative.thesystemhasthepotentialfordiscardingfalsepositivesinlaterstagesofprocessing sincefalsenegativescannotberecovered,weneedtominimizeerrorsontherooftopclass. thenon-rooftopclass(falsepositives),soitisbettertoretainafalsepositivethantodiscardafalse errorsonourminorityclass(rooftops)aremostexpensive,andtheextremeskewonlyincreases acrossclasses(781rooftopsvs.17,048non-rooftops).givensuchskeweddata,mostinduction algorithmshavedicultylearningtopredicttheminorityclass.moreover,wehaveestablishedthat Second,wehaveaseverelyskeweddataset,withtrainingexamplesdistributednon-uniformly sucherrors.thisinteractionbetweenskewedclassdistributionandunequalerrorcostsoccursin manycomputervisionapplications,inwhichavisionsystemgeneratesthousandsofcandidates butonlyahandfulcorrespondtoobjectsofinterest.italsoholdsmanyotherapplicationsof machinelearning,suchasfrauddetection(fawcett&provost1997),discourseanalysis(soderland &Lehnert1994),andtelecommunicationsriskmanagement(Ezawa,Singh,&Norton1996). thatcanachievehighaccuracyontheminorityclass.second,theyrequireanexperimentalmethodologythatletsuscomparedierentmethodsondomainslikerooftopdetection,inwhichtheclasses areskewedanderrorshavedierentcosts.intheremainderofthissection,wefurtherclarifythenatureoftheproblem,afterwhichweproposesomecost-sensitivelearningmethodsandanapproach 6.1FavoritismTowardtheMajorityClass toexperimentalevaluation. Theseissuesraisetwochallenges.First,theysuggesttheneedformodiedlearningalgorithms Inapreviousstudy(Maloofetal.1997),weevaluatedseveralalgorithmswithouttakinginto accountthecostofclassicationerrorsandgotconfusingexperimentalresults.somemethods,like thestandarderror-drivenalgorithmforrevisingperceptronweights(e.g.,zurada1992),learnedto alwayspredictthemajorityclass.thenaivebayesianclassierfoundamorecomfortabletrade-o setsthatareskewed,aninductivemethodthatlearnstopredictthemajorityclasswilloftenhavea higheroverallaccuracythanamethodthatndsabalancebetweentruepositiveandfalsepositive betweenthetruepositiveandfalsepositiverates,butstillfavoredthemajorityclass.2fordata whichmakesitamisleadingmeasureofperformance. rates.indeed,alwayspredictingthemajorityclassforourproblemyieldsahitrateof95percent, minorityclass.fortherooftopdomain,iftheerrorcostsforthetwoclasseswerethesame,thenwe 2.Coveringalgorithms,likeAQ15(Michalskietal.1986)orCN2(Clark&Niblett1989),maybelesssusceptible Thisbiastowardthemajorityclassonlycausesdicultywhenwecaremoreabouterrorsonthe toskeweddatasets,butthisishighlydependentontheirruleselectioncriteria.

9 RooftopDetectionThroughMachineLearning 8 wouldnotcareonwhichclasswemadeerrors,providedweminimizedthetotalnumberofmistakes. Norwouldtherebeanyproblemifmistakesonthemajorityclassweremoreexpensive,sincemost classdistributionrunscountertotherelativecostofmistakes,asinourdomain,thenwemustdo learningmethodsarebiasedtowardminimizingsucherrorsanyway.ontheotherhand,ifthe somethingtocompensate,bothinthelearningalgorithmitselfandinmeasuringitsperformance. costoferrors.inparticular,theypointoutthatonecanmitigatethebiasagainsttheminority classbyduplicatingexamplesofthatclassinthetrainingdata.thisalsohelpsexplainwhymost inductionmethodsgivemoreweighttoaccuracyonthemajorityclass,sinceskewedtrainingdata Breimanetal.(1984)notethecloserelationbetweenthedistributionofclassesandtherelative implicitlyplacesmoreweightonerrorsforthatclass.inresponse,severalresearchershaveexplored tobiastheperformanceelement(cardie&howe1997),removingunimportantexamplesfromthe approachesthatalterthedistributionoftrainingdatainvariousways,includinguseofweights majorityclass(kubat&matwin1997),and`boosting'theexamplesintheunder-representedclass themselvestomoredirectlyrespondtoerrorcosts. (Freund&Schapire1996).However,aswewillseeshortly,onecanalsomodifythealgorithms 6.2Cost-SensitiveLearningMethods errors,possiblybecausemostlearningmethodsdonotprovidewaystotakesuchcostsintoaccount. Empiricalcomparisonsamongmachinelearningalgorithmsseldomfocusonthecostofclassication havealsodonesomepreliminaryworkalongtheselines,whichtheydescribeasaddressingthecosts Happily,someresearchershaveexploredvariationsonstandardalgorithmsthateectivelybiasthe ratiointoc4.5(quinlan1993)tobiasittowardunder-representedclasses.pazzanietal.(1994) ofdierenterrortypes.theirmethodndstheminimum-costclassierforavarietyofproblems methodinfavorofoneclassoverothers.forexample,lewisandcatlett(1994)introducedaloss usingasetofhypotheticalerrorcosts.turney(1995)presentsresultsfromanempiricalevaluation algorithmtreatsinstancesfromthemoreexpensiveclassrelativetotheotherinstances,either ofalgorithmsthattakeintoaccountboththecostofteststomeasureattributesandthecostof duringthelearningprocessoratthetimeoftesting.inessence,wewanttoincorporateacost classicationerror. heuristicintothealgorithmssowecanbiasthemtowardmakingmistakesonthelesscostlyclass Whenimplementingcost-sensitivelearningmethods,thebasicideaistochangethewaythe ratherthanonthemoreexpensiveclass. relativecostofmakingamistakeononeclassversusanother.zeroindicatesthaterrorscost nothing,whereasonemeansthaterrorsaremaximallyexpensive.toincorporateacostheuristic intothealgorithms,wechosetomodifytheperformanceelementofthealgorithms,ratherthanthe Toaccomplishthis,wedenedacostforeachclassontherange[0:0;1:0]thatindicatesthe learningelement,byusingthecostheuristictoadjustthedecisionboundaryatwhichthealgorithm selectsoneclassversustheother. usingbayes'rule,sowewantthecostheuristictobiaspredictioninfavorofthemoreexpensive class.foracostparametercj2[0:0;1:0],wecomputedtheexpectedcostjfortheclass!jusing theformula: RecallthatnaiveBayespredictstheclasswiththehighestposteriorprobabilityascomputed

10 RooftopDetectionThroughMachineLearning 9 cost-sensitiveversionofnaivebayespredictstheclass!jwiththeleastexpectedcostj. wherexisthequery,andp(!jjx)istheposteriorprobabilityofthejthclassgiventhequery.the j=p(!jjx)+cj(1?p(!jjx)) magnitudeofthecostparameter.therefore,wecomputedtheexpectedcostjfortheclass!j exampleofthemoreexpensiveclass.themagnitudeofthischangeshouldbeproportionaltothe Therefore,thecostheuristicshouldhavetheeectofmovingthequerypointclosertotheclosest Nearestneighbor,asnormallyused,predictstheclassoftheexamplethatisclosesttothequery. usingtheformula: expectedcost.thismodicationalsoworksforknearestneighbor,whichconsidersthekclosest distancefunction.thecost-sensitiveversionofnearestneighborpredictstheclasswiththeleast wherexjistheclosestneighborfromclass!jtothequerypoint,andde(x;y)istheeuclidean j=de(x;xj)?cjde(x;xj) neighborswhenclassifyingunknowninstances. ingalgorithms,wecanmakesimilarchangestothebuddsclassier.sincethisclassierusesa lineardiscriminantfunction,wewantthecostheuristictoadjustthethresholdsothehyperplane ofdiscriminationisfartherfromthehypotheticalregionofexamplesofthemoreexpensiveclass, Finally,becauseourmodicationsfocusedontheperformanceelementsratherthanonthelearn- thusenlargingthedecisionregionofthatclass.thedegreetowhichthealgorithmadjuststhe thresholdisagaindependentonthemagnitudeofthecostparameter.theadjustedthreshold0 iscomputedby: thepositiveclassandnegativeforthenegativeclass,andjisthemaximumvaluetheweighted whereistheoriginalthresholdforthelineardiscriminantfunction,sgn(!j)returnspositivefor 0=?2Xj=1sgn(!j)cjj sumcantakeforthejthclass.thecost-sensitiveversionofthebuddsclassierpredictsthe otherwise,itpredictsthenegativeclass. positiveclassiftheweightedsumofaninstance'sattributessurpassestheadjustedthreshold0; Oursecondchallengewastoidentifyanexperimentalmethodologythatwouldletuscompare 6.3ROCAnalysisforEvaluatingPerformance costsorskeweddistributions.rather,wemustseparatelymeasureaccuracyonbothclasses,in thatcomparisonsbasedonoverallaccuracyarenotsucientfordomainsthatinvolvenon-uniform thebehaviorofourcost-sensitivelearningmethodsontherooftopdata.wehavealreadyseen termsoffalsepositivesandfalsenegatives.giveninformationabouttherelativecostsoferrors, sayfromconversationswithdomainexpertsorfromadomainanalysis,wecouldthencompute Fawcett&Provost1997). aweightedaccuracyforeachalgorithmthattakescostintoaccount(e.g.,pazzanietal.1994; ratherthanaimingforasingleperformancemeasure,astypicallydoneinmachinelearningex- resultsoftheirinterpretationstodeterminetheactualcostsforthedomain.insuchsituations, However,inthiscase,wehadnoaccesstoimageanalystsorenoughinformationaboutthe

11 RooftopDetectionThroughMachineLearning 10 1 True Positive Rate periments,anaturalsolutionistoevaluateeachlearningmethodoverarangeofcostsettings. 0 ROC(ReceiverOperatingCharacteristic)analysis(Swets1988)providesaframeworkforcarryingoutsuchcomparisons.Thebasicideaistosystematicallyvarysomeaspectofthesituation, negativerateforeachsituation.althoughresearchershaveusedsuchroccurvesinsignaldetectionandpsychophysicsfordecades(e.g.,green&swets1974;egan1975),thistechniquehas Maloofetal.1997;Provost&Fawcett1997). onlyrecentlybeguntolterintomachinelearningresearch(e.g.,ezawa,singh,&norton1996; Figure2.AnidealizedReceiverOperatingCharacteristic(ROC)curve. 0 1 False Positive Rate suchasthecostratioortheclassdistribution,andtoplotthefalsepositiverateagainstthefalse onthenegativeclassaremaximallyexpensive(i.e.,c+=0:0andc?=1:0).conversely,theupper learningalgorithm.thelowerleftcornerofthegurerepresentsthesituationinwhichmistakes rightcorneroftherocgraphrepresentsthesituationinwhichmistakesonthepositiveclassare Figure2showsanidealizedROCcurvegeneratedbyvaryingthecostparameterofacost-sensitive maximallyexpensive(i.e.,c+=1:0andc?=0:0).byvaryingovertherangeofcostparameters andplottingtheclassier'struepositiveandfalsepositiverates,weproduceaseriesofpointsthat representsthealgorithm'saccuracytrade-o.thepoint(0,1)iswhereclassicationisperfect, withafalsepositiverateofzeroandatruepositiverateofone,sowewantroccurvesthat\push" withcurvesthatcoverlargerareasgenerallybeingviewedasbetter(hanley&mcneil1982;swets towardthiscorner. 1988).Giventheskewednatureoftherooftopdata,andthedierentbutimprecisecostsoferrors onthetwoclasses,wedecidedtouseareaundertheroccurveasthedependentvariableinour TraditionalROCanalysisusesareaunderthecurveasthepreferredmeasureofperformance, experimentalstudies.thismeasureraisesproblemswhentwocurveshavesimilarareasbutare dissimilarandasymmetric,andthusoccupydierentregionsoftherocspace.insuchcases, other.aswewillsee,thisrelationtypicallyholdsforourcost-sensitivealgorithmsintherooftop appearstobemostappropriatewhencurveshavesimilarshapesandwhenoneisnestedwithinthe othertypesofanalysisaremoreuseful(e.g.,provost&fawcett1997),butareaunderthecurve detectiondomain.

12 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure3.ROCcurvesforImages1and2.Weraneachmethodbytrainingandtestingusingdataderived fromthesameimageoverarangeofmisclassicationcosts.weconductedtensuchrunsand Naive Bayes Naive Bayes 0.2 Nearest Neighbor 0.2 Nearest Neighbor Budds Classifier Budds Classifier ExperimentalStudies butdierentaspects:image1isanadirview,whileimage2isanoblique. plottedtheaveragetruepositiveandfalsepositiverates.theseimagesareofthesamelocation False Positive Rate False Positive Rate (rooftopcandidates)separatefromthoseusedtotestthelearnedclassiers.aswewillsee,the Astypicallydoneinsuchstudies,ineachexperimentwetrainedtheinductionmethodsondata Toinvestigatetheuseofmachinelearningforthetaskofrooftopdetection,weconductedexperimentsusingthecost-sensitiveversionsofnaiveBayes,nearestneighbor,andtheBuddsclassier. experimentsdieredinwhetherthetrainingandtestcasescamefromthesameordistinctimages, whichletusexaminedierentformsofgeneralizationbeyondthetrainingdata. Ourrstexperimentalstudyexaminedhowthevariousmethodsbehavedgivenwithin-imagelearning,thatis,whengeneralizingtotestcasestakenfromthesameimageonwhichwetrainedthem. 7.1Within-ImageLearning rooftopclassierswouldhavelargerareasthanthoseofthebuddsclassier. Ourresearchhypothesiswasthatthelearnedclassierswouldbemoreaccurate,overarangeof misclassicationcosts,thanthehandcraftedlinearclassier.becauseourmeasureofperformance wasareaundertheroccurve,thistranslatesintoapredictionthattheroccurvesofthelearned andfalsepositiveratesfortenruns.sincecostsarerelative(i.e.,c+=0:0andc?=0:5isequivalent toc+=0:25andc?=0:75)andourdomaininvolvedonlytwoclasses,wevariedthecostparameter foronlyoneclassatatimeandxedtheotheratzero.eachruninvolvedpartitioningthedataset Foreachimageandmethod,wevariedtheerrorcostsandmeasuredtheresultingtruepositive set.becausethebuddsclassierwashand-congured,ithadnotrainingphase,soweappliedit inthetrainingset,andevaluatingtheresultingconceptdescriptionsusingthedatainthetest directlytotheinstancesinthetestset.foreachcostsettingandeachclassier,weplottedthe randomlyintotraining(60%)andtest(40%)sets,runningthelearningalgorithmsontheinstances similarresults,butbothfarebetterthanthebuddsclassier.ratherthanpresentthecurves averagefalsepositiverateagainsttheaveragetruepositiverateoverthetenruns. Figure3presentstheROCcurvesforImages1and2.NaiveBayesandnearestneighborgive

13 RooftopDetectionThroughMachineLearning 12 Table2.Resultsforwithin-imageexperiments.Foreachimage,wegeneratedROCcurvesbytrainingand testingeachmethodoverarangeofcosts.weusedtheapproximateareaunderthecurveasthe measureofperformance,whichappearwith95%condenceintervals.naivebayesperformedbest overall,withthebuddsclassieroutperformingnearestneighboronthreeofthesiximages. Image1 Image2ApproximateAreaunderROCCurve NearestNeighbor BuddsClassier NaiveBayes Image3 Image4 Image5 Image6 fortheremainingfourimages,wefollowswets(1988)andreport,intable2,theareaunder pairofadjacentpointsintheroccurve.forallimagesexceptforimage6,naivebayesproduced eachroccurve,whichweapproximatedbysummingtheareasofthetrapezoidsdenedbyeach curveswithareasgreaterthanthoseforthebuddsclassier,thusgenerallysupportingourresearch hypothesis.onimages4,5,and6,nearestneighbordidworsethanthehandcraftedmethod,which runscountertoourprediction. ourmotivatingproblemisthelargenumberofimagesthattheanalystmustprocess.inorderto 7.2Between-ImageLearning Wegearedournextsetofexperimentsmoretowardthegoalsofimageanalysis.Recallthat alleviatethisburden,wewanttoapplyknowledgelearnedfromsomeimagestomanyotherimages. dierenttimesandimagesofdierentareaspresentsimilarissues. viewpointsofthesamesiteinorientationorinanglefromtheperpendicular.imagestakenat learnedknowledgetonewimages.forexample,oneviewpointofagivensitecandierfromother Butwehavealreadynotedthatseveraldimensionsofvariationposeproblemstotransferringsuch versionofthepreviousone:classierslearnedfromonesetofimageswouldbemoreaccurateon unseenimagesthanhandcraftedclassiers.however,wealsoexpectedthatbetween-imagelearning generalizestootherimagesthatdieralongsuchdimensions.ourhypothesisherewasarened Wedesignedexperimentstoletusunderstandbetterhowtheknowledgelearnedfromoneimage wouldgiveloweraccuracythanthewithin-imagesituation,sincedierencesacrossimageswould makegeneralizationmoredicult. fromthesamelocation.asanexample,forthenadiraspect,wechoseimage1andthentested thelearningalgorithmsonanimagefromoneaspectandtestonanimagefromanotheraspectbut hadimagesfromtwoaspects(i.e.,nadirandoblique)andfromthreelocations.thisletustrain Oneexperimentfocusedonhowthemethodsgeneralizeoveraspect.RecallfromTable1thatwe plottedtheresultsasroccurves,asshowninfigure4.theareasunderthesecurvesandtheir usingtheimagesfromeachlocation,whilevaryingtheircostparametersandmeasuringtheirtrue onimage2,whichisanobliqueimageofthesamelocation.weranthealgorithmsinthismanner 95%condenceintervalsappearinTable3. positiveandfalsepositiverates.wethenaveragedthesemeasuresacrossthethreelocationsand

14 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure4.ROCcurvesforexperimentsthattestedgeneralizationoveraspect.Left:Foreachlocation,we trainedeachmethodontheobliqueimageandtestedtheresultingconceptdescriptionsonthe Naive Bayes Naive Bayes 0.2 nadirimage.weplottedtheaveragetruepositiveandfalsepositiverates.right:wefolloweda Nearest Neighbor 0.2 Nearest Neighbor similarmethodology,exceptthatwetrainedthemethodsonthenadirimagesandtestedonthe Budds Classifier Budds Classifier obliqueimages False Positive Rate False Positive Rate fortestingondatafromobliqueimages.forexample,table3showsthatnaivebayesgenerates obliqueimages,sincethecurvesfortestingonnadircandidatesaregenerallyhigherthanthose acurvewithanareaof0.878forthenadirimages,butproducesacurvewithanareaof0.842 Oneobviousconclusionisthatthenadirimagesappeartoposeaneasierproblemthanthe fortheobliqueimages.theothertwomethodsshowasimilardegradationinperformancewhen generalizingfromnadirtoobliqueimagesratherthanfromobliquetonadirimages. tion,naivebayes(withanareaundertheroccurveof0.878)performsbetterthanthebudds classier,withanareaof0.837,whichinturndidbetterthannearestneighbor(0.795).fornadir toobliquegeneralization,naivebayesperformsslightlybetterthanthebuddsclassier,which Uponcomparingthebehaviorofdierentmethods,wendthat,forobliquetonadirgeneraliza- produceareasof0.842and0.831,respectively.nearestneighbor'scurveinthissituationcoversan areaof0.785,whichisconsiderablysmaller. methodsonpairsofimagesfromoneaspectandtestedonthethirdimagefromthesameaspect. candidatesfromimages1and3,thentestingoncandidatesfromimage5.wethenraneachofthe Asanexample,forthenadirimages,oneofthethreelearningrunsinvolvedtrainingonrooftop Asecondexperimentexaminedgeneralizationoverlocation.Tothisend,wetrainedthelearning algorithmsacrossarangeofcosts,measuringthefalsepositiveandtruepositiverates.weplotted theaveragesofthesemeasuresacrossallthreelearningrunsforoneaspectinanroccurve,as showninfigure5. Comparingthebehaviorofthevariousmethods,Table3showsthat,forthenadiraspect,naive nitiontaskthanthenadiraspect,sincetheobliqueareasarelessthanthoseforthenadirimages. BayesperformsslightlybetterthantheBuddsclassier,whichgiveareasof0.901and Inthiscontext,weagainseeevidencethattheobliqueimagespresentedamoredicultrecog- curve.whengeneralizingoverlocationontheobliqueimages,naivebayesandthebuddsclassi- Asbefore,bothdidbetterthannearestneighbor,whichyieldedanareaof0.819underitsROC erproducedroccurveswithequalareasof0.831.thesewereconsiderablybetterthannearest neighbor's,whichhadanareaof0.697.

15 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure5.ROCcurvesforexperimentthattestedgeneralizationoverlocation.Left:Foreachpairofimages forthenadiraspect,wetrainedthemethodsonthatpairandtestedtheresultingconceptdescriptionsonthethirdimage.wethenplottedtheaveragetruepositiveandfalsepositiverates.right: Weappliedthesamemethodologyusingtheimagesfortheobliqueaspect. Naive Bayes Naive Bayes 0.2 Nearest Neighbor 0.2 Nearest Neighbor Budds Classifier Budds Classifier Thus,theresultswiththenaiveBayesianclassiersupportourmainhypothesis.InallexperimentalconditionsthismethodfaredbetterthanorequaltotheBuddslinearclassier.Onthe False Positive Rate False Positive Rate otherhand,thebehaviorofnearestneighbortypicallygaveworseresultsthanthehandcrafted rooftopdetector,whichwentagainstouroriginalexpectations. forthewithin-imagecondition(table2),naivebayesproducedanaveragerocareaof0.9forthe generalizingwithinimages.totestthishypothesis,wemustcomparetheresultsfromtheseexperimentswiththosefromthewithin-imageexperiments(seetable3).simplecalculationshowsthat, Recallthatwealsoanticipatedthatgeneralizingacrossimageswouldgiveloweraccuraciesthan nadirimagesand0.851fortheobliqueimages.similarly,nearestneighboraveraged0.851forthe nadirimagesand0.791fortheobliqueimages.mostofthesetheseareasaresubstantiallyhigher thenadirimage,buttheresultsgenerallysupportourprediction. thantheanalogousareasthatresultedwhenthesemethodsgeneralizedacrosslocationandaspect. TheoneexceptionisthatnaiveBayesactuallydidequallywellwhengeneralizingoverlocationfor performanceinthewithin-imageconditionandinthebetween-imageconditions.forexample, naivebayes'averagedegradationinperformanceoverallexperimentalconditionswas0.013,while eralizingtounseenimages.thiscanbeseenbycomparingthedierencesbetweeneachmethod's AlsonotethatnaiveBayes'performancedegradedlessthanthatofnearestneighborwhengen- nearestneighbor'swas0.47.thisconstitutesfurtherevidencethatnaivebayesisbettersuitedfor 7.3LearningfromAllAvailableImages thisdomain,atleastwhenoperatingovertheninefeaturesusedinourexperiments. OurnextstudyusedalloftherooftopcandidatesgeneratedfromthesixFortHoodimages,since wewantedtoreplicateourpreviousresultsinasituationsimilartothatweenvisionbeingusedin practice,whichwoulddrawontrainingcasesfromallimages.basedontheearlierexperiments,we anticipatedthatthenaivebayesianclassierwouldyieldanroccurveofgreaterareathanthose oftheothermethods.

16 RooftopDetectionThroughMachineLearning 15 Table3.Resultsforbetween-imageexperiments.WeagainusedtheapproximateareaundertheROC and`oblique'indicatethetestingcondition.wederivedanalogousresultsforthewithin-image thebest,whilethebuddsclassiergenerallyoutperformednearestneighbor.thelabels`nadir' experimentsbyaveragingtheresultsforeachcondition.approximateareasappearwith95% curveasthemeasureofperformance,alongwith95%condenceintervals.naivebayesperformed condenceintervals. Nadir AspectExperiment Oblique LocationExperiment Nadir Oblique Nadir WithinImage NearestNeighbor BuddsClassier NaiveBayes Oblique positiveand17,048labelednegative.weraneachalgorithmtentimesoverarangeofcosts.for (40%)sets,thenaveragedtheresultsforeachcostleveloveritstenruns. eachrunandsetofcostparameters,werandomlysplitthedataintotraining(60%)andtesting Combiningtherooftopcandidatesfromallsiximagesgaveus17,829instances,781labeled performedthebestoverall,producingacurvewitharea0.85.nearestneighborfaredslightly betterthanthebuddsclassier,yieldinganareaof0.801,comparedto0.787forthelatter. whereastable4givestheapproximateareaunderthesecurves.asanticipated,naivebayes Figure6showstheresultingROCcurves,whichplotthetruepositiveandfalsepositiverates, curvebut,rather,willhavespecicerrorcostsinmind,eveniftheycannotstatethemformally. WehaveusedROCcurvesbecausewedonotknowthesecostsinadvance,butwecaninspect behaviorofthevariousclassiersatdierentpointsonthesecurvestogivefurtherinsightintohow Inpractice,imageanalystswillnotevaluateaclassiersperformanceusingareaundertheROC muchthelearnedclassiersarelikelytoaidanalystsduringactualuse. rateof0.84andafalsepositiverateof0.27,thethirddiamondfromtherightinfigure6.toobtain thesametruepositiverate,thebuddsclassierproduceda0.62falsepositiverate.thismeans that,foragiventruepositiverate,naivebayesreducedthefalsepositiveratebymorethanhalf Forexample,considerthebehaviorofthenaiveBayesianclassierwhenitachievesatruepositive naivebayesimprovedthetruepositiverateby0.12overthebuddsclassier.inthiscase,the wouldhaverejected5,969morenon-rooftopsthanbudds_similarly,byxingthefalsepositiverate, Bayesianclassierwouldhavefound86morerooftopsthanBuddswouldhavedetected. overthehandcraftedclassier.hence,fortheimagesweconsidered,thenaivebayesianclassier 7.4RatesofLearning Wewerealsointerestedinthebehaviorofthelearningmethodsastheyprocessedincreasing amountsoftrainingdata.ourlong-termgoalistoembedthelearnedclassierinaninteractive systemthatsupportsanimageanalyst.forthisreason,wewouldpreferalearningalgorithmthat achieveshighaccuracyfromrelativelyfewtrainingcases,sincethisshouldreducetheloadonthe humananalyst.

17 RooftopDetectionThroughMachineLearning 16 1 True Positive Rate Figure6.ROCcurvefortheexperimentusingallavailableimagedata.Weraneachmethodoverarangeof Naive Bayes costsusingatrainingset(60%)andatestingset(40%)andaveragedthetruepositiveandfalse 0.2 Nearest Neighbor Budds Classifier Tothisend,wecarriedoutanalexperimentinwhichwesystematicallyvariedthenumber positiveratesovertenruns.naivebayesproducedthecurvewiththelargestarea,butnearest neighboralsoyieldedacurvelargerinareathanthatforthebuddsclassier. False Positive Rate candidates,splittingthedataintotraining(60%)andtest(40%)sets,butfurtherdividingthe trainingsetrandomlyintotensubsets(10%,20%,:::,100%).weranthelearningalgorithmson oftrainingcasesavailabletothelearningmethod.weagainusedalloftheavailablerooftop eachofthetrainingsubsetsandevaluatedtheacquiredconceptdescriptionsonthereservedtesting thethebuddsclassierisat,sinceitinvolvesnotrainingandwesimplyappliedittothesame data,averagingourresultsover25separatetraining/testsplits. undertheroccurvesforagivennumberoftrainingcases.asexpected,thelearningcurvefor testsetforeachnumberoftrainingcases.however,nearestneighborproducesacurvethatstarts Figure7showstheresultinglearningcurves,eachpointofwhichcorrespondstotheaveragearea belowthatofthebuddsclassierandthensurpassesitafterseeing70%ofthetrainingdata.naive oneimage.notonlywasnaivebayesthebestperformingmethod,butalsoitwasabletoachieve Bayesshowssimilarimprovementwithincreasingamountsoftrainingdata,butitsperformance Thisequatestoroughly6%oftheavailabledataandislessthantheamountofdataderivedfrom wasbetterthanthebuddsclassierfromthestart,afterobservingonly10%ofthetrainingdata. thisperformanceusingverylittleoftheavailabletrainingdata. 7.5Summary naivebayes,showedpromiseofimprovingtherooftopdetectiontaskoverthehandcraftedlinear dataderivedfromthesameimage,itwasapparentthatatleastonemachinelearningmethod, Fromthewithin-learningexperiments,inwhichwetrainedandtestedthelearningmethodsusing classier.theresultsfromthisexperimentalsoestablishedbaselineperformanceconditionsforthe couldbebecausebuddswasinitiallydevelopedusingnadirimagesandthenextendedtohandle thatrooftopdetectionforobliqueimagesposedamoredicultproblemthanfornadirimages.this methodsbecausetheycontrolledfordierencesinaspectandlocation. Inaneorttotestthelearningmethodsfortheirabilitytogeneralizetounseenimages,wefound

18 RooftopDetectionThroughMachineLearning 17 Table4.Resultsfortheexperimentusingalloftheimagedata.Wesplitthedataintotraining(60%)and test(40%)setsandraneachmethodoverarangeofcosts.wethencomputedtheaveragearea undertheroccurveand95%condenceintervalsovertenruns. Classier NaiveBayes NearestNeighbor BuddsClassier ApproximateArea obliqueimages.thus,thefeaturesmaybebiasedtowardnadir-viewrooftops.amorelikely explanationisthatobliqueimagesaresimplyharderthannadirimages.nevertheless,underall nearestneighbor. generalizingtounseenimages,butthattheperformanceofnaivebayesdegradedlessthanthatof linearclassier.finally,wealsodiscoveredthattheperformanceofthemethodsdegradedwhen circumstances,theperformanceofnaivebayeswasequaltoorbetterthanthatofthehandcrafted thehandcraftedsolutionbymorethanafactoroftwofortruepositiveratesof0.84andhigher. naivebayesandnearestneighboroutperformedthebuddsclassier.furtheranalysisofspecic pointsontheroccurvesrevealedthatnaivebayesimproveduponthefalsepositiverateof Ournalexperimentusedalloftheavailableimagedataforlearninganddemonstratedthat theavailabletrainingdata. LearningcurvesdemonstratedthatnaiveBayesachievedsuperiorperformanceusingverylittleof workinvisuallearningtakesanimage-basedapproach(e.g.,beymer&poggio1996),inwhichthe Researchonlearningincomputervisionhasbecomeincreasinglycommoninrecentyears.Some 8.RelatedWork thepixelsintoadecisionorclassication.researchershaveusedthisapproachextensivelyforface andgesturerecognition(e.g.,chan,nasrabadi,&mirelli1996;guttaetal.1996;osuna,freund,& process,whichisresponsibleforformingtheintermediaterepresentationsnecessarytotransform imagesthemselves,usuallynormalizedortransformedinsomeway,areusedasinputtoalearning Girosi1997;Segen1994),althoughithasseenotherapplicationsaswell(e.g.,Nayar&Poggio1996; Pomerleau1996;Viola1993). features,basedonintensityorshapeproperties,thenlearnstorecognizedesiredobjectsusing thesemachine-producedclassiers.shepherd(1983)useddecision-treeinductiontoclassifyshapes ofchocolatesforanindustrialvisionapplication.cromwellandkak(1991)tookasimilarapproach Aslightlydierentapproachreliesonhandcraftedvisionroutinestoextractrelevantimage forrecognizingelectricalcomponents,suchastransistors,resistors,andcapacitors.maloofand Michalski(1997)examinedvariousmethodsoflearningshapecharacteristicsfordetectingblasting capsinx-rayimages,whereasadditionalwork(maloofetal.1996)discussedlearninginamultistepvisionsystemforthesamedetectionproblem. byconklin(1993),connellandbrady(1987),cooketal.(1993),provan,langley,andbinford Severalresearchershavealsoinvestigatedlearningforthree-dimensionalvisionsystems.Papers

19 RooftopDetectionThroughMachineLearning Average Area under Curve Figure7.LearningcurvesforareaundertheROCcurveusingallavailableimagedata.Weraneachmethod 0.75 Naive Bayes Nearest Neighbor Budds Classifier 0.7 (1996),andSenguptaandBoyer(1993)alldescribeinductiveapproachesaimedatimprovingobject onincreasingamountsoftrainingdataandevaluatedtheresultingconceptdescriptionsonreserved testingdata.eachpointisanaverageoftenruns Percentage of Training Data recognition.theaimhereistolearnthethree-dimensionalstructurethatcharacterizesanobjector objectclass,ratherthanitsappearance.anotherlineofresearch,whichfallsmidwaybetweenthis approachandimage-basedschemes,insteadattemptstolearnasmallsetofcharacteristicviews, Pope&Lowe1996). eachofwhichcanbeusedtorecognizeanobjectfromadierentperspective(e.g.,gros1993; costoferrorsintotheiralgorithmforconstructingandpruningmultivariatedecisiontrees.they theselineshassomeprecedents.inparticular,draper,brodley,andutgo(1994)incorporatethe testedthisapproachonthetaskoflabelingpixelsfromoutdoorimagesforusebyaroad-following Mostworkonvisuallearningignorestheimportanceofmisclassicationcosts,butourworkalong testpixels.woods,bowyer,andkegelmeyer(1996),aswellasrowley,baluja,andkanade(1996), reportsimilarworkthattakesintoaccountthecostoferrors. thanthereverse,andshowedexperimentallythattheirmethodcouldreducesucherrorsonnovel vehicle.theydeterminedthat,inthiscontext,labelingaroadpixelasnon-roadwasmorecostly intosemanticnetworksthatitthengeneralizedbycomparingtodescriptionsofotherinstances. Draper1997;Teller&Veloso1997).OneexceptionisConnellandBrady's(1987)workonlearning structuraldescriptionsofairplanesfromaerialviews.theirmethodconvertedtrainingimages Muchoftheresearchonvisuallearningusesimagesofscenesorobjectsviewedateyelevel(e.g., However,theauthorsdonotappeartohavetestedexperimentallytheiralgorithm'sabilityto 1996),whichcatalogscelestialobjects,suchasgalaxiesandstars,usingimagesfromtheSecond accuratelyclassifyobjectsinnewimages.anotherexampleistheskicatsystem(fayyadetal. PalomarObservatorySkySurvey. UsingROCcurves,theydemonstratethattheensembleachievedbetterperformancethaneither detectvenusianvolcanos,usingsyntheticapertureradaronthemagellanspacecraft.askerand Maclin(1997)extendJARToolbyusinganensembleof48neuralnetworkstoimproveperformance. Arelatedsystem,JARTool(Fayyadetal.1996),alsoanalyzesaerialimages,inthiscaseto theindividuallearnedclassiersortheoneusedoriginallyinjartool.theyalsodocumentsome

20 RooftopDetectionThroughMachineLearning 19 ofthedicultiesassociatedwithapplyingmachinelearningtechniquestoreal-worldproblems,such asfeatureselectionandinstancelabeling,whichweresimilartoproblemsweencountered. neuralnetworks)tolearnconditionsonoperatorselection.hepresentsinitialresultsonaradius procedure(forsoftwaresimilartobudds),thenusesaninductionmethod(backpropagationin Hisapproachadaptsmethodsforreinforcementlearningtoassigncreditinmulti-stagerecognition Finally,Draper(1996)reportsacarefulstudyoflearninginthecontextofanalyzingaerialimages. taskthatalsoinvolvesthedetectionofroofs.ourframeworksharessomefeatureswithdraper's approach,butassumesthatlearningisdirectedbyfeedbackfromahumanexpert.wepredict thatoursupervisedmethodwillbemorecomputationallytractablethanhisuseofreinforcement learning,whichiswellknownforitshighcomplexity.ourapproachdoesrequiremoreinteraction withusers,butwebelievethisinteractionwillbeunobtrusiveifcastwithinthecontextofan advisorysystemforimageanalysis. Althoughthisstudyhasprovidedsomeinsightintotheroleofmachinelearninginimageanalysis, 9.ConcludingRemarks muchstillremainstobedone.forexample,wemaywanttoconsiderothermeasuresofperformance thattakeintoaccountthepresenceofmultiplevalidcandidatesforagivenrooftop.classifying methods,weintendtoworkatbothearlierandlaterlevelsofthebuildingdetectionprocess.the goalhereisnotonlytoincreaseclassicationaccuracy,whichcouldbehandledentirelybycandidate oneofthesecandidatescorrectlyissucientforthepurposeofimageanalysis. selection,butalsotoreducethecomplexityofprocessingbyremovingpoorcandidatesbeforethey Inaddition,althoughtherooftopselectionstagewasanaturalplacetostartinapplyingour areaggregatedintolargerstructures.withthisaiminmind,weplantoextendourworktoall andtakingthisintoaccountinourmodiedinductionalgorithms.anotherconcernswhetherwe shouldusethesameinductionalgorithmateachlevelorusedierentmethodsateachstage. levelsoftheimageunderstandingprocess.wemustaddressanumberofissuesbeforewecanmake progressontheseotherstages.oneinvolvesidentifyingthecostofdierenterrorsateachlevel, madebytheimageunderstandingsystem,generatingtrainingdataintheprocess.atintervals hopetointegratelearningroutinesintobudds.thissystemwasnotdesignedinitiallytobeinteractive,butweintendtomodifyitsothattheimageanalystcanacceptorrejectrecommendations Aswementionedearlier,inordertoautomatethecollectionoftrainingdataforlearning,wealso thesystemwouldinvokeitslearningalgorithms,producingrevisedknowledgethatwouldalterthe interactivelabelingsystemdescribedinsection5couldserveasaninitialmodelforthisinterface. system'sbehaviorinthefutureand,hopefully,reducetheuser'sneedtomakecorrections.the ingtheaccuracy,andthustherobustness,ofimageanalysissystems.however,weneedadditional experimentstogivebetterunderstandingofthefactorsaectingbetween-imagegeneralizationand weneedtoextendlearningtoadditionallevelsoftheimageunderstandingprocess.also,beforewe Inconclusion,ourstudiessuggestthatmachinelearninghasanimportantroletoplayinimprov- canbuildasystemthattrulyaidsthehumanimageanalyst,wemustfurtherdevelopunobtrusive waystocollecttrainingdatatosupportlearning.

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Maximizing Return and Minimizing Cost with the Decision Management Systems

Maximizing Return and Minimizing Cost with the Decision Management Systems KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management

More information

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Introduction to nonparametric regression: Least squares vs. Nearest neighbors Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,

More information

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Koji Fujiwara Office of Institutional Research and Effectiveness Bemidji State University & Northwest Technical College

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

More information

Leveraging electronic health records for predictive modeling of surgical complications. Grant Weller

Leveraging electronic health records for predictive modeling of surgical complications. Grant Weller Leveraging electronic health records for predictive modeling of surgical complications Grant Weller ISCB 2015 Utrecht NL August 26, 2015 Collaborators: David W. Larson, MD; Jenna Lovely, PharmD, RPh; Berton

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

King Saud University

King Saud University King Saud University College of Computer and Information Sciences Department of Computer Science CSC 493 Selected Topics in Computer Science (3-0-1) - Elective Course CECS 493 Selected Topics: DATA MINING

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right

More information

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee

UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass

More information

LocalErrorRecoveryinSRM: ComparisonofTwoApproaches. Ching-GungLiu,DeborahEstrin,ScottShenkerandLixiaZhang

LocalErrorRecoveryinSRM: ComparisonofTwoApproaches. Ching-GungLiu,DeborahEstrin,ScottShenkerandLixiaZhang LocalErrorRecoveryinSRM: ComparisonofTwoApproaches Ching-GungLiu,DeborahEstrin,ScottShenkerandLixiaZhang AbstractSRMisaframeworkforreliablemulticastdelivery.Inordertomaximizethecollaborationamongthe groupmembersinerrorrecovery,bothretransmissionrequestsandrepliesaremulticasttotheentiregroup.while

More information

Local classification and local likelihoods

Local classification and local likelihoods Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

More information

An Approach to Detect Spam Emails by Using Majority Voting

An Approach to Detect Spam Emails by Using Majority Voting An Approach to Detect Spam Emails by Using Majority Voting Roohi Hussain Department of Computer Engineering, National University of Science and Technology, H-12 Islamabad, Pakistan Usman Qamar Faculty,

More information

Classification Techniques (1)

Classification Techniques (1) 10 10 Overview Classification Techniques (1) Today Classification Problem Classification based on Regression Distance-based Classification (KNN) Net Lecture Decision Trees Classification using Rules Quality

More information

Content-Based Recommendation

Content-Based Recommendation Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches

More information

OUTLIER ANALYSIS. Data Mining 1

OUTLIER ANALYSIS. Data Mining 1 OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,

More information

Mining a Corpus of Job Ads

Mining a Corpus of Job Ads Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

Data Mining Techniques for Prognosis in Pancreatic Cancer

Data Mining Techniques for Prognosis in Pancreatic Cancer Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree

More information

Big Data: a new era for Statistics

Big Data: a new era for Statistics Big Data: a new era for Statistics Richard J. Samworth Abstract Richard Samworth (1996) is a Professor of Statistics in the University s Statistical Laboratory, and has been a Fellow of St John s since

More information

Imagine what it would mean to your marketing

Imagine what it would mean to your marketing DATA MINING Assessing Loan Risks: A Data Mining Case Study Rob Gerritsen Imagine what it would mean to your marketing clients if you could predict how their customers would respond to a promotion, or if

More information

Combining Global and Personal Anti-Spam Filtering

Combining Global and Personal Anti-Spam Filtering Combining Global and Personal Anti-Spam Filtering Richard Segal IBM Research Hawthorne, NY 10532 Abstract Many of the first successful applications of statistical learning to anti-spam filtering were personalized

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of

More information

Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm

Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm Markus Goldstein and Andreas Dengel German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122,

More information

Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA

Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA mpm308@lehigh.edu M. Chuah CSE Dept Lehigh University 19 Memorial

More information

Visualizing High-Dimensional Predictive Model Quality

Visualizing High-Dimensional Predictive Model Quality Visualizing High-Dimensional Predictive Model Quality Penny Rheingans University of Maryland, Baltimore County Department of Computer Science and Electrical Engineering rheingan@cs.umbc.edu Marie desjardins

More information

CS 207 - Data Science and Visualization Spring 2016

CS 207 - Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

Bayes Theorem & Diagnostic Tests Screening Tests

Bayes Theorem & Diagnostic Tests Screening Tests Bayes heorem & Screening ests Bayes heorem & Diagnostic ests Screening ests Some Questions If you test positive for HIV, what is the probability that you have HIV? If you have a positive mammogram, what

More information

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577 T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

More information

Data Mining in Weka Bringing It All together

Data Mining in Weka Bringing It All together Data Mining in Weka Bringing It All together Predictive Analytics Center of Excellence (PACE) San Diego Super Computer Center, UCSD Data Mining Boot Camp 1 Introduction The project assignment demonstrates

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model

Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different

More information

Using Dalvik opcodes for Malware Detection on Android

Using Dalvik opcodes for Malware Detection on Android Using Dalvik opcodes for Malware Detection on Android José Gaviria de la Puerta, Borja Sanz, Igor Santos and Pablo García Bringas DeustoTech Computing, University of Deusto jgaviria@deusto.es, borja.sanz@deusto.es,

More information

Math. Rounding Decimals. Answers. 1) Round to the nearest tenth. 8.54 8.5. 2) Round to the nearest whole number. 99.59 100

Math. Rounding Decimals. Answers. 1) Round to the nearest tenth. 8.54 8.5. 2) Round to the nearest whole number. 99.59 100 1) Round to the nearest tenth. 8.54 8.5 2) Round to the nearest whole number. 99.59 100 3) Round to the nearest tenth. 310.286 310.3 4) Round to the nearest whole number. 6.4 6 5) Round to the nearest

More information

Identifying Peer-to-Peer Traffic Based on Traffic Characteristics

Identifying Peer-to-Peer Traffic Based on Traffic Characteristics Identifying Peer-to-Peer Traffic Based on Traffic Characteristics Prof S. R. Patil Dept. of Computer Engineering SIT, Savitribai Phule Pune University Lonavala, India srp.sit@sinhgad.edu Suraj Sanjay Dangat

More information

Data Mining for Network Intrusion Detection

Data Mining for Network Intrusion Detection Data Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining for Network Intrusion Detection p.1/55 Overview This is important for defense in depth Much

More information

However,duetoboththescaleandthecomplexityoftheInternet,itisunlikelythatameasure-

However,duetoboththescaleandthecomplexityoftheInternet,itisunlikelythatameasure- Part1:AServer-BasedMeasurementInfrastructure NetworkPerformanceMeasurementandAnalysis Y.ThomasHou (ConceptPaper) AsInternettraccontinuestogrowexponentially,itisessentialforboththeusersandserviceproviders

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Average rate of change of y = f(x) with respect to x as x changes from a to a + h:

Average rate of change of y = f(x) with respect to x as x changes from a to a + h: L15-1 Lecture 15: Section 3.4 Definition of the Derivative Recall the following from Lecture 14: For function y = f(x), the average rate of change of y with respect to x as x changes from a to b (on [a,

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

WEKA. Machine Learning Algorithms in Java

WEKA. Machine Learning Algorithms in Java WEKA Machine Learning Algorithms in Java Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand E-mail: ihw@cs.waikato.ac.nz Eibe Frank Department of Computer Science

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

Role of Neural network in data mining

Role of Neural network in data mining Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Predictive Data modeling for health care: Comparative performance study of different prediction models

Predictive Data modeling for health care: Comparative performance study of different prediction models Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar

More information

Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class

Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class Problem 1. (10 Points) James 6.1 Problem 2. (10 Points) James 6.3 Problem 3. (10 Points) James 6.5 Problem 4. (15 Points) James 6.7 Problem 5. (15 Points) James 6.10 Homework 4 Statistics W4240: Data Mining

More information

A semi-supervised Spam mail detector

A semi-supervised Spam mail detector A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach

More information

Using Artificial Intelligence to Manage Big Data for Litigation

Using Artificial Intelligence to Manage Big Data for Litigation FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear

More information

Course Description This course will change the way you think about data and its role in business.

Course Description This course will change the way you think about data and its role in business. INFO-GB.3336 Data Mining for Business Analytics Section 32 (Tentative version) Spring 2014 Faculty Class Time Class Location Yilu Zhou, Ph.D. Associate Professor, School of Business, Fordham University

More information

On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction

On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction Empirical Software Engineering manuscript No. () On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction Burak Turhan Tim Menzies Ayse Bener Justin Distefano Received: Sept

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

203.4770: Introduction to Machine Learning Dr. Rita Osadchy

203.4770: Introduction to Machine Learning Dr. Rita Osadchy 203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:

More information

Sentiment analysis using emoticons

Sentiment analysis using emoticons Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was

More information

Applying Classifier Algorithms to Organizational Memory to Build an Attrition Predictor Model

Applying Classifier Algorithms to Organizational Memory to Build an Attrition Predictor Model Applying Classifier Algorithms to Organizational Memory to Build an Attrition Predictor Model K. M. SUCEENDRAN 1, R. SARAVANAN 2, DIVYA ANANTHRAM 3, Dr.S.POONKUZHALI 4, R.KISHORE KUMAR 5, Dr.K.SARUKESI

More information

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...

More information

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)

Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116) Business Intelligence and Data Mining ISOM 3360: Spring 203 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: justinjia@ust.hk Office: Rm 336 (Lift 3-) Begin

More information

Network Intrusion Detection Using a HNB Binary Classifier

Network Intrusion Detection Using a HNB Binary Classifier 2015 17th UKSIM-AMSS International Conference on Modelling and Simulation Network Intrusion Detection Using a HNB Binary Classifier Levent Koc and Alan D. Carswell Center for Security Studies, University

More information

Supervised model-based visualization of high-dimensional data

Supervised model-based visualization of high-dimensional data Intelligent Data Analysis 4 (2000) 213 227 IOS Press Supervised model-based visualization of high-dimensional data Petri Kontkanen, Jussi Lahtinen, Petri Myllymäki, Tomi Silander and Henry Tirri Complex

More information

How To Prevent Network Attacks

How To Prevent Network Attacks Ali A. Ghorbani Wei Lu Mahbod Tavallaee Network Intrusion Detection and Prevention Concepts and Techniques )Spri inger Contents 1 Network Attacks 1 1.1 Attack Taxonomies 2 1.2 Probes 4 1.2.1 IPSweep and

More information

Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

2.0. Specification of HSN 2.0 JavaScript Static Analyzer

2.0. Specification of HSN 2.0 JavaScript Static Analyzer 2.0 Specification of HSN 2.0 JavaScript Static Analyzer Pawe l Jacewicz Version 0.3 Last edit by: Lukasz Siewierski, 2012-11-08 Relevant issues: #4925 Sprint: 11 Summary This document specifies operation

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

Keywords: Data Warehouse, Data Warehouse testing, Lifecycle based testing, performance testing.

Keywords: Data Warehouse, Data Warehouse testing, Lifecycle based testing, performance testing. DOI 10.4010/2016.493 ISSN2321 3361 2015 IJESC Research Article December 2015 Issue Performance Testing Of Data Warehouse Lifecycle Surekha.M 1, Dr. Sanjay Srivastava 2, Dr. Vineeta Khemchandani 3 IV Sem,

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Mining HTML Pages to Support Document Sharing in a Cooperative System

Mining HTML Pages to Support Document Sharing in a Cooperative System Mining HTML Pages to Support Document Sharing in a Cooperative System Donato Malerba, Floriana Esposito and Michelangelo Ceci Dipartimento di Informatica, Università degli Studi via Orabona, 4-70126 Bari

More information

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware

Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Cumhur Doruk Bozagac Bilkent University, Computer Science and Engineering Department, 06532 Ankara, Turkey

More information

Conditional anomaly detection methods for patient management alert systems

Conditional anomaly detection methods for patient management alert systems Conditional anomaly detection methods for patient management alert systems Keywords: anomaly detection, alert systems, monitoring, health care applications, metric learning Michal Valko michal@cs.pitt.edu

More information

GoldenBullet in a Nutshell

GoldenBullet in a Nutshell GoldenBullet in a Nutshell Y. Ding, M. Korotkiy, B. Omelayenko, V. Kartseva, V. Zykov, M. Klein, E. Schulten, and D. Fensel Vrije Universiteit Amsterdam, De Boelelaan 1081a, 1081 HV Amsterdam, NL From:

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

Monday Morning Data Mining

Monday Morning Data Mining Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik

More information

SOPS: Stock Prediction using Web Sentiment

SOPS: Stock Prediction using Web Sentiment SOPS: Stock Prediction using Web Sentiment Vivek Sehgal and Charles Song Department of Computer Science University of Maryland College Park, Maryland, USA {viveks, csfalcon}@cs.umd.edu Abstract Recently,

More information

Model Selection. Introduction. Model Selection

Model Selection. Introduction. Model Selection Model Selection Introduction This user guide provides information about the Partek Model Selection tool. Topics covered include using a Down syndrome data set to demonstrate the usage of the Partek Model

More information

B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions.

B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. B2.53-R3: COMPUTER GRAPHICS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered in the TEAR-OFF ANSWER

More information

Choosing software metrics for defect prediction: an investigation on feature selection techniques

Choosing software metrics for defect prediction: an investigation on feature selection techniques SOFTWARE PRACTICE AND EXPERIENCE Softw. Pract. Exper. 2011; 41:579 606 Published online in Wiley Online Library (wileyonlinelibrary.com)..1043 Choosing software metrics for defect prediction: an investigation

More information

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum

Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?

More information

Comparison of Classification Techniques for Heart Health Analysis System

Comparison of Classification Techniques for Heart Health Analysis System International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-04, Issue-02 E-ISSN: 2347-2693 Comparison of Classification Techniques for Heart Health Analysis System Karthika

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Scaling Up the Accuracy of Naive-Bayes Classiers: a Decision-Tree Hybrid. Ron Kohavi. Silicon Graphics, Inc. 2011 N. Shoreline Blvd. ronnyk@sgi.

Scaling Up the Accuracy of Naive-Bayes Classiers: a Decision-Tree Hybrid. Ron Kohavi. Silicon Graphics, Inc. 2011 N. Shoreline Blvd. ronnyk@sgi. Scaling Up the Accuracy of Classiers: a Decision-Tree Hybrid Ron Kohavi Data Mining and Visualization Silicon Graphics, Inc. 2011 N. Shoreline Blvd Mountain View, CA 94043-1389 ronnyk@sgi.com Abstract

More information

Predicting Flight Delays

Predicting Flight Delays Predicting Flight Delays Dieterich Lawson jdlawson@stanford.edu William Castillo will.castillo@stanford.edu Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Inferring Formal Titles in Organizational Email Archives

Inferring Formal Titles in Organizational Email Archives Inferring Formal Titles in Organizational Email Archives Galileo Mark S. Namata Jr. namatag@cs.umd.edu Department of Computer Science/UMIACS, University of Maryland, College Park, MD 2742, USA Lise Getoor

More information

Date : July 28, 2015

Date : July 28, 2015 Date : July 28, 2015 Awesome(Team( 2! Who"are"we?" Menish Gupta Lukas Osborne Founder!&!CEO! 9+!years!@!Amex!! 5!years!@!Startups!in!NYC! B.S.!/!M.S.!Comp!Sci.!NJIT! Data!Science! 7!PublicaIons! 5!years!@!CISMM!Labs!

More information

Automatic Text Processing: Cross-Lingual. Text Categorization

Automatic Text Processing: Cross-Lingual. Text Categorization Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell Informazione Università degli Studi di Siena Dottorato di Ricerca in Ingegneria dell Informazone XVII ciclo

More information

Diploma Of Computing

Diploma Of Computing Diploma Of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B

More information

DATA MINING IN FINANCIAL MARKETS STEPHEN EVANS. Presented to the Faculty of the Graduate School of

DATA MINING IN FINANCIAL MARKETS STEPHEN EVANS. Presented to the Faculty of the Graduate School of DATA MINING IN FINANCIAL MARKETS by STEPHEN EVANS Presented to the Faculty of the Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for the Degree of MASTER

More information

Detecting Internet Worms Using Data Mining Techniques

Detecting Internet Worms Using Data Mining Techniques Detecting Internet Worms Using Data Mining Techniques Muazzam SIDDIQUI Morgan C. WANG Institute of Simulation & Training Department of Statistics and Actuarial Sciences University of Central Florida University

More information

On the Role of Data Mining Techniques in Uncertainty Quantification

On the Role of Data Mining Techniques in Uncertainty Quantification On the Role of Data Mining Techniques in Uncertainty Quantification Chandrika Kamath Lawrence Livermore National Laboratory Livermore, CA, USA kamath2@llnl.gov USA/South America Symposium on Stochastic

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information