|
|
|
- Matilda Knight
- 10 years ago
- Views:
Transcription
1 ImprovingRooftopDetectioninAerialImages ThroughMachineLearning yinstituteforthestudyoflearningandexpertise zroboticslaboratory,departmentofcomputerscience 2164StauntonCourt,PaloAlto,CA94306 StanfordUniversity,Stanford,CA94305 theproblemofanalyzingaerialimagesanddescribeanexistingvisionsystemthatautomatesthe whichisonestepinavisionsystemthatrecognizesbuildingsinoverheadimagery.wereview Inthispaper,weexaminetheuseofmachinelearningtoimprovearooftopdetectionprocess, Abstract recognitionofbuildingsinsuchimages.afterthis,webrieyreviewtwowell-knownlearning algorithms,representingdierentinductivebiases,thatweselectedtoimproverooftopdetection. Animportantaspectofthisproblemisthatthedatasetsarehighlyskewedandthecostofmistakes bothtrainingandtestingdataarederivedfromthesameimage.anotheraddressesbetween-image learningtotheimageanalysistask.onesetofstudiesfocusesonwithin-imagelearning,inwhich ROCanalysis.Wereportthreesetsofexperimentsdesignedtoilluminatefacetsofapplyingmachine diersforthetwoclasses,soweevaluatethealgorithmsundervaryingmisclassicationcostsusing learning,inwhichtrainingandtestingsetscomefromdierentimages.analsetinvestigates learningusingallavailableimagedatainaneorttodeterminethebestperformingmethod. Experimentalresultsdemonstratethatusefulgeneralizationoccurswhentrainingandtestingon ahandcraftedlinearclassier,thesolutioncurrentlybeingusedinthebuildingdetectionsystem. classierexceeded,byasmuchasafactoroftwo,thepredictiveaccuracyofnearestneighborand that,undermostconditionsandacrossarangeofmisclassicationcosts,atrainednaivebayesian dataderivedfromimagesthatdierinlocationandinaspect.furthermore,theydemonstrate theavailabletrainingdata. AnalysisoflearningcurvesrevealsthatnaiveBayesachievedsuperiorityusingaslittleas6%of
2 RooftopDetectionThroughMachineLearning 1 abilitytoprocessthem.computationalaidswillberequiredtolterthisoodofimagesand Thenumberofimagesavailabletoimageanalystsisgrowingrapidly,andwillsoonoutpacetheir 1.Introduction focustheanalyst'sattentiononinterestingevents,butcurrentimageunderstandingsystemsare operationscangiveacceptableresultsonsomeimagesbutnotothers. consequentlyremainfragile.handcraftedknowledgeaboutwhenandhowtouseparticularvision notyetrobustenoughtosupportthisprocess.successfulimageunderstandingreliesonknowledge, anddespitetheoreticalprogress,implementedvisionsystemsstillrelyonheuristicmethodsand inthevisionprocess,andthusforproducingmorerobustsoftware.recentapplicationsofmachine learninginbusinessandindustry(langley&simon1995)holdusefullessonsforapplicationsin imageanalysis.akeyideainappliedmachinelearninginvolvesbuildinganadvisorysystemthat Inthispaper,weexploretheuseofmachinelearningasameansforimprovingknowledgeused systemanalysesacceptableandothersuninterestingorinerror.theaimofourresearchprogram recommendsactionsbutgivesnalcontroltoahumanuser,witheachdecisiongeneratingatraining issimilartothescenarioinwhichanimageanalystinteractswithavisionsystem,ndingsome istoembedmachinelearningintothisinteractiveprocessofimageanalysis. case,gatheredinanunobtrusiveway,foruseinlearning.thissettingforknowledgeacquisition individualsinresponsetofeedbackfromthoseusers.theoveralleectshouldbeanewclass thatimageanalystsmustmakeperpicture,thusimprovingtheirabilitytodealwithahighow ofimages.moreover,theresultingsystemsshouldadapttheirknowledgetothepreferencesof Thisadaptiveapproachtocomputervisionpromisestogreatlyreducethenumberofdecisions ofsystemsforimageanalysisthatreducestheworkloadonhumananalystsandgivethemmore reliableresults,thusspeedingtheimageanalysisprocess. domain identifyingbuildingsinaerialphotographs andthendescribethevisionsystemdesigned makingatonestageinanexistingimageunderstandingsystem.webeginbyexplainingthetask forthistask.next,wereviewtwowell-knownalgorithmsforsupervisedlearningthatholdpotential Inthesectionsthatfollow,wereportprogressonusingmachinelearningtoimprovedecision forimprovingthereliabilityofimageanalysisinthisdomain.afterthis,wereportthedesignof experimentstoevaluatethesemethodsandtheresultsofthosestudies.inclosing,wediscuss relatedandfuturework. 2.NatureoftheImageAnalysisTask otherinterestingbehavior.theimagesunderscrutinyareusuallycomplex,involvingmanyobjects Theimageanalystinterpretsaerialimagesofgroundsiteswithaneyetounusualactivityor inarangeofsizesandshapes,majorandminorroadways,sidewalks,parkinglots,vehicles,and vegetation.acommontaskfacedbytheimageanalystistodetectchangeatasiteasreectedin arrangedinavarietyofpatterns.overheadimagesofforthood,texas,collectedaspartofthe dierencesbetweentwoimages,asinthenumberofbuildings,roads,andvehicles.thisinturn RADIUSproject(Firschein&Strat1997),aretypicalofamilitarybaseandincludebuildings requirestheabilitytorecognizeexamplesfromeachclassofinterest.inthispaper,wefocuson theperformancetaskofidentifyingbuildingsinsatellitephotographs.
3 RooftopDetectionThroughMachineLearning 2 theimage,includingthetimeofday(whichaectscontrastandshadows),thetimeofyear(which parameters,suchasdistancefromthesite(whichaectssizeandresolution)andviewingangle (whichaectsperspectiveandvisiblesurfaces).butothervariablesalsoinuencethenatureof Aerialimagescanvaryacrossanumberofdimensions.Themostobviousfactorsconcernviewing aectsfoliage),andthesiteitself(whichdeterminestheshapesofviewedobjects).takentogether, thesefactorsintroduceconsiderablevariabilityintotheimagesthatconfronttheanalyst. thoughabuildingorvehiclewillappeardierentfromalternativeperspectivesanddistances,the eectsofsuchtransformationsarereasonablywellunderstood.butvariationsduetotimeofday, theseason,andthesitearemoreserious.shadowsandfoliagecanhideedgesandobscuresurfaces, Inturn,thisvariabilitycansignicantlycomplicatethetaskofrecognizingobjectclasses.Al- andbuildingsatdistinctsitesmayhavequitedierentstructuresandlayouts.suchvariationsserve computervisionsystems. asmeredistractionstothehumanimageanalyst,yettheyprovideseriouschallengestoexisting particularvisionsoftware.inthenexttwosections,webrieyreviewonesuchsystemforimage studythistaskintheabstract.wemustexploretheeectofspecicinductionalgorithmson knowledgethatimprovesthereliabilityofsuchanimageanalysissystem.however,wecannot Thissuggestsanaturaltaskformachinelearning:givenaerialimagesastrainingdata,acquire analysisandtwolearningmethodsthatmightgiveitmorerobustbehavior. 3.AnArchitectureforImageAnalysis LinandNevatia(1996)reportacomputervisionpackage,calledtheBuildingsDetectionand DescriptionSystem(Budds),fortheanalysisofgroundsitesinaerialimages.Likemanyprograms forimageunderstanding,theirsystemoperatesinaseriesofprocessingstages.eachstepinvolves aggregatinglowerlevelfeaturesintohigherlevelones,eventuallyreachinghypothesesaboutthe locationsanddescriptionsofbuildings.wewillconsiderthesestagesintheorderthattheyoccur. invokesalinendertogroupedgelsintolines.junctionsandparallellinesareidentiedand combinedtoformthree-sidedstructuresor\us".thealgorithmthengroupsselectedusand junctionstoformparallelograms.eachsuchparallelogramconstitutesahypothesisaboutthe Startingatthepixellevel,Buddsusesanedgedetectortogrouppixelsintoedgels,andthen positionandorientationoftheroofforsomebuilding,sowemaycallthissteprooftopgeneration. eachrooftopcandidatetodeterminewhetherithassucientevidencetoberetained.theaim ofthisprocessistoremovecandidatesthatdonotcorrespondtoactualbuildings.ideally,the systemwillrejectmostspuriouscandidatesatthispoint,althoughanalvericationstepmaystill Afterthesystemhascompletedtheaboveaggregationprocess,arooftopselectionstageevaluates collapseduplicateoroverlappingrooftops.thisstagemayalsoexcludecandidatesifthereisno evidenceofthree-dimensionalstructure,suchasshadowsandwalls. candidates.thisprocesstakesintoaccountbothlocalandglobalcriteria.localsupportcomes improvementthroughmachinelearning,becausethisstagemustdealwithmanyspuriousrooftop fromfeaturessuchaslinesandcornersthatareclosetoagivenparallelogram.sincethesesuggest Analysisofthesystem'soperationsuggestedthatrooftopselectionheldthemostpromisefor wallsandshadows,theyprovideevidencethatthecandidatecorrespondstoanactualbuilding.
4 RooftopDetectionThroughMachineLearning 3 constraintsappliedinthisprocesshaveasolidfoundationinboththeoryandpractice. tioncriteria,thesetofrooftopcandidatesisreducedtoamoremanageablesize.theindividual Globalcriteriaconsidercontainment,overlap,andduplicationofcandidates.Usingtheseevalua- thatvaryintheirglobalcharacteristics,suchascontrastandamountofshadow.however,methods Moreover,suchrulesofthumbarecurrentlycraftedbyhand,andtheydonotfarewellonimages frommachinelearning,towhichwenowturn,maybeabletoinducebetterconditionsforselecting Theproblemisthatwehaveonlyheuristicknowledgeabouthowtocombinetheseconstraints. orrejectingcandidaterooftops.iftheseacquiredheuristicsaremoreaccuratethantheexisting handcraftedsolutions,theywillimprovethereliabilityoftherooftopselectionprocess. 4.AReviewofThreeLearningTechniques Wecanformulatethetaskofacquiringrooftopselectionheuristicsintermsofsupervisedlearning. Inthisprocess,trainingcasesofsomeconceptarelabeledastotheirclass.Inrooftopselection, associatedvalues,alongwithaclasslabel.theselabeledinstancesconstitutetrainingdatathatare examplesoftheconcept\rooftop".eachinstanceconsistsofanumberofattributesandtheir onlytwoclassesexist rooftopandnon-rooftop whichwewillrefertoaspositiveandnegative providedasinputtoaninductivelearningroutine,whichgeneratesconceptdescriptionsdesigned todistinguishthepositiveexamplesfromthenegativeones.theseknowledgestructuresstatethe conditionsunderwhichtheconcept,inthiscase\rooftop",issatised. fortherooftopdetectiontaskandselectedthetwothatshowedpromiseofachievingabalancebetweenthetruepositiveandfalsepositiverates:nearestneighbor,andnaivebayes.thesemethods Inapreviousstudy(Maloofetal.1997),weevaluatedavarietyofmachinelearningmethods usedierentrepresentations,performanceschemes,andlearningmechanismsforsupervisedconceptlearning,andexhibitdierentinductivebiases,meaningthateachalgorithmacquirescertaisentationofknowledgethatsimplyretainstrainingcasesinmemory.thisapproachclassiesnew instancesbyndingthe\nearest"storedcase,asmeasuredbysomedistancemetric,thenpredictingtheclassassociatedwiththatcase.fornumericattributes,acommonmetric(whichweusein Thenearest-neighbormethod(e.g.,Aha,Kibler,&Albert1991),usesaninstance-basedrepre- conceptsmoreeasilythanothers. eachtraininginstance,alongwithitsassociatedclass.althoughthismethodisquitesimpleand hasknownsensitivitytoirrelevantattributes,inpracticeitperformswellinmanydomains.some ourstudies)iseuclideandistance.inthisframework,learninginvolvesnothingmorethanstoring versionsselectthekclosestcasesandpredictthemajorityclass;herewewillfocusonthe\simple" estimatedconditionalprobabilitiesofeachattributevaluegiventheclass.themethodclassies nearestneighborscheme,whichusesonlythenearestcaseforprediction. newinstancesbycomputingtheposteriorprobabilityofeachclassusingbayes'rule,combiningthe descriptionforeachclass.thisdescriptionincludesanestimateoftheclassprobabilityandthe ThenaiveBayesianclassier(e.g.,Langley,Iba,&Thompson1992)storesaprobabilisticconcept storedprobabilitiesbyassumingthattheattributesareindependentgiventheclassandpredicting
5 RooftopDetectionThroughMachineLearning 4 Figure1.Visualizationinterfaceforlabelingrooftopcandidates.Thesystempresentscandidatestoauser wholabelsthembyclickingeitherthe`roof'or`non-roof'button.italsoincorporatesasimple basedonpreviouslylabeledexamples. learningalgorithmtoprovidefeedbacktotheuseraboutthestatisticalpropertiesofacandidate itations,suchassensitivitytoattributecorrelationsandaninabilitytorepresentmultipledecision theclasswiththehighestposteriorprobability.likenearestneighbor,naivebayeshasknownlim- regions,butinpracticeitbehaveswellonmanynaturaldomains. whichisequivalenttoaperceptronclassier(e.g.,zurada1992).althoughwedidnottrainthis thepurposeofcomparison.thismethodrepresentsconceptsusingacollectionofweightswand methodaswedidnaivebayesandnearestneighbor,weincludedthismethodinourevaluationfor Currently,Buddsusesahandcraftedlinearclassierforrooftopdetection(Lin&Nevatia1996), athreshold.toclassifyaninstance,whichwerepresentasavectorofnnumbersx,wecompute theoutputooftheclassierusingtheformula: Forourapplication,theclassierpredictsthepositiveclassiftheoutputis+1andpredictsthe o=(+1ifpni=1wixi> negativeclassotherwise.thereareanumberofestablishedmethodsfortrainingperceptrons,but?1otherwise usedinbuddsasthe\buddsclassier". notusethelearnedperceptronshere.henceforth,wewillrefertothehandcraftedlinearclassier ourpreliminarystudiessuggestedthattheyfaredworsethanthemanuallysetweights,sowedid
6 RooftopDetectionThroughMachineLearning 5 Table1.Characteristicsoftheimagesanddatasets.Webeganwithanadirandanobliqueimageofan areaofforthood,texas,andderivedthreesubimagesfromeachthatcontainedconcentrationsof buildings.wethenusedbuddstoextractrooftopcandidatesandlabeledeachaseitherapositive ornegativeexampleoftheconcept\rooftop". Number Image 21 Original Image LocationAspectExamplesExamples Positive 197 Negative FHOV1027 FHOV625 3 Oblique Nadir candidatesinaerialimages.thisrequiredthreethings:asetofimagesthatcontainbuildings, 5.Generating,Representing,andLabelingRooftopCandidates Wewereinterestedinhowwellthevariousinductionalgorithmscouldlearntoclassifyrooftop somemeanstogenerateandrepresentplausiblerooftops,andlabelsforeachsuchcandidate. werecollectedaspartoftheradiusprogram(firschein&strat1997).theseimagescoverthe sameareabutweretakenfromdierentviewpoints,onefromanadirangleandtheotherfroman obliqueangle.wesubdividedeachimageintothreesubimages,focusingonlocationsthatcontained Asourrststep,weselectedtwoimages,FHOV1027andFHOV625,ofFortHood,Texas,which image,producingsixdatasets.followinglinandnevatia(1996),thedatasetsdescribedeach concentrationsofbuildings,tomaximizethenumberofpositiverooftopcandidates.thisgaveus rooftopcandidateintermsofninecontinuousfeaturesthatsummarizetheevidencegatheredfrom threepairsofimages,eachpaircoveringthesameareabutviewedfromdierentaspects. thevariouslevelsofanalysis.forexample,positiveindicationsfortheexistenceofarooftop OuraimwastoimproveBuddssoweusedthissystemtogeneratecandidaterooftopsforeach junctionsadjacenttothecandidate,similarlyadjacentt-junctions,gapsinthecandidate'sedges, ofthecandidate.negativeevidenceincludedtheexistenceoflinesthatcrossthecandidate,l- includedevidenceforedgesandcorners,thedegreetowhichacandidate'sopposinglinesare andthedegreetowhichenclosinglinesfailedtoformaparallelogram. parallel,supportfortheexistenceoforthogonaltrihedralvertices,andshadowsnearthecorners thedata,andwemakenoclaimsthatthesenineattributesarethebestonesforrecognizingrooftops inaerialimages.however,becauseouraimwastoimprovetherobustnessofbudds,weneededto usethesamefeaturesaslinandnevatia'shandcraftedclassier.moreover,itseemedunlikelythat Weshouldnotethatinductionalgorithmsareoftensensitivetothefeaturesoneusestodescribe wecoulddevisebetterfeaturesthanthesystem'sauthorshaddevelopedduringyearsofresearch. themostinteresting.buddsitselfclassieseachcandidate,butsinceweweretryingtoimprove onitsability,wecouldnotusethoselabels.thus,wetriedanapproachinwhichanexpert Thethirdproblem,labelingthegeneratedrooftopcandidates,provedthemostchallengingand
7 RooftopDetectionThroughMachineLearning 6 aregionsurroundingtheactualrooftop.unfortunately,uponinspectionneitherapproachgaveus positiveornegativedependingonthedistanceoftheirverticesfromthenearestactualrooftop's corners.wealsotriedasecondschemethatusedthenumberofcandidateverticesthatfellwithin speciedtheverticesofactualrooftopsintheimage,thenweautomaticallylabeledcandidatesas satisfactorylabelingresults. process.oneisthattheyignoreinformationaboutthecandidate'sshape;agoodrooftopshould beaparallelogram,yetnearnessofverticesisneithersucientornecessaryforthisform.a seconddrawbackisthattheyignoreotherinformationcontainedintheninebuddsattributes, Analysisrevealedthedicultieswithusingsuchrelationstoactualrooftopsinthelabeling two-dimensionalspacethatdescribeslocationwithintheimage,ratherthanthenine-dimensional suchasshadowsandcrossinglines.thebasicproblemisthatsuchmethodsdealonlywiththe daunting,aseachimageproducedthousandsofcandidaterooftops.tosupporttheprocess,we spacethatwewantthevisionsystemtouseinclassifyingacandidate. eachextractedrooftoptotheuser.thesystemdrawseachcandidateovertheportionoftheimage implementedaninteractivelabelingsysteminjava,showninfigure1,thatsuccessivelydisplays Reluctantly,weconcludedthatmanuallabelingbyahumanwasnecessary,butthistaskwas fromwhichitwasextracted,thenletstheuserclickbuttonsfor`roof'or`non-roof'tolabelthe example. rooftops,andunknown.theinterfacedisplayslikelyrooftopsusinggreenrectangles,unlikely toimprovethelabelingprocess.asthesystemobtainsfeedbackfromtheuseraboutpositive andnegativeexamples,itdividesunlabeledcandidatesintothreeclasses:likelyrooftops,unlikely Thevisualinterfaceitselfincorporatesasimplelearningmechanism nearestneighbor designed sensitivityparameter1thataectshowcertainthesystemmustbebeforeitproposesalabel.after eitherthe`roof'or`non-roof'button.thesimplelearningmechanismthenusesthisinformation rooftopsasredrectangles,andunknowncandidatesasbluerectangles.thesystemincludesa toimprovesubsequentpredictionsofcandidatelabels. displayingarooftop,theusereitherconrmsorcontradictsthesystem'spredictionbyclicking fewerandfewercandidatesaboutwhichitwasuncertain,andthusspeedupthelaterstagesof session,theusertypicallyconrmsnearlyalloftheinterface'srecommendations.however,because interaction.informalstudiessuggestedthatthesystemachievesthisaim:bytheendofthelabeling Ourintentwasthat,astheinterfacegainedexperiencewiththeuser'slabels,itwoulddisplay manner,theinterfacerequiredonlyaboutvehourstolabelthe17,829roofcandidatesextracted wewereconcernedthatouruseofnearestneighbormightbiasthelabelingprocessinfavorofthis fromthesiximages.thiscomestounderonesecondpercandidate,whichstillseemsquiteecient. algorithmduringlaterstudies,wegeneratedthedatausedinsection7bythesettingsensitivity parametersothatthesystempresentedallcandidatesasuncertain.evenhandicappedinthis fascinatingissuesinourwork.toincorporatesupervisedconceptlearningintovisionsystems, whichcangeneratethousandsofcandidatesperimage,wemustdevelopmethodstoreducethe burdenoflabelingthesedata.infuturework,weintendtomeasuremorecarefullytheabilityof Insummary,whatbeganasthesimpletaskoflabelingvisualdataledustosomeofthemore learnedclassiertoordercandidaterooftops(showingtheleastcertainonesrst)andeventolter ouradaptivelabelingsystemtospeedthisprocess.wealsoplantoexploreextensionsthatusethe 1.TheusercansetthisparameterusingthesliderbarandnumbereldinthebottomrightcornerofFigure1.
8 RooftopDetectionThroughMachineLearning 7 candidatesbeforetheyarepassedontotheuser(automaticallylabelingthemostcondentones). Techniquessuchasselectivesampling(e.g.,Freundetal.1997)anduncertaintysampling(Lewis 6.Cost-SensitiveLearningandSkewedData &Catlett1994)shouldproveusefultowardtheseends. Twoaspectsoftherooftopselectiontaskinuencedourapproachtoimplementationandevaluation. First,Buddsworksinabottom-upmanner,soifthesystemdiscardsarooftop,itcannotretrieveit later.consequently,errorsontherooftopclass(falsenegatives)aremoreexpensivethanerrorson whenitcandrawuponaccumulatedevidence,suchastheexistenceofwallsandshadows.however, negative.thesystemhasthepotentialfordiscardingfalsepositivesinlaterstagesofprocessing sincefalsenegativescannotberecovered,weneedtominimizeerrorsontherooftopclass. thenon-rooftopclass(falsepositives),soitisbettertoretainafalsepositivethantodiscardafalse errorsonourminorityclass(rooftops)aremostexpensive,andtheextremeskewonlyincreases acrossclasses(781rooftopsvs.17,048non-rooftops).givensuchskeweddata,mostinduction algorithmshavedicultylearningtopredicttheminorityclass.moreover,wehaveestablishedthat Second,wehaveaseverelyskeweddataset,withtrainingexamplesdistributednon-uniformly sucherrors.thisinteractionbetweenskewedclassdistributionandunequalerrorcostsoccursin manycomputervisionapplications,inwhichavisionsystemgeneratesthousandsofcandidates butonlyahandfulcorrespondtoobjectsofinterest.italsoholdsmanyotherapplicationsof machinelearning,suchasfrauddetection(fawcett&provost1997),discourseanalysis(soderland &Lehnert1994),andtelecommunicationsriskmanagement(Ezawa,Singh,&Norton1996). thatcanachievehighaccuracyontheminorityclass.second,theyrequireanexperimentalmethodologythatletsuscomparedierentmethodsondomainslikerooftopdetection,inwhichtheclasses areskewedanderrorshavedierentcosts.intheremainderofthissection,wefurtherclarifythenatureoftheproblem,afterwhichweproposesomecost-sensitivelearningmethodsandanapproach 6.1FavoritismTowardtheMajorityClass toexperimentalevaluation. Theseissuesraisetwochallenges.First,theysuggesttheneedformodiedlearningalgorithms Inapreviousstudy(Maloofetal.1997),weevaluatedseveralalgorithmswithouttakinginto accountthecostofclassicationerrorsandgotconfusingexperimentalresults.somemethods,like thestandarderror-drivenalgorithmforrevisingperceptronweights(e.g.,zurada1992),learnedto alwayspredictthemajorityclass.thenaivebayesianclassierfoundamorecomfortabletrade-o setsthatareskewed,aninductivemethodthatlearnstopredictthemajorityclasswilloftenhavea higheroverallaccuracythanamethodthatndsabalancebetweentruepositiveandfalsepositive betweenthetruepositiveandfalsepositiverates,butstillfavoredthemajorityclass.2fordata whichmakesitamisleadingmeasureofperformance. rates.indeed,alwayspredictingthemajorityclassforourproblemyieldsahitrateof95percent, minorityclass.fortherooftopdomain,iftheerrorcostsforthetwoclasseswerethesame,thenwe 2.Coveringalgorithms,likeAQ15(Michalskietal.1986)orCN2(Clark&Niblett1989),maybelesssusceptible Thisbiastowardthemajorityclassonlycausesdicultywhenwecaremoreabouterrorsonthe toskeweddatasets,butthisishighlydependentontheirruleselectioncriteria.
9 RooftopDetectionThroughMachineLearning 8 wouldnotcareonwhichclasswemadeerrors,providedweminimizedthetotalnumberofmistakes. Norwouldtherebeanyproblemifmistakesonthemajorityclassweremoreexpensive,sincemost classdistributionrunscountertotherelativecostofmistakes,asinourdomain,thenwemustdo learningmethodsarebiasedtowardminimizingsucherrorsanyway.ontheotherhand,ifthe somethingtocompensate,bothinthelearningalgorithmitselfandinmeasuringitsperformance. costoferrors.inparticular,theypointoutthatonecanmitigatethebiasagainsttheminority classbyduplicatingexamplesofthatclassinthetrainingdata.thisalsohelpsexplainwhymost inductionmethodsgivemoreweighttoaccuracyonthemajorityclass,sinceskewedtrainingdata Breimanetal.(1984)notethecloserelationbetweenthedistributionofclassesandtherelative implicitlyplacesmoreweightonerrorsforthatclass.inresponse,severalresearchershaveexplored tobiastheperformanceelement(cardie&howe1997),removingunimportantexamplesfromthe approachesthatalterthedistributionoftrainingdatainvariousways,includinguseofweights majorityclass(kubat&matwin1997),and`boosting'theexamplesintheunder-representedclass themselvestomoredirectlyrespondtoerrorcosts. (Freund&Schapire1996).However,aswewillseeshortly,onecanalsomodifythealgorithms 6.2Cost-SensitiveLearningMethods errors,possiblybecausemostlearningmethodsdonotprovidewaystotakesuchcostsintoaccount. Empiricalcomparisonsamongmachinelearningalgorithmsseldomfocusonthecostofclassication havealsodonesomepreliminaryworkalongtheselines,whichtheydescribeasaddressingthecosts Happily,someresearchershaveexploredvariationsonstandardalgorithmsthateectivelybiasthe ratiointoc4.5(quinlan1993)tobiasittowardunder-representedclasses.pazzanietal.(1994) ofdierenterrortypes.theirmethodndstheminimum-costclassierforavarietyofproblems methodinfavorofoneclassoverothers.forexample,lewisandcatlett(1994)introducedaloss usingasetofhypotheticalerrorcosts.turney(1995)presentsresultsfromanempiricalevaluation algorithmtreatsinstancesfromthemoreexpensiveclassrelativetotheotherinstances,either ofalgorithmsthattakeintoaccountboththecostofteststomeasureattributesandthecostof duringthelearningprocessoratthetimeoftesting.inessence,wewanttoincorporateacost classicationerror. heuristicintothealgorithmssowecanbiasthemtowardmakingmistakesonthelesscostlyclass Whenimplementingcost-sensitivelearningmethods,thebasicideaistochangethewaythe ratherthanonthemoreexpensiveclass. relativecostofmakingamistakeononeclassversusanother.zeroindicatesthaterrorscost nothing,whereasonemeansthaterrorsaremaximallyexpensive.toincorporateacostheuristic intothealgorithms,wechosetomodifytheperformanceelementofthealgorithms,ratherthanthe Toaccomplishthis,wedenedacostforeachclassontherange[0:0;1:0]thatindicatesthe learningelement,byusingthecostheuristictoadjustthedecisionboundaryatwhichthealgorithm selectsoneclassversustheother. usingbayes'rule,sowewantthecostheuristictobiaspredictioninfavorofthemoreexpensive class.foracostparametercj2[0:0;1:0],wecomputedtheexpectedcostjfortheclass!jusing theformula: RecallthatnaiveBayespredictstheclasswiththehighestposteriorprobabilityascomputed
10 RooftopDetectionThroughMachineLearning 9 cost-sensitiveversionofnaivebayespredictstheclass!jwiththeleastexpectedcostj. wherexisthequery,andp(!jjx)istheposteriorprobabilityofthejthclassgiventhequery.the j=p(!jjx)+cj(1?p(!jjx)) magnitudeofthecostparameter.therefore,wecomputedtheexpectedcostjfortheclass!j exampleofthemoreexpensiveclass.themagnitudeofthischangeshouldbeproportionaltothe Therefore,thecostheuristicshouldhavetheeectofmovingthequerypointclosertotheclosest Nearestneighbor,asnormallyused,predictstheclassoftheexamplethatisclosesttothequery. usingtheformula: expectedcost.thismodicationalsoworksforknearestneighbor,whichconsidersthekclosest distancefunction.thecost-sensitiveversionofnearestneighborpredictstheclasswiththeleast wherexjistheclosestneighborfromclass!jtothequerypoint,andde(x;y)istheeuclidean j=de(x;xj)?cjde(x;xj) neighborswhenclassifyingunknowninstances. ingalgorithms,wecanmakesimilarchangestothebuddsclassier.sincethisclassierusesa lineardiscriminantfunction,wewantthecostheuristictoadjustthethresholdsothehyperplane ofdiscriminationisfartherfromthehypotheticalregionofexamplesofthemoreexpensiveclass, Finally,becauseourmodicationsfocusedontheperformanceelementsratherthanonthelearn- thusenlargingthedecisionregionofthatclass.thedegreetowhichthealgorithmadjuststhe thresholdisagaindependentonthemagnitudeofthecostparameter.theadjustedthreshold0 iscomputedby: thepositiveclassandnegativeforthenegativeclass,andjisthemaximumvaluetheweighted whereistheoriginalthresholdforthelineardiscriminantfunction,sgn(!j)returnspositivefor 0=?2Xj=1sgn(!j)cjj sumcantakeforthejthclass.thecost-sensitiveversionofthebuddsclassierpredictsthe otherwise,itpredictsthenegativeclass. positiveclassiftheweightedsumofaninstance'sattributessurpassestheadjustedthreshold0; Oursecondchallengewastoidentifyanexperimentalmethodologythatwouldletuscompare 6.3ROCAnalysisforEvaluatingPerformance costsorskeweddistributions.rather,wemustseparatelymeasureaccuracyonbothclasses,in thatcomparisonsbasedonoverallaccuracyarenotsucientfordomainsthatinvolvenon-uniform thebehaviorofourcost-sensitivelearningmethodsontherooftopdata.wehavealreadyseen termsoffalsepositivesandfalsenegatives.giveninformationabouttherelativecostsoferrors, sayfromconversationswithdomainexpertsorfromadomainanalysis,wecouldthencompute Fawcett&Provost1997). aweightedaccuracyforeachalgorithmthattakescostintoaccount(e.g.,pazzanietal.1994; ratherthanaimingforasingleperformancemeasure,astypicallydoneinmachinelearningex- resultsoftheirinterpretationstodeterminetheactualcostsforthedomain.insuchsituations, However,inthiscase,wehadnoaccesstoimageanalystsorenoughinformationaboutthe
11 RooftopDetectionThroughMachineLearning 10 1 True Positive Rate periments,anaturalsolutionistoevaluateeachlearningmethodoverarangeofcostsettings. 0 ROC(ReceiverOperatingCharacteristic)analysis(Swets1988)providesaframeworkforcarryingoutsuchcomparisons.Thebasicideaistosystematicallyvarysomeaspectofthesituation, negativerateforeachsituation.althoughresearchershaveusedsuchroccurvesinsignaldetectionandpsychophysicsfordecades(e.g.,green&swets1974;egan1975),thistechniquehas Maloofetal.1997;Provost&Fawcett1997). onlyrecentlybeguntolterintomachinelearningresearch(e.g.,ezawa,singh,&norton1996; Figure2.AnidealizedReceiverOperatingCharacteristic(ROC)curve. 0 1 False Positive Rate suchasthecostratioortheclassdistribution,andtoplotthefalsepositiverateagainstthefalse onthenegativeclassaremaximallyexpensive(i.e.,c+=0:0andc?=1:0).conversely,theupper learningalgorithm.thelowerleftcornerofthegurerepresentsthesituationinwhichmistakes rightcorneroftherocgraphrepresentsthesituationinwhichmistakesonthepositiveclassare Figure2showsanidealizedROCcurvegeneratedbyvaryingthecostparameterofacost-sensitive maximallyexpensive(i.e.,c+=1:0andc?=0:0).byvaryingovertherangeofcostparameters andplottingtheclassier'struepositiveandfalsepositiverates,weproduceaseriesofpointsthat representsthealgorithm'saccuracytrade-o.thepoint(0,1)iswhereclassicationisperfect, withafalsepositiverateofzeroandatruepositiverateofone,sowewantroccurvesthat\push" withcurvesthatcoverlargerareasgenerallybeingviewedasbetter(hanley&mcneil1982;swets towardthiscorner. 1988).Giventheskewednatureoftherooftopdata,andthedierentbutimprecisecostsoferrors onthetwoclasses,wedecidedtouseareaundertheroccurveasthedependentvariableinour TraditionalROCanalysisusesareaunderthecurveasthepreferredmeasureofperformance, experimentalstudies.thismeasureraisesproblemswhentwocurveshavesimilarareasbutare dissimilarandasymmetric,andthusoccupydierentregionsoftherocspace.insuchcases, other.aswewillsee,thisrelationtypicallyholdsforourcost-sensitivealgorithmsintherooftop appearstobemostappropriatewhencurveshavesimilarshapesandwhenoneisnestedwithinthe othertypesofanalysisaremoreuseful(e.g.,provost&fawcett1997),butareaunderthecurve detectiondomain.
12 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure3.ROCcurvesforImages1and2.Weraneachmethodbytrainingandtestingusingdataderived fromthesameimageoverarangeofmisclassicationcosts.weconductedtensuchrunsand Naive Bayes Naive Bayes 0.2 Nearest Neighbor 0.2 Nearest Neighbor Budds Classifier Budds Classifier ExperimentalStudies butdierentaspects:image1isanadirview,whileimage2isanoblique. plottedtheaveragetruepositiveandfalsepositiverates.theseimagesareofthesamelocation False Positive Rate False Positive Rate (rooftopcandidates)separatefromthoseusedtotestthelearnedclassiers.aswewillsee,the Astypicallydoneinsuchstudies,ineachexperimentwetrainedtheinductionmethodsondata Toinvestigatetheuseofmachinelearningforthetaskofrooftopdetection,weconductedexperimentsusingthecost-sensitiveversionsofnaiveBayes,nearestneighbor,andtheBuddsclassier. experimentsdieredinwhetherthetrainingandtestcasescamefromthesameordistinctimages, whichletusexaminedierentformsofgeneralizationbeyondthetrainingdata. Ourrstexperimentalstudyexaminedhowthevariousmethodsbehavedgivenwithin-imagelearning,thatis,whengeneralizingtotestcasestakenfromthesameimageonwhichwetrainedthem. 7.1Within-ImageLearning rooftopclassierswouldhavelargerareasthanthoseofthebuddsclassier. Ourresearchhypothesiswasthatthelearnedclassierswouldbemoreaccurate,overarangeof misclassicationcosts,thanthehandcraftedlinearclassier.becauseourmeasureofperformance wasareaundertheroccurve,thistranslatesintoapredictionthattheroccurvesofthelearned andfalsepositiveratesfortenruns.sincecostsarerelative(i.e.,c+=0:0andc?=0:5isequivalent toc+=0:25andc?=0:75)andourdomaininvolvedonlytwoclasses,wevariedthecostparameter foronlyoneclassatatimeandxedtheotheratzero.eachruninvolvedpartitioningthedataset Foreachimageandmethod,wevariedtheerrorcostsandmeasuredtheresultingtruepositive set.becausethebuddsclassierwashand-congured,ithadnotrainingphase,soweappliedit inthetrainingset,andevaluatingtheresultingconceptdescriptionsusingthedatainthetest directlytotheinstancesinthetestset.foreachcostsettingandeachclassier,weplottedthe randomlyintotraining(60%)andtest(40%)sets,runningthelearningalgorithmsontheinstances similarresults,butbothfarebetterthanthebuddsclassier.ratherthanpresentthecurves averagefalsepositiverateagainsttheaveragetruepositiverateoverthetenruns. Figure3presentstheROCcurvesforImages1and2.NaiveBayesandnearestneighborgive
13 RooftopDetectionThroughMachineLearning 12 Table2.Resultsforwithin-imageexperiments.Foreachimage,wegeneratedROCcurvesbytrainingand testingeachmethodoverarangeofcosts.weusedtheapproximateareaunderthecurveasthe measureofperformance,whichappearwith95%condenceintervals.naivebayesperformedbest overall,withthebuddsclassieroutperformingnearestneighboronthreeofthesiximages. Image1 Image2ApproximateAreaunderROCCurve NearestNeighbor BuddsClassier NaiveBayes Image3 Image4 Image5 Image6 fortheremainingfourimages,wefollowswets(1988)andreport,intable2,theareaunder pairofadjacentpointsintheroccurve.forallimagesexceptforimage6,naivebayesproduced eachroccurve,whichweapproximatedbysummingtheareasofthetrapezoidsdenedbyeach curveswithareasgreaterthanthoseforthebuddsclassier,thusgenerallysupportingourresearch hypothesis.onimages4,5,and6,nearestneighbordidworsethanthehandcraftedmethod,which runscountertoourprediction. ourmotivatingproblemisthelargenumberofimagesthattheanalystmustprocess.inorderto 7.2Between-ImageLearning Wegearedournextsetofexperimentsmoretowardthegoalsofimageanalysis.Recallthat alleviatethisburden,wewanttoapplyknowledgelearnedfromsomeimagestomanyotherimages. dierenttimesandimagesofdierentareaspresentsimilarissues. viewpointsofthesamesiteinorientationorinanglefromtheperpendicular.imagestakenat learnedknowledgetonewimages.forexample,oneviewpointofagivensitecandierfromother Butwehavealreadynotedthatseveraldimensionsofvariationposeproblemstotransferringsuch versionofthepreviousone:classierslearnedfromonesetofimageswouldbemoreaccurateon unseenimagesthanhandcraftedclassiers.however,wealsoexpectedthatbetween-imagelearning generalizestootherimagesthatdieralongsuchdimensions.ourhypothesisherewasarened Wedesignedexperimentstoletusunderstandbetterhowtheknowledgelearnedfromoneimage wouldgiveloweraccuracythanthewithin-imagesituation,sincedierencesacrossimageswould makegeneralizationmoredicult. fromthesamelocation.asanexample,forthenadiraspect,wechoseimage1andthentested thelearningalgorithmsonanimagefromoneaspectandtestonanimagefromanotheraspectbut hadimagesfromtwoaspects(i.e.,nadirandoblique)andfromthreelocations.thisletustrain Oneexperimentfocusedonhowthemethodsgeneralizeoveraspect.RecallfromTable1thatwe plottedtheresultsasroccurves,asshowninfigure4.theareasunderthesecurvesandtheir usingtheimagesfromeachlocation,whilevaryingtheircostparametersandmeasuringtheirtrue onimage2,whichisanobliqueimageofthesamelocation.weranthealgorithmsinthismanner 95%condenceintervalsappearinTable3. positiveandfalsepositiverates.wethenaveragedthesemeasuresacrossthethreelocationsand
14 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure4.ROCcurvesforexperimentsthattestedgeneralizationoveraspect.Left:Foreachlocation,we trainedeachmethodontheobliqueimageandtestedtheresultingconceptdescriptionsonthe Naive Bayes Naive Bayes 0.2 nadirimage.weplottedtheaveragetruepositiveandfalsepositiverates.right:wefolloweda Nearest Neighbor 0.2 Nearest Neighbor similarmethodology,exceptthatwetrainedthemethodsonthenadirimagesandtestedonthe Budds Classifier Budds Classifier obliqueimages False Positive Rate False Positive Rate fortestingondatafromobliqueimages.forexample,table3showsthatnaivebayesgenerates obliqueimages,sincethecurvesfortestingonnadircandidatesaregenerallyhigherthanthose acurvewithanareaof0.878forthenadirimages,butproducesacurvewithanareaof0.842 Oneobviousconclusionisthatthenadirimagesappeartoposeaneasierproblemthanthe fortheobliqueimages.theothertwomethodsshowasimilardegradationinperformancewhen generalizingfromnadirtoobliqueimagesratherthanfromobliquetonadirimages. tion,naivebayes(withanareaundertheroccurveof0.878)performsbetterthanthebudds classier,withanareaof0.837,whichinturndidbetterthannearestneighbor(0.795).fornadir toobliquegeneralization,naivebayesperformsslightlybetterthanthebuddsclassier,which Uponcomparingthebehaviorofdierentmethods,wendthat,forobliquetonadirgeneraliza- produceareasof0.842and0.831,respectively.nearestneighbor'scurveinthissituationcoversan areaof0.785,whichisconsiderablysmaller. methodsonpairsofimagesfromoneaspectandtestedonthethirdimagefromthesameaspect. candidatesfromimages1and3,thentestingoncandidatesfromimage5.wethenraneachofthe Asanexample,forthenadirimages,oneofthethreelearningrunsinvolvedtrainingonrooftop Asecondexperimentexaminedgeneralizationoverlocation.Tothisend,wetrainedthelearning algorithmsacrossarangeofcosts,measuringthefalsepositiveandtruepositiverates.weplotted theaveragesofthesemeasuresacrossallthreelearningrunsforoneaspectinanroccurve,as showninfigure5. Comparingthebehaviorofthevariousmethods,Table3showsthat,forthenadiraspect,naive nitiontaskthanthenadiraspect,sincetheobliqueareasarelessthanthoseforthenadirimages. BayesperformsslightlybetterthantheBuddsclassier,whichgiveareasof0.901and Inthiscontext,weagainseeevidencethattheobliqueimagespresentedamoredicultrecog- curve.whengeneralizingoverlocationontheobliqueimages,naivebayesandthebuddsclassi- Asbefore,bothdidbetterthannearestneighbor,whichyieldedanareaof0.819underitsROC erproducedroccurveswithequalareasof0.831.thesewereconsiderablybetterthannearest neighbor's,whichhadanareaof0.697.
15 RooftopDetectionThroughMachineLearning True Positive Rate True Positive Rate Figure5.ROCcurvesforexperimentthattestedgeneralizationoverlocation.Left:Foreachpairofimages forthenadiraspect,wetrainedthemethodsonthatpairandtestedtheresultingconceptdescriptionsonthethirdimage.wethenplottedtheaveragetruepositiveandfalsepositiverates.right: Weappliedthesamemethodologyusingtheimagesfortheobliqueaspect. Naive Bayes Naive Bayes 0.2 Nearest Neighbor 0.2 Nearest Neighbor Budds Classifier Budds Classifier Thus,theresultswiththenaiveBayesianclassiersupportourmainhypothesis.InallexperimentalconditionsthismethodfaredbetterthanorequaltotheBuddslinearclassier.Onthe False Positive Rate False Positive Rate otherhand,thebehaviorofnearestneighbortypicallygaveworseresultsthanthehandcrafted rooftopdetector,whichwentagainstouroriginalexpectations. forthewithin-imagecondition(table2),naivebayesproducedanaveragerocareaof0.9forthe generalizingwithinimages.totestthishypothesis,wemustcomparetheresultsfromtheseexperimentswiththosefromthewithin-imageexperiments(seetable3).simplecalculationshowsthat, Recallthatwealsoanticipatedthatgeneralizingacrossimageswouldgiveloweraccuraciesthan nadirimagesand0.851fortheobliqueimages.similarly,nearestneighboraveraged0.851forthe nadirimagesand0.791fortheobliqueimages.mostofthesetheseareasaresubstantiallyhigher thenadirimage,buttheresultsgenerallysupportourprediction. thantheanalogousareasthatresultedwhenthesemethodsgeneralizedacrosslocationandaspect. TheoneexceptionisthatnaiveBayesactuallydidequallywellwhengeneralizingoverlocationfor performanceinthewithin-imageconditionandinthebetween-imageconditions.forexample, naivebayes'averagedegradationinperformanceoverallexperimentalconditionswas0.013,while eralizingtounseenimages.thiscanbeseenbycomparingthedierencesbetweeneachmethod's AlsonotethatnaiveBayes'performancedegradedlessthanthatofnearestneighborwhengen- nearestneighbor'swas0.47.thisconstitutesfurtherevidencethatnaivebayesisbettersuitedfor 7.3LearningfromAllAvailableImages thisdomain,atleastwhenoperatingovertheninefeaturesusedinourexperiments. OurnextstudyusedalloftherooftopcandidatesgeneratedfromthesixFortHoodimages,since wewantedtoreplicateourpreviousresultsinasituationsimilartothatweenvisionbeingusedin practice,whichwoulddrawontrainingcasesfromallimages.basedontheearlierexperiments,we anticipatedthatthenaivebayesianclassierwouldyieldanroccurveofgreaterareathanthose oftheothermethods.
16 RooftopDetectionThroughMachineLearning 15 Table3.Resultsforbetween-imageexperiments.WeagainusedtheapproximateareaundertheROC and`oblique'indicatethetestingcondition.wederivedanalogousresultsforthewithin-image thebest,whilethebuddsclassiergenerallyoutperformednearestneighbor.thelabels`nadir' experimentsbyaveragingtheresultsforeachcondition.approximateareasappearwith95% curveasthemeasureofperformance,alongwith95%condenceintervals.naivebayesperformed condenceintervals. Nadir AspectExperiment Oblique LocationExperiment Nadir Oblique Nadir WithinImage NearestNeighbor BuddsClassier NaiveBayes Oblique positiveand17,048labelednegative.weraneachalgorithmtentimesoverarangeofcosts.for (40%)sets,thenaveragedtheresultsforeachcostleveloveritstenruns. eachrunandsetofcostparameters,werandomlysplitthedataintotraining(60%)andtesting Combiningtherooftopcandidatesfromallsiximagesgaveus17,829instances,781labeled performedthebestoverall,producingacurvewitharea0.85.nearestneighborfaredslightly betterthanthebuddsclassier,yieldinganareaof0.801,comparedto0.787forthelatter. whereastable4givestheapproximateareaunderthesecurves.asanticipated,naivebayes Figure6showstheresultingROCcurves,whichplotthetruepositiveandfalsepositiverates, curvebut,rather,willhavespecicerrorcostsinmind,eveniftheycannotstatethemformally. WehaveusedROCcurvesbecausewedonotknowthesecostsinadvance,butwecaninspect behaviorofthevariousclassiersatdierentpointsonthesecurvestogivefurtherinsightintohow Inpractice,imageanalystswillnotevaluateaclassiersperformanceusingareaundertheROC muchthelearnedclassiersarelikelytoaidanalystsduringactualuse. rateof0.84andafalsepositiverateof0.27,thethirddiamondfromtherightinfigure6.toobtain thesametruepositiverate,thebuddsclassierproduceda0.62falsepositiverate.thismeans that,foragiventruepositiverate,naivebayesreducedthefalsepositiveratebymorethanhalf Forexample,considerthebehaviorofthenaiveBayesianclassierwhenitachievesatruepositive naivebayesimprovedthetruepositiverateby0.12overthebuddsclassier.inthiscase,the wouldhaverejected5,969morenon-rooftopsthanbudds_similarly,byxingthefalsepositiverate, Bayesianclassierwouldhavefound86morerooftopsthanBuddswouldhavedetected. overthehandcraftedclassier.hence,fortheimagesweconsidered,thenaivebayesianclassier 7.4RatesofLearning Wewerealsointerestedinthebehaviorofthelearningmethodsastheyprocessedincreasing amountsoftrainingdata.ourlong-termgoalistoembedthelearnedclassierinaninteractive systemthatsupportsanimageanalyst.forthisreason,wewouldpreferalearningalgorithmthat achieveshighaccuracyfromrelativelyfewtrainingcases,sincethisshouldreducetheloadonthe humananalyst.
17 RooftopDetectionThroughMachineLearning 16 1 True Positive Rate Figure6.ROCcurvefortheexperimentusingallavailableimagedata.Weraneachmethodoverarangeof Naive Bayes costsusingatrainingset(60%)andatestingset(40%)andaveragedthetruepositiveandfalse 0.2 Nearest Neighbor Budds Classifier Tothisend,wecarriedoutanalexperimentinwhichwesystematicallyvariedthenumber positiveratesovertenruns.naivebayesproducedthecurvewiththelargestarea,butnearest neighboralsoyieldedacurvelargerinareathanthatforthebuddsclassier. False Positive Rate candidates,splittingthedataintotraining(60%)andtest(40%)sets,butfurtherdividingthe trainingsetrandomlyintotensubsets(10%,20%,:::,100%).weranthelearningalgorithmson oftrainingcasesavailabletothelearningmethod.weagainusedalloftheavailablerooftop eachofthetrainingsubsetsandevaluatedtheacquiredconceptdescriptionsonthereservedtesting thethebuddsclassierisat,sinceitinvolvesnotrainingandwesimplyappliedittothesame data,averagingourresultsover25separatetraining/testsplits. undertheroccurvesforagivennumberoftrainingcases.asexpected,thelearningcurvefor testsetforeachnumberoftrainingcases.however,nearestneighborproducesacurvethatstarts Figure7showstheresultinglearningcurves,eachpointofwhichcorrespondstotheaveragearea belowthatofthebuddsclassierandthensurpassesitafterseeing70%ofthetrainingdata.naive oneimage.notonlywasnaivebayesthebestperformingmethod,butalsoitwasabletoachieve Bayesshowssimilarimprovementwithincreasingamountsoftrainingdata,butitsperformance Thisequatestoroughly6%oftheavailabledataandislessthantheamountofdataderivedfrom wasbetterthanthebuddsclassierfromthestart,afterobservingonly10%ofthetrainingdata. thisperformanceusingverylittleoftheavailabletrainingdata. 7.5Summary naivebayes,showedpromiseofimprovingtherooftopdetectiontaskoverthehandcraftedlinear dataderivedfromthesameimage,itwasapparentthatatleastonemachinelearningmethod, Fromthewithin-learningexperiments,inwhichwetrainedandtestedthelearningmethodsusing classier.theresultsfromthisexperimentalsoestablishedbaselineperformanceconditionsforthe couldbebecausebuddswasinitiallydevelopedusingnadirimagesandthenextendedtohandle thatrooftopdetectionforobliqueimagesposedamoredicultproblemthanfornadirimages.this methodsbecausetheycontrolledfordierencesinaspectandlocation. Inaneorttotestthelearningmethodsfortheirabilitytogeneralizetounseenimages,wefound
18 RooftopDetectionThroughMachineLearning 17 Table4.Resultsfortheexperimentusingalloftheimagedata.Wesplitthedataintotraining(60%)and test(40%)setsandraneachmethodoverarangeofcosts.wethencomputedtheaveragearea undertheroccurveand95%condenceintervalsovertenruns. Classier NaiveBayes NearestNeighbor BuddsClassier ApproximateArea obliqueimages.thus,thefeaturesmaybebiasedtowardnadir-viewrooftops.amorelikely explanationisthatobliqueimagesaresimplyharderthannadirimages.nevertheless,underall nearestneighbor. generalizingtounseenimages,butthattheperformanceofnaivebayesdegradedlessthanthatof linearclassier.finally,wealsodiscoveredthattheperformanceofthemethodsdegradedwhen circumstances,theperformanceofnaivebayeswasequaltoorbetterthanthatofthehandcrafted thehandcraftedsolutionbymorethanafactoroftwofortruepositiveratesof0.84andhigher. naivebayesandnearestneighboroutperformedthebuddsclassier.furtheranalysisofspecic pointsontheroccurvesrevealedthatnaivebayesimproveduponthefalsepositiverateof Ournalexperimentusedalloftheavailableimagedataforlearninganddemonstratedthat theavailabletrainingdata. LearningcurvesdemonstratedthatnaiveBayesachievedsuperiorperformanceusingverylittleof workinvisuallearningtakesanimage-basedapproach(e.g.,beymer&poggio1996),inwhichthe Researchonlearningincomputervisionhasbecomeincreasinglycommoninrecentyears.Some 8.RelatedWork thepixelsintoadecisionorclassication.researchershaveusedthisapproachextensivelyforface andgesturerecognition(e.g.,chan,nasrabadi,&mirelli1996;guttaetal.1996;osuna,freund,& process,whichisresponsibleforformingtheintermediaterepresentationsnecessarytotransform imagesthemselves,usuallynormalizedortransformedinsomeway,areusedasinputtoalearning Girosi1997;Segen1994),althoughithasseenotherapplicationsaswell(e.g.,Nayar&Poggio1996; Pomerleau1996;Viola1993). features,basedonintensityorshapeproperties,thenlearnstorecognizedesiredobjectsusing thesemachine-producedclassiers.shepherd(1983)useddecision-treeinductiontoclassifyshapes ofchocolatesforanindustrialvisionapplication.cromwellandkak(1991)tookasimilarapproach Aslightlydierentapproachreliesonhandcraftedvisionroutinestoextractrelevantimage forrecognizingelectricalcomponents,suchastransistors,resistors,andcapacitors.maloofand Michalski(1997)examinedvariousmethodsoflearningshapecharacteristicsfordetectingblasting capsinx-rayimages,whereasadditionalwork(maloofetal.1996)discussedlearninginamultistepvisionsystemforthesamedetectionproblem. byconklin(1993),connellandbrady(1987),cooketal.(1993),provan,langley,andbinford Severalresearchershavealsoinvestigatedlearningforthree-dimensionalvisionsystems.Papers
19 RooftopDetectionThroughMachineLearning Average Area under Curve Figure7.LearningcurvesforareaundertheROCcurveusingallavailableimagedata.Weraneachmethod 0.75 Naive Bayes Nearest Neighbor Budds Classifier 0.7 (1996),andSenguptaandBoyer(1993)alldescribeinductiveapproachesaimedatimprovingobject onincreasingamountsoftrainingdataandevaluatedtheresultingconceptdescriptionsonreserved testingdata.eachpointisanaverageoftenruns Percentage of Training Data recognition.theaimhereistolearnthethree-dimensionalstructurethatcharacterizesanobjector objectclass,ratherthanitsappearance.anotherlineofresearch,whichfallsmidwaybetweenthis approachandimage-basedschemes,insteadattemptstolearnasmallsetofcharacteristicviews, Pope&Lowe1996). eachofwhichcanbeusedtorecognizeanobjectfromadierentperspective(e.g.,gros1993; costoferrorsintotheiralgorithmforconstructingandpruningmultivariatedecisiontrees.they theselineshassomeprecedents.inparticular,draper,brodley,andutgo(1994)incorporatethe testedthisapproachonthetaskoflabelingpixelsfromoutdoorimagesforusebyaroad-following Mostworkonvisuallearningignorestheimportanceofmisclassicationcosts,butourworkalong testpixels.woods,bowyer,andkegelmeyer(1996),aswellasrowley,baluja,andkanade(1996), reportsimilarworkthattakesintoaccountthecostoferrors. thanthereverse,andshowedexperimentallythattheirmethodcouldreducesucherrorsonnovel vehicle.theydeterminedthat,inthiscontext,labelingaroadpixelasnon-roadwasmorecostly intosemanticnetworksthatitthengeneralizedbycomparingtodescriptionsofotherinstances. Draper1997;Teller&Veloso1997).OneexceptionisConnellandBrady's(1987)workonlearning structuraldescriptionsofairplanesfromaerialviews.theirmethodconvertedtrainingimages Muchoftheresearchonvisuallearningusesimagesofscenesorobjectsviewedateyelevel(e.g., However,theauthorsdonotappeartohavetestedexperimentallytheiralgorithm'sabilityto 1996),whichcatalogscelestialobjects,suchasgalaxiesandstars,usingimagesfromtheSecond accuratelyclassifyobjectsinnewimages.anotherexampleistheskicatsystem(fayyadetal. PalomarObservatorySkySurvey. UsingROCcurves,theydemonstratethattheensembleachievedbetterperformancethaneither detectvenusianvolcanos,usingsyntheticapertureradaronthemagellanspacecraft.askerand Maclin(1997)extendJARToolbyusinganensembleof48neuralnetworkstoimproveperformance. Arelatedsystem,JARTool(Fayyadetal.1996),alsoanalyzesaerialimages,inthiscaseto theindividuallearnedclassiersortheoneusedoriginallyinjartool.theyalsodocumentsome
20 RooftopDetectionThroughMachineLearning 19 ofthedicultiesassociatedwithapplyingmachinelearningtechniquestoreal-worldproblems,such asfeatureselectionandinstancelabeling,whichweresimilartoproblemsweencountered. neuralnetworks)tolearnconditionsonoperatorselection.hepresentsinitialresultsonaradius procedure(forsoftwaresimilartobudds),thenusesaninductionmethod(backpropagationin Hisapproachadaptsmethodsforreinforcementlearningtoassigncreditinmulti-stagerecognition Finally,Draper(1996)reportsacarefulstudyoflearninginthecontextofanalyzingaerialimages. taskthatalsoinvolvesthedetectionofroofs.ourframeworksharessomefeatureswithdraper's approach,butassumesthatlearningisdirectedbyfeedbackfromahumanexpert.wepredict thatoursupervisedmethodwillbemorecomputationallytractablethanhisuseofreinforcement learning,whichiswellknownforitshighcomplexity.ourapproachdoesrequiremoreinteraction withusers,butwebelievethisinteractionwillbeunobtrusiveifcastwithinthecontextofan advisorysystemforimageanalysis. Althoughthisstudyhasprovidedsomeinsightintotheroleofmachinelearninginimageanalysis, 9.ConcludingRemarks muchstillremainstobedone.forexample,wemaywanttoconsiderothermeasuresofperformance thattakeintoaccountthepresenceofmultiplevalidcandidatesforagivenrooftop.classifying methods,weintendtoworkatbothearlierandlaterlevelsofthebuildingdetectionprocess.the goalhereisnotonlytoincreaseclassicationaccuracy,whichcouldbehandledentirelybycandidate oneofthesecandidatescorrectlyissucientforthepurposeofimageanalysis. selection,butalsotoreducethecomplexityofprocessingbyremovingpoorcandidatesbeforethey Inaddition,althoughtherooftopselectionstagewasanaturalplacetostartinapplyingour areaggregatedintolargerstructures.withthisaiminmind,weplantoextendourworktoall andtakingthisintoaccountinourmodiedinductionalgorithms.anotherconcernswhetherwe shouldusethesameinductionalgorithmateachlevelorusedierentmethodsateachstage. levelsoftheimageunderstandingprocess.wemustaddressanumberofissuesbeforewecanmake progressontheseotherstages.oneinvolvesidentifyingthecostofdierenterrorsateachlevel, madebytheimageunderstandingsystem,generatingtrainingdataintheprocess.atintervals hopetointegratelearningroutinesintobudds.thissystemwasnotdesignedinitiallytobeinteractive,butweintendtomodifyitsothattheimageanalystcanacceptorrejectrecommendations Aswementionedearlier,inordertoautomatethecollectionoftrainingdataforlearning,wealso thesystemwouldinvokeitslearningalgorithms,producingrevisedknowledgethatwouldalterthe interactivelabelingsystemdescribedinsection5couldserveasaninitialmodelforthisinterface. system'sbehaviorinthefutureand,hopefully,reducetheuser'sneedtomakecorrections.the ingtheaccuracy,andthustherobustness,ofimageanalysissystems.however,weneedadditional experimentstogivebetterunderstandingofthefactorsaectingbetween-imagegeneralizationand weneedtoextendlearningtoadditionallevelsoftheimageunderstandingprocess.also,beforewe Inconclusion,ourstudiessuggestthatmachinelearninghasanimportantroletoplayinimprov- canbuildasystemthattrulyaidsthehumanimageanalyst,wemustfurtherdevelopunobtrusive waystocollecttrainingdatatosupportlearning.
21 RooftopDetectionThroughMachineLearning 20 TheauthorsthankRamNevatia,AndresHuertas,andAndyLinfortheirassistanceinobtaining Acknowledgements theimagesanddatausedforexperimentationandprovidingvaluablecommentsandadvice.we wouldalsoliketothankdanshapirofordiscussionsaboutdecisiontheoryandwayneibaforhis assistancewithnaivebayes.thisworkwasconductedattheinstituteforthestudyoflearning andexpertiseandinthecomputationallearninglaboratory,centerforthestudyoflanguage andinformation,atstanforduniversity.theresearchwassupportedbythedefenseadvanced ResearchProjectsAgency,undergrantN ,administeredbytheOceofNaval Research,andbySunMicrosystemsthroughagenerousequipmentgrant. References Aha,D.;Kibler,D.;andAlbert,M.1991.Instance-basedlearningalgorithms.MachineLearning Asker,L.,andMaclin,R.1997.Featureengineeringandclassierselection:acasestudyinVenusianvolcanodetection.InProceedingsoftheFourteenthInternationalConferenceonMachine 6:37{66. Beymer,D.,andPoggio,T.1996.Imagerepresentationsforvisuallearning.Science272:1905{1909. Learning,3{11.SanFrancisco,CA:MorganKaufmann. Bradley,A.1997.TheuseoftheareaundertheROCcurveintheevaluationofmachinelearning Breiman,L.;Friedman,J.;Olshen,R.;andStone,C.1984.Classicationandregressiontrees. algorithms.patternrecognition30:1145{1159. Cardie,C.,andHowe,N.1997.Improvingminorityclasspredictionusingcase-specicfeature Belmont,CA:Wadsworth. Chan,L.;Nasrabadi,N.;andMirelli,V.1996.Multi-stagetargetrecognitionusingmodularvector 65.SanFrancisco,CA:MorganKaufmann. weights.inproceedingsofthefourteenthinternationalconferenceonmachinelearning,57{ Clark,P.,andNiblett,T.1989.TheCN2inductionalgorithm.MachineLearning3:261{284. VisionandPatternRecognition,114{119.LosAlamitos,CA:IEEEPress. quantizersandmultilayerperceptrons.inproceedingsoftheieeeconferenceoncomputer Conklin,D.1993.Transformation-invariantindexingandmachinediscoveryforcomputervision.In Connell,J.,andBrady,M.1987.Generatingandgeneralizingmodelsofvisualobjects.Articial WorkingNotesoftheAAAIFallSymposiumonMachineLearninginComputerVision,10{14. Intelligence31:159{183. MenloPark,CA:AAAIPress. Cook,D.;Hall,L.;Stark,L.;andBowyer,K.1993.Learningcombinationofevidencefunctions Cromwell,R.,andKak,A.1991.Automaticgenerationofobjectclassdescriptionsusingsymbolic inobjectrecognition.inworkingnotesoftheaaaifallsymposiumonmachinelearningin ComputerVision,139{143.MenloPark,CA:AAAIPress. learningtechniques.inproceedingsoftheninthnationalconferenceonarticialintelligence, 710{717.
22 RooftopDetectionThroughMachineLearning 21 Draper,B.;Brodley,C.;andUtgo,P.1994.Goal-directedclassicationusinglinearmachine Draper,B.1996.Learninggroupingstrategiesfor2Dand3Dobjectrecognition.InProceedingsof decisiontrees.ieeetransactionsonpatternanalysisandmachineintelligence16(9):888{893. Draper,B.1997.Learningcontrolstrategiesforobjectrecognition.InIkeuchi,K.,andVeloso,M., eds.,symbolicvisuallearning.newyork,ny:oxforduniversitypress.49{76. theimageunderstandingworkshop,1447{1454.sanfrancisco,ca:morgankaufmann. Egan,J.1975.SignaldetectiontheoryandROCanalysis.NewYork,NY:AcademicPress. Fawcett,T.,andProvost,F.1997.Adaptivefrauddetection.DataMiningandKnowledgeDiscovery Ezawa,K.;Singh,M.;andNorton,S.1996.Learninggoal-orientedBayesiannetworksfortelecommunicationsriskmanagement.InProceedingsoftheThirteenthInternationalConferenceon 1:291{316. MachineLearning,139{147.SanFrancisco,CA:MorganKaufmann. Firschein,O.,andStrat,T.,eds.1997.RADIUS:imageunderstandingforimageryintelligence. Fayyad,U.;Smyth,P.;Burl,M.;andPerona,P.1996.Learningtocatalogscienceimages.In 237{268. Nayar,S.,andPoggio,T.,eds.,Earlyvisuallearning.NewYork,NY:OxfordUniversityPress. Freund,Y.,andSchapire,R.1996.Experimentswithanewboostingalgorithm.InProceedings SanFrancisco,CA:MorganKaufmann. Freund,Y.;Seung,H.;Shamir,E.;andTishby,N.1997.SelectivesamplingusingtheQueryby ofthethirteenthinternationalconferenceonmachinelearning,148{156.sanfrancisco,ca: MorganKaufmann. Green,D.,andSwets,J.1974.Signaldetectiontheoryandpsychophysics.NewYork,NY:Robert Committeealgorithm.MachineLearning28:133{168. Gros,P.1993.Matchingandclustering:Twostepstowardsautomaticobjectmodelgeneration E.KriegerPublishing. Gutta,S.;Huang,J.;Imam,I.;andWeschler,H.1996.Faceandhandgesturerecognitionusing incomputervision.inworkingnotesoftheaaaifallsymposiumonmachinelearningin ComputerVision,40{44.MenloPark,CA:AAAIPress. Hanley,J.,andMcNeil,B.1982.ThemeaninganduseoftheareaunderaReceiverOperating hybridclassiers.inproceedingsofthesecondinternationalconferenceonautomaticfaceand Characteristic(ROC)curve.Radiology143:29{36. GestureRecognition,164{169.LosAlamitos,CA:IEEEPress. Kubat,M.,andMatwin,S.1997.Addressingthecurseofimbalancedtrainingsets:one-sided Langley,P.,andSimon,H.1995.Applicationsofmachinelearningandruleinduction.CommunicationsoftheACM38:55{64. selection.inproceedingsofthefourteenthinternationalconferenceonmachinelearning,179{ 186.SanFrancisco,CA:MorganKaufmann. Langley,P.;Iba,W.;andThompson,K.1992.AnanalysisofBayesianclassiers.InProceedings Press. ofthetenthnationalconferenceonarticialintelligence,223{228.menlopark,ca:aaai
23 RooftopDetectionThroughMachineLearning 22 Lin,C.,andNevatia,R.1996.Buildingdetectionanddescriptionfrommonocularaerialimages. Lewis,D.,andCatlett,J.1994.Heterogeneousuncertaintysamplingforsupervisedlearning.In cisco,ca:morgankaufmann. ProceedingsoftheEleventhInternationalConferenceonMachineLearning,148{156.SanFran- Maloof,M.,andMichalski,R.1997.Learningsymbolicdescriptionsofshapeforobjectrecognition InProceedingsoftheImageUnderstandingWorkshop,461{468.SanFrancisco,CA:Morgan inx-rayimages.expertsystemswithapplications12:11{20. Kaufmann. Maloof,M.;Langley,P.;Sage,S.;andBinford,T.1997.Learningtodetectrooftopsinaerial Maloof,M.;Duric,Z.;Michalski,R.;andRosenfeld,A.1996.RecognizingblastingcapsinX-ray MorganKaufmann. images.inproceedingsoftheimageunderstandingworkshop,835{845.sanfrancisco,ca: images.inproceedingsoftheimageunderstandingworkshop,1257{1261.sanfrancisco,ca: Michalski,R.;Mozetic,I.;Hong,J.;andLavrac,H.1986.Themulti-purposeincrementallearning MorganKaufmann. Nayar,S.,andPoggio,T.,eds.1996.Earlyvisuallearning.NewYork,NY:OxfordUniversityPress. systemaq15anditstestingapplicationtothreemedicaldomains.inproceedingsofthefifth Osuna,E.;Freund,R.;andGirosi,F.1997.TrainingSupportVectorMachines:anapplication NationalConferenceonArticialIntelligence,1041{1045.MenloPark,CA:AAAIPress. Pazzani,M.;Merz,C.;Murphy,P.;Ali,K.;Hume,T.;andBrunk,C.1994.Reducingmisclassi- tofacedetection.inproceedingsoftheieeeconferenceoncomputervisionandpattern cationcosts.inproceedingsoftheeleventhinternationalconferenceonmachinelearning, Recognition,130{136.LosAlamitos,CA:IEEEPress. Pomerleau,D.1996.Neuralnetworkvisionforrobotdriving.InNayar,S.,andPoggio,T.,eds., Earlyvisuallearning.NewYork,NY:OxfordUniversityPress.161{ {225.SanFrancisco,CA:MorganKaufmann. Pope,A.,andLowe,D.1996.Learningprobabilisticappearancemodelsforobjectrecognition.In Provan,G.;Langley,P.;andBinford,T.1996.Probabilisticlearningofthree-dimensionalobject models.inproceedingsoftheimageunderstandingworkshop,1403{1413.sanfrancisco,ca: 67{97. Nayar,S.,andPoggio,T.,eds.,Earlyvisuallearning.NewYork,NY:OxfordUniversityPress. Provost,F.,andFawcett,T.1997.Analysisandvisualizationofclassierperformance:comparison MorganKaufmann. Quinlan,J.1993.C4.5:Programsformachinelearning.SanFrancisco,CA:MorganKaufmann. underimpreciseclassandcostdistributions.inproceedingsofthethirdinternationalconferenceonknowledgediscoveryanddatamining,43{48.menlopark,ca:aaaipress. Rowley,H.;Baluja,S.;andKanade,T.1996.Neuralnetwork-basedfacedetection.InProceedings oftheieeeconferenceoncomputervisionandpatternrecognition,203{208.losalamitos, CA:IEEEPress.
24 RooftopDetectionThroughMachineLearning 23 Segen,J.1994.GEST:alearningcomputervisionsystemthatrecognizeshandgestures.InMichalski,R.,andTecuci,G.,eds.,MachineLearning:AMultistrategyApproach,volume4.San Francisco,CA:MorganKaufmann.621{634. Sengupta,K.,andBoyer,K.1993.Incrementalmodelbaseupdating:learningnewmodelsites.In Shepherd,B.1983.Anappraisalofadecisiontreeapproachtoimageclassication.InIJCAI-83, 473{475. MenloPark,CA:AAAIPress. WorkingNotesoftheAAAIFallSymposiumonMachineLearninginComputerVision,1{5. Soderland,S.,andLehnert,W.1994.Corpus-drivenknowledgeacquisitionfordiscourseanalysis. Teller,A.,andVeloso,M.1997.PADO:anewlearningarchitectureforobjectrecognition.In Swets,J.1988.Measuringtheaccuracyofdiagnosticsystems.Science240:1285{1293. InProceedingsoftheTwelfthNationalConferenceonArticialIntelligence,827{832. Turney,P.1995.Cost-sensitiveclassication:empiricalevaluationofahybridgeneticdecisiontree Ikeuchi,K.,andVeloso,M.,eds.,Symbolicvisuallearning.NewYork,NY:OxfordUniversity Press.77{112. Viola,P.1993.Feature-basedrecognitionofobjects.InWorkingNotesoftheAAAIFallSymposium inductionalgorithm.journalofarticialintelligenceresearch2:369{409. Woods,K.;Bowyer,K.;andKegelmeyer,W.1996.Combinationofmultipleclassiersusinglocal onmachinelearningincomputervision,60{64.menlopark,ca:aaaipress. Zurada,J.1992.Introductiontoarticialneuralsystems.St.Paul,MN:WestPublishing. accuracyestimates.inproceedingsoftheieeeconferenceoncomputervisionandpattern Recognition,391{396.LosAlamitos,CA:IEEEPress.
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
Machine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
Machine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov [email protected] Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
Maximizing Return and Minimizing Cost with the Decision Management Systems
KDD 2012: Beijing 18 th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Rich Holada, Vice President, IBM SPSS Predictive Analytics Maximizing Return and Minimizing Cost with the Decision Management
Introduction to nonparametric regression: Least squares vs. Nearest neighbors
Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,
Predicting Student Persistence Using Data Mining and Statistical Analysis Methods
Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Koji Fujiwara Office of Institutional Research and Effectiveness Bemidji State University & Northwest Technical College
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
King Saud University
King Saud University College of Computer and Information Sciences Department of Computer Science CSC 493 Selected Topics in Computer Science (3-0-1) - Elective Course CECS 493 Selected Topics: DATA MINING
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee
UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass
Local classification and local likelihoods
Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor
An Approach to Detect Spam Emails by Using Majority Voting
An Approach to Detect Spam Emails by Using Majority Voting Roohi Hussain Department of Computer Engineering, National University of Science and Technology, H-12 Islamabad, Pakistan Usman Qamar Faculty,
Classification Techniques (1)
10 10 Overview Classification Techniques (1) Today Classification Problem Classification based on Regression Distance-based Classification (KNN) Net Lecture Decision Trees Classification using Rules Quality
Content-Based Recommendation
Content-Based Recommendation Content-based? Item descriptions to identify items that are of particular interest to the user Example Example Comparing with Noncontent based Items User-based CF Searches
OUTLIER ANALYSIS. Data Mining 1
OUTLIER ANALYSIS Data Mining 1 What Are Outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism Ex.: Unusual credit card purchase,
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
Data Mining Techniques for Prognosis in Pancreatic Cancer
Data Mining Techniques for Prognosis in Pancreatic Cancer by Stuart Floyd A Thesis Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUE In partial fulfillment of the requirements for the Degree
Big Data: a new era for Statistics
Big Data: a new era for Statistics Richard J. Samworth Abstract Richard Samworth (1996) is a Professor of Statistics in the University s Statistical Laboratory, and has been a Fellow of St John s since
Combining Global and Personal Anti-Spam Filtering
Combining Global and Personal Anti-Spam Filtering Richard Segal IBM Research Hawthorne, NY 10532 Abstract Many of the first successful applications of statistical learning to anti-spam filtering were personalized
KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics
ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of
Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm
Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm Markus Goldstein and Andreas Dengel German Research Center for Artificial Intelligence (DFKI), Trippstadter Str. 122,
Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA
Spam Detection on Twitter Using Traditional Classifiers M. McCord CSE Dept Lehigh University 19 Memorial Drive West Bethlehem, PA 18015, USA [email protected] M. Chuah CSE Dept Lehigh University 19 Memorial
CS 207 - Data Science and Visualization Spring 2016
CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler [email protected] An introduction to techniques for the automated and human-assisted analysis of data sets. These
Bayes Theorem & Diagnostic Tests Screening Tests
Bayes heorem & Screening ests Bayes heorem & Diagnostic ests Screening ests Some Questions If you test positive for HIV, what is the probability that you have HIV? If you have a positive mammogram, what
T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577
T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or
Data Mining in Weka Bringing It All together
Data Mining in Weka Bringing It All together Predictive Analytics Center of Excellence (PACE) San Diego Super Computer Center, UCSD Data Mining Boot Camp 1 Introduction The project assignment demonstrates
Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
Predicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
Anti-Spam Filter Based on Naïve Bayes, SVM, and KNN model
AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different
Using Dalvik opcodes for Malware Detection on Android
Using Dalvik opcodes for Malware Detection on Android José Gaviria de la Puerta, Borja Sanz, Igor Santos and Pablo García Bringas DeustoTech Computing, University of Deusto [email protected], [email protected],
Math. Rounding Decimals. Answers. 1) Round to the nearest tenth. 8.54 8.5. 2) Round to the nearest whole number. 99.59 100
1) Round to the nearest tenth. 8.54 8.5 2) Round to the nearest whole number. 99.59 100 3) Round to the nearest tenth. 310.286 310.3 4) Round to the nearest whole number. 6.4 6 5) Round to the nearest
Identifying Peer-to-Peer Traffic Based on Traffic Characteristics
Identifying Peer-to-Peer Traffic Based on Traffic Characteristics Prof S. R. Patil Dept. of Computer Engineering SIT, Savitribai Phule Pune University Lonavala, India [email protected] Suraj Sanjay Dangat
Data Mining for Network Intrusion Detection
Data Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining for Network Intrusion Detection p.1/55 Overview This is important for defense in depth Much
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Average rate of change of y = f(x) with respect to x as x changes from a to a + h:
L15-1 Lecture 15: Section 3.4 Definition of the Derivative Recall the following from Lecture 14: For function y = f(x), the average rate of change of y with respect to x as x changes from a to b (on [a,
Machine Learning with MATLAB David Willingham Application Engineer
Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the
WEKA. Machine Learning Algorithms in Java
WEKA Machine Learning Algorithms in Java Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand E-mail: [email protected] Eibe Frank Department of Computer Science
Is a Data Scientist the New Quant? Stuart Kozola MathWorks
Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences
Role of Neural network in data mining
Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
Homework 4 Statistics W4240: Data Mining Columbia University Due Tuesday, October 29 in Class
Problem 1. (10 Points) James 6.1 Problem 2. (10 Points) James 6.3 Problem 3. (10 Points) James 6.5 Problem 4. (15 Points) James 6.7 Problem 5. (15 Points) James 6.10 Homework 4 Statistics W4240: Data Mining
A semi-supervised Spam mail detector
A semi-supervised Spam mail detector Bernhard Pfahringer Department of Computer Science, University of Waikato, Hamilton, New Zealand Abstract. This document describes a novel semi-supervised approach
Using Artificial Intelligence to Manage Big Data for Litigation
FEBRUARY 3 5, 2015 / THE HILTON NEW YORK Using Artificial Intelligence to Manage Big Data for Litigation Understanding Artificial Intelligence to Make better decisions Improve the process Allay the fear
Course Description This course will change the way you think about data and its role in business.
INFO-GB.3336 Data Mining for Business Analytics Section 32 (Tentative version) Spring 2014 Faculty Class Time Class Location Yilu Zhou, Ph.D. Associate Professor, School of Business, Fordham University
On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction
Empirical Software Engineering manuscript No. () On the Relative Value of Cross-Company and Within-Company Data for Defect Prediction Burak Turhan Tim Menzies Ayse Bener Justin Distefano Received: Sept
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
Sentiment analysis using emoticons
Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was
Applying Classifier Algorithms to Organizational Memory to Build an Attrition Predictor Model
Applying Classifier Algorithms to Organizational Memory to Build an Attrition Predictor Model K. M. SUCEENDRAN 1, R. SARAVANAN 2, DIVYA ANANTHRAM 3, Dr.S.POONKUZHALI 4, R.KISHORE KUMAR 5, Dr.K.SARUKESI
Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek
Recommender Systems: Content-based, Knowledge-based, Hybrid Radek Pelánek 2015 Today lecture, basic principles: content-based knowledge-based hybrid, choice of approach,... critiquing, explanations,...
Lecture: Mon 13:30 14:50 Fri 9:00-10:20 ( LTH, Lift 27-28) Lab: Fri 12:00-12:50 (Rm. 4116)
Business Intelligence and Data Mining ISOM 3360: Spring 203 Instructor Contact Office Hours Course Schedule and Classroom Course Webpage Jia Jia, ISOM Email: [email protected] Office: Rm 336 (Lift 3-) Begin
Network Intrusion Detection Using a HNB Binary Classifier
2015 17th UKSIM-AMSS International Conference on Modelling and Simulation Network Intrusion Detection Using a HNB Binary Classifier Levent Koc and Alan D. Carswell Center for Security Studies, University
How To Prevent Network Attacks
Ali A. Ghorbani Wei Lu Mahbod Tavallaee Network Intrusion Detection and Prevention Concepts and Techniques )Spri inger Contents 1 Network Attacks 1 1.1 Attack Taxonomies 2 1.2 Probes 4 1.2.1 IPSweep and
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
2.0. Specification of HSN 2.0 JavaScript Static Analyzer
2.0 Specification of HSN 2.0 JavaScript Static Analyzer Pawe l Jacewicz Version 0.3 Last edit by: Lukasz Siewierski, 2012-11-08 Relevant issues: #4925 Sprint: 11 Summary This document specifies operation
Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.
Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus
Keywords: Data Warehouse, Data Warehouse testing, Lifecycle based testing, performance testing.
DOI 10.4010/2016.493 ISSN2321 3361 2015 IJESC Research Article December 2015 Issue Performance Testing Of Data Warehouse Lifecycle Surekha.M 1, Dr. Sanjay Srivastava 2, Dr. Vineeta Khemchandani 3 IV Sem,
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware
Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware Cumhur Doruk Bozagac Bilkent University, Computer Science and Engineering Department, 06532 Ankara, Turkey
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Monday Morning Data Mining
Monday Morning Data Mining Tim Ruhe Statistische Methoden der Datenanalyse Outline: - data mining - IceCube - Data mining in IceCube Computer Scientists are different... Fakultät Physik Fakultät Physik
SOPS: Stock Prediction using Web Sentiment
SOPS: Stock Prediction using Web Sentiment Vivek Sehgal and Charles Song Department of Computer Science University of Maryland College Park, Maryland, USA {viveks, csfalcon}@cs.umd.edu Abstract Recently,
Model Selection. Introduction. Model Selection
Model Selection Introduction This user guide provides information about the Partek Model Selection tool. Topics covered include using a Down syndrome data set to demonstrate the usage of the Partek Model
B2.53-R3: COMPUTER GRAPHICS. NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions.
B2.53-R3: COMPUTER GRAPHICS NOTE: 1. There are TWO PARTS in this Module/Paper. PART ONE contains FOUR questions and PART TWO contains FIVE questions. 2. PART ONE is to be answered in the TEAR-OFF ANSWER
Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum
Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?
Linear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
Scaling Up the Accuracy of Naive-Bayes Classiers: a Decision-Tree Hybrid. Ron Kohavi. Silicon Graphics, Inc. 2011 N. Shoreline Blvd. ronnyk@sgi.
Scaling Up the Accuracy of Classiers: a Decision-Tree Hybrid Ron Kohavi Data Mining and Visualization Silicon Graphics, Inc. 2011 N. Shoreline Blvd Mountain View, CA 94043-1389 [email protected] Abstract
Predicting Flight Delays
Predicting Flight Delays Dieterich Lawson [email protected] William Castillo [email protected] Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Date : July 28, 2015
Date : July 28, 2015 Awesome(Team( 2! Who"are"we?" Menish Gupta Lukas Osborne Founder!&!CEO! 9+!years!@!Amex!! 5!years!@!Startups!in!NYC! B.S.!/!M.S.!Comp!Sci.!NJIT! Data!Science! 7!PublicaIons! 5!years!@!CISMM!Labs!
Automatic Text Processing: Cross-Lingual. Text Categorization
Automatic Text Processing: Cross-Lingual Text Categorization Dipartimento di Ingegneria dell Informazione Università degli Studi di Siena Dottorato di Ricerca in Ingegneria dell Informazone XVII ciclo
Diploma Of Computing
Diploma Of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B
Detecting Internet Worms Using Data Mining Techniques
Detecting Internet Worms Using Data Mining Techniques Muazzam SIDDIQUI Morgan C. WANG Institute of Simulation & Training Department of Statistics and Actuarial Sciences University of Central Florida University
On the Role of Data Mining Techniques in Uncertainty Quantification
On the Role of Data Mining Techniques in Uncertainty Quantification Chandrika Kamath Lawrence Livermore National Laboratory Livermore, CA, USA [email protected] USA/South America Symposium on Stochastic
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
Chapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
