Modelingand Rendering Architecture from Photographs: Ahybrid geometry and imagebased approach


 Eileen Wilson
 1 years ago
 Views:
Transcription
1 Modelingand Rendering Architecture from Photographs: Ahbrid geometr and imagebased approach PaulE. Debevec CamilloJ. Talor JitendraMalik UniversitofCaliforniaat Berkele 1 ABSTRACT Wepresentanewapproachformodelingandrenderingeistingarchitectural scenesfrom asparsesetof still photographs. Our modeling approach, which combines both geometrbasedand imagebased techniques, has two components. The first component is a photogrammetricmodelingmethodwhichfacilitatestherecoverof thebasicgeometrofthephotographedscene. Ourphotogrammetric modelingapproachis effective, convenient,androbustbecause it eploits the constraints that are characteristic of architectural scenes. Thesecondcomponentis amodelbasedstereo algorithm, which recovers how the real scene deviates from the basic model. Bmakinguseofthemodel,ourstereotechniquerobustlrecovers accuratedepthfrom widelspacedimagepairs. Consequentl,our approachcanmodellargearchitecturalenvironmentswithfarfewer photographs than current imagebased modeling approaches. For producingrenderings,wepresentviewdependentteturemapping, amethodof compositingmultiple views of ascenethat better simulates geometricdetailonbasicmodels. Ourapproachcanbeused to recovermodelsfor useineither geometrbasedorimagebased rendering sstems. Wepresent results that demonstrate our approach sabilittocreaterealisticrenderingsofarchitecturalscenes from viewpointsfar from theoriginal photographs. CR Descriptors: I.2.10 [Artificial Intelligence]: Vision and Scene Understanding  Modeling and recover of phsical attributes; I.3.7 [Computer Graphics]: ThreeDimensional GraphicsandRealismColor,shading,shadowing,andtetureI.4.8[Image Processing]: Scene Analsis Stereo; J.6 [ComputerAided Engineering]: Computeraideddesign(CAD). 1 INTRODUCTION Efforts to model the appearance and dnamics of the real world have producedsome of the most compelling imager in computer graphics. In particular, efforts to model architectural scenes, from the Amiens Cathedral to the Giza Pramids to Berkele s Soda Hall, have produced impressive walkthroughs and inspiring flbs. Clearl, it is anattractive applicationtobe ableto eplorethe world sarchitectureunencumberedb fences,gravit, customs,or jetlag. 1 Computer Science Division, Universit of California at Berkele, Berkele,CA alsohttp://www.cs.berkele.edu/ debevec/research (a) Geometr Based user input Modeling Program model Rendering Algorithm renderings teture maps (b) Hbrid Approach images basic model Model Based Stereo depth maps Image Warping renderings user input Photogrammetric Modeling Program (c) Image Based images Stereo Correspondence depth maps Image Warping renderings (user input) Unfortunatel, current geometrbased methods (Fig. 1a) of modeling eisting architecture, in which amodeling program is used to manuall position the elements of the scene, have several drawbacks. First,theprocessisetremellaborintensive,tpicall involving surveing the site, locating and digitizing architectural plans(ifavailable),orconvertingeistingcaddata(again,ifavailable). Second,it is difficulttoverifwhethertheresultingmodelis accurate. Most disappointing,though, is that the renderings of the resultingmodelsarenoticeablcomputergenerated;eventhosethat emploliberalteturemappinggenerallfailtoresemblerealphotographs. Figure 1: Schematic of how our hbrid approach combines geometrbasedandimagebasedapproachestomodelingandrenderingarchitecturefromphotographs. Recentl, creating models directl from photographs has received increased interest in computer graphics. Since real images areusedas input, suchanimagebasedsstem(fig. 1c) hasanadvantagein producingphotorealistic renderings as output. Some of the most promisingof thesesstems[16, 13] rel on the computer visiontechniqueofcomputationalstereopsistoautomaticalldeterminethestructureofthescenefromthemultiplephotographsavailable. As aconsequence,however,thesesstemsareonl as strong as the underling stereoalgorithms. This has causedproblems because stateoftheart stereo algorithms have anumber of significantweaknesses;inparticular,thephotographsneedtoappearver similar for reliable results to be obtained. Becauseof this, current imagebasedtechniquesmustusemancloselspacedimages,and insomecasesemplosignificantamountsofuserinputforeachimagepair to supervisethe stereoalgorithm. In this framework, capturingthedataforarealisticallrenderablemodelwouldrequirean impracticalnumberofcloselspacedphotographs,andderivingthe depthfromthephotographscouldrequireanimpracticalamountof userinput. Theseconcessionstotheweaknessof stereoalgorithms bode poorl for creating largescale, freel navigable virtual environmentsfrom photographs. Ourresearchaims tomaketheprocessofmodelingarchitectural
2 scenes more convenient, more accurate, and more photorealistic thanthemethodscurrentlavailable. Todothis,wehavedeveloped anewapproachthatdrawsonthestrengthsofbothgeometrbased andimagebasedmethods,asillustratedinfig. 1b. Theresultisthat ourapproachtomodelingandrenderingarchitecturerequiresonla sparsesetofphotographsandcanproducerealisticrenderingsfrom arbitrar viewpoints. In our approach, abasic geometric model of the architectureis recoveredinteractivelwith aneastousephotogrammetricmodelingsstem,novelviewsarecreatedusingviewdependentteturemapping,andadditional geometricdetail canbe recovered automaticall through stereo correspondence. The final images can be rendered with current imagebased rendering techniques. Because onl photographs are required, our approach to modeling architecture is neither invasive nor does it require architecturalplans,cadmodels,orspecializedinstrumentationsuchas surveingequipment,gpssensorsor rangescanners. 1.1 Background and Related Work The process of recovering 3D structure from 2D images has been acentral endeavorwithin computer vision, and the processof rendering such recovered structures is asubject receiving increased interest in computer graphics. Although no general technique eists to derivemodelsfrom images,fourparticular areasof research haveprovidedresultsthatareapplicabletotheproblemofmodeling andrendering architectural scenes. Theare: Camera Calibration, Structure from Motion, Stereo Correspondence, and ImageBased Rendering Camera Calibration Recovering3D structure from images becomes asimpler problem whenthecamerasusedarecalibrated,thatis,themappingbetween imagecoordinatesanddirectionsrelativeto eachcamerais known. This mappingis determined b,among other parameters,the camera sfocal length and its pattern of radial distortion. Camera calibrationisawellstudiedproblembothinphotogrammetrandcomputer vision;somesuccessfulmethods include[20] and[5]. While there has beenrecent progress in the useof uncalibratedviews for 3D reconstruction [7], we have found camera calibration to be a straightforward processthatconsiderablsimplifiestheproblem Structure from Motion Given the 2D projection of apoint in the world, its position in 3D spacecouldbe anwhereon ara etendingout in aparticular direction from the camera soptical center. However, when the projections of asufficient numberof points in the world are observed inmultipleimagesfromdifferentpositions,itistheoreticallpossibletodeducethe3dlocationsofthepoints aswellas thepositions oftheoriginal cameras,uptoanunknownfactor ofscale. This problem has been studied in the area of photogrammetr for theprincipal purposeof producingtopographicmaps. In 1913, Kruppa[10] provedthefundamentalresultthat giventwo viewsof five distinct points, one could recover the rotation and translation betweenthetwocamerapositionsaswellasthe3dlocationsofthe points(uptoascalefactor). Sincethen,theproblem smathematical andalgorithmicaspectshavebeeneploredstartingfromthefundamentalworkofullman[21] andlonguethiggins[11],intheearl 1980s. Faugeras sbook[6]overviewsthestateoftheartasof1992. So far, ake realization has been that the recover of structure is versensitivetonoiseinimagemeasurementswhenthetranslation betweentheavailablecamerapositions is small. Attention has turned to using more than two views with image streammethodssuchas[19]orrecursiveapproaches(e.g. [1]). [19] showsecellentresultsforthecaseoforthographiccameras,butdirect solutions for the perspective case remain elusive. In general, linear algorithms for the problem fail to make use of all available informationwhilenonlinearminimizationmethodsarepronetodifficultiesarisingfromlocalminimaintheparameterspace. Analternativeformulation oftheproblem[17] useslines ratherthanpoints as image measurements, but the previousl stated concerns were shown to remain largel valid. For purposes of computer graphics,thereis etanotherproblem: themodelsrecoveredbthesealgorithms consistof sparsepoint fields or individual line segments, whicharenot directlrenderableassolid3dmodels. In our approach, we eploit the fact that we are tring to recovergeometricmodelsofarchitecturalscenes,notarbitrar threedimensional point sets. This enables us to include additional constraints not tpicall availableto structurefrom motion algorithms and to overcomethe problems of numerical instabilit that plague suchapproaches. Ourapproachis demonstratedinausefulinteractivesstemfor buildingarchitecturalmodelsfrom photographs Stereo Correspondence The geometricaltheor of structure from motion assumesthat one isabletosolvethecorrespondenceproblem,whichistoidentifthe pointsin twoor moreimagesthatareprojections ofthesamepoint in the world. In humans, correspondingpoints in the two slightl differing images on the retinas are determined bthe visual corte intheprocesscalledbinocularstereopsis. Years of research(e.g. [2, 4, 8, 9, 12, 15]) have shownthat determiningstereocorrespondencesbcomputeris difficultproblem. Ingeneral,currentmethodsaresuccessfulonlwhentheimagesare similarinappearance,asinthecaseofhumanvision,whichis usuallobtainedbusingcamerasthatarecloselspacedrelativetothe objectsinthescene. Whenthedistancebetweenthecameras(often called the baseline) becomes large, surfaces in the images ehibit different degreesof foreshortening,different patterns of occlusion, andlargedisparitiesintheirlocationsinthetwoimages,allofwhich makesit muchmoredifficult for the computertodeterminecorrect stereocorrespondences.unfortunatel,thealternativeofimproving stereocorrespondencebusingimagestakenfromnearblocations has the disadvantagethat computingdepthbecomesver sensitive tonoisein imagemeasurements. In this paper,we showthat havinganapproimatemodelof the photographedscenemakes it possibleto robustl determine stereo correspondences from images taken from widel varing viewpoints. Specificall, the model enables us to warp the images to eliminate unequalforeshorteningandto predict major instancesof occlusionbeforetringto findcorrespondences ImageBased Rendering In animagebasedrenderingsstem, the modelconsistsof aset of images of asceneand their correspondingdepth maps. When the depth of ever point in an image is known, the image can be rerenderedfrom annearbpoint ofview bprojectingthepiels of the image to their proper 3D locations and reprojecting them onto anew image plane. Thus, anew image of the sceneis created b warping the images accordingto their depthmaps. Aprincipal attraction of imagebasedrendering is that it offers amethodof rendering arbitraril comple scenes with aconstantamount of computationrequiredperpiel. Usingthis propert,[23] demonstrated howregularl spacedsntheticimages (with their computeddepth maps)couldbewarpedandcompositedinrealtimetoproduceavirtualenvironment. Morerecentl,[13]presentedarealtimeimagebasedrendering sstem that used panoramic photographs with depth computed, in part,fromstereocorrespondence.onefindingofthepaperwasthat etracting reliable depth estimates from stereo is ver difficult. The method was nonetheless able to obtain acceptable results for nearbviews usinguserinput to aidthe stereodepthrecover: the correspondencemapforeachimagepairwasseededwith100to500 usersuppliedpointcorrespondencesandalsopostprocessed.even 2
3 with userassistance,theimagesusedstill hadtobecloselspaced; thelargestbaselinedescribedinthepaperwas fivefeet. Therequirementthat samplesbe closetogether is aserious limitation to generating afreel navigable virtual environment. Covering the size of just one cit block would require thousands of panoramic images spaced five feet apart. Clearl, acquiring so manphotographsisimpractical. Moreover,evenadenselatticeof groundbasedphotographswouldonlallowrenderingstobegeneratedfrom withinafewfeetoftheoriginal cameralevel,precluding anvirtualflbsofthescene. Etendingthedenselatticeofphotographs into three dimensionswould clearl makethe acquisition process even more difficult. The approachdescribedin this paper takesadvantageof thestructureinarchitecturalscenessothatit requiresonlasparsesetofphotographs. Foreample,ourapproach hasieldedavirtualflaroundofabuildingfrom justtwelvestandardphotographs. 1.2 Overview In this paper we present three new modeling and rendering techniques: photogrammetric modeling, viewdependent teture mapping, andmodelbasedstereo. Weshow how thesetechniquescan be used in conjunction to ield aconvenient, accurate, and photorealistic methodof modelingandrendering architecture from photographs. Inourapproach,thephotogrammetricmodelingprogram isusedtocreateabasicvolumetricmodelofthescene,whichisthen usedto constrainstereomatching. Our renderingmethod compositesinformationfrommultipleimageswithviewdependentteturemapping. Our approach is successful becauseit splits the task of modelingfrom imagesintotaskswhichareeasilaccomplishedb aperson(butnotacomputeralgorithm), andtaskswhichareeasil performedbacomputeralgorithm (butnotaperson). In Section 2, we present our photogrammetric modeling method. Inessence,wehaverecastthestructurefrommotionproblem not as the recover of individual point coordinates, but as the recoveroftheparametersofaconstrainedhierarchofparametric primitives. The result is that accurate architectural models can be recoveredrobustlfromjustafewphotographsandwithaminimal numberof usersuppliedcorrespondences. InSection3,wepresentviewdependentteturemapping,and showhowitcanbeusedtorealisticallrendertherecoveredmodel. Unlike traditional teturemapping, in which asingle static image is used to color in each face of the model, viewdependent teturemappinginterpolates betweentheavailablephotographsofthe scenedependingon the user spoint of view. This results in more lifelike animations that better capturesurfacespecularities andunmodeledgeometricdetail. Lastl, in Section 4, we present modelbased stereo, which is usedtoautomaticallrefineabasicmodelofaphotographedscene. This techniquecanbeusedto recoverthestructure ofarchitectural ornamentation that would be difficult to recover with photogrammetricmodeling. Inparticular,weshowthatprojectingpairsofimages onto an initial approimatemodel allows conventionalstereo techniques to robustl recover ver accurate depth measurements from imageswith widelvaringviewpoints. 2 Photogrammetric Modeling In this sectionwepresentour methodfor photogrammetricmodeling, in which the computer determines the parameters of ahierarchicalmodelof parametricpolhedralprimitives toreconstructthe architectural scene. Wehaveimplemented this methodin Faç ade, aneastouseinteractivemodelingprogramthatallows theuserto constructageometricmodelofascenefromdigitizedphotographs. WefirstoverviewFaç adefromthepointofviewoftheuser,thenwe describeour model representation, andthen we eplain our reconstructionalgorithm. Lastl,wepresentresultsfromusingFaç adeto reconstructseveralarchitectural scenes. 2.1 The User s View Constructing ageometric model of an architectural scene using Faç adeisanincrementalandstraightforwardprocess. Tpicall,the userselectsasmallnumberofphotographstobeginwith,andmodelsthesceneonepieceatatime. Theusermarefinethemodeland includemoreimagesintheprojectuntilthemodelmeetsthedesired levelofdetail. Fig. 2(a)and(b)showsthetwotpesofwindowsusedinFaç ade: image viewers and model viewers. The user instantiates the componentsof the model, marks edgesin the images, andcorresponds theedgesintheimagestotheedgesinthemodel. Wheninstructed, Faç adecomputesthesizesandrelativepositionsofthemodelcomponentsthat bestfit theedgesmarkedinthephotographs. Components of the model, called blocks, are parameterized geometric primitives such as boes, prisms, and surfaces of revolution. Abo,foreample,isparameterizedbits length,width,and height. The user models the scene as acollection of such blocks, creatingnewblockclassesas desired. Of course,the userdoesnot needto specif numerical values for the blocks parameters, since thesearerecoveredbtheprogram. The userma chooseto constrainthe sizesand positions of an oftheblocks. InFig. 2(b),mostoftheblockshavebeenconstrained to have equal length and width. Additionall, the four pinnacles havebeenconstrainedto havethesameshape. Blocksmaalsobe placedin constrainedrelations to oneother. For eample,manof theblocksinfig. 2(b) havebeenconstrainedtositcenteredandon topoftheblockbelow. Suchconstraintsarespecifiedusingagraphical3Dinterface. Whensuchconstraintsareprovided,theareused tosimplif thereconstructionproblem. The user marks edge features in the images using apointandclickinterface;agradientbasedtechniqueasin[14] canbeusedto align the edges with subpiel accurac. Weuse edge rather than point features since the are easier to localize and less likel to be completel obscured. Onl asection of each edge needs to be marked,making it possibleto usepartiall visibleedges. For each markededge,the user alsoindicates the correspondingedgein the model. Generall,accuratereconstructionsareobtainedifthereare as man correspondences in the images as there are free camera andmodelparameters. Thus,Faç adereconstructsscenesaccuratel evenwhenjustaportionof thevisibleedgesandmarkedintheimages, and when just aportion of the model edges are given correspondences. Atantime,theusermainstructthecomputertoreconstructthe scene. The computer then solves for the parameters of the model that causeit to align with the markedfeatures in the images. During thereconstruction,the computercomputesanddisplas thelocationsfromwhichthephotographsweretaken. Forsimplemodels consistingofjustafewblocks,afullreconstructiontakesonlafew seconds;for more complemodels, it cantakeafew minutes. For this reason,theusercaninstruct thecomputerto emplofaster but less precisereconstructionalgorithms (see Sec. 2.4) during theintermediatestagesof modeling. Toverifthetheaccuracoftherecoveredmodelandcamerapositions,Faç adecanprojectthemodelintotheoriginalphotographs. Tpicall, the projected model deviates from the photographs b lessthanapiel. Fig. 2(c)showstheresultsofprojectingtheedges ofthemodelinfig. 2(b) intotheoriginal photograph. Lastl,the usermageneratenovelviews of themodelbpositioningavirtualcameraatandesiredlocation. Faç adewillthenuse theviewdependentteturemappingmethodofsection3torender anovelviewofthescenefromthedesiredlocation. Fig. 2(d)shows anaerial renderingof thetowermodel. 2.2 Model Representation Thepurposeofourchoiceofmodelrepresentationistorepresentthe sceneas asurfacemodelwith as fewparametersas possible: when 3
4 (a) (b) (c) (d) Figure 2: (a) Aphotographof the Campanile, Berkele sclocktower,with markededges shown in green. (b) The model recoveredb our photogrammetricmodelingmethod. Althoughonltheleftpinnaclewasmarked,theremainingthree(includingonenotvisible)wererecovered from smmetrical constraints in the model. Our method allows an number of images to be used, but in this case constraints of smmetr madeit possibleto recoveranaccurate3d modelfromasinglephotograph. (c) Theaccuracof themodelis verifiedbreprojectingit into the original photographthroughthe recoveredcameraposition. (d) Asnthetic viewof the Campanilegeneratedusing the viewdependent teturemappingmethoddescribedin Section3. Arealphotographfrom this position wouldbe difficult to takesincethe cameraposition is 250feet abovetheground. themodelhasfewerparameters,theuserneedstospeciffewercorrespondences,andthecomputercanreconstructthemodelmoreefficientl. In Faç ade, the sceneis representedas aconstrainedhierarchical model of parametric polhedral primitives, called blocks. Each block has asmall set of parameters which serve to define its size and shape. Each coordinate of each verte of the block is thenepressedaslinearcombinationoftheblock sparameters,relative to an internal coordinate frame. For eample, for the wedge block in Fig. 3, the coordinates of the verte P o are written in terms of theblockparameters width, height, andlength as P o = (,width;,height;length ) T.Eachblockis alsogivenanassociatedboundingbo. wedge_height P 0 Bounding Bo z wedge_width wedge_depth Figure3: Awedgeblockwith its parametersandboundingbo. roof z first_store z ground_plane z z entrance roof ground_plane g (X) 1 first_store g (X) 2 entrance (a) (b) Figure 4: (a) Ageometric model of asimple building. (b) The model shierarchical representation. The nodes in the tree represent parametric primitives (called blocks) while the links contain thespatialrelationshipsbetweentheblocks. TheblocksinFaç adeareorganizedinahierarchicaltreestructure asshowninfig. 4(b). Eachnodeofthetreerepresentsanindividual block,whilethelinksinthetreecontainthespatialrelationshipsbetweenblocks,calledrelations. Suchhierarchicalstructuresarealso usedintraditional modelingsstems. Therelationbetweenablockanditsparentismostgenerallrepresentedas arotationmatri R andatranslationvector t. This representationrequiressiparameters: threeeachfor R and t. Inarchitectural scenes,however,therelationship betweentwo blocks usuall has asimple form that canbe representedwith fewer parameters, andfaçadeallowstheusertobuildsuchconstraintson R and t intothemodel. Therotation R betweenablockandits parentcan bespecifiedinoneofthreewas: first,asanunconstrainedrotation, requiring three parameters; second,as arotation about aparticular coordinateais,requiringjust oneparameter;or third, as afiedor nullrotation, requiringnoparameters. Likewise, Faç ade allows for constraints to be placed on each component of the translation vector t. Specificall, the user can constrainthe boundingboes of two blocks to align themselves in some manner along eachdimension. For eample, in order to ensurethattheroof blockin Fig. 4lies ontopof thefirststor block, the usercan require thatthe maimum etentof the firststor block be equal to the minimum etent of the roof block. With this constraint, the translation along the ais is computed (t = (firststor MA X, roof MIN )) rather than representedas aparameterof themodel. Eachparameterof eachinstantiatedblockis actuallareference to anamedsmbolic variable, as illustrated in Fig. 5. As aresult, two parameters of different blocks (or of the same block) can be equatedbhavingeachparameterreferencethesamesmbol. This facilit allows theuser to equatetwo or more of the dimensionsin amodel, which makes modeling smmetrical blocks and repeated structuremoreconvenient. Importantl,theseconstraintsreducethe numberofdegreesoffreedominthemodel,which,aswewillshow, simplifiesthestructure recoverproblem. Once the blocks and their relations have beenparameterized, it is straightforward to derive epressions for the world coordinates of the block vertices. Consider the set of edges which link aspecific block in the model to the ground plane as shown in Fig. 4. 4
5 ToappearintheSIGGRAPH 96 conferenceproceedings BLOCKS Block1 tpe: wedge length width height name: "building_length" value: 20.0 name: "building_width" value: 10.0 VARIABLES Thus, the unknown model parameters and camera positions are computedbminimizing O withrespecttothesevariables. Oursstemusesthe the error function Err i from [17], describedbelow. Block2 tpe: bo length width height name:"roof_height" value: 2.0 name: "first_store_height" value: 4.0 Figure5: Representationofblockparametersassmbolreferences. Asinglevariablecanbereferencedbthemodelinmultipleplaces, allowingconstraintsofsmmetrto beembeddedin themodel. Let g 1(X);:::;g n(x) representtherigidtransformationsassociated with each of these links, where X represents the vector of all the model parameters. The world coordinates P w(x) of aparticular blockverte P (X) is then: P w(x) =g 1(X):::g n(x)p (X) (1) Similarl, the world orientation v w(x) of aparticular line segment v(x) is: v w(x) =g 1(X):::g n(x)v(x) (2) Intheseequations,thepointvectors P and P w andtheorientation vectors v and v w arerepresentedin homogeneouscoordinates. Modelingthescenewithpolhedralblocks,asopposedtopoints, line segments, surface patches, or polgons, is advantageousfor a numberof reasons: Mostarchitecturalscenesarewellmodeledbanarrangement ofgeometricprimitives. Blocksimplicitlcontaincommonarchitecturalelementssuch asparallel lines andright angles. Manipulatingblockprimitives is convenientsincetheareat asuitablhighlevelofabstraction;individualfeaturessuchas points andlines arelessmanageable. Asurface model of the scene is readil obtained from the blocks,sothereis noneedtoinfersurfacesfrom discretefeatures. Modelinginterms ofblocksandrelationshipsgreatlreduces the number of parameters that the reconstruction algorithm needsto recover. Thelastpointiscrucialtotherobustnessofourreconstructionalgorithm andtheviabilit of ourmodelingsstem,andis illustrated bestwith aneample. Themodelin Fig. 2is parameterizedbjust 33variables(the unknowncamerapositionaddssimore). If each blockin the scenewere unconstrained(in its dimensionsandposition),themodelwouldhave240parameters;ifeachlinesegmentin the scenewere treated independentl,the model wouldhave2,896 parameters. Thisreductioninthenumberofparametersgreatlenhancesthe robustnessandefficiencof themethodas comparedto traditionalstructurefrommotionalgorithms. Lastl,sincethenumber of correspondencesneededto suitabl overconstrainthe minimizationis roughlproportionaltothenumberofparametersinthe model,thisreductionmeansthatthenumberofcorrespondencesrequiredoftheuseris manageable. 2.3 Reconstruction Algorithm Our reconstruction algorithm works b minimizing an objective function O that sums the disparit betweenthe projected edges of the model and the edges markedin the images, i.e. O = P Err i where Err i represents the disparit computed for edge feature i. image plane Camera Coordinate Sstem m z image edge <R, t> v World Coordinate Sstem z d 3D line h(s) h 1 ( 1, 1 ) predicted line: m + m + m z f = 0 P(s) Observed edge segment h 2 ( 2, 2 ) (a) (b) Figure 6: (a) Projection of astraight line onto acamera simage plane. (b) Theerrorfunctionusedin thereconstructionalgorithm. Theheavlinerepresentstheobservededgesegment(markedbthe user)andthelighterlinerepresentsthemodeledgepredictedbthe currentcameraandmodelparameters. Fig. 6(a) shows how astraight line in the model projects onto the image plane of acamera. The straight line can be defined b a pair of vectors hv;d i where v represents the direction of the line and d representsapointontheline. Thesevectorscanbecomputed fromequations2and1respectivel. Thepositionofthecamerawith respecttoworldcoordinatesisgivenintermsofarotationmatri R j andatranslationvector t j.thenormalvectordenotedb m inthe figureis computedfrom thefollowing epression: m = R j(v (d, t j)) (3) Theprojection of theline ontothe imageplaneis simpl theintersectionoftheplanedefinedb m withtheimageplane,locatedat z =,f where f is thefocallengthofthecamera. Thus,theimage edgeis definedbtheequation m + m, m zf =0. Fig. 6(b)showshowtheerror betweentheobservedimageedge f( 1; 1); ( 2; 2)g and the predicted image line is calculated for eachcorrespondence. Points onthe observededgesegmentcanbe parameterized b a single scalar variable s 2 [0;l] where l is the lengthoftheedge. Welet h(s) bethefunctionthatreturnstheshortestdistancefromapointonthesegment, p(s),tothepredictededge. With thesedefinitions,thetotal errorbetweentheobservededge segmentandthepredictededgeis calculatedas: Err i = where: Z l 0 A = B = h 2 (s)ds = m ( = m ;m ;m z) T l 3 (h2 1+h 1h 2+h 2 2) = m T (A T BA)m l 0 1:5 3(m 2 + 1m 2 ) 0:5 Thefinalobjectivefunction O isthesumoftheerrortermsresulting from eachcorrespondence. Weminimize O using avariant of thenewtonraphsonmethod,whichinvolvescalculatingthegradient andhessianof O with respectto the parameters of the camera (4) 5
6 and the model. As we have shown, it is simple to construct smbolicepressionsfor m intermsoftheunknownmodelparameters. The minimization algorithm differentiates these epressions smbolicall to evaluate the gradient and Hessian after each iteration. The procedureis inepensivesince the epressions for d and v in Equations2and1haveaparticularlsimpleform. 2.4 Computing an Initial Estimate TheobjectivefunctiondescribedinSection2.3sectionisnonlinear with respecttothemodelandcameraparametersandconsequentl can have local minima. If the algorithm begins at arandom location in the parameter space,it stands little chanceof convergingto thecorrectsolution. Toovercomethis problemwehavedeveloped amethodto directl computeagoodinitial estimate for the model parametersandcamerapositionsthatisnearthecorrectsolution. In practice, our initial estimate method consistentl enables the nonlinearminimization algorithm toconvergetothecorrectsolution. Ourinitialestimatemethodconsistsoftwoproceduresperformed in sequence. The first procedure estimates the camera rotations whilethesecondestimatesthecameratranslationsandtheparametersofthemodel. Bothinitialestimateproceduresarebaseduponan eaminationof Equation3. From this equationthe following constraints canbededuced: m T R jv 0 = (5) m T 0 R j(d, = t j ) (6) Givenanobservededge u ij themeasurednormal m 0 totheplane passingthroughthecameracenteris: m 0 = 1 1,f! 2 2,f Fromtheseequations,weseethatanmodeledgesofknownorientationconstrainthe possiblevaluesfor R j.sincemost architectural models containmansuchedges(e.g. horizontalandvertical lines), each camera rotation can be usuall be estimated from the modelindependentofthemodelparametersandindependentofthe camera slocationinspace. Ourmethoddoesthisbminimizingthe following objectivefunction O 1 thatsums theetentsto whichthe rotations R j violatetheconstraintsarisingfrom Equation5: O 1 = X i! (7) (m T R jv i) 2 ; v i 2f^; ^; ^zg (8) Once initial estimates for the camera rotations are computed, Equation 6is used to obtain initial estimates of the model parameters and camera locations. Equation 6reflects the constraint that allofthepointsonthelinedefinedbthetuple hv; di shouldlieon theplanewithnormalvector m passingthroughthecameracenter. This constraintis epressedinthefollowing objectivefunction O 2 where P i(x) and Q i(x) areepressionsfortheverticesofanedge ofthemodel. O 2 = X i (m T R j (P i(x), t j)) 2 +(m T R j (Q i(x), t j)) 2 (9) In the specialcasewhere all of the blockrelations in the model have aknown rotation, this objective function becomes asimple quadraticform whichis easilminimizedbsolvingasetoflinear equations. Oncetheinitialestimateisobtained,thenonlinearminimization overtheentireparameterspaceisappliedtoproducethebestpossiblereconstruction. Tpicall,theminimization requiresfewer than ten iterations and adjusts the parameters of the modelb at most a few percentfrom the initial estimates. The edges of the recovered models tpicall conform to the original photographs to within a piel. Figure7: Threeoftwelvephotographsusedtoreconstructtheentire eterior of UniversitHigh Schoolin Urbana, Illinois. Thesuperimposedlines indicatetheedgestheuserhas marked. (a) (c) Figure 8: The high school model, reconstructedfrom twelve photographs. (a) Overheadview.(b) Rearview.(c) Aerialviewshowingtherecoveredcamerapositions. Twonearlcoincidentcameras can be observed in front of the building; their photographs were takenfromthesecondstorofabuildingacrossthestreet. Figure 9: Asnthetic view of Universit High School. This is a framefromananimationof flingaroundtheentirebuilding. (b) 6
7 (a) (b) (c) Figure10: ReconstructionofHooverTower,Stanford,CA(a)Originalphotograph,withmarkededgesindicated. (b)modelrecovered fromthesinglephotographshownin(a). (c)teturemappedaerial viewfromthevirtualcamerapositionindicatedin (b). Regionsnot seenin(a) areindicatedin blue. 2.5 Results Fig. 2showed the results of using Faç ade to reconstruct aclock tower from asingle image. Figs. 7and 8show the results of using Faç ade to reconstruct ahigh schoolbuilding from twelve photographs. (Themodelwas originall constructedfrom just fiveimages;theremainingimageswereaddedtotheprojectforpurposesof generatingrenderingsusingthetechniquesofsection3.) Thephotographsweretakenwithacalibrated35mmstillcamerawithastandard50mmlensanddigitizedwiththePhotoCDprocess. Imagesat the piel resolution were processedto correctfor lens distortion,thenfiltereddownto pielsforuseinthemodelingsstem. Fig. 8showssomeviewsoftherecoveredmodeland camerapositions,andfig. 9showsasntheticviewofthebuilding generatedbthetechniqueinsec. 3. Fig. 10 shows the reconstruction of another tower from asingle photograph. The domewas modeledspeciallsincethe reconstructionalgorithmdoesnotrecovercurvedsurfaces. Theuserconstrainedatwoparameterhemisphereblocktositcenteredontopof thetower, andmanualladjustedits heightandwidth toalign with the photograph. Eachof the models presentedtook approimatel four hourstocreate. 3 ViewDependent TetureMapping In this sectionwe present viewdependentteturemapping, an effective method of rendering the scene that involves projecting the originalphotographsontothemodel. Thisformofteturemapping is most effective when the model conforms closel to the actual structureof thescene,andwhentheoriginal photographsshowthe sceneinsimilarlightingconditions. InSection4wewill showhow viewdependent teturemapping can be used in conjunction with modelbasedstereotoproducerealisticrenderingswhentherecoveredmodelonlapproimatelmodels thestructureofthescene. Since the camera positions of the original photographs are recoveredduringthe modelingphase,projectingthe imagesonto the model is straightforward. In this section we first describehow we projectasingleimageontothemodel,andthenhowwemergeseveral image projections to render the entire model. Unlike traditional teturemapping, our method projects different images onto themodeldependingontheuser sviewpoint. Asaresult,ourviewdependentteture mapping can give abetter illusion of additional geometricdetailinthemodel. 3.1 Projecting asingle Image Theprocessof teturemappingasingleimageontothemodelcan be thought of as replacing each camera with aslide projector that projects the original imageontothe model. Whenthemodel is not conve,itis possiblethatsomeparts ofthemodelwill shadowotherswithrespecttothecamera. Whilesuchshadowedregionscould bedeterminedusinganobjectspacevisiblesurfacealgorithm,oran imagespaceracastingalgorithm, weuseanimagespaceshadow mapalgorithm basedon[22] sinceit is efficientlimplementedusingzbuffer hardware. Fig. 11,upperleft, showsthe results of mappingasingle image onto the high school building model. The recoveredcamera position for the projected imageis indicatedin the lower left corner of theimage. Becauseofselfshadowing,noteverpointonthemodel within thecamera sviewingfrustum is mapped. 3.2 Compositing Multiple Images In general, each photograph will view onl apiece of the model. Thus,it is usuallnecessartousemultiple imagesinorder torendertheentire modelfrom anovelpointof view. Thetop imagesof Fig. 11showtwodifferent imagesmappedontothemodelandrenderedfromanovelviewpoint. Somepielsarecoloredinjustoneof the renderings, while some are coloredin both. Thesetwo renderings can be merged into acompositerendering b consideringthe correspondingpiels intherenderedviews. If apielis mappedin onlonerendering,itsvaluefromthatrenderingisusedinthecomposite. If it is mappedinmorethanonerendering,therendererhas todecidewhichimage(or combinationof images)to use. It wouldbeconvenient,ofcourse,iftheprojectedimageswould agree perfectl where the overlap. However, the images will not necessarilagreeifthereisunmodeledgeometricdetailinthebuilding,orifthesurfacesofthebuildingehibitnonlambertianreflection. In this case, the best image to use is clearl the one with the viewingangleclosesttothat ofthe renderedview. However,using theimageclosestinangleateverpielmeansthatneighboringrenderedpielsmabesampledfrom differentoriginalimages. When thishappens,specularitandunmodeledgeometricdetailcancause visible seams in the rendering. Toavoid this problem, we smooth thesetransitions throughweightedaveragingas infig. 12. Figure 11: The process of assembling projected images to form a composite rendering. The top two pictures show two images projectedontothemodelfromtheir respectiverecoveredcamerapositions. Thelower left pictureshowstheresultsof compositingthese two renderingsusing our viewdependentweighting function. The lower right picture shows the results of compositing renderingsof all twelve original images. Some piels near the front edge of the roof not seen in an image have beenfilled in with the holefilling algorithmfrom[23]. Even with this weighting, neighboring piels can still be sampledfromdifferentviewsattheboundarofaprojectedimage,since thecontribution of animagemustbezerooutsideits boundar. To 7
8 ToappearintheSIGGRAPH 96 conferenceproceedings view 1 virtual view view 2 a 2 a 1 (a) (b) model Figure 12: The weighting function used in viewdependentteture mapping. The piel in the virtual view correspondingto the point on the model is assignedaweightedaverageof the corresponding pielsinactualviews1and2. Theweights w 1 and w 2 areinversel inversel proportional to the magnitude of angles a 1 and a 2. Alternatel,moresophisticatedweightingfunctionsbasedonepected foreshorteningandimageresamplingcanbeused. addressthis, thepielweights are rampeddownneartheboundar of the projectedimages. Althoughthis method does not guarantee smoothtransitionsinallcases,wehavefoundthatiteliminatesmost artifacts inrenderingsandanimations arisingfrom suchseams. If an original photograph features an unwanted car, tourist, or otherobjectinfrontofthearchitectureofinterest,theunwantedobjectwill beprojectedontothesurfaceofthemodel. Topreventthis from happening,theusermamaskouttheobjectbpaintingover theobstructionwithareservedcolor. Therenderingalgorithm will thensettheweights foranpielscorrespondingtothemaskedregionstozero,anddecreasetheweightsofthepielsneartheboundarasbeforetominimizeseams. Anregionsinthecompositeimagewhichareoccludedineverprojectedimagearefilledinusing theholefillingmethodfrom [23]. Inthediscussionsofar,projectedimageweightsarecomputedat everpielofeverprojectedrendering. Sincetheweightingfunction is smooth (though not constant) across flat surfaces, it is not generallnot necessarto computeit for ever pielof everface of the model. For eample, using asingle weight for each face of the model, computed at the face s center, produces acceptable results. B coarselsubdividing large faces, the results are visuall indistinguishablefromthecasewhereauniqueweightis computed foreverpiel. Importantl,this techniquesuggestsarealtimeimplementation of viewdependent teture mapping using ateturemapping graphics pipeline to render the projected views, and  channelblendingtocompositethem. Forcomplemodelswheremostimagesareentireloccludedfor the tpical view, it canbe ver inefficient to project ever original photographtothenovelviewpoint. Someefficienttechniquestodetermine suchvisibilit aprioriinarchitectural scenesthroughspatial partitioning arepresentedin [18]. 4 ModelBasedStereopsis ThemodelingsstemdescribedinSection2allowstheusertocreateabasicmodel of ascene,but in generalthescenewill haveadditionalgeometricdetail(suchasfriezesandcornices)notcaptured in the model. In this section we present anew method of recoveringsuchadditionalgeometricdetailautomaticallthroughstereo correspondence, which we call modelbased stereo. Modelbased stereodiffers fromtraditional stereointhatit measureshowtheactual scenedeviatesfrom the approimatemodel,rather thantring tomeasurethestructureofthescenewithoutanpriorinformation. Themodelservesto placethe imagesinto acommonframe of referencethat makes the stereo correspondencepossibleevenfor im (c) (d) Figure13: Viewdependentteturemapping. (a)adetailviewofthe highschoolmodel. (b)arenderingofthemodelfromthesamepositionusingviewdependentteturemapping. Notethatalthoughthe modeldoesnotcapturetheslightlrecessedwindows,thewindows appear properlrecessedbecausethe teturemap is sampled primarilfromaphotographwhichviewedthewindowsfromapproimatelthesamedirection. (c) Thesamepieceofthemodelviewed fromadifferent angle, usingthe sameteturemap as in (b). Since thetetureisnotselectedfromanimagethatviewedthemodelfrom approimatelthesameangle,therecessedwindowsappearunnatural. (d) Amore natural result obtained b using viewdependent teturemapping. Sincethe angleof view in (d) is different than in (b),adifferentcompositionoforiginalimagesisusedtoteturemap themodel. agestakenfrom relativelfar apart. Thestereocorrespondenceinformationcanthenbeusedtorendernovelviewsofthesceneusing imagebasedrenderingtechniques. As in traditional stereo, given two images (which we call the ke andoffset), modelbasedstereo computes the associateddepth map for the ke imageb determining correspondingpoints in the keandoffset images. Likemanstereoalgorithms, ourmethodis correlationbased,inthatitattemptstodeterminethecorresponding point in the offset image b comparingsmall piel neighborhoods aroundthepoints. Assuch,correlationbasedstereoalgorithmsgenerall require the neighborhoodof each point in the ke image to resemblethe neighborhoodof its correspondingpoint in the offset image. The problem we face is that when the ke and offset images are taken from relativel far apart, as is the casefor our modeling method, corresponding piel neighborhoods can be foreshortened verdifferentl. InFigs. 14(a)and(c),pielneighborhoodstoward theright of the keimageareforeshortenedhorizontallb nearl afactorof fourin theoffsetimage. The ke observation in modelbased stereo is that even though two images of the same scenema appear ver different, the appearsimilarafterbeingprojectedontoanapproimatemodelofthe scene. In particular,projectingtheoffsetimageontothemodeland viewingitfromthepositionofthekeimageproduceswhatwecall thewarpedoffsetimage,whichappearsversimilar tothe keimage. The geometricall detailed scenein Fig. 14 was modeled as twoflatsurfaceswithourmodelingprogram,whichalsodetermined therelativecamerapositions. Asepected,thewarpedoffsetimage (Fig. 14(b)) ehibits the same pattern of foreshorteningas the ke image. In modelbased stereo, piel neighborhoods are compared be tweenthekeandwarpedoffsetimagesratherthanthekeandoff 8
9 (a) KeImage (b)warpedoffsetimage (c)offset Image (d) ComputedDisparitMap Figure 14: (a) and (c) Two images of the entrance to Peterhouse chapel in Cambridge, UK. The Faç ade program was used to model the faç adeandgroundas aflat surfacesandto recoverthe relativecamerapositions. (b) Thewarpedoffset image, producedb projectingthe offset imageontothe approimatemodelandviewingit fromthepositionof thekecamera. This projectioneliminates mostof thedisparit andforeshorteningwithrespectto the keimage,greatlsimplifingstereocorrespondence.(d) Anunediteddisparitmapproducedbour modelbasedstereoalgorithm. setimages. Whenacorrespondenceisfound,itissimpletoconvert itsdisparittothecorrespondingdisparitbetweenthekeandoffset images, from which the point sdepth is easil calculated. Fig. 14(d)showsadisparitmapcomputedfor thekeimagein (a). Thereductionofdifferencesinforeshorteningis justoneofseveral wasthatthe warpedoffset imagesimplifies stereocorrespondence. Someother desirableproperties of thewarpedoffset image are: An point in the scenewhich lies on the approimate model willhavezerodisparitbetweenthekeimageandthewarped offset image. Disparitiesbetweenthekeandwarpedoffsetimagesareeasil convertedto adepthmapfor thekeimage. Depth estimates are far less sensitive to noise in image measurementssinceimages takenfrom relativel farapart canbe compared. Placeswherethemodeloccludesitself relative tothekeimagecanbedetectedandindicatedin thewarpedoffset image. Alinear epipolargeometr (Sec. 4.1) eists betweenthe ke and warped offset images, despite the warping. In fact, the epipolar lines of the warped offset image coincide with the epipolarlines ofthekeimage. 4.1 ModelBased Epipolar Geometr In traditional stereo, the epipolar constraint(see [6]) is often used to constrain the search for corresponding points in the offset image to searching along an epipolar line. This constraint simplifies stereo not onl b reducing the search for eachcorrespondenceto onedimension,butalsobreducingthechanceofselectingafalse matches. Inthissectionweshowthattakingadvantageoftheepipolarconstraintisnomoredifficultinmodelbasedstereocase,despite thefactthat theoffsetimageis nonuniforml warped. Fig. 15shows the epipolargeometr for modelbasedstereo. If weconsiderapoint P inthescene,thereisauniqueepipolarplane whichpassesthrough P andthe centersof the keandoffset cameras. Thisepipolarplaneintersects thekeandoffsetimageplanes in epipolar lines e k and e o.if we considerthe projection p k of P ontothekeimageplane,theepipolarconstraintstatesthatthecorrespondingpointin the offset imagemust lie somewherealongthe offset image sepipolarline. Inmodelbasedstereo,neighborhoodsinthekeimagearecomparedtothewarpedoffsetimageratherthantheoffsetimage. Thus, to makeuseof the epipolar constraint, it is necessarto determine wherethepielsontheoffsetimage sepipolarlineprojecttointhe warpedoffsetimage. Thewarpedoffsetimageisformedbprojectingtheoffsetimageontothemodel,andthenreprojectingthemodel ontotheimageplaneofthe kecamera. Thus,the projection p o of P intheoffset imageprojects ontothemodelat Q, andthenreprojects to q k in the warpedoffset image. Since eachof theseprojectionsoccurswithintheepipolarplane,anpossiblecorrespondence ke / warped offset image p p o q k epipolar lines k Ke Camera approimate model e k Q P actual structure epipolar plane e o offset image Offset Camera Figure15: Epipolargeometrfor modelbasedstereo. for p k inthekeimagemustlie onthekeimage sepipolarline in thewarpedoffset image. Inthecasewheretheactualstructureand the modelcoincideat P, p o is projectedto P andthen reprojected to p k,ieldingacorrespondencewithzerodisparit. Thefactthattheepipolargeometrremainslinearafterthewarping step also facilitates the use of the ordering constraint [2, 6] throughadnamicprogrammingtechnique. 4.2 Stereo Results and Rerendering While the warping step makes it dramaticall easier to determine stereo correspondences,astereo algorithm is still necessarto actualldeterminethem. Thealgorithm wedevelopedtoproducethe imagesinthis paperis describedin[3]. Onceadepthmaphasbeencomputedfor aparticular image,we can rerender the scene from novel viewpoints using the methods described in [23, 16, 13]. Furthermore, when several images and theircorrespondingdepthmapsareavailable,wecanusetheviewdependent teturemapping method of Section 3to composite the multiple renderings. The novel views of the chapel faç ade in Fig. 16wereproducedthroughsuchcompositingof fourimages. 5 Conclusion and Future Work Toconclude,wehavepresentedanew,photographbasedapproach to modeling and rendering architectural scenes. Our modeling approach, which combines both geometrbasedand imagebased modeling techniques, is built from two components that we have developed. The first component is an eastouse photogrammet 9
10 Figure16: Novelviewsofthescenegeneratedfromfouroriginalphotographs. Theseareframesfromananimatedmovieinwhichthefaç ade rotatescontinuousl. Thedepthis computedfrommodelbasedstereoandtheframesaremadebcompositingimagebasedrenderingswith viewdependentteturemapping. ric modeling sstem which facilitates the recover of abasic geometricmodelofthephotographedscene. Thesecondcomponentis amodelbasedstereo algorithm, which recovers preciselhow the realscenediffers fromthebasicmodel. Forrendering,wehavepresentedviewdependentteturemapping,whichproducesimagesb warpingandcompositingmultiple views of thescene. Throughjudicioususeofimages,models,andhumanassistance,ourapproach is more convenient, more accurate, and more photorealistic than current geometrbased or imagebased approaches for modeling andrenderingrealworld architecturalscenes. Thereareseveralimprovementsandetensionsthatcanbemade toourapproach. First,surfacesofrevolutionrepresentanimportant componentofarchitecture(e.g. domes,columns,andminarets)that arenot recoveredin ourphotogrammetricmodelingapproach. (As noted,thedomeinfig. 10was manuallsizedbtheuser.) Fortunatel,therehas beenmuchwork(e.g. [24]) thatpresents methods of recoveringsuchstructures from image contours. Curved model geometrisalsoentirelconsistentwithourapproachtorecovering additionaldetailwith modelbasedstereo. Second, our techniques should be etended to recognize and modelthephotometricproperties ofthematerials inthescene. The sstem should be able to make better use of photographs taken in varinglighting conditions,and it shouldbe ableto renderimages of the sceneas it would appearat antime of da, in anweather, andwithanconfigurationofartificiallight. Alread,therecovered modelcanbeusedtopredictshadowinginthescenewithrespectto anarbitrar light source. However,afull treatment of the problem will require estimating the photometricproperties (i.e. the bidirectionalreflectancedistributionfunctions)ofthesurfacesinthescene. Third,itisclearthatfurtherinvestigationshouldbemadeintothe problem of selectingwhich original images to use whenrendering anovelviewofthescene. Thisproblemis especialldifficultwhen theavailableimagesaretakenatarbitrarlocations. Ourcurrentsolution to this problem, the weighting functionpresentedin Section 3, still allows seams to appearin renderings and doesnot consider issuesarisingfromimageresampling. Anotherformofviewselectionis requiredtochoosewhichpairsofimagesshouldbematched torecoverdepthin themodelbasedstereoalgorithm. Lastl, it will clearl be an attractive application to integrate the modelscreatedwith thetechniquespresentedin this paperinto forthcomingrealtime imagebasedrenderingsstems. Acknowledgments This research was supported b anational Science Foundation Graduate Research Fellowship and grants from Interval Research Corporation, the California MICRO program, and JSEP contract F C The authors also wish to thank Tim Hawkins, CarloSéquin,DavidForsth,andJianboShifortheirvaluablehelp inrevisingthis paper. References [1] Ali AzarbaejaniandAle Pentland. Recursiveestimationof motion,structure, andfocallength.ieeetrans.patternanal.machineintell.,17(6): ,June [2] H. H. Baker and T.O. Binford. Depth fromedgeand intensit based stereo. In ProceedingsoftheSeventhIJCAI,Vancouver,BC,pages ,1981. [3] PaulE.Debevec,CamilloJ. Talor,andJitendraMalik. Modelingandrendering architecturefrom photographs: Ahbridgeometrandimagebasedapproach. TechnicalReportUCB//CSD ,U.C.Berkele,CSDivision,Januar1996. [4] D.J.Fleet, A.D.Jepson,andM.R.M.Jenkin. Phasebaseddisparitmeasurement. CVGIP:ImageUnderstanding,53(2): ,1991. [5] Oliver Faugeras and Giorgio Toscani. The calibration problem for stereo. In ProceedingsIEEECVPR86,pages15 20,1986. [6] Olivier Faugeras. ThreeDimensionalComputerVision. MITPress, [7] Olivier Faugeras, Stephane Laveau, Luc Robert, Gabriella Csurka, and Cril Zeller. 3d reconstructionof urban scenes from sequencesof images. TechnicalReport2572,INRIA,June1995. [8] W.E. L.Grimson. FromImagesto Surface. MITPress, [9] D. Jones and J. Malik. Computational framework for determining stereo correspondencefrom aset of linear spatial filters. Image and Vision Computing, 10(10): ,December1992. [10] E. Kruppa. Zur ermittlungeinesobjectesauszwei perspektivenmitinnererorientierung. Sitz.Ber.Akad.Wiss., Wien, Math.Naturw.Kl., Abt. Ila., 122: ,1913. [11] H.C. LonguetHiggins. Acomputeralgorithm for reconstructingascene from two projections. Nature,293: ,September1981. [12] D.MarrandT.Poggio.Acomputationaltheorofhumanstereovision.ProceedingsoftheRoalSocietofLondon,204: ,1979. [13] LeonardMcMillan andgarbishop. Plenopticmodeling:An imagebasedrenderingsstem. InSIGGRAPH 95,1995. [14] EricN.MortensenandWilliam A.Barrett. Intelligentscissorsforimagecomposition. In SIGGRAPH 95,1995. [15] S. B. Pollard,J. E. W.Mahew,andJ. P.Frisb. Astereo correspondencealgorithmusingadisparitgradientlimit. Perception,14: ,1985. [16] R. Szeliski. Image mosaicing for telerealit applications. In IEEE Computer GraphicsandApplications,1996. [17] Camillo J. Talor and David J. Kriegman. Structureand motion from line segments in multiple images. IEEE Trans. Pattern Anal. Machine Intell., 17(11), November1995. [18] S.J.Teller,CelesteFowler,ThomasFunkhouser,andPatHanrahan. Partitioning and orderinglarge radiosit computations. In SIGGRAPH 94, pages , [19] Carlo Tomasiand TakeoKanade. Shapeand motionfrom imagestreams under orthograph:afactorizationmethod. InternationalJournalofComputerVision, 9(2): ,November1992. [20] Roger Tsai. Aversatile cameracalibration techniquefor high accurac3d machinevision metrologusingofftheshelftv camerasand lenses. IEEEJournal ofroboticsandautomation,3(4): ,August1987. [21] S.Ullman.TheInterpretationofVisualMotion.TheMITPress,Cambridge,MA, [22] LWilliams. Casting curved shadows on curved surfaces. In SIGGRAPH 78, pages ,1978. [23] LanceWilliams andericchen. Viewinterpolationforimagesnthesis. InSIG GRAPH 93,1993. [24] Mourad Zerroug and Ramakant Nevatia. Segmentation and recover of shgcs fromarealintensitimage. InEuropeanConferenceonComputerVision,pages ,
From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose. Abstract
To Appear in the IEEE Trans. on Pattern Analysis and Machine Intelligence From Few to Many: Illumination Cone Models for Face Recognition Under Variable Lighting and Pose Athinodoros S. Georghiades Peter
More informationDistinctive Image Features from ScaleInvariant Keypoints
Distinctive Image Features from ScaleInvariant Keypoints David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., Canada lowe@cs.ubc.ca January 5, 2004 Abstract This paper
More informationRendering Synthetic Objects into Legacy Photographs
Rendering Synthetic Objects into Legacy Photographs Kevin Karsch Varsha Hedau David Forsyth Derek Hoiem University of Illinois at UrbanaChampaign {karsch1,vhedau2,daf,dhoiem}@uiuc.edu Figure 1: With only
More informationPoint Matching as a Classification Problem for Fast and Robust Object Pose Estimation
Point Matching as a Classification Problem for Fast and Robust Object Pose Estimation Vincent Lepetit Julien Pilet Pascal Fua Computer Vision Laboratory Swiss Federal Institute of Technology (EPFL) 1015
More informationAutomatic Photo Popup
Automatic Photo Popup Derek Hoiem Alexei A. Efros Martial Hebert Carnegie Mellon University Figure 1: Our system automatically constructs a rough 3D environment from a single image by learning a statistical
More informationRandomized Trees for RealTime Keypoint Recognition
Randomized Trees for RealTime Keypoint Recognition Vincent Lepetit Pascal Lagger Pascal Fua Computer Vision Laboratory École Polytechnique Fédérale de Lausanne (EPFL) 1015 Lausanne, Switzerland Email:
More informationDude, Where s My Card? RFID Positioning That Works with Multipath and NonLine of Sight
Dude, Where s My Card? RFID Positioning That Works with Multipath and NonLine of Sight Jue Wang and Dina Katabi Massachusetts Institute of Technology {jue_w,dk}@mit.edu ABSTRACT RFIDs are emerging as
More informationToappearin thesiggraph98conferenceproceedings
Toappearin thesiggraph98conferenceproceedings Rendering Synthetic Objects into Real Scenes: Bridging Traditional and Imagebased Graphics with Global Illumination and High Dynamic Range Photography Paul
More informationA Theory of Shape by Space Carving
International Journal of Computer Vision 38(3), 199 218, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. A Theory of Shape by Space Carving KIRIAKOS N. KUTULAKOS Department of
More informationParallel Tracking and Mapping for Small AR Workspaces
Parallel Tracking and Mapping for Small AR Workspaces Georg Klein David Murray Active Vision Laboratory Department of Engineering Science University of Oxford ABSTRACT This paper presents a method of estimating
More informationModelling with Implicit Surfaces that Interpolate
Modelling with Implicit Surfaces that Interpolate Greg Turk GVU Center, College of Computing Georgia Institute of Technology James F O Brien EECS, Computer Science Division University of California, Berkeley
More informationTHIS CHAPTER INTRODUCES the Cartesian coordinate
87533_01_ch1_p001066 1/30/08 9:36 AM Page 1 STRAIGHT LINES AND LINEAR FUNCTIONS 1 THIS CHAPTER INTRODUCES the Cartesian coordinate sstem, a sstem that allows us to represent points in the plane in terms
More informationShape from Shading: A Survey Ruo Zhang, PingSing Tsai, James Edwin Cryer and Mubarak Shah Computer Vision Lab School of Computer Science University of Central Florida Orlando, FL 32816 shah@cs.ucf.edu
More informationMultiview matching for unordered image sets, or How do I organize my holiday snaps?
Multiview matching for unordered image sets, or How do I organize my holiday snaps? F. Schaffalitzky and A. Zisserman Robotics Research Group University of Oxford fsm,az @robots.ox.ac.uk Abstract. There
More informationVisual Categorization with Bags of Keypoints
Visual Categorization with Bags of Keypoints Gabriella Csurka, Christopher R. Dance, Lixin Fan, Jutta Willamowski, Cédric Bray Xerox Research Centre Europe 6, chemin de Maupertuis 38240 Meylan, France
More informationONE of the great open challenges in computer vision is to
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 29, NO. 1, JANUARY 2007 65 Tracking People by Learning Their Appearance Deva Ramanan, Member, IEEE, David A. Forsyth, Senior Member,
More informationA Survey of Augmented Reality
In Presence: Teleoperators and Virtual Environments 6, 4 (August 1997), 355385. A Survey of Augmented Reality Abstract Ronald T. Azuma Hughes Research Laboratories 3011 Malibu Canyon Road, MS RL96 Malibu,
More informationMetric Measurements on a Plane from a Single Image
TR26579, Dartmouth College, Computer Science Metric Measurements on a Plane from a Single Image Micah K. Johnson and Hany Farid Department of Computer Science Dartmouth College Hanover NH 3755 Abstract
More informationA Lowcost Attack on a Microsoft CAPTCHA
A Lowcost Attack on a Microsoft CAPTCHA Jeff Yan, Ahmad Salah El Ahmad School of Computing Science, Newcastle University, UK {Jeff.Yan, Ahmad.SalahElAhmad}@ncl.ac.uk Abstract: CAPTCHA is now almost
More informationA Flexible New Technique for Camera Calibration
A Flexible New Technique for Camera Calibration Zhengyou Zhang December 2, 1998 (updated on December 14, 1998) (updated on March 25, 1999) (updated on Aug. 1, 22; a typo in Appendix B) (updated on Aug.
More informationTHE PROBLEM OF finding localized energy solutions
600 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 3, MARCH 1997 Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Reweighted Minimum Norm Algorithm Irina F. Gorodnitsky, Member, IEEE,
More informationA Taxonomy and Evaluation of Dense TwoFrame Stereo Correspondence Algorithms
A Taxonomy and Evaluation of Dense TwoFrame Stereo Correspondence Algorithms Daniel Scharstein Richard Szeliski Dept. of Math and Computer Science Microsoft Research Middlebury College Microsoft Corporation
More informationNot Seeing is Also Believing: Combining Object and Metric Spatial Information
Not Seeing is Also Believing: Combining Object and Metric Spatial Information Lawson L.S. Wong, Leslie Pack Kaelbling, and Tomás LozanoPérez Abstract Spatial representations are fundamental to mobile
More informationCan we calibrate a camera using an image of a flat, textureless Lambertian surface?
Can we calibrate a camera using an image of a flat, textureless Lambertian surface? Sing Bing Kang 1 and Richard Weiss 2 1 Cambridge Research Laboratory, Compaq Computer Corporation, One Kendall Sqr.,
More informationUsing Many Cameras as One
Using Many Cameras as One Robert Pless Department of Computer Science and Engineering Washington University in St. Louis, Box 1045, One Brookings Ave, St. Louis, MO, 63130 pless@cs.wustl.edu Abstract We
More informationFeature Sensitive Surface Extraction from Volume Data
Feature Sensitive Surface Extraction from Volume Data Leif P. Kobbelt Mario Botsch Ulrich Schwanecke HansPeter Seidel Computer Graphics Group, RWTHAachen Computer Graphics Group, MPI Saarbrücken Figure
More informationModeling by Example. Abstract. 1 Introduction. 2 Related Work
Modeling by Example Thomas Funkhouser, 1 Michael Kazhdan, 1 Philip Shilane, 1 Patrick Min, 2 William Kiefer, 1 Ayellet Tal, 3 Szymon Rusinkiewicz, 1 and David Dobkin 1 1 Princeton University 2 Utrecht
More informationAn Experimental Comparison of MinCut/MaxFlow Algorithms for Energy Minimization in Vision
In IEEE Transactions on PAMI, Vol. 26, No. 9, pp. 11241137, Sept. 2004 p.1 An Experimental Comparison of MinCut/MaxFlow Algorithms for Energy Minimization in Vision Yuri Boykov and Vladimir Kolmogorov
More informationDecomposing a Scene into Geometric and Semantically Consistent Regions
Decomposing a Scene into Geometric and Semantically Consistent Regions Stephen Gould Dept. of Electrical Engineering Stanford University sgould@stanford.edu Richard Fulton Dept. of Computer Science Stanford
More informationWhich Edges Matter? Aayush Bansal RI, CMU. Devi Parikh Virginia Tech. Adarsh Kowdle Microsoft. Larry Zitnick Microsoft Research
Which Edges Matter? Aayush Bansal RI, CMU aayushb@cmu.edu Andrew Gallagher Cornell University andrew.c.gallagher@gmail.com Adarsh Kowdle Microsoft adkowdle@microsoft.com Devi Parikh Virginia Tech parikh@vt.edu
More information