Perceiving and Recognizing Objects Philosophy Marr (1982) Vision What is the purpose of vision? create a 3D map of the world in the brain Gibson (1966) vision is for action navigation, manipulation, exploration early vision visual feature extraction mid-level vision finding edges and primitive grouping and segmenting the world Middle vision loosely defined stage that comes after basic features have been extracted from the image (low level vision) and before object recognition and scene understanding (high level vision) high-level vision recognizing objects and understanding scenes Perceiving and Recognizing Objects What Do You See? finding edges grouping and texture segmentation figure ground assignment parts and wholes object recognition objects in the brain 1
More Ambiguity: Accidental Views.2 (a) A house.(b) Cells in primary visual cortex respond to the local features of the house. (c) A more complicated scene More Ambiguity: Accidental Views (cont d) accidental viewpoint - a viewing position that introduces regularity in the image that is not present in the visual world Why is it difficult? Object Ambiguity 2
Every Image Is Ambiguous Perceiving Objects and Forms Finding Edges Gestalt Approach Principles of perceptual organization figure-ground separation Object Recognition Theories Object-centered Biederman - Recognition by Components Viewer-centered Bülthoff & Edelman & Poggio - Image-based interpolation Task Surface perception a complete 3D representation or map of scene edges, motion, stereo etc. and Historical Perspective Segment/parse into objects Which points in map belong to same objects? segment into figure and ground Recognize and identify objects represent, remember and match to memory Finding Edges Scene perception conglomerations of objects layout 3
Finding Edges Illusory Contours Early vision - oriented line detectors (V1) computer algorithms/simple cells Illusory contour a contour that is perceived even though luminance does not change from one side of the contour to the other Finding Edges (cont d) Filters - information at high/low spatial frequencies Meaning in the Edges Non-accidental features provide clues to object structure
Perceptual Organization How we see surfaces and objects from images knowledge and experience? Statistics of the world? principles of perceptual organization more statistics? Bayesian approach? attention Performance Factors Pandemonium Perceptual Committees and cues All other things being equal.. Oliver Selfridge (1959) simple model of letter recognition demons loosely represent neurons; each level represents a different brain area perception by committee Gestalt Psychology The whole is greater than the sum of its parts Wertheimer, Köhler, Koffka (1920s 1950s); Palmer and Rock (1990s) reaction to structuralist school of psychology structuralism A school of thought that held that complex objects or perceptions could be understood by the analysis of components Laws of perceptual organization Pragnanz - good figure Stimuli should be interpreted so that the resulting form is ASAP Good Continuation form continuous lines Similarity Similar shapes, orientations, colors should be grouped together Proximity Close things should be grouped together Common fate Motion in the same direction should be grouped together Meaningfulness and familiarity Groups should look familiar Good figure form asap 5
Rules for Linking Contours Good continuation: group elements to form smoothly continuing lines Texture Segmentation Similarity - group elements that are similar together Texture Segmentation (cont d) Proximity (a) group elements that are proximal,(b) no grouping by conjunctions of color and form.12 Proximity grouping can be overruled by grouping by common region or by connectedness Parallelism/Symmetry weaker grouping principles group parallel and symmetric elements together Dynamic Grouping Principles Common fate: group elements moving in the same direction together 6
Dynamic Grouping Principles (cont d) Dynamic Grouping Principles (cont d) Synchrony: group elements changing at the same time together Dynamic Grouping Principles (cont d) Modern Gestaltism common region: elements perceived to be part of a larger region group together connectedness: elements that are connected to each other group together Which Gestalt principles at work? Figure - Ground Separation 7
Figure Ground Segmentation What is the to-be-recognized object and what is the background? Figure Ground Segmentation surroundedness - figure is surrounded size - figure tends to be smaller symmetry - figures more commonly symmetric parallelism - figures form parallel lines Gestalt figure ground assignment principles: surroundedness, size, symmetry, parallelism Meaning and Figure Ground Assignment Object recognition starts before figure ground assignment finishes Occlusion Occlusion Which circles are figures and which are holes? 8
T-junctions indicate occlusion Y- and Arrow-junctions indicate corners of objects Except with accidental views! Object Recognition High-Level Vision Processes in object recognition Low level vision determine features present in image Mid-level vision group features into objects High level vision match perceived to encoded representations Naive Template Theory What is the representation? image? extracted image-based features? extracted object centered 3D shape? Marr 1982 Lock-and-key representations problem: You would need too many templates! 9
Object and Shape Recognition Theories Direct analysis of shapes Problems Photometric problems - illumination, viewpoint, shadows, highlights Object setting - isolation, occlusion Rigid, non-rigid - animated Shape invariants Properties of shape common to all views Feature list that specifies object Good - some success in limited situations Bad - not generally applicable Structural description Find parts Identify parts Describe structural relations among the parts examples - Bottom-up Approaches Object and Shape Recognition Theories (continued) Feature hierarchies Pandemonium model (Selfridge,59) Generalized Cones as Parts - (Marr & Nishihara, 77; Marr, 82) Raw primal sketch 21/2 D sketch 3D object centered representations Object and Shape Recognition Theories (continued) Recognition by components (Biederman, 86) Geons (about 50) least changeable with viewpoint maximize image features that generalize psychological evidence accidental and non-accidental views Structural Description Theory Geons (n=36 or so) Structural Description Theory Represent the structure of an object, not what it looks like from one view 10
The Effect of Viewpoint Problems with structural theories RBC predicts viewpoint invariant recognition some empirical studies have found viewpoint invariant recognition usually with common objects» (chairs, tables, etc) some empirical studies have found viewpoint dependent recognition usually with synthetic objects Object and Shape Recognition Theories (continued) Image-based models interpolation models (Poggio & Edelman, 91) 2D image analysis Store multiple views Interpolate in image space Special or canonical views alignment models (Ullman 90 s) Within a category - solve correspondence Align to a special view Transform from 2D to 3D Match Object versus face recognition object recognition entry level or basic level categorization that is a bird superordinate level that is an animal subordinate that is a cardinal 11
Levels of Object Categorization What are these objects? Rosch & Mervis (1976) 1. entry basic level: birds 2. subordinate level: sparrow/ostrich 3. superordinate level: animals The Entry Level-A special case Which label comes to mind first? entry level term may be determined by a perceptual committee Face Recognition Object versus face recognition face recognition (within category processing) entry level or basic level categorization that is a face superordinate level that is an human subordinate that is a George Face Recognition (cont d) What s wrong with this picture? 12
Object Recognition in the Brain Objects in the Brain What system (ventral stream, parvocellular): object identification (inferotemporal cortex) Klüver and Bucy (1938,39) lesioning of the temporal lobe in monkeys monkeys did not know what they where seeing -> psychic blindness Objects in the Brain Where/how system (dorsal stream, magnocellular) object localization/ manipulation (parietal cortex, TPO junction) human stoke victims agnosia What and where double dissociation Mishkin and Ungerlieder (1982) monkeys what task - object recognition where task - spatial localization lesion temporal lobe» what task fails» where task fine parietal lobe» what task fine» where task fails Face Recognition (cont d) Face Special processes may be involved in identifying individual faces prosopagnosia: selective inability to recognize faces 13
Grandmother Cells Could a single neuron be responsible for recognizing your grandmother? Functional neuroimaging data fusiform face area (FFA) responds to faces more than objects Kanwisher, McDermott & Chun (1997) special module in the brain? Modular representation Kanwisher et al. (1997) Distributed representation Haxby et al. (2001) Expertise-based representation Gauthier & Tarr (1999) 1