The influence of scene organization on attention: Psychophysics and electrophysiology

Similar documents

Independence of Visual Awareness from the Scope of Attention: an Electrophysiological Study

Overview of Methodology. Human Electrophysiology. Computing and Displaying Difference Waves. Plotting The Averaged ERP

PRIMING OF POP-OUT AND CONSCIOUS PERCEPTION

An electrophysiological assessment of distractor suppression in visual search tasks

PERSPECTIVE. How Top-Down is Visual Perception?

Time Window from Visual Images to Visual Short-Term Memory: Consolidation or Integration?

Agent Simulation of Hull s Drive Theory

History of eye-tracking in psychological research

An Introduction to ERP Studies of Attention

The Psychonomic Society, Inc. Vanderbilt University, Nashville, Tennessee

Visual area MT responds to local motion. Visual area MST responds to optic flow. Visual area STS responds to biological motion. Macaque visual areas

Data Analysis Methods: Net Station 4.1 By Peter Molfese

Perceptual Processes in Matching and Recognition of Complex Pictures

Online simulations of models for backward masking

Visual Attention and Emotional Perception

DISPLAYING SMALL SURFACE FEATURES WITH A FORCE FEEDBACK DEVICE IN A DENTAL TRAINING SIMULATOR

The Visual Cortex February 2013

Video-Based Eye Tracking

Bernice E. Rogowitz and Holly E. Rushmeier IBM TJ Watson Research Center, P.O. Box 704, Yorktown Heights, NY USA

Visual Singleton Detection ( Pop-out ) is Mediated by Dimensionbased

Masters research projects. 1. Adapting Granger causality for use on EEG data.

Overlapping mechanisms of attention and spatial working memory

Processing the Image or Can you Believe what you see? Light and Color for Nonscientists PHYS 1230

Ten Simple Rules for Designing and Interpreting ERP Experiments Steven J. Luck University of Iowa

Quantifying Spatial Presence. Summary

The Binding Problem Solutions to the spatial binding problem

Pop-Out Without Awareness: Unseen Feature Singletons Capture Attention Only When Top-Down Attention Is Available

CHAPTER 6 PRINCIPLES OF NEURAL CIRCUITS.

Decoding Information Processing When Attention Fails: An Electrophysiological Approach

Cognitive Neuroscience. Questions. Multiple Methods. Electrophysiology. Multiple Methods. Approaches to Thinking about the Mind

Effects of Orientation Disparity Between Haptic and Graphic Displays of Objects in Virtual Environments

The Scientific Data Mining Process

CELL PHONE INDUCED PERCEPTUAL IMPAIRMENTS DURING SIMULATED DRIVING

Principles of Data Visualization

Shifting views on the symbolic cueing effect: Cueing attention through recent prior experience

Lecture 2, Human cognition

2 Neurons. 4 The Brain: Cortex

2-1 Position, Displacement, and Distance

Obtaining Knowledge. Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology.

Physics 9e/Cutnell. correlated to the. College Board AP Physics 1 Course Objectives

The Time Course of Consolidation in Visual Working Memory

The Capacity of Visual Short- Term Memory Is Set Both by Visual Information Load and by Number of Objects G.A. Alvarez and P.

Chapter 14: The Cutaneous Senses

Lag-1 sparing in the attentional blink: Benefits and costs of integrating two events into a single episode

Problem-Based Group Activities for a Sensation & Perception Course. David S. Kreiner. University of Central Missouri

ERPs in Cognitive Neuroscience

Making Two Responses to a Single Object: Implications for the Central Attentional Bottleneck

HOW AUTOMATICALLY IS MEANING ACCESSED: A REVIEW OF THE EFFECTS OF ATTENTION ON SEMANTIC PROCESSING

Vision Research 50 (2010) Contents lists available at ScienceDirect. Vision Research. journal homepage:

Procon Engineering. Technical Document PELR TERMS and DEFINITIONS

Subjects. Subjects were undergraduates at the University of California, Santa Barbara, with

Neuropsychologia 48 (2010) Contents lists available at ScienceDirect. Neuropsychologia

Negligible Effect of Spatial Precuing on Identification of Single Digits

Measuring and modeling attentional functions

On the control of visual spatial attention: evidence from human electrophysiology

Anna Martelli Ravenscroft

PERCEPTUAL RECOGNITION AS A FUNCTION OF MEANINGFULNESS OF STIMULUS MATERIAL '

1 Example of Time Series Analysis by SSA 1

Long-term Memory for 400 Pictures on a CommonTheme

The Effect of Network Cabling on Bit Error Rate Performance. By Paul Kish NORDX/CDT

ON SELECTIVE ATTENTION: PERCEPTION OR RESPONSE?

Brain Function, Spell Reading, and Sweep-Sweep-Spell by Abigail Marshall, March 2005

Functions of the Brain

VOICE OVER WI-FI CAPACITY PLANNING

Functional neuroimaging. Imaging brain function in real time (not just the structure of the brain).

S. Hartmann, C. Seiler, R. Dörner and P. Grimm

Solving Simultaneous Equations and Matrices

1 An Introduction to Event-Related Potentials and Their Neural Origins

How To Check For Differences In The One Way Anova

Chi Square Tests. Chapter Introduction

The Effects of Moderate Aerobic Exercise on Memory Retention and Recall

DMD 101 Introduction to DMD technology

DIODE CIRCUITS LABORATORY. Fig. 8.1a Fig 8.1b

Component Ordering in Independent Component Analysis Based on Data Power

A PHOTOGRAMMETRIC APPRAOCH FOR AUTOMATIC TRAFFIC ASSESSMENT USING CONVENTIONAL CCTV CAMERA

International Year of Light 2015 Tech-Talks BREGENZ: Mehmet Arik Well-Being in Office Applications Light Measurement & Quality Parameters

MANAGING QUEUE STABILITY USING ART2 IN ACTIVE QUEUE MANAGEMENT FOR CONGESTION CONTROL

Face detection is a process of localizing and extracting the face region from the

Structural Axial, Shear and Bending Moments

GAZETRACKERrM: SOFTWARE DESIGNED TO FACILITATE EYE MOVEMENT ANALYSIS

Brain Maps The Sensory Homunculus

Picture Memory Improves with Longer On Time and Off Time

Decoding mental states from brain activity in humans

Gilles Pourtois Æ Sylvain Delplanque Æ Christoph Michel Æ Patrik Vuilleumier

Vision: Receptors. Modes of Perception. Vision: Summary 9/28/2012. How do we perceive our environment? Sensation and Perception Terminology

CONTE Summer Lab Experience Application

SimFonIA Animation Tools V1.0. SCA Extension SimFonIA Character Animator

Atomic Force Microscope and Magnetic Force Microscope Background Information

The Fourier Analysis Tool in Microsoft Excel

RESEARCH ON SPOKEN LANGUAGE PROCESSING Progress Report No. 29 (2008) Indiana University

The role of mask coherence in motion-induced blindness

The electrophysiology of introspection q

Common Tools for Displaying and Communicating Data for Process Improvement

2012 Psychology GA 1: Written examination 1

What is Visualization? Information Visualization An Overview. Information Visualization. Definitions

pm4dev, 2015 management for development series Project Schedule Management PROJECT MANAGEMENT FOR DEVELOPMENT ORGANIZATIONS

Characterizing Wireless Network Performance

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Basic Theory of Intermedia Composing with Sounds and Images

Understanding Sociograms

Transcription:

Kanwisher-16 9/16/03 6:26 PM Page 321 Chapter 16 The influence of scene organization on attention: Psychophysics and electrophysiology Mitchell Valdes-Sosa, Maria A. Bobes, Valia Rodríguez, Yanelis Acosta, Alejandro Pérez, Jorge Iglesias, and Mayelin Borrego Abstract Many studies have examined the neural mechanisms of visual attention by recording event-related potentials (ERPs). In most of these studies, objects are abruptly presented to the observer, often in a rapid succession of events. Similar stimuli are used in the psychophysical technique known as rapid serial visual presentation (RSVP). We argue that in RSVP (and related ERP paradigms) it is difficult to evince object-based attentional effects and to study the influence of scene organization. Rapid serial object transformation (RSOT) is offered as an alternative. Therein, several objects are presented over seconds and are monitored to detect brief sequential mutations of their attributes (e.g. changes in motion direction affecting illusory transparent surfaces, or local form transformations of objects defined by shape and color). We show that when two events engage the same object in rapid succession, both can be attended to effectively. In contrast, events that affect different objects compete and produce an attentional blink (AB). The AB is associated with smaller ERPs that are possibly generated in early visual extrastriate cortex. This suggests that the competition between these visual events is played out at the level of features codes, but is flexibly guided by higher-order representations similar to an object-file. 16.1 Competition for attention Many, ifnot all, of the visual scenes encountered daily are full of numerous objects, and each object, in turn, can be comprised of multiple parts. Moreover, collections of

Kanwisher-16 9/16/03 6:26 PM Page 322 322 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION objects can integrate perceptual groups. For each observed scene, these diverse entities compete for visual attention. Considerable progress has been made in understanding how exogenous cues (like stimulus salience), and endogenous factors (like a person s beliefs and goals), interact in the resolution of this competition (Desimone and Duncan 1995; Duncan 1996). For example, highly salient and more relevant objects are preferred over less salient or less relevant ones, allowing the winners preferential access to action and memory systems in the brain and into consciousness. The biasing of competition in favor of some stimuli over others is particularly clear in experiments showing endogenous selection of the inputs coming from one location, with rejection of those originating in other sites of the visual field. This type of selection leads to faster and more accurate detection or recognition of stimuli presented at the privileged location in a number of different paradigms (e.g. Eriksen and Hoffman 1973; Jonides 1981; LaBerge 1983; Posner and Cohen 1984). Results of this nature, reported in a host of studies, have given support to the classical spotlight metaphor of visuo-spatial attention. More recently it has been understood that competition for attention is also influenced by the organization of the visual entities contained in a scene. The structure of the scene can constrain the allocation (and re-allocation) of attention. It has been shown that it is easier to divide attention between two features on one object, than to do so for two features of different objects, even if the two objects are in close spatial proximity (Duncan 1984; Vecera and Farah 1994; Lavie and Driver 1996; Behrmann et al. 1998; Valdes-Sosa et al. 1998b). In fact, attention to one attribute of an object or surface may entail some degree of obligatory attention to other attributes of the same object (Duncan 1984, 1996; Egly et al. 1994; He and Nakayama 1995). Moreover, competition between objects can turn into cooperation (Driver and Baylis 1989) if they are perceptually grouped (i.e. move in the same direction). However, within the same object a cost for dividing attention between clearly differentiable parts has been reported (Vecera et al. 2000). 16.2 Neural mechanisms of attention competition Understanding of the neural mechanisms of attentional competition is rapidly increasing in several directions, 1 especially for the thoroughly studied situation in which spatial locations are selected. Neuroimaging has demonstrated that both pre-stimulus and stimulus-elicited activity in visual extrastriate areas are enhanced within the retinotopic sectors corresponding to the selected region of visual space (Mangun et al. 2000; Chapter 15, this volume). These effects may also be present in striate cortex (Brefczynski and DeYoe 1999; Gandhi et al. 1999). Research in several labs (see reviews by Hillyard and Anllo-Vento 1998; Hillyard et al. 1998; and Chapter 19, this volume) has established that event-related potentials (ERPs), elicited by stimuli flashed at the attended locations, are enhanced in amplitude

Kanwisher-16 9/16/03 6:26 PM Page 323 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 323 relative to ERPs elicited by stimuli presented elsewhere. The earliest components affected are P1 and N1. The P1 begins as early as 80 ms after the stimulus onset, and can be modeled by sources in early extrastriate cortex (Heinze et al. 1994; Mangun et al. 1997; Martinez et al.1999; Di Russo et al. 2001). The N1 begins at about 150 ms after the stimulus, and is a more complex mixture of subcomponents, including several sources in extrastriate cortex and in more frontal regions (Di Russo et al. 2001). Note that these early ERP effects are only obtained under conditions of large perceptual load (Lavie 1995; Luck and Hillyard 2000), in which the time to process potentially competing stimuli is limited. In other words, the early sensory modulation indexed by the P1 and N1 attentional effects are only present when it is difficult to switch attention in time between different input sources. Despite convincing psychophysical evidence for object-based attention (cited above), much less work has been performed on this topic with neuroimaging techniques (e.g. Arrington et al. 2000). The paucity of ERP studies on the topic is even more acute, hence little is known about the timing of the processes involved. This, in part, is a consequence of the dominant form of stimulus presentation used in ERP research. In contrast with other neuroimaging methods, ERPs are best elicited by abruptly presented, brief, and discrete stimulus variations (Hillyard and Picton 1987). Furthermore, the successive stimuli must be presented under strict time limits to enhance perceptual load (if selective attention is of interest). It is not surprising that the stimuli used to study the sharing of attention over time and those used to elicit attentional ERP effects are so similar, an issue we examine next. 16.3 The attentional blink, RSVP, and RSOT The temporal constraints in sharing attention between objects (see review by Egeth and Yantis 1997) have been studied using a technique known as rapid serial visual presentation (RSVP). In RSVP, objects are presented briefly, and generally one at a time. The objects can be words, letters, or pictures. In one variant (see Fig. 16.1A), a series of stimuli (say letters) are presented in rapid succession, usually at the same site. The stream is divided into targets and distracters. Recognition of a first target (T1) within the stream hampers processing of a subsequently appearing target (T2) until several hundred milliseconds have transpired, or until several distracters have intervened between the targets (Raymond et al. 1992; Shapiro et al. 1994; Chun, 1997). This phenomenon has been dubbed the attentional blink (AB). The attentional nature of the AB is attested by asking the observers to ignore T1 (thereby focusing attention on T2), in which case performance on T2 discrimination improves (Fig. 16.1B). This rules out sensory masking as an explanation of the interference (Egeth and Yantis 1997). A more austere variant of RSVP uses only two stimuli (see Fig. 16.1A), that are masked after a short presentation. A long-lasting impediment in the identification of the second stimulus is also found in this paradigm, (Duncan et al. 1994; Ward et al. 1996). Again, the interference is present only when attention is paid to the first stimulus.

Kanwisher-16 9/16/03 6:26 PM Page 324 324 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION RSVP D1 T1 D2 D3 T2 D4 Objects (A) MASK Discriminanda T1 SOA MASK T2 TIME (B) % Correct 300 500 ms Focused attention (Ignore T1) Divided attention (process T1 and T2) SOA or number of items RAPID SERIAL OBJECT TRANSFORMATIONS (RSOT) Object token 1 Objects (C) Object token 2 T1 affects object token 1 MASK Discriminanda SOA MASK T2 affects object token 2 Fig. 16.1 Traditional variants of rapid serial visual presentation (RSVP). (A) A series of many objects presented in rapid succession, each substituting (and, if at the same site, also masking) its predecessor; in the other variant (below), only two objects are presented, each with a post-mask that interrupts processing and limits the temporal availability of stimulus information. (B) Idealized graph of basic RSVP findings. Accuracy is reported as the percent correct of T2 identification given correct T1 identification. Attending to T1 hampers processing of T2 up to stimulus onset asynchrony (SOA) of about 300 500 ms. This effect is eliminated if T1 is ignored. (C) Rapid serial object transformations (RSOT). Object tokens (e.g. small shapes or surfaces) are presented for longer lifetimes than in RSVP. Targets consist of brief events that transform the objects without destroying their spatio-temporal continuity, as shown in the timechart. Here masking of the event is accomplished by reverting the transformation and returning the object to its baseline condition.

Kanwisher-16 9/16/03 6:26 PM Page 325 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 325 The duration of this interference can be as long as half a second (but see Moore et al. 1996). The data from both paradigms indicate that resolving the attentional competition between objects can take a considerable time. An attractive aspect of RSVP is the controlled timing of visual input. This allows relatively direct measurements of the ability to re-allocate attention over time (Duncan et al. 1994), in contrast with the awkward assumptions needed when estimating this ability with visual search tasks (see Wolfe, 1998 for a discussion). However, note that in all variants of RSVP the objects have short lifetimes (Fig. 16.1A). In the visual world, many (perhaps most) objects are relatively long lived within a scene. Also, an input of only one object per instant of time is perhaps infrequent in real-life scenarios. Furthermore, the creation of new objects may capture attention (Yantis 1998), and the disappearance of stimuli may disengage attention (Mackeben and Nakayama 1993) automatically. Consequently, the onset and offset of stimuli in RSVP may impose a peculiar temporal dynamics to the allocation of attention, not always present in natural scenes. These usually have a complex structure, where it may be necessary to shift attention between aspects of the same part of an object, different parts of the same object, as well as between different (and simultaneously present) objects or groups of objects. Therefore, it is difficult to study the influence of scene organization on the attention with RSVP. RSVP shares important features with the experimental designs used in many ERP experiments on visual attention (e.g. the use of briefly flashed objects and fast stimulus presentation rates), which might explain why electrophysiological signatures of object-based attention have been so elusive. To observe these effects, a longer permanence of the objects giving structure to the scene is probably required. A different approach is needed, one that would allow us to combine multiple objects within a visual scene (so that perceptual organization can be examined) while conserving the controlled timing of visual input from RSVP (so that the temporal dynamics of attention can be studied). One alternative, which we have named rapid serial object transformation (RSOT), consists of presenting several objects simultaneously in a display for relatively prolonged durations. Observers monitor the scene, and the accuracy in identifying events that transform these objects is measured. The critical aspect is how accuracy is affected by the duration of the interval between the events (see Fig. 16.1C). The objects may change shape, color, position, or any other attribute. Despite this mutability, each object token survives as an entity. The individuality of each object is not at issue. These stimuli instantiate the concept of an object-file (Kahneman and Treisman 1984; Kahneman et al. 1992), discussed later in this chapter. RSOT captures certain aspects of real-life scenes, and many examples are readily available. Imagine talking with several colleagues (springing a new idea) while monitoring their faces for changes in expression. Or imagine tracking the position of several vehicles when driving in heavy traffic. With RSOT (just as with RSVP) the timing of attentional shifts can be measured directly from the duration of the interference between the recognition of two events.

Kanwisher-16 9/16/03 6:26 PM Page 326 326 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION But in contrast to RSVP, perceptual organization can be manipulated by varying the relationships between groups of objects, objects, parts of objects, or even features of one object. Also, the events transforming the objects can be used as triggers to elicit ERPs. Note that the capture of attention by changes in pre-existing objects may not be mandatory (Hillstrom and Yantis, 1994). Object transformations have been used before in some studies of attention (e.g. Yantis and Jonides 1984; Sears and Pylyshyn 2000), and in studies of eye movements during reading or scene perception (e.g. Henderson and Ferreira 1990; Morris et al. 1990). 16.4 RSOT under conditions of extreme competition: Transparent motion We have used RSOT of transparent motion, a situation characterized by strong perceptual competition between two illusory entities or objects-tokens. When two sets of dots move in different directions within the same region of visual space, an illusion is generated. Two surfaces that slide one across the other (separated in depth, with an ambiguous order, but seen as very close together) are seen. These surfaces are rivals for attention. We have shown that it is easier to divide attention between two features of the same transparent surface than between identical attributes for different surfaces. This holds for simultaneous judgments about direction and speed of motion (Valdes- Sosa et al. 1998b), and about direction of motion and the shape of the moving elements (Rodriguez et al. 2002). These results extend the early demonstration of object-based attention by Duncan (1984). Building on these results, a RSOT paradigm was created. Two sets of dots (colored green and red respectively) were rotated in opposite directions around a colored fixation point (see Valdes-Sosa et al. 2000 for more details). After a baseline period of rotation, the dots from one surface briefly changed their direction of motion undergoing a translation (T1). The illusion is that the flow of rotating dots suddenly heads in another direction, and then recovers the original motion. After a variable T1-T2 stimulus onset asynchrony (SOA), either the same or the other surface was affected by an additional change in direction of motion (T2). The task was to report the direction of translation for both targets. The surface (red or green) affected by T1 was cued in advance by the color of the fixation point, whereas the surface affected by T2 was unpredictable. Note that the two illusory surfaces were continuously present (albeit with mutations in their direction of motion) and that the set of possible directions of translations was the same for both surfaces. The first target event was discriminated accurately under all conditions (Fig. 16.2). If the second target event (T2) affected the same surface as T1, it was also judged accurately. In sharp contrast, a large impairment in performance was observed if T2 affected the other surface (Figure 16.2). This two-surface cost persisted until about 500 ms after T1. The difficulty was not only in discrimination, but also affected detection

Kanwisher-16 9/16/03 6:26 PM Page 327 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 327 100 90 80 Accuracy (%) 70 60 50 40 First probe Second probe 300 450 600 950 1200 300 450 600 950 1200 Mean ± SE Different Same SOA (ms) Fig. 16.2 Performance in the transparent motion RSOT paradigm. The mean accuracy for T1 discrimination for 10 observers is shown in the panel on the left, and the corresponding values for T2 are shown in the panel to the right. Error bars represent the standard error (reproduced with permission from Valdes-Sosa et al. 2000). of the T2 probe motion (Pinilla et al. 2001). In other words, an AB was obtained when two events concerned different object tokens (surfaces), but not when they concerned the same object token (Fig. 16.2). Foreknowledge of the direction of the attentional shift did not ameliorate the two-surface cost (Cobo et al. 1999). Recently this pattern of results has been replicated independently (see Chapter 18, this volume). Blaser et al. (2000) recently performed a related RSOT experiment with a pair of superimposed (transparent) Gabor patches that smoothly (but unpredictably) changed orientation, spatial frequency, and color (note the similarity to conditions for monocular rivalry). The attributes held by one Gabor could (over time) become assigned to the other Gabor. Unpredictably, small jumps in the pattern of change would occur, which were to be reported. In line with our results, carrying out pairs of judgments for the same Gabor patch was easier than performing the identical judgments but on two different patches. 16.5 Neural correlates of attentional competition in transparent motion The neural basis of the AB in the transparent motion RSOT paradigm was studied by means of ERP recordings (Pinilla et al., 2001). The responses elicited by the two target events (Fig. 16.3) both contained a negative wave known as N200, which has been

Kanwisher-16 9/16/03 6:26 PM Page 328 328 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION previously described in relation to motion onsets or direction changes (Kuba and Kubová 1992; Bach and Ullrich 1994; Torriente et al. 1999). The N200 elicited by T2 (but not the preceding positivity) was significantly reduced in amplitude when the targets engaged different surfaces, relative to when they affected the same surface (Fig. 16.3). This suggests attenuated sensory processing of the T2 during the AB in this paradigm, since the N200 is considered an index of activity within motion-specific visual cortices (Kuba and Kubová 1992; Bach and Ullrich 1994). Interestingly, when observers sustain attention on the same illusory surface for a long period of time, and the perceptual load is increased, all of the ERP components are essentially suppressed (Valdes-Sosa et al. 1998a). This suggests that perhaps larger suppressive effects could be found at shorter T1 T2 SOAs (where the AB is larger). A further characterization of the ERPs in the transparent motion RSOT paradigm has been recently carried out (Rodríguez et al.submitted). Niedeggen et al. (2002) have pointed out that if the AB is related to sensory suppression, then ERPs indexing activity in early visual areas should be smaller for trials on which identification fails. Therefore in this new study, the trials in which T2 was correctly and incorrectly identified were T2 A P1 Two probes on the same surface B C D E N1 Two probes on different surfaces Only one probe A-C B-C 1µV 0 200 400 ms Fig. 16.3 Grand average ERPs in the transparent motion RSOT paradigm, for 10 observers, and recorded at the right posterior temporal electrode (T6). The timing of T1 corresponds to the origin of the time axis, whereas that of T2 is indicated by the vertical line. First row: ERPs from same-surface trials. Second row: ERPs from different-surface trials. Third row: ERPs from trials in which only T1 was presented. The fourth and fifth rows show the difference waveforms obtained by subtracting the T1-only response from the responses associated with same- and different-surface trials. In this and subsequent graphs positive deflections are plotted up. (Reproduced with permission from Pinilla et al. 2001).

Kanwisher-16 9/16/03 6:26 PM Page 329 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 329 averaged separately (correct T1 report was required in all cases). Also, high-density ERPs (120 channels) were recorded to order to ascertain the scalp distribution of the N200 (and of its attentional modulation) more exactly. Larger N200 amplitudes (associated to T2) were again obtained for same- relative to different-surface trials. Additionally, we found that the N200 was much larger for trials with correct responses than for trials with incorrect responses (Rodríguez et al. 2003), in both the same- and different-surface conditions. This indicates that the N200 was smaller on trials in which the identification of T2 failed in both these conditions, and that the proportion of trials with such failures is simply higher when attention must switch between surfaces. The N200 attentional effect is therefore best estimated by subtracting the correct/same-surface response from the incorrect/different-surface response (instead of the subtraction used in Fig. 16.3). The scalp distribution of peak of the N200 modulation reflected in this difference waveform presented a maximum amplitude over right posterior temporal sites (see Fig. 16.4A). The intracranial generators of this negativity were modeled with a data-driven method employing multiple distributed current sources, known as VARETA (for more complete descriptions see Muller et al. 1998; Picton et al. 1999; Bosch-Bayard et al. 2001). The scalp distribution is consistent with bilateral generators located in lateral (A) 0.45µV Back Top Right 0.45µV (B) R L R L A P 100% Fig. 16.4 (A) Scalp distribution of the attentional effect on the N200 in the transparent motion RSOT experiment (grand average of data from 10 observers). The data were obtained from a high-density electrode array (120 channels) and correspond to the voltage at the latency of the peak of the N200 component. The attentional effect was estimated by subtracting the ERPs related to incorrectly identified T2 stimuli in the different-surface condition from the ERPs related to correctly identified T2 stimuli in the same-surface condition. (B) Current sources for the scalp distribution shown in Fig. 16.4A, as estimated by the VARETA method and represented within the Talairach space in a glass brain (for details, see Rodríguez et al. 2003). Note that the right side of the brain is on the left. The estimated current density at each unitary source is represented as a percentage of the magnitude of the voxel with the largest response (values lower than 50% are not shown). 50%

Kanwisher-16 9/16/03 6:26 PM Page 330 330 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION ventral extrastriate cortex, and at the junction of the temporal and occipital lobes (Fig. 16.4B). There seems to be more activation on the right side with sources in lateral occipito-temporal extrastriate, parietal, and frontal cortices. The Talairach coordinates of the largest current source are near a site identified in PET and fmri studies as the motion-processing area MT + (see review by Culham et al. 2001). The somewhat larger involvement of the right hemisphere is consistent with previous ERP and fmri studies of motion processing (for more details see Rodríguez et al. 2003). These modeling efforts can only suggest that the AB in our paradigm is related to a reduction of activity in MT + and other extrastriate areas, but this conclusion would be congruent with the object-based attentional suppression of activity (evinced with transparent motion) that has been demonstrated in similar cortical areas by several neuroimaging studies (O Craven et al. 1997, 1999; Watanabe et al. 1998). Moreover, recordings of neurons in cortical area V4 have been obtained recently in monkeys trained to observe transparent surfaces similar to those described here. These data support the idea the suppressive interactions within extrastriate cortex are at play during the AB in our paradigm (see Chapter 18, this volume). To summarize, there is suggestive evidence that the AB in the transparent motion RSOT task is associated with reduced neural activity in MT/MST and perhaps in V4, which are relatively early visual extrastriate areas. Hence, under some conditions early sensory filtering (i.e. attenuation of sensory motion processing) may contribute to the AB. 16.6 RSOT based on object shape transformations The transparent motion design serves as a strong challenge the classical spotlight metaphor of attention. But, although objects do overlap and transparency is present in many natural scenes, the extreme form of perceptual competition described up to now is not the rule. It is therefore necessary to examine the influence of scene organization on attention when the competing stimuli are not transparently overlapped (and thus more separated in space), which represents the mainstream situation in visual attention experiments. Moreover, we would like to demonstrate that the object-based modulation of the AB found with transparent motion is not a quirk of the dorsal stream of visual extrastriate areas (Desimone and Ungerleider 1989), limited to situations where the object tokens are defined by relative motion and the target events consist of changes in motion direction. Therefore we now turn to studies using stationary object tokens specified by shape and color, that do not overlap completely, and to transformations consisting of changes in local shape. Color and shape are thought to be preferentially processed in the ventral visual stream (Desimone and Ungerleider 1989). For this, we (Pinilla and Valdes-Sosa submitted) modified the stimuli designed by Behrmann and coworkers (Behrmann et al. 1998). Two overlapping bar shapes were presented (see Fig. 16.5A). The tips of the bars were shaped so that erasing one set of pixels would reveal a form

Kanwisher-16 9/16/03 6:26 PM Page 331 (A) TIMING (B) (C) Control experiments Baseline Event 1 Variable Event 2 Baseline baseline 85 80 75 Event 70 65 Event 1 Event 2 same place Event 2 same object/different place Mean accuracy (%) Baseline A1 A2 A3 Event 2 different object 100 300 500 800 1100 SOA between events (ms) Fig. 16.5 (A) RSOT paradigm based on object shape transformations. Two overlapped bars (of different colors) were presented, and the tips could mutate into two or three bumps. Observers were cued about which object would be affected first by the color of the fixation point. The top row depicts a same-location trial (T1 and T2 affect the same tip). The middle row represents a same-object trial (T1 and T2 affect different tips from the same object). The bottom row represents a different-object trial (T1 and T2 affect tips from different objects). (B) Accuracy in identifying targets in the shape transformation RSOT paradigm, which was high for T1 in for all conditions and SOA values. This was also true for T2 identification when it affected the same tip as T1. There was an AB when T2 affected a different tip than the tip affected by T1; however, if the two tips belong to the same object, the AB was shorter than when the two tips belonged to different objects. (C) Control conditions. In the column to the left, the same shape transformations already described were used, and the response required was the same, but all the tips are isolated from each other (the colored pixels from the middle of the two bars were reallocated to disconnect them). In the column to the right, the transformations consisted in flashing a white rectangle upon the background bars. The observer had to determine which dimension (vertical or horizontal) was larger.

Kanwisher-16 9/16/03 6:26 PM Page 332 332 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION consisting of two bumps, whereas erasing another set of pixels would reveal three bumps. Thus two potential tip forms (with either two or three bumps) could emerge from the baseline stimulus. After the presentation of the baseline stimulus on each trial (see Fig. 16.5A), a brief change in shape (i.e. the unmasking of one of the two bump shapes) occurred at a randomly selected tip. This comprised the T1 event. The bar to be affected by T1 was pre-cued by the color of the fixation point. After a variable SOA, a second shape change took place at a randomly selected (and unpredictable) tip. In all cases T1 was accurately identified. Also, T2 was accurately discriminated for all the SOAs explored when it engaged the same tip as T1 (Fig. 16.5B). In other words, the AB was absent for successive events taking place at the same object-part. In contrast, interference between T1 and T2 was found when they involved different tips of the same object. However, this AB only lasted until about 300 ms. Note that the elongated form of the bars possibly facilitated the segregation of the tips as distinct parts of the overall object (see Vecera et al. 2000 for congruent findings). Importantly, interference between T1 and T2 lasted longer when these events concerned the tips of distinct objects, with an AB lasting up to 500 ms (Fig. 16.5B). Note that the distance between tips on different object is shorter than between tips of the same object, hence the larger AB for between- versus within-object attentional shifts can not be explained by purely spatial mechanisms. The pattern of results was the same when trials with eye movements (monitored with infrared spectacles) were discarded from the analysis. When the continuity of the bars was broken (see Fig. 16.5C), the speed for attentional shifts between all tip-pairs was equivalent. If the stimulus was changed to white rectangles briefly flashed on the bar tips (Fig. 16.5C), whose shape had to be discriminated, the object-based advantage was absent. A tendency for a shorter AB between events on tips of different objects was found, perhaps related to the shorter distance between these pairs of tips. The last finding is particularly relevant when designing ERP studies of object-based attention. Although perhaps optimal for eliciting large responses with adequate signal to noise ratio, sudden-onset stimuli (that may be perceived as new objects unrelated to the previous structure of the scene) are not useful in uncovering organizational constraints on the distribution of attention. A related result has been described for a hybrid experiment, where in one condition the successive frames in a RSVP were different objects, whereas in another condition successive frames were perceived as a single rotating object (in effect RSOT). The AB was diminished for the single object- compared to the different-object condition (Raymond 2003). 16.7 ERPs in an RSOT paradigm using shape transformations The findings of the previous section were confirmed in an alternative RSOT design, which also served to measure ERPs (Valdes-Sosa et al. 2003). In this design (schematized in Fig. 16.6A), five small diamonds are presented as holes in two non-overlapping objects. The objects were of different colors and shapes. One diamond was in the center

Kanwisher-16 9/16/03 6:26 PM Page 333 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 333 (A) T2 BASELINE (B) BASELINE T1 BASELINE 300 ms 30 ms Variable ISI 100 ms 300 ms Top within-object between-object Oval to the right within-object between-object Oval to the left (C) 40% 60% 80% Left Right Down Fig. 16.6 (A) Timechart for an RSOT paradigm based on local shape transformations. Two global objects were depicted on a monitor screen for the duration of the complete trial. Target events were brief changes of one of five diamond-shaped holes within the two objects: the loss of one corner (see inset). T1 was always a 30 ms modification of the central diamond, which also served as fixation point. T2, with a 100 ms duration, could affect any of the four eccentric diamonds (left, right, top and down), with equal probability. T2 event duration was selected to produce about 80% correct responses in most subjects when presented alone. In the example, T1 affects the central, and T2 affects the rightmost, diamond. (B) Two mirror layouts used. In one case, depicted above, the shift of attention to the left crosses an object boundary, whereas the shift to the right stays within the same object. In the layout depicted below, a leftward shift of attention remains within the object, whereas a rightward shift moves between objects. (C) Polar plots of accuracy in identifying T2 (given accurate T1 recognition). The mean percent correct for 10 observers is plotted for each of the eccentric diamonds at the corresponding angle. Results for the two stimulus configurations of Fig. 16.6B are overlaid. A 380 ms SOA between T1 and T2 was used. of the display and served as the fixation point, as well as the medium for creating T1 on all trials. It always belonged to an oval that also included one of the diamonds on the horizontal meridian. The other three diamonds were included within a different curved object.

Kanwisher-16 9/16/03 6:26 PM Page 334 334 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION The centers of the four peripheral diamonds were placed at the same retinal eccentricity (about 1.6 ), in a cross-like layout. The serial transformations consisted in the brief disappearance of one of the four corners of a diamond (see inset in Fig. 16.6A), an alteration which observers had to describe by pressing the appropriate arrow keys of the computer keyboard. Each trial included an event at the center diamond (T1), and a subsequent event at one of the peripheral diamonds (T2). T1 was uninformative about the upcoming location of T2. Two stimulus configurations were used (see Fig. 16.6B): a layout that included the central diamond in the same object (oval) as the rightmost diamond, and another layout which included the central diamond in the same object as the leftmost diamond. Accordingly, the local events were always the same, but the perceptual linkage between these events varied for the two configurations. The first question asked was if T2 was discriminated with equal accuracy when it concerned the same object, or a different object, as T1. In an experiment with a fixed SOA between T1 and T2 (380 ms), accuracy of T1 recognition was uniformly high (over 96% in all conditions and observers). Note that overall accuracy is lower in this design than in the one from the previous section, perhaps due to the smaller size of the shape mutations. T2 discrimination was about 20% more accurate when the affected peripheral diamond was part of the same object containing the central diamond (the site of T1), than when it was not connected to the central diamond. In other words, the AB was smaller when the two target events concerned the same object (see polar plot in Fig. 16.6C), confirming the conclusions drawn in Section 16.6). Equivalent results were obtained when the trials with eye movements were discarded. The temporal dynamics of the AB were measured in a second task with the same stimuli but with a variable T1 T2 SOA. The results are illustrated in Fig. 16.7, where data from both stimulus configurations are collapsed. One block was performed with focused attention (ignoring T1) and another block was performed with divided attention (attending to both T1 and T2). During focused attention, all T1 were the same in order to make the event easier to ignore. Blocks were presented in a counterbalanced manner across subjects. As expected, T1 identification was accurate. For within-object shifts of attention, the mean accuracy of T2 recognition was slightly above 60% correct identification of shape changes (Fig. 16.7, left panel). This mean accuracy did not vary as a function of SOA, and was only slightly affected by division of attention. No cost was found for this condition at 100 ms, a finding seemingly at variance with the results from the previous section (see Fig. 16.5B). We believe that intra-object grouping cues (and thus intraobject facilitation) were stronger in the stimuli of this new experiment. Note that division into parts is difficult for the oval used here, but not for the bars used before (compare Figs 16.5A and 16.6A). However, this idea has to be tested systematically by varying the concavity of the region connecting the two diamonds in the oval.

Kanwisher-16 9/16/03 6:26 PM Page 335 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 335 100 100 80 SAME OBJECT 80 DIFFERENT OBJECT Mean accuracy (%) 60 40 20 0 0 100 250 500 900 100 250 500 900 SOA (ms) SOA (ms) Fig. 16.7 Accuracy of T2 identification as a function of the T1 T2 SOA, and of the type of attentional shift (within- and between-object shifts as defined in Fig. 16.6B). Data for T2 collapsed over the right and left diamonds are shown, plotted separately for the blocks on which T1 was ignored or identified. Results for within-object attentional shifts are depicted in the left panel, and results for between-object shifts are shown in the right panel. Only trials with correct T1 identification were included. 60 40 Mean ± SE 20 Divided attention *P <0.05 Focused attention **P < 0.001 On the other hand, under divided attention there was a substantial AB at short T1 T2 SOAs for between-object attentional shifts (Fig. 16.7, right panel). This impairment was ameliorated as the SOA increased. When T1 was ignored (focused attention), there was a facilitation of T2 identification, which was most pronounced at the SOA of 250 ms. This shows that sensory masking cannot explain the performance impairment for T2 discrimination during the AB. Interestingly, trying to ignore T1 produced little benefit for the SOA of 100 ms. Given the short time between T1 and T2, this outcome may be due to exogenous cueing, which automatically calls attention to visual entities and can not be voluntarily suppressed by the observer (Jonides 1981; Remington et al. 1992). Exogenous processes have been shown to operate on object-based attention (Macquistan 1997). We also demonstrated that when the targets in the RSOT consisted of shapes flashed briefly on top of the figures in the scene (new objects), then the object-based modulation of the AB was largely attenuated, in line with the results of Pinilla and Valdes-Sosa (2003). Is the AB described in this section reflected in the ERP associated with T2? An experiment with eight observers was performed using high-density ERP recordings and a constant T1 T2 SOA of 380 ms, an interval at which the AB was expected to be large for the between-object attentional shifts as confirmed by the data (Valdes-Sosa et al. 2003). The ERPs related to T2 presented at the rightmost diamond, and from one electrode (left posterior temporal site) are shown in Fig. 16.8A. The N230 component elicited by T2 is significantly larger for the trials in which attention remained within

Kanwisher-16 9/16/03 6:26 PM Page 336 336 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION the same object than for trials when attention shifted between objects (<0.01 in the latency range from 170 to 265). An equivalent result was obtained for the responses related to T2 at the leftmost diamond. Also, responses from trials in which T2 was correctly identified were significantly larger than responses from trials in which T2 was missed. This establishes a tight link between the AB and N230 amplitude. The scalp distribution of the modulation of N230 was studied by obtaining the difference waveform resulting from the subtraction of the ERPs related to incorrectly T2 Early N230 Response to right T2 event T5 electrode Left T2 N1 +1.7 µv 200 ms Same-object correct Same-object incorrect Different-object correct Different-object incorrect (A) Right T2 0.45µV (B) Back Early N230 subcomponent R LR LA Top 0.45µV 100% P LEFT T2 145 ms R LR LA P 50% RIGHT T2 170 ms (C) Fig. 16.8 Attentional effects on the ERPs related to T2 in the shape-change RSOT task (modified from Valdes-Sosa et al. 2003). (A) ERPs from the left posterior temporal electrode (T5 of the 10/20 system), plotted separately for trials with within- and between-object attentional shifts, and for correct and incorrect T2 identifications. Difference waveforms were obtained by subtracting the ERPs related to incorrectly identified between-object T2s from ERPs related to correctly identified within-object T2s (for details, see Valdes-Sosa et al. 2003). These waveforms therefore represent the modulation related to the AB. Recordings are the grand averages from eight observers and were obtained from 120 electrodes. (B) Scalp distributions of the voltage modulation affecting the early N230 subcomponent, presented separately for trials when T2 was presented at the leftmost and rightmost diamonds. (C) Current sources for the scalp distributions of the attentional effect on the early N230 subcomponent shown in Fig. 16.8B, as estimated by the VARETA method and represented with the same conventions as in Fig. 16.4B.

Kanwisher-16 9/16/03 6:26 PM Page 337 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 337 identified/different-object trial from ERPs related to correctly identified/same-object trials. The N230 contained at least three subcomponents. The earliest contribution to N230 was a large contralateral negativity with maximum amplitudes at posterior sites (see Fig. 16.8B). This subcomponent was present for the time region corresponding to the descending limb of N230, at about 145 ms for stimuli on the left and at about 170 ms for stimuli on the right. Additional, and somewhat later, contributions were found at ipsilateral and more frontal sites. This resembles the structure of the N1 elicited by the onset of the pattern stimuli used in previous studies of visuospatial attention (Di Russo et al. 2001). All of these subcomponents were affected by the AB. The sources of the earliest N230 subcomponent, modeled with the VARETA method, are shown in Fig. 16.8C. The generators estimated as most active were located in several visual extrastriate areas contralateral to the side on which T2 was presented. A later time region (on the ascending limb of N230 at about 245 ms) was also examined. For this region, in which the ipsilateral and frontal subcomponents were more evident at the scalp, the modeled sources are displaced frontally and laterally (for more details see Valdes-Sosa et al. 2003). If we accept that N230 amplitude reflects the strength with which sensory information is represented in visual extrastriate cortex, then these results indicate that the AB in the shape-change RSOT is related to early suppression of this information, in a manner similar to that demonstrated in the transparent motion design. The association of an AB with early sensory suppression is therefore not limited to objects and events defined by motion but seems a more general phenomenon. 16.8 Discussion and conclusions By using RSOT designs, we have measured directly the timing of attentional interference within multiobject visual scenes. This timing depends on the perceptual linkage between successive target events. It is of very short duration and small magnitude when attention shifts within an object. It is larger, producing an AB, when attention shifts between objects. We can therefore generalize the principle that it is easier to divide attention to attributes within a single object than between different objects (Duncan 1984), to situations where the attributes are not simultaneously present. The amplitude of the early N200 and N230 components was reduced during the object-based AB, a suppressive effect originally described as a signature for spatial attention (Hillyard and Anllo-Vento 1998). Therefore, the neural processes reflected by these ERP components are determined not only by local stimulation, but also by more global properties of the scene, including constraints on how attention can shift between different objects. These ERP effects are correlated with the subject s perceptual reports. Since the modulated components probably originate in early visual extrastriate cortex, this result suggests early suppression (or attenuation) of sensory information during the AB.

Kanwisher-16 9/16/03 6:26 PM Page 338 338 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION Previous ERP studies of the AB have obtained different results. One study using a traditional RSVP design (Luck et al. 1996; Vogel et al. 1998) found no modulation of the P1 or N1. In another study, using an RSOT with moving dots, during the AB a smaller N200 was found, but the amplitude of this component was not related to the observer s perceptual accuracy (Niedeggen et al. 2002). These discrepancies could be due to several reasons. First, Luck et al. used a salient (abrupt-onset) probe superimposed on T2 to characterize neural reactivity during the AB. This type of stimulus may capture attention automatically (see Yantis 1998) and therefore always elicit a large P1 and N1. Secondly, the possibility remains that the AB found in more traditional RSVP paradigms is of a different nature from the AB found in RSOT paradigms, due to different attentional dynamics, or the use alphanumeric symbols as stimuli. On the other hand, the perceptual load in the study by Niedeggen and coworkers is lower than in our transparent motion task, which may lead to later attentional competition. Further research is needed to clarify the causes for the discrepancies. We have presented evidence consistent with selective filtering of sensory information in early extrastriate areas during the AB (see also Chapter 18, this volume). But exactly what is filtered? The results in the shape-change paradigm might be explained by a modified spatial filter, as posited by the grouped array hypothesis (Vecera and Farah 1994; Lavie and Driver 1996; Arrington et al. 2000). By this view, the spotlight of attention is warped to accommodate the shape of the attended object within a spatiotopic representation of the visual field. Therefore, at this moment a parsimonious account of these data can be reached by simply modifying location-based theories to accommodate an influence of perceptual organization. A critical test of this account will be to study the effects of displacing the objects within the scene. However, we have argued in more detail elsewhere (Valdes-Sosa et al. 2000) that several candidates for attentional filtering (including representations of locations, elementary features, and two- or three-dimensional grouped arrays) are inconsistent with the results of the transparent motion paradigm (the same considerations are valid for the findings of Blaser et al. 2000). In brief, the spatial superposition of the competing transparent surfaces precludes selection of one surface based on locations, or a grouped array. A direct measurement of the difficulty in shifting attention between stationary sets of dots separated in depth, or of the additional difficulty produced by separating the illusory surfaces in transparent motion, reveals very weak effects (see Valdes-Sosa et al. 2000). More importantly, these results also rule out fixed filters selecting specific values along some sensory dimension (such as direction of motion or color). For example, we presented evidence that signals in MT are attenuated during the AB in the transparent motion RSOT paradigm. This is achievable by inhibiting cortical columns selective for the direction of motion of T2 on that trial. Nevertheless, since possible directions for the T2 are identical for the same- and different-surface conditions, a flexible mechanism for setting the filters is needed, changing from trial to trial. This is even more necessary in the experiment reported by Blaser and coworkers, given that the

Kanwisher-16 9/16/03 6:26 PM Page 339 THE INFLUENCE OF SCENE ORGANIZATION ON ATTENTION 339 features of their two Gabors could interchange, and thus any attribute (or combination of attributes) potentially belonged to either of the two objects present in the scene. In other words, it is necessary to re-specify the settings of lower-order filters for elementary attributes as an object changes its aspect. Therefore, although the influence of scene organization on attention in our experiments is probably played out in early extrastriate cortex (where the different N1 components are generated), information from a higher-order representation is needed to explain the large flexibility in filter settings. A theoretical approach in consonance with these considerations postulates mental representations named object-files (Kahneman and Treisman 1984; Kahneman et al. 1992). Object-files are codes for episodic information that bridge the variations in location and attributes of the same object entity. Therefore the identity and continuity of an object token are preserved. The object-file hypothesis stipulates that a mechanism is needed to update the attributes bound to a particular token. A higher-order representation could control the activity of extrastriate areas in a top-down manner, possibly mediated by the massive feedback projections these areas receive from other cortical sites (Lamme and Roelfsema 2000). This possibility is ignored by most theoretical accounts of the AB, that use strictly feedforward cognitive architectures (which are considered critically in the context of attention research in Chapter 14, this volume). This has been aptly elucidated by Chun and Wolfe (2003) with a conveyor-belt metaphor. Imagine that perception delivers information, somewhat like dropping objects on to a conveyor-belt, at a rate faster than a subsequent stage can use (e.g. entry into visual short term memory), similar to a slow unloading of the belt. The difference in speed of the two operations would create the AB. But just as conveyor-belts move in only one direction, this type of bottleneck cannot explain the sensory suppression described here. Our results can be accommodated (admittedly loosely at this point) within the framework of the integrated competition theory (Duncan 1996). When several objects are present in a scene we can suppose that an invariant neuronal representation of each token is set up. The appropriate local features are some how bound to each token code. When a change occurs in an object, new features appear and others disappear from the scene. Some neurons are activated and others are deactivated in different visual cortices. The new cells must somehow undergo binding to the representation of their object token, thus exchanging facilitation with other units already linked to the object and entering into inhibitory interactions with units representing other objects. This last idea is central to the integrated competition theory. In our experiments, attention is first drawn to the object affording T1. This entails a momentary activation of cells representing new features, their binding to the object representation, and the ensuing activation of other units within this representation. Due to suppressive interactions, the neurons representing the other object are momentarily inhibited. Both effects would endure for a period after T1 offset. If a second

Kanwisher-16 9/16/03 6:26 PM Page 340 340 FUNCTIONAL NEUROIMAGING OF VISUAL COGNITION event, T2, arrives before these changes dissipate, the ease with which the new features will be processed will be affected. If the new features are bound to the activated object, their coding units will benefit from its facilitation. If the features are bound to the suppressed objects, then the neurons representing them will inherit the corresponding inhibition. Of course, this proposal begs the question as to how exactly object-files are coded in the brain. The problem of how the brain represents objects is a complex issue (see Kanwisher and Treisman 1998, and Chapter 4, this volume), and there is no firm solution at hand. However, further studies with RSOT (in addition to capturing some interesting traits of real-life scenes) could perhaps further an understanding of the nature of object representations. This type of study could help to identify the neural activity that is invariant for the same object token, distinguishing it from the more volatile codes representing mutable aspects of the object, and thus contribute to unravel the puzzle of how objects are represented in the brain. Acknowledgements The authors thank Greysi Horta, Belkis Alonso, and Carlos Suarez-Murias for technical assistance, Lourdes Diaz Comas for software development, and Mitchell Valdes-Bobes for help with typing. This work was supported by a grant from the Human Frontier Science Program. Notes 1 The recording of neurons in awake monkeys is contributing to the increased knowledge of the mechanisms of visual attention but is beyond the scope of this chapter (see Chapter 18, this volume). References Ahlfors, S.P., Simpson, G.V., Dale, A.M., Belliveau, J.W., Liu, A.K., Korvenoja, A. et al. (1999). Spatiotemporal activity of a cortical network for processing visual motion revealed by MEG and fmri. Journal of Neurophysiology, 82, 2545 55. Arrington, C.M., Carr, T.H., Mayer, A.R., and Rao, S.M. (2000). Neural mechanisms of visual attention: Object-based selection of a region in space. Journal of Cognitive Neuroscience, 12, 106 17. Bach, M. and Ullrich, D. (1994). Motion adaption governs the shape of motion-evoked cortical potentials. Vision Research, 34, 1541 7. Behrmann, M., Zemel, R.S., and Mozer, M.C. (1998). Object-based attention and occlusion: Evidence from normal participants and a computational model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1011 36. Blaser, E., Pylyshyn, Z., and Holcombe, A. (2000). Tracking an object through feature space. Nature, 408, 196 9. Bosch-Bayard, J., Valdes-Sosa, P., Virues-Alba, T., Aubert-Vázquez, E., John, E.R., Harmony, T., Riera-Díaz, J., and Trujillo-Barreto, N. (2001). 3D statistical parametric mapping of EEG source spectra by means of variable resolution electromagnetic tomography (VARETA). Clinical Electroencephalography, 32, 47 61.