From manual gesture to speech: A gradual transition



Similar documents
How did language begin?

EARLY INTERVENTION: COMMUNICATION AND LANGUAGE SERVICES FOR FAMILIES OF DEAF AND HARD-OF-HEARING CHILDREN

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Psychology G4470. Psychology and Neuropsychology of Language. Spring 2013.

Sign language transcription conventions for the ECHO Project

Function (& other notes)

2) Language: Lesion Studies Demonstrating the Left Hemisphere s Dominance for Language - Broca s Aphasia, Wernicke s Aphasia

CHAPTER 6 PRINCIPLES OF NEURAL CIRCUITS.

62 Hearing Impaired MI-SG-FLD062-02

THE HUMAN BRAIN. observations and foundations

BRAIN AND EDUCATION JEAN-DIDIER VINCENT

Culture and Language. What We Say Influences What We Think, What We Feel and What We Believe

2 Neurons. 4 The Brain: Cortex

How Children Acquire Language: A New Answer by Dr. Laura Ann Petitto

INTRODUCTION. Stage 1: Bipedal hominins

Grasp With Hand and Mouth: A Kinematic Study on Healthy Subjects

TOOLS for DEVELOPING Communication PLANS

Education and the Brain: A Bridge Too Far John T. Bruer. Key Concept: the Human Brain and Learning

A Survey of ASL Tenses

Interest in Animal Communication. Structural properties. Birds 3/3/10. Assume human language evolved

Speech Production 2. Paper 9: Foundations of Speech Communication Lent Term: Week 4. Katharine Barden

The Story of Human Evolution Part 1: From ape-like ancestors to modern humans

Chapter 12: Observational Learning. Lecture Outline

Ohio Early Learning and Development Standards Domain: Language and Literacy Development

STROKE CARE NOW NETWORK CONFERENCE MAY 22, 2014

Cognitive Neuroscience. Questions. Multiple Methods. Electrophysiology. Multiple Methods. Approaches to Thinking about the Mind

Clinical Neuropsychology. Recovery & Rehabilitation. Alan Sunderland School of Psychology

It s All in the Brain!

Vision: Receptors. Modes of Perception. Vision: Summary 9/28/2012. How do we perceive our environment? Sensation and Perception Terminology

Unit 3. Effective Communication in Health and Social Care. Learning aims

LANGUAGE! 4 th Edition, Levels A C, correlated to the South Carolina College and Career Readiness Standards, Grades 3 5

Holistic Music Therapy and Rehabilitation

Animal Models of Human Behavioral and Social Processes: What is a Good Animal Model? Dario Maestripieri

Grammar, Gesture, and Meaning in American Sign Language

Neurogenic Disorders of Speech in Children and Adults

The Visual Cortex February 2013

How To Understand A Deaf Person

American Sign Language From a Psycholinguistic and Cultural Perspective Spring 2012 Syllabus Linguistics 242

Obtaining Knowledge. Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology.

Effects of Learning American Sign Language on Co-speech Gesture

Electrophysiology of language

MISSOURI S Early Learning Standards

Introduction to 30th Anniversary Perspectives on Cognitive Science: Past, Present, and Future

Fall 2013 to present Assistant Professor, Department of Psychological and Brain Sciences, Johns Hopkins University

Interpersonal Communication Skills

The Development of Nicaraguan Sign Language via the Language Acquisition Process

Functional neuroimaging. Imaging brain function in real time (not just the structure of the brain).

A System for Labeling Self-Repairs in Speech 1

2012 Psychology GA 1: Written examination 1

PERSPECTIVES. The cortical organization of speech processing

Brain Evolution Relevant to Language

Bachelors of Science Program in Communication Disorders and Sciences:

PS3021, PS3022, PS4040

Reading Instruction and Reading Achievement Among ELL Students

Language Development and Deaf Children

Heuristics Heuristics make it easier for us to use simple principles to arrive at solutions to problems.

Lecture 2, Human cognition

Auditory memory and cerebral reorganization in post-linguistically deaf adults

Culture (from the Encarta Encyclopedia)

Empathy: Integrating Science and Social Work Practice. Rachel Ledgerwood April 9, 2013

WHAT IS SAID HERE by Alvin Liberman 1 for. Language within our grasp V IEWPOINT. Giacomo Rizzolatti and Michael A. Arbib

Alphabetic Knowledge / Exploring with Letters

Perspective taking strategies in Turkish Sign Language and Croatian Sign Language

A Primer on Writing Effective Learning-Centered Course Goals

CREATING LEARNING OUTCOMES

Agent Simulation of Hull s Drive Theory


Growing Up With Epilepsy

Skill acquisition. Skill acquisition: Closed loop theory Feedback guides learning a motor skill. Problems. Motor learning practice

Howard Gardner s Theory of Multiple Intelligences

The Creative Curriculum for Preschool: Objectives for Development & Learning

Visual area MT responds to local motion. Visual area MST responds to optic flow. Visual area STS responds to biological motion. Macaque visual areas

Chapter 14: The Cutaneous Senses

[Refer Slide Time: 05:10]

ART A. PROGRAM RATIONALE AND PHILOSOPHY

PSYCHOLOGY FACULTY: Amber Garcia, Chair Michael Casey Susan Clayton Gary Gillund Grit Herzmann Brian Karazsia (on leave Fall 2015) John Neuhoff Amy

Program curriculum for graduate studies in Speech and Music Communication

Learning Today Smart Tutor Supports English Language Learners

Ph.D in Speech-Language Pathology

Oralism and How it Affects the Development of the Deaf Child

Teaching Math to English Language Learners

Comprehensive Reading Assessment Grades K-1

SPECIFIC LEARNING DISABILITY

Contemporary Linguistics

Chapter 5. The Sensual and Perceptual Theories of Visual Communication

Lecture 1-10: Spectrograms

Infants: (0-18 months)

CONTE Summer Lab Experience Application

Frequency, definition Modifiability, existence of multiple operations & strategies

Critical Analysis So what does that REALLY mean?

VPAT for Apple MacBook Pro (Late 2013)

What Is Linguistics? December 1992 Center for Applied Linguistics

Montana State University-Billings. Apes and Language: A Turabian Style Sample Paper. Karen Shaw. English 214. Professor Bell

Standards for Certification in Early Childhood Education [ ]

Teaching Strategies GOLD Objectives for Development & Learning: Birth Through Kindergarten

Summary Table Voluntary Product Accessibility Template

Optical Illusions Essay Angela Wall EMAT 6690

COGNITIVE PSYCHOLOGY

Cooperation Proposal between IUB Robotics and BioART at Imperial College

Transcription:

ARTICLE IN PRESS Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 Review From manual gesture to speech: A gradual transition Maurizio Gentilucci a, Michael C. Corballis b, a Department of Neuroscience, University of Parma, Parma I-43100, Italy b Department of Psychology, University of Auckland, Private Bag 92019, Auckland, New Zealand Received 6 October 2005; received in revised form 15 February 2006; accepted 16 February 2006 www.elsevier.com/locate/neubiorev Abstract There are a number of reasons to suppose that language evolved from manual gestures. We review evidence that the transition from primarily manual to primarily vocal language was a gradual process, and is best understood if it is supposed that speech itself a gestural system rather than an acoustic system, an idea captured by the motor theory of speech perception and articulatory phonology. Studies of primate premotor cortex, and, in particular, of the so-called mirror system suggest a double hand/mouth command system that may have evolved initially in the context of ingestion, and later formed a platform for combined manual and vocal communication. In humans, speech is typically accompanied by manual gesture, speech production itself is influenced by executing or observing hand movements, and manual actions also play an important role in the development of speech, from the babbling stage onwards. The final stage at which speech became relatively autonomous may have occurred late in hominid evolution, perhaps with a mutation of the FOXP2 gene around 100,000 years ago. r 2006 Elsevier Ltd. All rights reserved. Keywords: Speech; Gesture; Mirror system; FOXP2 gene; Evolution Contents 1. Introduction................................................................................ 949 2. The gestural-origins theory...................................................................... 950 2.1. The argument from signed language.......................................................... 950 2.2. Primate origins.......................................................................... 951 2.3. The mirror system....................................................................... 951 2.4. A gradual switch?........................................................................ 953 3. Connections between hand and mouth: empirical evidence............................................... 954 4. Evolutionary speculations....................................................................... 956 4.1. When did the changes occur?............................................................... 956 4.2. The human revolution................................................................... 957 5. Conclusion................................................................................. 958 Acknowledgments............................................................................... 958 References..................................................................................... 958 1. Introduction Corresponding author. Tel.: +649 373 7599; fax: +649 373 7450. E-mail address: m.corballis@auckland.ac.nz (M.C. Corballis). Language is composed of symbols, which bear little or no physical relation to the objects, actions, or properties they represent. This poses problems in the understanding 0149-7634/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.neubiorev.2006.02.004

950 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 of how language evolved, since it not obvious how abstract symbols could become associated with aspects of the real world. One theory proposed by Paget (1930), called schematopoeia, holds that spoken words arose initially from parallels between sound and meaning. For example, in many languages vowels are opened in words coding something large, but are closed in words coding something small (gr/a/nde vs. p/i/ccolo; note too that a is differently pronounced in the words large and small). Nevertheless most of the things we talk about cannot be represented iconically through sound, and with very few exceptions (zanzara, buzz, hum) the actual sounds of most words convey nothing of what they mean. This raises the paradox that was well expressed by Rousseau (1775/1964), who remarked that Words would seem to have been necessary to establish the use of words (pp. 148 149). In this article we argue that the problem is to some extent alleviated if it is supposed that language evolved from manual gestures rather than from vocalizations, since manual actions can provide more obvious iconic links with objects and actions in the physical world. Early proponents of this view were the 18th-century philosophers de Condillac (1971/1756) and Vico (1953/1744) but it has been put forward, with variations, many times since (e.g., Arbib, 2005; Armstrong, 1999; Armstrong et al., 1995; Corballis, 1992, 2002; Donald, 1991; Givo` n, 1995; Hewes, 1973; Rizzolatti and Arbib, 1998; Ruben, 2005). The remainder of this article is in three parts. First, we outline the arguments for the gestural-origins theory. Second, we present recent data demonstrating close links between movements of the hand and mouth, adding support to the theory. Third, we speculate as to the possible sequence of events in our evolutionary history that may have led to the replacement of a visuo-manual system by an auditory vocal one. 2. The gestural-origins theory 2.1. The argument from signed language Part of the argument is based on the fact that the signed languages of the deaf are entirely manual and facial, but display most, at least, of the essential linguistic properties of spoken language (Emmorey, 2002; Neidle et al., 2000; Stokoe, 1960). It is well recognized that signs are fundamentally different from gestures of the sort that occur in everyday life, independently of any linguistic function, and which are iconic or indexical rather than symbolic. Despite the symbolic nature of signs, though, there is also an analog, iconic component to signed languages, suggesting a link to a more iconic form of communication. In the course of evolution, then, pantomimes of actions might have incorporated gestures that are analog representations of objects or actions (Donald, 1991), but through time these gestures may have lost the analog features and become abstract. The shift over time from iconic gestures to arbitrary symbols is termed conventionalisation. It appears to be common to both human and animal communication systems, and is probably driven by increased economy of reference (Burling, 1999). Nevertheless some have argued that the properties of sign languages are fundamentally different from those of speech, suggesting that the two may have evolved independently. For example, it has been claimed that signed language does not exhibit duality of patterning (e.g., Armstrong et al., 1995), which Hockett (1960) proposed as one of the distinguishing features of language. In speech, duality refers to the distinction between phonology, in which elements of meaningless sound are combined to form meaningful units called morphemes, and syntax, in which morphemes are combined to form higher-order linguistic entities. For signed language, Stokoe (1991) proposed a theory of semantic phonology, in which the components of signs are themselves meaningful, thus precluding duality in the strict sense. More recent sign-language models, though, suggest that the sublexical units of signs are not meaningful, and use the term phonology to apply equally to sign languages as to speech (e.g., Brentari, 1998; Liddell and Johnson, 1989; Sandler, 1989;Van der Hulst, 1993). The four basic phonological categories of American Sign Language (ASL), known as parameters, in ASL are handshape, location, movement, and orientation of the hands (Emmorey, 2002), and the same elements have been identified in Italian Sign Language (LIS, Volterra, 2004/ 1987). As evidence that these are independent of meaning, it has been shown that deaf signers show a tip-of-thefingers (TOF) effect comparable to the tip-of-thetongue (TOT) effect shown by speakers. The TOT is induced when speakers cannot retrieve a word they know they know, but can often correctly produce one or more phonemes (usually the first). Similarly, TOF refers to a state in which the signer cannot produce a sign she/he knows, but correctly produces one or more parameters of the target. Just as TOT depends on a distinction between semantics and phonology in speech, so TOF indicates a similar distinction in signed language, supporting duality of structure (Thompson et al., 2005). Another difference lies in the nature of the lexical units. Although many signs have lost their iconic form, sign languages retain iconic or analog components that have led some authors to doubt that spoken language could have evolved from gestures (e.g., Talmy, in press). In particular, sign languages have a classifier subsystem that is analog and gradient in character, and that has no parallel in spoken languages. This system applies primarily to the representation of spatial attributes like motion and location (Emmorey, 2002). For example, a signer might represent the motion of a car passing a tree by making the sign for a car (thumb raised, index and middle finger extended forward) with the dominant hand, and a tree (forearm upright and five fingers extended) with the nondominant hand, and then moving the dominant hand horizontally across the torso past the nondominant hand.

ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 951 The movement could be varied to indicate different kinds of motion, say upwards or downwards to represent a slope, quickly or slowly to represent speed, and so forth. These representations of motion are analog in the sense that they map directly onto the actual motion that is referred to, and the representations of the car and the tree have at least a vestigial iconic aspect. In spoken language, by contrast, different characteristics of motion are represented categorically, using morphemes or phrases such as pass, climb, move quickly, etc., and the words car and tree in no way resemble the objects themselves. Talmy (in press) suggests that if spoken language had evolved from a manual system, one would expect a continuation of more analog representation, perhaps with rising pitch to indicate climbing motion, rapid speech to indicate fast motion, and so on. The fact that spoken languages are almost entirely dependent on discrete, recombinant representations suggests, according to Talmy, that language evolved through the vocal auditory channel, with the visuo-manual system a secondary form. Of course, signed and spoken languages are end-states, and need not represent intermediate stages of transition. We shall argue that the differences between signed and spoken languages have to do primarily with the medium through which language is expressed, rather than with the nature of language itself, and that these differences do not preclude a gradual transition from one form to another. The visuo-manual medium lends itself to efficient representation of spatial concepts in analog fashion, and to a greater degree of parallel transmission than is possible using voicing. The auditory vocal medium, in contrast, lacks an effective spatial dimension, and forces serial transmission. Of course some degree of analog representation is possible, and is sometimes used, as in expressions such as up, up, up and away, or he s wa-a-a-ay too young to understand. On the whole, though, it is probably much more efficient to use the combinatorial capacities of the vocal system to create categorical representations. Although the nature of the differences between signed and spoken languages remains somewhat controversial, it now seems reasonably clear that they share the same underlying structure. Emmorey (2002) summarizes as follows: The research strategy of comparing signed and spoken languages makes it possible to tease apart which phonological entities arise from the modality of articulation and perception and which properties arise from the nature of the expression system of human language, regardless of modality. The results thus far suggest that basic phonological entities such as distinctive features, segments, and syllables do not arise because language is spoken; that is, they do not arise from the nature of speech. Although the detailed structure of these entities differs (e.g., distinctive features in signed language are based on manual, rather than oral, articulation) they appear to play the same organizational role for both signed and spoken languages (pp. 41 42). We suggest, then, that there is sufficient commonality between sign language and speech to give credence to the idea that language evolved from manual gestures. The question of how language might have been transformed from something resembling sign language to vocal speech is considered in Section 3 below. 2.2. Primate origins Neurophysiological evidence suggests that nonhuman primates have little if any cortical control over vocalization, which is critical to speech. This implies that the common ancestor of humans and chimpanzees was much better preadapated to develop a voluntary communication system based on visible gestures rather than sounds. Ploog (2002) documents two neural systems for vocal behavior, a cingulate pathway and a neocortical pathway. In nonhuman primates vocalization is largely, if not exclusively, dependent on the cingulate system. The neocortical system is progressively developed for voluntary control of manual movements, including relatively independent finger movements, from monkeys to apes to humans, and is indispensable for voluntary control (e.g., Hepp-Raymond, 1988). Only in humans is the neocortical system developed for precise voluntary control of the muscles of the vocal cords. Monkeys do make extensive use of facial expressions for communication, but these are more obviously gestural than language-like (Van Hooff, 1962, 1967). Attempts to teach vocal language to great apes have achieved much greater success in communicating in language-like fashion through manual signs than in acquiring anything resembling vocal language (e.g., Gardner and Gardner, 1969; Savage-Rumbaugh et al., 1998), which is further evidence that voluntary control is more highly developed manually than vocally in our closest primate relatives. The human equivalents of primate vocalizations are probably emotionally based sounds like laughing, crying, grunting, or shrieking, rather than words. With the emergence of bipedalism in the hominid line some 6 million years ago, the hands were freed from locomotion, providing a potential boost to the evolution of manual communication. 2.3. The mirror system Further support for gestural origins comes from the discovery of neurons in area F5 in the ventral premotor cortex of the monkey that fire when the animal makes movements to grasp an object with the hand or mouth (Rizzolatti et al., 1988). Another set of neurons in the ventral premotor cortex of the monkey, dubbed mirror neurons, fire also when the animal observes another individual making the same movements (Ferrari et al., 2003; Gallese et al., 1996; Rizzolatti et al., 1996). More recent discoveries, based on both neurophysiological

952 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 recordings in primates and functional brain imaging in humans, have identified a more general mirror system, involving temporal, parietal, as well as frontal regions, that is specialized for the perception and understanding of biological motion (Rizzolatti et al., 2001). In monkeys this system has been demonstrated primarily for reaching grasping movements, although it also maps certain movements, such as tearing paper or cracking nuts, onto the sounds of those movements (Kohler et al., 2002). So far, there is no clear evidence for a mapping of the production of vocalizations onto the perception of vocalizations. However, this mapping is implicit in humans in the socalled motor theory of speech perception (Liberman et al., 1967), which holds that we understand spoken speech in terms of how it is produced rather than in terms of its acoustic properties. More detailed study of area F5 suggests further specializations of relevance to the understanding of manual action. This area is located in the rostral part of the ventral premotor cortex, and consists of two main sectors, one located on the dorsal convexity (F5c), the other on the posterior bank of the inferior arcuate sulcus (F5ab). Both sectors receive strong input from the second somatosensory area (SII) and area PF. In addition, F5ab is the selective target of parietal area AIP (for a review, see Rizzolatti and Luppino, 2001). Single-neuron recording studies have shown not only that F5 neurons code specific actions, such as grasping, holding, or tearing, but also that many of them code specific types of hand shaping, such as the precision grip. It is worth noting that hand shape is an important component of human signed languages (e.g., Emmorey, 2002). F5 neurons frequently discharge when the grasping action is performed with the mouth as well as with the hand (Rizzolatti et al., 1988; see Fig. 1). These neurons may be functionally involved in preparing the mouth to grasp the object when the hand grasps it (Gentilucci et al., 2001), thereby encoding the goal of the action (taking possession of the object, Rizzolatti et al., 1988). From an evolutionary point of view, they may have been instrumental in the transfer of the gestural communication system from the hand to the mouth (see below). Fig. 1. Study of a neuron responding to grasping with the hand and the mouth. The left upper panel shows the approximate location of area F5 on a lateral view of the monkey brain. (A) Neuron discharge during grasping with the mouth. (B) Neuron discharge during grasping with the hand contralateral to the recording side. (C) Neuron discharge during grasping with the ipsilateral hand. Rasters and histograms are aligned with the moment when the animal touches the food.

ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 953 F5 neurons can fire during specific phases of the grasp, and some of them, known as canonical neurons, are activated simply by the presentation of a graspable object (Murata et al., 1997; Rizzolatti et al., 1988). Canonical neurons, which are mostly located in the sector F5ab, are distinct from the mirror neurons described above, which are found generally in sector F5c. Nevertheless, mirror neurons also frequently respond to grasping actions, whether executed or observed, and may be sensitive to particular type of grip used in the action (Gallese et al., 1996), but they do not respond to the simple presentation of a graspable object. As of now, no electrophysiological data (for example using multielectrode recording techniques) are available showing temporal and functional relationships between canonical and mirror neurons. Because the mirror system is activated when observing and executing the same hand action, it can be considered to be involved in understanding the action meaning (Gallese et al., 1996). It might therefore have provided the link between actor and observer that also exists between sender and receiver of messages. Rizzolatti and Arbib (1998) proposed that the mirror system was used as an initial communication system in language evolution. Indeed, a comparable mirror system has also been inferred also in modern humans, based on evidence from electroencephalography (Muthukumaraswamy et al., 2004), magnetoencephalography (Hari et al., 1998), transcranial magnetic stimulation (Fadiga et al., 1995), and functional magnetic resonance imaging (Iacoboni et al., 1999). Area F5 is also considered the homologue of Broca s area in the human brain (Rizzolatti and Arbib, 1998), and the mirror system in general corresponds quite closely with the cortical circuits, usually on the left side of the human brain, that are involved in language, whether spoken or signed. The perception and production of language might therefore be considered part of the mirror system, and indeed part of the more general system by which visuo-motor (and audiomotor) integration is used in the understanding of biological motion. 2.4. A gradual switch? As anticipated earlier, a critical question for the theory that language evolved from manual gestures is how the medium of language shifted from a manuo-visual system to a vocal-acoustic one. It is likely that this switch was not an abrupt one, but was rather a gradual change, in which language evolved initially as a largely manual system, but facial and vocal elements were gradually introduced, and evolved to the point that vocalization became the predominant mode (Corballis, 2002). McNeill (1992) has pointed out, though, that even today speech-synchronized manual gestures should be considered part of language, so the dominance of speech is not complete. The argument for continuity between manual and vocal language is supported by evidence that speech itself is fundamentally gestural. This idea is captured by the motor theory of speech perception (Liberman et al., 1967), and by what has more recently become known as articulatory phonology (Browman and Goldstein, 1995). In this view speech is regarded, not as a system for producing sounds, but rather as a system for producing articulatory gestures, through the independent action of the six articulatory organs namely, the lips, the velum, the larynx, and the blade, body, and root of the tongue. This approach is based largely on the fact that the basic units of speech, known as phonemes, do not exist as discrete units in the acoustic signal (Joos, 1948), and are not discretely discernible in mechanical recordings of sound, as in a sound spectrograph (Liberman et al., 1967). One reason for this is that the acoustic signals corresponding to individual phonemes vary widely, depending on the contexts in which they are embedded. In particular, the formant transitions for a particular phoneme can be quite different, depending on the neighboring phonemes. Yet we can perceive speech at remarkably high rates, up to at least 10 15 phonemes per second, which seems at odds with the idea that some complex, context-dependent transformation is necessary. Indeed, even relatively simple sound units, such as tones or noises, cannot be perceived at comparable rates (Warren et al., 1969), which further suggests that a different principle underlies the perception of speech. The conceptualization of speech as gesture overcomes these difficulties, at least to some extent, since the articulatory gestures that give rise to speech partially overlap in time (co-articulation), which makes possible the high rates of production and perception (Studdert-Kennedy, 2005). MacNeilage (1998) has drawn attention to the similarity between human speech and primate sound-producing facial gestures such as lip smacks, tongue smacks, and teeth chatters. Ferrari et al. (2003) recorded discharge both from mirror neurons in monkeys during the lip smack, which is the most common facial gesture in monkeys, and from other mirror neurons in the same area during mouth movements related to eating. This suggests that nonvocal facial gestures may indeed be transitional between visual gesture and speech. This is supported by the increasing recognition that gestures of the face, and more particularly of the mouth, are components of sign languages, and are distinct from mouthing, where the signer silently produces the spoken word simultaneously with the sign that has the same meaning. Mouth gestures have been studied primarily in European signed languages, and schemes for the phonological composition of mouth movements have been proposed for Swedish (Bergman and Wallin, 2001), English (Sutton-Spence and Day, 2001) and Italian (Ajello et al., 2001) Sign Languages. Mouth gestures can serve to disambiguate hand gestures, and as part of more general facial gestures provide the equivalent of prosody in speech (Emmorey, 2002). This work is still in its infancy, but suggests an evolutionary scenario in which mouth movements gradually assume dominance over hand movements, and were eventually accompanied by voicing and

954 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 movements of the tongue and vocal tract. Thus, we suggest, speech was born. One interesting class of mouth gestures constitute what is known as echo phonology, in which movements of the mouth parallel movements of the hand. For example, the mouth may open and close in synchrony with the opening and closing of the hand (Woll, 2002). This may reflect a fundamental relationship between hand and mouth, which we explore in the next section. 3. Connections between hand and mouth: empirical evidence Recent evidence suggests not only that speech is itself gestural, but that there are intimate connections between hand and mouth, in monkeys as well as in humans. As we have seen, the mirror system in the monkey is related to both arm (Gallese et al., 1996; Rizzolatti et al., 1996) and mouth actions (Ferrari et al., 2003). This suggests that gestures of the mouth might have been added to the manual system to form a combined manuofacial gestural system. Up to now a mirror system has been documented only for arm and mouth actions, and the anatomical closeness of hand and mouth cells in the premotor cortex may relate to the involvement of both effectors in common goals. Since food is acquired by using mainly hand and mouth, it is important for animal maintenance to extract the action meaning and aim from visual analysis. Area F5 of the monkey premotor cortex includes also a class of neurons that discharge when the animal grasps an object with the either the hand or the mouth (Rizzolatti et al., 1988). Gentilucci et al. (2001) infer a similar class of neurons in humans. They showed that when subjects were instructed to open their mouths while grasping objects, the size of the mouth opening increased with the size of the grasped object, and conversely, when they open their hands while grasping objects with their mouths, the size of the hand opening also increased with the size of the object. Mirror neurons have also been recorded in the parietal cortex of the monkey, which is strictly connected with F5; some neurons discharge when the monkey observes a grasp action performed with the hand and when the monkey executes a grasp action with the hand or the mouth (Gallese et al., 2002). In the evolution of communication, this mechanism of double command to hand and mouth could have been instrumental in the transfer of a communication system, based on the mirror system, from movements of the hand to movements of the mouth. Grasping movements of the hand also affect the kinematics of speech itself. Grasping larger objects (Gentilucci et al., 2001) and bringing them to the mouth (Gentilucci et al., 2004a) induces selective increases in parameters of lip kinematics and voice spectra of syllables pronounced simultaneously with action execution. Even observing another individual grasping or bringing to the mouth larger objects affects the lip kinematics and the voice spectra of syllables simultaneously pronounced by the viewer (Gentilucci 2003; Gentilucci et al. 2004a, b). Again, then, action observation induces the same effects as action execution. The effects on voicing and lip kinematics are dependent on the arm movement itself, and not on the nature of the grasped objects. Indeed, the same effects were found when either fruits or geometrical solids were presented, and even when no object was presented (i.e., the action was pantomimed, see Fig. 3). By using the mirror system, an individual observing an arm action can automatically and covertly execute the same action in order to interpret the meaning of the action. For manual actions functionally related to oro-facial actions the motor command is sent also to the mouth, and reaches the threshold for execution when the mouth is already activated to pronounce the syllable. Gentilucci and colleagues (Gentilucci, 2003; Gentilucci et al., 2001, 2004b) observed that execution/observation of the grasp with the hand activates a command to grasp with the mouth, which modifies the posture of the anterior mouth articulation, according to the hand shape used to grasp objects of different size. This, in turn, affects formant 1 (F1) of the voice spectra, which is related to mouth aperture (Fig. 2). Conversely, execution/observation of the bringing-to-the-mouth action probably induces an internal mouth movement (as for example chewing or swallowing), which affects tongue displacement according to the size of the object being brought to the mouth (Gentilucci et al., 2004a, b). This, in turn, modifies speech formant 2 (F2), which is related to tongue position (Fig. 3). On the basis of these results we propose that, early in language evolution, communication signals related to the meaning of actions (e.g., taking possession of an object by grasping, or bringing an edible object to the mouth) might have been associated with the activity of particular articulatory organs of the mouth that were later co-opted for speech. The possibility that actions directed to a target might have been used to communicate is supported by the finding that the observation of pantomimes influences speech in the same way that observation of the corresponding real actions does (Gentilucci et al., 2004a; see Fig. 3). The strict relationship between representations of actions and spoken language is supported also by neuroimaging studies, which show activation of Broca s area when representing meaningful arm gestures (Buccino et al., 2001; Decety et al., 1997; Gallagher and Frith, 2004; Gre` zes et al., 1998). Motor imagery of hand movements has also been shown to activate both Broca s and left premotor ventral areas (Gerardin et al., 2000; Grafton et al., 1996; Hanakawa et al., 2003; Kuhtz-Buschbeck et al., 2003; Parsons et al., 1995). The course of events in the evolution of language may be paralleled by those in the development of language in children, which also appears to involve the system of observation/execution of actions directed to a target (Gentilucci et al., 2004b). This is in accordance with the notion that a strict relationship exists between early speech development in children and several aspects of manual activity, such as communicative and gestures (Volterra

ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 955 LIP KINEMATICS 50 ADULTS 50 CHILDREN mm 35 mm 35 GRASP OBSERVATION 20 0 250 500 20 0 250 500 ms ms PEAK VELOCITY OF LIP APERTURE 132 mm/s 116 A * C * 1160 F1 VOICE SPECTRA 1750 F2 100 36.8 cherry apple MAXIMAL LIP APERTURE C * C C * Hz 1030 ** Hz 1600 mm 35.0 A * A * A 900 cherry apple 1450 cherry apple 33.2 cherry apple Fig. 2. Effects of the observation of the grasping action on the lip kinematics and the voice spectra of the syllable BA (/ba/) pronounced during action observation. Children (C) and adults (A) participated in the study. The object grasped by the actor was either a cherry (small object) or an apple (large object). (A) Hand shaping used by the actor when grasping the cherry and the apple. (B) Participants lip kinematics. The upper panels show examples of the time course of lip opening and closing during syllable pronunciation, i.e., the curves show the time course of the distance between two markers placed on the upper and lower lip, respectively. Triangles and circles refer to lip movements when observing the grasp of the cherry and of the apple, respectively. The lower panels show the values of kinematics parameters averaged across subjects. Peak velocity of lip aperture and maximal lip aperture significantly increased when observing the grasp of the apple as compared to the grasp of the cherry. No significant interaction between age groups (adults vs. children) and fruit (cherry vs. apple) was found. C: Parameters of the voice spectra of the syllable BA pronounced during grasp observation. The panels show the mean values of formant 1 (F1) and formant 2 (F2). F1 significantly increased when observing the grasp of the apple in the comparison with the grasp of the cherry. This effect was greater in children than in adults. *Significant main effect of fruit. **Significant interaction between age group and fruit. Bars are standard errors (SE). et al., 2005; Bates and Dick, 2002). For example, canonical babbling in children aged from 6 to 8 months is accompanied by rhythmic hand movements (Masataka, 2001). Manual gestures predate early development of speech in children, and predict later success even up to the two-word level (Iverson and Goldin-Meadow, 2005). Word comprehension in children between 8 and 10 months and word productions between 11 and 13 months are accompanied by deictic and recognition gestures, respectively (Bates and Snyder, 1987; Volterra et al., 1979). From a behavioral point of view, words and manual gestures are communicative signals, which according to McNeill (1992) are synchronized with one another, suggesting that speech and gesture form a single, integrated system. Further support for the integration of speech and gesture comes from Bernardis and Gentilucci (2006), who showed that voice spectra parameters of words pronounced simultaneously with execution of the corresponding-inmeaning gesture increased in comparison with those resulting from word pronunciation alone. This was not observed when the gesture was meaningless. Conversely, pronouncing words slowed down the simultaneous execution of the gesture, which did not occur when pseudowords were pronounced. These effects of voice enhancement and arm inhibition were interpreted as due to a process of transferring some aspects (such as the intention to interact closely) from the gesture to the word (Bernardis and Gentilucci, 2006). On the other hand, the verbal response to a message expressed by the combination of word and gesture is different from that to either communication signal alone. In fact, the voice spectra of words pronounced in response to simultaneously listening to and observing the speaker making the corresponding-in-meaning gesture are enhanced, just as they are by the simultaneous production of both word and gesture (Bernardis and Gentilucci, 2006). Broca s area is probably involved in the simultaneous control of gestures and word pronunciation. Indeed, the effects of gesture observation on word pronunciation, described above, were extinguished during temporary inactivation of this area using repetitive Transcranial Magnetic Stimulation (Gentilucci et al., in press). In summary, the connections between hand and mouth reviewed above may have been established initially in the context of ingestive movements of the mouth, and the acts of grasping and bringing food to the mouth, but adapted later for communication. MacNeilage (1998) has suggested that speech itself originated from repetitive ingestive

956 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 Fig. 3. Effects of the execution and observation of the bringing-to-the-mouth action on the voice spectra of the syllable BA (/ba/) pronounced during action execution/observation. The object brought to the mouth was either a cherry (small object) or an apple (large object). Upper panels show the execution effects. During the task the participant (shown in the panel) executed the action and simultaneously either pronounced the syllable BA or emitted a vocalization unrelated to Italian (/œ/). Lower panels show the observation effects. During the task, the participant observed the actor (shown in the panels) executing the action and simultaneously pronounced the syllable BA. (A) Execution and pronunciation of BA. (B) Execution and vocalization unrelated to Italian. (C) Observation of the action and pronunciation of BA. (D) Observation of a pantomime of the action and pronunciation of BA. Note that during pantomime neither object was presented, nor did the mouth open. (E) Observation of a pantomime of the action executed with a nonbiological arm (i.e., a shape of the arm) and pronunciation of BA. F1: Formant 1. F2: Formant 2. Gray (cherry) and black (apple) bars refer to the object brought to the mouth, respectively. Execution of bringing the apple to the mouth induced an increase in F2 as compared to the action executed with the cherry (A). The action did not affect a vocalization unrelated to Italian (B). Observation of the action (C) and of a pantomime of the action (D) induced the same effects as execution of these actions. No effect was found when the pantomime was executed with a non-biological arm (E). movements of the mouth. This may well be correct, but we suggest that it is only half the story, since it neglects the important role, in primates at least, of hand and arm movements in eating. 4. Evolutionary speculations 4.1. When did the changes occur? Although the connections between hand and mouth were probably well established in our primate forebears, fully articulate vocalization may not have been possible until fairly late in hominid evolution, and perhaps not until the emergence of our own species, Homo sapiens. As we have seen, there is little if any cortical control over vocalization in nonhuman primates (Ploog, 2002), and it has proven virtually impossible to teach chimpanzees anything approaching human speech (Hayes, 1952). Moreover, fossil evidence suggests that the alterations to the vocal tract (e.g., D. Lieberman, 1998; Lieberman et al., 1972) and to the mechanisms of breath control (MacLarnon and Hewitt, 1999, 2004) necessary for articulate speech were not completed until late in hominid evolution, and perhaps only with the emergence of our own species, H. sapiens, which is dated at some 170,000 years ago (Ingman et al., 2000). A further clue comes from study of an extended family in England, known as the KE family. Half of the members of this family are affected by a disorder of speech and language, which is evident from the affected child s first attempts to speak and persists into adulthood (Vargha- Khadem et al., 1995). The disorder is now known to be due to a point mutation on the FOXP2 gene (forkhead box P2) on chromosome 7 (Fisher et al., 1998; Lai et al, 2001). For normal speech to be acquired, two functional copies of this gene seem to be necessary. The nature of the deficit, and therefore the role of the FOXP2 gene, have been debated. Some have argued that FOXP2 gene is involved in the development of morphosyntax (Gopnik, 1990), and it has even been identified more broadly as the grammar gene (Pinker, 1994) although Pinker (2003) has since recognized that other genes probably also played a role in the

ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 957 evolution of grammar. Subsequent investigation suggests, however, that the core deficit in affected members of the KE family is one of articulation, with grammatical impairment a secondary outcome (Watkins et al., 2002a). It may therefore play a role in the incorporation of vocal articulation into the mirror system, but have little to do with grammar itself (Corballis, 2004a). This is supported by a study in which fmri was used to record brain activity in both affected and unaffected members of the KE family while they covertly generated verbs in response to nouns (Lie geois et al., 2003). Whereas unaffected members showed the expected activity concentrated in Broca s area in the left hemisphere, affected members showed relative underactivation in both Broca s area and its right-hemisphere homologue, as well as in other cortical language areas. They also showed overactivation bilaterally in regions not associated with language. However, there was bilateral activation in the posterior superior temporal gyrus; the left side of this area overlaps Wernicke s area, important in the comprehension of language. This suggests that affected members may have generated words in terms of their sounds, rather than in terms of articulatory patterns. Their deficits were not attributable to any difficulty with verb generation itself, since affected and unaffected members did not differ in their ability to generate verbs overtly, and the patterns of brain activity were similar to those recorded during covert verb generation. Another study based on structural MRI showed morphological abnormalities in the same areas (Watkins et al., 2002b). The FOXP2 gene is highly conserved in mammals, and in humans differs in only three places from that in the mouse, but two of the three changes occurred on the human lineage after the split from the common ancestor with the chimpanzee and bonobo. A recent estimate of the date of the more recent of these mutations suggests that it occurred since the onset of human population growth, some 10,000 100,000 years ago (Enard et al., 2002, p. 871). If this is so, fully articulate vocal language may not have emerged until after the appearance of our species, H. sapiens, some 170,000 years ago in Africa. This is not to say that the FOXP2 gene was the only gene involved in the switch to an autonomously vocal system; rather, it was probably just the final step in a series of progressive changes. Selective changes to the vocal tract, breathing, and cortical control of vocal language suggest that there must have been selective pressure to replace a system that was largely based on manual and facial gestures to one that could rely almost exclusively on vocalization, albeit with manual accompaniments. Why, then, would such pressure have existed? One factor may have been greater energy requirements associated with gesture; we have anecdotal evidence from those attending courses in sign language that the instructors required regular massages in order to meet the sheer physical demands of sign language expression. The physiological costs of speech, in contrast, are so low as to be nearly unmeasurable (Russell et al., 1998). The switch would also have allowed communication at night, or when speakers and listeners are out of visual contact, and would have freed the hands for other activities, including the use and manufacture of tools. Vocal language allows people to speak and use tools at the same time, leading perhaps to pedagogy (Corballis, 2002). This may well have been one of the factors underlying the so-called human revolution, which we discuss next. 4.2. The human revolution The human revolution (Mellars and Stringer, 1989) refers to the dramatic appearance of more sophisticated tools, bodily ornamentation, art, and perhaps music, dating from some 40,000 years ago in Europe, and probably earlier in Africa (McBrearty and Brooks, 2000; Oppenheimer, 2003). Despite some imprecision in the estimates of the dates both the FOXP2 mutation and the human revolution, the two dates are fairly close, and suggest that the mutation of the FOXP2 gene may have been the final step in the evolution of autonomous speech. This raises the possibility that the final incorporation of vocalization into the mirror system was critical to the emergence of modern human behavior in the Upper Paleolithic (Corballis, 2004b). The human revolution is more commonly attributed to the emergence of symbolic language itself than to the emergence of speech (e.g., Klein et al., 2004; Mellars, 2004). This implies that language must have evolved very late, and quite suddenly, in hominid evolution. Some have associated it with the arrival of our own species, H. sapiens, about 170,000 years ago. Bickerton (1995), for example, writes that ytrue language, via the emergence of syntax, was a catastrophic event, occurring within the first few generations of Homo sapiens sapiens. Crow (2002) has similarly proposed that the emergence of language was part of the speciation event that gave rise to H. sapiens. The association of the evolutionary explosion with the human revolution suggests that language may have emerged even later, as proposed by Klein et al. (2004), although there is still debate over the extent and time frame of the human revolution (e.g., McBrearty and Brooks, 2000). Given the complexity of syntax, still not fully understood by linguists, it seems unlikely that these big bang theories of language evolution can be correct. It seems much more likely that language evolved incrementally, perhaps beginning with the emergence of the genus Homo from around 2 million years ago. Pinker and Bloom (1990) argue, contrary to earlier views expressed by Chomsky (1975), that language evolved incrementally through natural selection, and Jackendoff (2002) has proposed a series of stages through which this might have occurred. In something of a change of stance for Chomsky, Hauser et al. (2002) have also highlighted a continuity between primate and human communication, again suggesting the gradual evolution of human language although they do not consider the

958 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 possibility that language evolved from manual and facial gestures, nor do they speculate as to precisely when the uniquely human component (what they call faculty of language in the narrow sense ) emerged in hominid evolution. If syntactic language evolved gradually over the past 2 million years, then it seems reasonable to suppose that it was already well developed by the time H. sapiens appeared a mere 170,000 or so years ago. As we have seen, it now seems likely that the FOXP2 gene has to do with oral motor control rather than with syntax. One may question whether the switch to a fully autonomous vocal language could have brought about an effect as apparently profound as the human revolution. As noted above, speech would have freed the hands, enhancing pedagogy, which itself may be a uniquely human characteristic (e.g., Csibra and Gergely, in press). More generally, changes in the medium of communication have had deep influences on our material culture. Without the advent of writing, and the later development of mathematical notation, for example, we would surely not have had our modern contrivances such as the automobile, or the supersonic jet. The Internet may well prove to have comparable effects. We suggest, then, that the switch from a manuo-facial to a vocal means of communication would have especially enhanced material culture, including the manufacture and use of tools. Indeed, it is primarily in material culture that the human revolution is manifest, whereas the earlier evolution of language itself may have been expressed in, and perhaps driven by, complex social interaction, or what has been called cultural cognition (Tomasello et al., 2005). The social component may be less visible in the archeological record. The human revolution may therefore give a false impression of the evolution of the human mind itself. 5. Conclusion In conclusion, a system based on iconic and progressively symbolic gestures evolved from an initial gesture communication system based on pantomimes of actions. Grammar might have evolved as the sequence of hand and arm gestures increased in complexity. In line with Corballis s (2002) proposal, at the various stages of evolution arm postures were integrated with mouth articulation postures by the double hand mouth command system. Autonomy of speech from the arm-gesture communication system, at least to the point that language can be understood through speech alone, was probably reached when the alterations of the vocal tract and vocal control necessary for articulate speech were completed. Only at this point could the signal be carried autonomously by the vocal system. This stage may not have been reached until the emergence of our own species, H. sapiens, and may have been facilitated by the mutation of the FOXP2 gene within the past 100,000 years. Acknowledgments This work was supported by grant MIUR (Ministero dell Istruzione Universitaria e della Ricerca) to M.G. We thank Karen Emmorey, Michael Studdert-Kennedy, and Len Talmy for helpful discussion, although they do not necessarily agree with our conclusions. References Ajello, R., Mazzoni, L., Nicolai, F., 2001. Linguistic gestures: mouthing in Italian sign languages (LIS). In: Sutton-Spence, R., Boyes-Braem, P. (Eds.), The Hands are the Head of the Mouth: The Mouth as Articulator in Sign Language. Signum-Verlag, Hamburg, Germany, pp. 231 246. Arbib, M.A., 2005. From monkey-like action recognition to human language: an evolutionary framework for neurolinguistics. Behavioral and Brain Sciences 28, 105 168. Armstrong, D.F., 1999. Original Signs: Gesture, Sign, and the Source of Language. Gallaudet University Press, Washington, DC. Armstrong, D.F., Stokoe, W.C., Wilcox, S.E., 1995. Gesture and the Nature of Language. Cambridge University Press, Cambridge, MA. Bates, E., Dick, F., 2002. Language, gesture, and the developing brain. Developmental Psychobiology 40, 293 310. Bates, E., Snyder, L.S., 1987. The cognitive hypothesis in language development. In: Ina, E., Uzgiris, C., McVicker Hunt, E.J. (Eds.), Infant Performance and Experience: New Findings with the Ordinal Scales. University of Illinois Press, Urbana, IL, USA, pp. 168 204. Bergman, B., Wallin, L., 2001. A preliminary analysis of visual mouth segments in Swedish Sign Language. In: Sutton-Spence, R., Boyes- Braem, P. (Eds.), The Hands are the Head of the Mouth: The Mouth as Articulator in Sign Language. Signum-Verlag, Hamburg, Germany, pp. 51 68. Bernardis, P., Gentilucci, M., 2006. Speech and gesture share the same communication system. Neuropsychologia 44, 178 190. Bickerton, D., 1995. Language and Human Behavior. University of Washington Press, Seattle, WA. Brentari, D., 1998. A Prosodic Model of Sign Language Phonology. MIT Press, Cambridge, MA. Browman, C.P., Goldstein, L.F., 1995. Dynamics and articulatory phonology. In: van Gelder, T., Port, R.F. (Eds.), Mind as Motion. MIT Press, Cambridge, MA, pp. 175 193. Buccino, G., Binkofski, F., Fink, G.R., Fadiga, L., Fogassi, L., Gallese, V., et al., 2001. Action observation activates premotor and parietal areas in a somatotopic manner: an fmri study. European Journal of Neuroscience 13, 400 404. Burling, R., 1999. Motivation, conventionalization, and arbitrariness in the origin of language. In: King, B.J. (Ed.), The Origins of Language: What Nonhuman Primates can Tell Us. School of American Research Press, Santa Fe, NM, pp. 307 350. Chomsky, N., 1975. Reflections on Language. Pantheon, New York. de Condillac, E.B., 1971. An Essay on the Origin of Human Knowledge: Being a Supplement to Mr. Locke s Essay on the Human Understanding (A facsimile reproduction of the 1756 translation by T. Nugent of Condillac s 1747 essay). Scholars Facsimiles and Reprints, Gainesville, FL. Corballis, M.C., 1992. On the evolution of language and generativity. Cognition 44, 197 226. Corballis, M.C., 2002. From Hand to Mouth: The Origins of Language. Princeton University Press, Princeton, NJ. Corballis, M.C., 2004a. FOXP2 and the mirror system. Trends in Cognitive Sciences 8, 95 96. Corballis, M.C., 2004b. The origins of modernity: was autonomous speech the critical factor? Psychological Review 111, 543 552. Crow, T.J., 2002. Sexual selection, timing, and an X Y homologous gene: did Homo sapiens speciate on the Y chromosome? In: Crow, T.J. (Ed.),

ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 959 The Speciation of Modern Homo Sapiens. Oxford University Press, Oxford, UK, pp. 197 216. Csibra, G., Gergely, G., in press. Social learning and social cognition: The case for pedagogy. In: Johnson, M.H., Munakata, Y. (Eds.), Processes of Change in Brain and Cognitive Development. Attention and Performance XXI. Oxford University Press, Oxford, UK. Decety, J., Grezes, J., Costes, N., Perani, D., Jeannerod, M., Procyk, E., et al., 1997. Brain activity during observation of actions. Influence of action content and subject s strategy. Brain 120, 1763 1777. Donald, M., 1991. Origins of the Modern Mind. Harvard University Press, Cambridge, MA. Emmorey, K., 2002. Language, Cognition, and Brain: Insights from Sign Language Research. Erlbaum, Hillsdale, NJ. Enard, W., Przeworski, M., Fisher, S.E., Lai, C.S.L., Wiebe, V., Kitano, T., et al., 2002. Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418, 869 871. Fadiga, L., Fogassi, L., Pavesi, G., Rizzolatti, G., 1995. Motor facilitation during action observation a magnetic stimulation study. Journal of Neurophysiology 73, 2608 2611. Ferrari, P.F., Gallese, V., Rizzolatti, G., Fogassi, L., 2003. Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. European Journal of Neuroscience 17, 1703 1714. Fisher, S.E., Vargha-Khadem, F., Watkins, K.E., Monaco, A.P., Pembrey, M.E., 1998. Localisation of a gene implicated in a severe speech and language disorder. Nature Genetics 18, 168 170. Gallagher, H.L., Frith, C.D., 2004. Dissociable neural pathways for the perception and recognition of expressive and instrumental gestures. Neuropsychologia 42, 1725 1736. Gallese, V., Fadiga, L., Fogassi, L., Rizzolatti, G., 1996. Action recognition in the premotor cortex. Brain 119, 593 609. Gallese, V., Fadiga, L., Fogassi, L., Rizzolatti, G., 2002. Action representation and the inferior parietal lobule. In: Prinz, W., Hommel, B. (Eds.), Common Mechanisms in Perception and Action, Attention and Performance XIX, III Action perception and imitation. Oxford University Press, Oxford, UK, pp. 334 355. Gardner, R.A., Gardner, B.T., 1969. Teaching sign language to a chimpanzee. Science 165, 664 672. Gentilucci, M., 2003. Grasp observation influences speech production. European Journal of Neuroscience 17, 179 184. Gentilucci, M., Benuzzi, F., Gangitano, M., Grimaldi, S., 2001. Grasp with hand and mouth: a kinematic study on healthy subjects. Journal of Neurophysiology 86, 1685 1699. Gentilucci, M., Santunione, P., Roy, A.C., Stefanini, S., 2004a. Execution and observation of bringing a fruit to the mouth affect syllable pronunciation. European Journal of Neuroscience 19, 190 202. Gentilucci, M., Stefanini, S., Roy, A.C., Santunione, P., 2004b. Action observation and speech production: study on children and adults. Neuropsychologia 42, 1554 1567. Gentilucci, M., Bernardis, P., Crisi, G., Dalla Volta, R., in press. Repetitive transcranial stimulation of Broca s area affects verbal responses to gesture observation. Journal of Cognitive Neuroscience. Gerardin, E., Sirigu, A., Lehericy, S., Poline, J.B., Gaymard, B., Marsault, C., et al., 2000. Partially overlapping neural networks for real and imagined hand movements. Cerebral Cortex 10, 1093 1104. Givo` n, T., 1995. Functionalism and Grammar. Benjamins, Philadelphia, PA. Gopnik, M., 1990. Feature-blind grammar and dysphasia. Nature 344, 715. Grafton, S.T., Arbib, M.A., Fadiga, L., Rizzolatti, G., 1996. Localization of grasp representations in humans by positron emission tomography. 2. Observation compared with imagination. Experimental Brain Research 112, 103 111. Grèzes, J., Costes, N., Decety, J., 1998. Top-down effect of strategy on the perception of human biological motion: a PET investigation. Cognitive Neuropsychology 15, 553 582. Hanakawa, T., Immisch, I., Toma, K., Dimyan, M.A., Van Gelderen, P., Hallett, M., 2003. Functional properties of brain areas associated with motor execution and imagery. Journal of Neurophysiology 89, 989 1002. Hari, R., Forss, N., Avikainen, S., Kirveskari, E., Salenius, S., Rizzolatti, G., 1998. Activation of human primary motor cortex during action observation: a neuromagnetic study. Proceedings of the National and Academy of Sciences of the USA 95, 15061 15065. Hauser, M.D., Fitch, W.T., Chomsky, N., 2002. The faculty of language: what is it, who has it, and how did it evolve? Science 298, 1569 1579. Hayes, C., 1952. The Ape in Our House. Gollancz, London. Hepp-Raymond, M.-C., 1988. Functional organization of motor cortex and its participation in voluntary movements. In: Steklis, H.D., Erwin, J. (Eds.), Comparative Primate Biology, Vol. 4. Neurosciences. Alan R. Liss, New York, pp. 501 624. Hewes, G.W., 1973. Primate communication and the gestural origins of language. Current Anthropology 14, 5 24. Hockett, C.F., 1960. The origin of speech. Scientific American 203 (3), 88 96. Iacoboni, M., Woods, R.P., Brass, M., Bekkering, H., Mazziotta, J.C., Rizzolatti, G., 1999. Cortical mechanisms of human imitation. Science 286, 2526 2528. Ingman, M., Kaessmann, H., Pa äbo, S., Gyllensten, U., 2000. Mitochondrial genome variation and the origin of modern humans. Nature 408, 708 713. Iverson, J.M., Goldin-Meadow, S., 2005. Gesture paves the way for language development. Psychological Science 16, 367 371. Jackendoff, R., 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford University Press, Oxford, UK. Joos, M., 1948. Acoustic Phonetics. Language Monograph No. 23. Linguistic Society of America, Baltimore, MD. Klein, R.G., Avery, G., Cruz-Uribe, K., Halkett, D., Parkington, J.E., Steele, T., et al., 2004. The Ysterfontein 1 Middle Stone Age site, South Africa, and early human exploitation of coastal resources. Proceedings of the National Academy of Sciences of the USA 101, 5708 5715. Kohler, E., Keysers, C., Umilta, M.A., Fogassi, L., Gallese, V., Rizzolatti, G., 2002. Hearing sounds, understanding actions: Action representation in mirror neurons. Science 297, 846 848. Kuhtz-Buschbeck, J.P., Mahnkopf, C., Holzknecht, C., Siebner, H., Ulmer, S., Jansen, O., 2003. Effector-independent representations of simple and complex imagined finger movements: a combined fmri and TMS study. European Journal of Neuroscience 18, 3375 3387. Lai, C.S., Fisher, S.E., Hurst, J.A., Vargha-Khadem, F., Monaco, A.P., 2001. A novel forkhead-domain gene is mutated in a severe speech and language disorder. Nature 413, 519 523. Liberman, A.M., Cooper, F.S., Shankweiler, D.S., Studdert-Kennedy, M., 1967. Perception of the speech code. Psychological Review 74, 431 461. Liddell, S., Johnson, R., 1989. American Sign Language: The phonological base. Sign Language Studies 64, 197 277. Lieberman, D., 1998. Sphenoid shortening and the evolution of modern cranial shape. Nature 393, 158 162. Lieberman, P., Crelin, E.S., Klatt, D.H., 1972. Phonetic ability and related anatomy of the new-born, adult human, Neanderthal man, and the chimpanzee. American Anthropologist 74, 287 307. Liégeois, F., Baldeweg, T., Connelly, A., Gadian, D.G., Mishkin, M., Vargha-Khadem, F., 2003. Language fmri abnormalities associated with FOXP2 gene mutation. Nature Neuroscience 6, 1230 1237. MacLarnon, A., Hewitt, G., 1999. The evolution of human speech: The role of enhanced breathing control. American Journal of Physical Anthropology 109, 341 363. MacLarnon, A., Hewitt, G., 2004. Increased breathing control: another factor in the evolution of human language. Evolutionary Anthropology 13, 181 197. MacNeilage, P.F., 1998. The frame/content theory of evolution of speech. Behavioral and Brain Sciences 21, 499 546. Masataka, N., 2001. Why early linguistic milestones are delayed in children with Williams syndrome: late onset of hand banging as a possible rate-limiting constraint on the emergence of canonical babbling. Developmental Science 4, 158 164.

960 ARTICLE IN PRESS M. Gentilucci, M.C. Corballis / Neuroscience and Biobehavioral Reviews 30 (2006) 949 960 McBrearty, S., Brooks, A.S., 2000. The revolution that wasn t: a new interpretation of the origin of modern human behavior. Journal of Human Evolution 39, 453 563. McNeill, D., 1992. Hand and Mind: What Gestures Reveal about Thought. University of Chicago Press, Chicago. Mellars, P.A., 2004. Neanderthals and the modern human colonization of Europe. Nature 432, 461 465. Mellars, P.A., Stringer, C.B. (Eds.), 1989. The Human Revolution: Behavioural and Biological Perspectives on the Origins of Modern Humans. Edinburgh University Press, Edinburgh. Murata, A., Fadiga, L., Fogassi, L., Gallese, V., Raos, V., Rizzolatti, G., 1997. Object representation in the ventral premotor cortex (area F5) of the monkey. Journal of Neurophysiology 78, 2226 2230. Muthukumaraswamy, S.D., Johnson, B.W., McNair, N.A., 2004. Mu rhythm modulation during observation of an object-directed grasp. Cognitive Brain Research 19, 195 201. Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B., Lee, R.G., 2000. The Syntax of American Sign Language. MIT Press, Cambridge, MA. Oppenheimer, S., 2003. Out of Eden: The Peopling of the World. Constable, London. Paget, R., 1930. Human Speech: Some Observations, Experiments and Conclusions as to the Nature, Origin, Purpose and Possible Improvement of Human Speech. P. Kegan, T. Trench, Trubner & Co, New York, NY. Parsons, L.M., Fox, P.T., Downs, J.H., Glass, T., Hirsch, T.B., Martin, C.C., et al., 1995. Use of implicit motor imagery for visual shape discrimination as revealed by PET. Nature 375, 54 58. Pinker, S., 1994. The Language Instinct. Morrow, New York. Pinker, S., 2003. Language as an adaptation to the cognitive niche. In: Christiansen, M.H., Kirby, S. (Eds.), Language Evolution. Oxford University Press, Oxford, pp. 16 37. Pinker, S., Bloom, P., 1990. Natural language and natural selection. Behavioral and Brain Sciences 13, 707 784. Ploog, D., 2002. Is the neural basis of vocalisation different in non-human primates and Homo sapiens? In: Crow, T.J. (Ed.), The Speciation of Modern Homo Sapiens. Oxford University Press, Oxford, pp. 121 135. Rizzolatti, G., Arbib, M.A., 1998. Language within our grasp. Trends in Neurosciences 21, 188 194. Rizzolatti, G., Luppino, G., 2001. The cortical motor system. Neuron 31, 889 901. Rizzolatti, G., Camarda, R., Fogassi, L., Gentilucci, M., Luppino, G., Matelli, M., 1988. Functional organization of inferior area 6 in the macaque monkey. II. Area F5 and the control of distal movements. Experimental Brain Research 71, 491 507. Rizzolatti, G., Fadiga, L., Gallese, V., Fogassi, L., 1996. Premotor cortex and the recognition of motor actions. Cognitive Brain Research 3, 131 141. Rizzolatti, G., Fogassi, L., Gallese, V., 2001. Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews Neuroscience 2, 661 670. Rousseau, J.J., 17751964. Discours sur l origine et les fondements de l inégalité parmi les hommes. In: Gagnebin, B., Raymond, M. (Eds.), Oeuvres Complètes, vol. 3. Gallimard, Paris. Ruben, R.J., 2005. Sign language: its history and contribution to the understanding of the biological nature of language. Acta Oto- Laryngologica 125, 464 467. Russell, B.A., Cerny, F.J., Stathopoulos, E.T., 1998. Effects of varied vocal intensity on ventilation and energy expenditure in women and men. Journal of Speech, Language and Hearing Research 41, 239 248. Sandler, W., 1989. Phonological Representation of the Sign: Linearity and Nonlinearity in American Sign Language. Foris Publications, Dortrecht, The Netherlands. Savage-Rumbaugh, S., Shanker, S.G., Taylor, T.J., 1998. Apes, Language, and the Human Mind. Oxford University Press, New York. Stokoe, W.C., 1960. Sign Language Structure: An Outline of the Communicative Systems of the American Deaf. Linstock Press, Silver Spring, MD. Stokoe, W.C., 1991. Semantic phonology. Sign Language Studies 71, 107 114. Studdert-Kennedy, M., 2005. How did language go discrete? In: Tallerman, M. (Ed.), Language Origins: Perspectives on Evolution. Oxford University Press, Oxford, pp. 48 67. Sutton-Spence, R., Day, L., 2001. Mouthings and mouth gestures in British Sign Language. In: Boyes-Braem, P., Sutton-Spence, R. (Eds.), The Hands are the Head of the Mouth: The Mouth as Articulator in Sign Languages. Signum-Verlag, Hamburg, pp. 69 86. Talmy, L., in press. Recombinance in the evolution of language. In: Cihlar, J.E., Kaiser, D., Kimbara, I., Franklin, A. (Eds.), Proceedings of the 39th Annual Meeting of the Chicago Linguistic Society. Chicago Linguistic Society, Chicago, IL. Thompson, R., Emmorey, K., Gollan, T.H., 2005. Tip of the fingers experiences by deaf signers. Psychological Science 16, 856 860. Tomasello, M., Carpenter, M., Call, J., Behen, T., Moll, H., 2005. Understanding and sharing intentions: the origin of cultural cognition. Behavioral and Brain Sciences 28, 635 673. Van der Hulst, H., 1993. Units in the analysis of signs. Phonology 10, 209 241. Van Hooff, J.A.R.A.M., 1962. Facial expressions in higher primates. Symposium of the Zoological Society of London 8, 97 125. Van Hooff, J.A.R.A.M., 1967. The facial displays of the catarrhine monkeys and apes. In: Morris, D. (Ed.), Primate Ethology. Weidenfield and Nicolson, London, pp. 7 68. Vargha-Khadem, F., Watkins, K.E., Alcock, K.J., Fletcher, P., Passingham, R., 1995. Praxic and nonverbal cognitive deficits in a large family with a genetically transmitted speech and language disorder. Proceedings of the National Academy of Sciences of the USA 92, 930 933. Vico, G.B., 19531744. La Scienza Nuova. Laterza, Bari. Volterra, V., 20041987. La lingua italiana dei segni. Il Mulino, Bologna. Volterra, V., Bates, E., Benigni, L., Bretherton, I., Camaioni, L., 1979. First words in language and action: a qualitative look. In: Bates, E., Benigni, L., Bretherton, I., Camaioni, L., Volterra, V. (Eds.), The Emergence of Symbols: Cognition and Communication in Infancy. Academic Press, New York, pp. 141 222. Volterra, V., Caselli, M.C., Capirci, O., Pizzuto, E., 2005. Gesture and the emergence and development of language. In: Tomasello, M., Slobin, D.I. (Eds.), Beyond Nature Nurture: Essays in Honor of Elizabeth Bates. Lawrence Erlbaum Associates, Mahwah, NJ, pp. 3 40. Warren, R.M., Obusek, C.J., Farmer, R.M., Warren, R.P., 1969. Auditory sequence: confusion of patterns other than speech or music. Science 164, 586 587. Watkins, K.E., Dronkers, N.F., Vargha-Khadem, F., 2002a. Behavioural analysis of an inherited speech and language disorder: comparison with acquired aphasia. Brain 125, 452 464. Watkins, K.E., Vargha-Khadem, F., Ashburner, J., Passingham, R.E., Connelly, A., Friston, K.J., et al., 2002b. MRI analysis of an inherited speech and language disorder: structural brain abnormalities. Brain 125, 465 478. Woll, B., 2002. The sign that dares to speak its name: echo phonology in British Sign Language (BSL). In: Boyes-Braem, P., Sutton- Spence, R. (Eds.), The Hands are the Head of the Mouth: The Mouth as Articulator in Sign Languages. Signum-Verlag, Hamburg, pp. 87 98.