Noam Chomsky: Aspects of the Theory of Syntax notes Julia Krysztofiak May 16, 2006 1 Methodological preliminaries 1.1 Generative grammars as theories of linguistic competence The study is concerned with the syntactic component of generative grammar, i.e. with the rules tat specify the well formed strings of minimal syntactically functioning units formatives and assign structural information to these strings and to strings which deviate from well formedness. Distinction between competence speaker hearer s knowledge of the language and performance the actual use of language in concrete situations. Aim of grammar: to assign to each of an infinte range of sentences a structural description that indicates how a sentence is understood by the ideal speaker hearer. 1.2 Toward a theory of performance Acceptable refers to utterances which are prfectly natural and immediately comprehensible without paper and pencil analysis and in no way bizzarre or outlandish. The term is not to be confused with grammatical. Acceptability belongs to the study of performance, while grammaticalness to the study of competence. The most obvious formal property of utterances is their bracketing: 1. Nested constructions [I called [the man who wrote [the book that you told me about]]] 2. Self embedded constructions [who the boy [who the students recognized] pointed out] (nested and of the same type) 1
3. Multiple branching constructions [[John], [Bill], [Tom] visited us last night] 4. Left branching constructions [[[[John s] brother s] father s] uncle] 5. Right branching constructions [this is [the cat that caught [the rat that stole [the cheese that stinks]]]] The most acceptable constructions are the multiple branching while nesting and self embedding contributes to unacceptability. It follows from the hypothesis about limitations of memory the perceptual device has a stock of analytic procedures available to it and it cannot utilize procedure φ while still executing φ. 1.3 The organization of generative grammar Generative grammar is a system able to iterate to generate indefinitely many structures. Phonological component determines the phonetic form generated by syntactic rules. Semantic component determines the semantic interpretation of a sentence. Each component utilizes information provided by the syntactic component. The syntactic component of a grammar must specify, for each sentence, a deep structure that determines its semantic interpretation and a surface structure that determines its phonetic interpretation. D-structure and S-structure are not identical. The idea of generative grammar is that they are distinct and that the S-structure is determined by repeated application of certain formal operations grammatical transformations to objects of a more elementary sort. If it so, then the syntactic component must generate both S and D structures for each sentence and must interrelate them. Base of syntactic component:= system of rules that generate a highly restricted set of basic strings, each with a base Phrase marker. Base Phrase marker:= structural description of a basic string; elementary unit of which D structure is constituted. Basis of a sentence:= a sequence of base Phrase markers underlying a sentence, generated by the base of syntactic component. 2
Transformational subcomponent:= system of rules that generates a sentence with its S structure from its basis. Kernel sentences:= sentences that involve a minimum of transformational apparatus in their generation. 1.4 Justification of grammars Grammar can be justified on two levels: level of descriptive adequacy (that it correctly describes linguistic facts) and level of explanatory adequacy (that the linguistic theory with which it is associated selects this grammar over others that are also descriptively adequate; the problem touches the question of a theory of language acquisition and an account of innate abilities that make it possible). 1.5 Formal and substantive universals What are initial assumptions concerning tha nature of language that a child brings to language learning and how detailed is the innate schema (the general definition of grammar ) that gradually becomes more explicit and differentiated as a child learns the language? Traditional grammar makes claims about substantial universals (eg. Jakobson), namely that there are some items of a particular kind that must be drawn in any language from a fixed class of items eg. phonetic features, syntactic categories, specific objects distinguished by vocabulary, common to all languages. Generative grammar makes claims about formal universals eg. that grammars of all languages share transformational rules which map semantically intepreted D structures into phonetic structures. 1.6 Further remarks on descriptive and explanatory theories Requirements for a linguistic theory that aims for explanatory adequacy are to provide: 1. an enumeration of the class s 1, s 2,... of possible sentences 2. an enumeration of the class SD 1, SD 2... of possible structural descriptions (each structural description uniquely specifies a sentence, but not conversely) 3. an enumeration of the class G 1, G 2... of possible generative grammars 3
4. specification of a function f such that SD f(i,j) is the structural description assigned to a sentence s i by a grammar G j for any i,j. 5. specification of a function m 1 such thah m(i) is an integer associated with a grammar G i as its value (with, let us say, lower value indicated by higher numbers) (a method of evaluating possible grammars) A theory that satisfies only conditions (1) (4) is at most descriptively adequate. 1.7 On evaluation procedures Simplicity is not a good measure for evaluating grammars simplicity is an empirical matter. The main problem in constructing an evaluation measure is that of determining which generalizations about language are significant ones; an evaluation measure must be selected in such a way as to favor these. 1.8 Linguistic theory and language learning A theorist of language is given an empirical pairing of collections of primary linguistic data and associated grammars, constructed by a device on the basis of such data. The goal for a theorist is to determine the intrinsic properties of a device that, given the input linguistic data returns the output associated grammar. 1.9 Generative capacity and its linguistic relevance A distinction between weak generative capacity and strong generative capacity of a theory of language structure: a grammar weakly generates a set of sentences and strongly generates a set of structural descriptions. Let T be a linguistic theory that provide a class of grammars G 1, G 2,... where G i weakly generates the language L i and strongly generates the system of structural descriptions Σ i. Then the class {L 1, L 2,...} constitutes the weak generative capacity while class {Σ 1, Σ 2,...} constitutes the strong generative capacity of the theory T. A linguistic theory is descriptively adequate if its strong generative adequacy includes a system of structural descriptions for every natural language. 4
2 Categories and relations in syntactic theory 2.1 The scope of the base Traditional grammar describes sentences in terms of VP (Verb Phrase), NP (Noun Phrase), Object, Subject Verb relation, Verb Object relation and so on. The topic of Chapter 2 is to answer the following questions: How may description of this sort be formally presented in a structural description? How can such structural descriptions be generated by a system of explicit rules? 2.2 Aspects of Deep structure 2.2.1 Categorization (Base) Phrase marker:= structure represented as a derivation tree. Rewriting rules:= the natural mechanism for generatng Phrase markers of the form: A Z/X Y that means: category A is realized as the string Z when it is in environment consisting of X to the left and Y to the right. W-derivation of V:= a sequence of strings where W is the first and V is the last string, each string of a sequence derived from the proceeding one by application of a rewriting rule. Terminal string:= V is a terminal string if there is an #S# derivation of #V#, where S is the initial symbol of the grammar (representing sentence) and # is the boundary symbol (regarded as grammatical formative). Phrase structure grammar:= grammar with the unordered set of rewriting rules. Context free grammar:= grammar with the rewriting rules in which X and Y are null. Sequential derivation:= derivation formed by a series of rule applications that preserve the ordering of the rules within the grammar. Suppose that the grammar consists of the sequence of rules R 1,..., R n and that the sequence #S#, #X 1 #,..., #X m # is a derivation of the terminal string #X m #. For this to be a sequential derivationit must obtain, that if the rule R i was used to form line #X j # from the line 5
that preceeds it, then no rule R k, for k > i, can have been used to form a line #X l #, for l < j, from a line #X l 1 #. Note: only sequential derivations are generated by the sequence of rules constituting this part of the base. 2.2.2 Functional notions Functional notions like subject, predicate should be distinguished from categorial notions like NP, VP, verb. The idea is that we can reduce functional categories (or define as) major categories. Major category is a category that dominates a string... X..., where X is a lexical category. Lexical category is a category which appears on the left in a lexical rule. Lexical rule is a rule that introduces a lexical formative (eg.: N boy). Grammatical relation a substring U of W bears a grammatical relation [B,A] to a substring V of W if: V is dominated by a node A which directly dominates YBZ and U is dominated by the occurrence of B in YBZ (Y and Z possibly null). Examples of grammatical relations: Subject of : [NP, S], Predicate of : [VP, S], Direct Object of : [NP, VP], Main Verb of : [V, VP]. 2.2.3 Syntactic features The problem Traditional grammar may describe sentences in terms of Abstract Nouns, Count Nouns, Animate Nouns, Transitive Verbs, Aspects and so on. Problem of presentation: to what extent should such subcategorization be provided by syntactic component? Problem of justification: are (and to what extent) semantic considerations relevant in determining such subcotegorizations? Although the question of justification is beyond the scope of the discussion, one must be aware that not all the abberations og grammaticalness come from the syntax (the boy elapsed, John is owning a house). Semantic and syntatic considerations can t be sharply distinguished. However, a desriptively adequate grammar must account for phenomena like these in terms of the structural descriptions provided by its syntactic and semantic componenets. 6
As to the question of presentation, one must determine, whether the burden of providing structural descriptions of this sort should fall on the syntactic or rather semantic componenet of a grammar. Chomsky inclines toward a view that such descriptions fall under syntactic component and that sentences like the given are assigned Phrase markers only by virtue of relaxation of certain syntactic conditions. Some formal similarities between syntax an phonology Information in terms of animate, count, abstract and so on can be presented in explicit rules. However, the subcategorization eg. of nouns is not branching, but crossing. It is similar to the situation in phonology. In phonology each formative is represented as a distinctive feature matrix in which the columns stand for each segment of a formative and rows for particular features. Specification of each feature with respect to a segment occurs on each intersection. Features can be: postively specified, negatively specified or unspecified. Thanks to such method of feature specification one can formulate a rule: [+continuant] [+voiced]/ [+voiced] Such a rule converts [sm] into [zm] or [fd] into [vd] but does not affect [pd]. Similarly one can ascribe features like [human], [common] or [animate] to nouns. Then the rules of transformation look as follows: 1. N [+N, + common] 2. [+common] [+ count] 3. [+count] [+ animate] 4. [ common] [+ animate] 5. [+animate] [+ human] 6. [ count] [+ abstract] Such rules are reducible to even simpler ones. Moreover one can develop a complex system of lexical categories without admitting traditional divisions like these proposed by traditional grammar. General structure of the base component In addition to rewriting rules that apply to category symbols there are rewriting rules that apply tolexical category symbols and that operate on complex sets of specified features. The base component contains a lexicon. 7
Lexicon:= an unordered list of lexical entries. Lexical entry:= a pair < D, C >, where D is a phonological distinctive feature matrix spelling a certain lexical formative and C is a collection of specified syntactic features. Such system of rewriting rules generates derivations of preterminal strings. Preterminal string:= string that consists of grammatical formatives and complex symbols. terminal string:= string formed from preterminal string by insertion of a lexical formative according to the rule: If Q is a complex symbol of a preterminal string and < D, C > is a lexical entry, where C is not distinct from Q, then Q can be replaced by D. Predicate is a relating string to a category:= in the terminal string formed by replacing the complex symbol Q by the formative D of the lexiacl entry < D, C >, the formative D is an [αf ] (is dominated by [αf ]) if [αf ] is a part of a complex symbol Q or C, where α is either + or - and F is a feature. Context sensitive subcategorization rules 2.2.4 An illustrative fragment of the base component 2.2.5 Types of Base Rules 2.3 Deep structures and grammatical transformations The basis of a sentence is mapped into the sentence by the transformational rules which automatically assign to the sentence a derived Phrase marker and, ultimately, a surface structure. Transfromation marker is a history of transformations applied to the basis of a sentence. The Deep structure is given completely by its Transformation marker. Katz and Postal: the only contribution of transformations to semantic interpretation is that they combine semantic interpretations of already interpreted Phrase markers in a fixed way. Definition of a Deep structure: A generalized Phrase marker M D is a Deep Structure of a sentence S with a well formed surface structure M S iff the transformational rules generate M S from M D. The surface structure M S of S is well formed iff S contains no symbols indicating the blocking of obligatory transformations. 8
p.141:,,the grammar does not, in itself, provide provide any sensible procedure for finding a Deep structure of a given sentence, just as it provides no sensible precedure for finding a paraphrase to a given sentence. It merely defines these tasks in a sensible way. The form of grammar: syntactic component, semantic compnent, phonological componenet. Semantic and phonological are purely interpretative. They play no role in recursive generation of sentence structures. Syntactic component: base component and transformational component. Base component: categorial subcomponent and lexicon. The base component generates Deep structures. A Deep structure enters the semantic component and recieves semantic interpretation. Then it is mapped by transformational rules into Surface structure. Surface structure enters the phonological component and is given the phonological interpretation. Looking the other way round: the grammar assigns semantic interpretation to signals, mediated by the recursive rules of the syntactic component. Categorial subcomponent: a sequence of context free rewriting rules. 2.4 Some residual problems 2.4.1 The boundaries of syntax and semantics Degree of grammaticalness can be described in terms of violation of (obedience to) selection rules. Note: p162:,,it is clear that the manner of combination provided by the surface structure is in general totally irrelevant to semantic interpretation, whereas the grammatical relations expressed in the abstract deep structure are, in many cases, just those that determine the meaning of a sentence. (example of quantifiers in,,chomsky and deep structure debates. 2.4.2 The structure of the lexicon 9