# Chomsky, Noam (1928–)

## 1. The aims and principles of linguistic theory

There is an intimate relation between how a problem is conceived and the kinds of explanations one should offer. Chomsky proposes that we identify explanation in linguistics with a solution to the problem of how children can attain mastery of their native languages on the basis of a rather slender database. This is often referred to as ‘the logical problem of language acquisition’.

A natural language assigns meanings to an unbounded number of sentences. Humans typically come to master at least one such language in a surprisingly short time, without conscious effort, explicit instruction or apparent difficulty. How is this possible? There are significant constraints on any acceptable answer.

First, a human can acquire any language if placed in the appropriate speech community. Grow up in Boston and one grows up speaking English the way Bostonians do. However, the ‘primary linguistic data’ (PLD) available to the child are unable to guide the task unaided. There are four kinds of problems with the data that prevent it from shaping the outcome:

1. The set of sentences the child is exposed to is finite. However, the knowledge attained extends over an unbounded domain of sentences.

2. The child is exposed not to sentences but to utterances of sentences. These are imperfect vehicles for the transmission of sentential information as they can be defective in various ways. Slurred speech, half sentences, slips of the tongue and mispronunciations are only a few of the ways that utterances can obscure sentence structure.

3. Acquisition takes place without explicit guidance by the speech community. This is so for a variety of reasons. Children do not make many errors to begin with when one considers the range of logically possible mistakes. Moreover, adults do not engage in systematic corrections of errors that do occur and even when correction is offered children seem neither to notice nor to care. At any rate, children seem surprisingly immune to any form of adult linguistic intrusion (see Lightfoot 1982).

4. Last of all, and most importantly, of the linguistic evidence theoretically available to the child, it is likely that only simple sentences are absorbed. The gap between input and intake is attributable to various cognitive limitations such as short attention span and limited memory. This implies that the acquisition process is primarily guided by the information available in well-formed simple sentences. Negative data (the information available in unacceptable ill-formed sentences) and complex data (the information yielded by complex constructions) are not among the PLD that guide the process of grammar acquisition. The child constructing its native grammar is limited to an informationally restricted subset of the relevant data. In contrast to the evidence that the linguist exploits in theory construction, the information the child uses in building its grammar is severely restricted. This suggests that whenever the linguistic properties of complex clauses diverge from simple ones, the acquisition of this knowledge cannot be driven by data. Induction is insufficient as the relevant information is simply unavailable in the PLD.

The general picture that emerges from these considerations is that attaining linguistic competence involves the acquisition of a grammar, and that humans come equipped with a rich innate system that guides the process of grammar construction. This system is supple enough to allow for the acquisition of any natural language grammar, yet rigid enough to guide the process despite the degeneracy and deficiency of the PLD. Linguistic theorizing takes the above facts as boundary conditions and aims both at descriptive adequacy (that is, to characterize the knowledge that speakers have of their native grammars) and explanatory adequacy (that is, to adumbrate the fine structure of the innate capacity) (see Language, innateness of).

Issues of descriptive and explanatory adequacy have loomed large in Chomsky’s work since the beginning. Chomsky’s objection, for example, to ‘Markov models’ of human linguistic competence was that they were incapable of dealing with long distance dependencies exemplified by conditional constructions in English and hence could not be descriptively adequate. His argument in favour of a transformational approach to grammar rested on the claim that it allowed for the statement of crucial generalizations evident in the judgments of native speakers and so advanced the goal of descriptive adequacy (Chomsky 1957). Similarly, his influential critique (1959) of Skinner’s Verbal Behavior consisted in showing that the learning theory presented therein was explanatorily inadequate. It was either too vague to be of scientific value or clearly incorrect given even moderately precise notions of stimulus or reinforcement.

The shift from the early Syntactic Structures (1957) theory to the one in Aspects of a Theory of Syntax (1965) was also motivated by concerns of explanatory adequacy. In the earlier model the recursive application of transformations allows for the generation of more and more complex sentences from the sentences produced by the ‘phrase structure’ component of the grammar. In the Aspects theory, recursion is incorporated into the phrase structure component itself, and removed from the transformational part of the theory (see Syntax §3). The impetus for this was the observation that greater explanatory adequacy could be attained by grammars that had a level of ‘Deep Structure’ incorporating a recursive base component. In particular, Fillmore (1963) observed that the various optional transformations in a Syntactic Structures theory always applied in a particular order in any given derivation. This order is unexplained in a Syntactic Structures theory; in Aspects it is deduced. Thus, the move to an Aspects-style grammar is motivated on grounds of greater explanatory adequacy: introducing Deep Structure and moving recursion to the base allows for a more restricted theory of Universal Grammar. All things being equal, restricting UG is always desirable as it advances a central goal of grammatical theory; the more restricted the options innately available for grammar construction, the easier it is to explain how language acquisition is possible, despite the difficulties in the PLD noted above.

The same logic motivates various later additions to and shifts in grammatical theory. For example, a major move in the 1970s was radically to simplify transformational operations so as to make their acquisition easier. This involves eliminating any mention of construction-specific properties from transformational rules. For example, an Aspects rule for passive constructions looks like (1), the left-hand side being the Structural Description (SD) and the right hand side being the Structural Change (SC):

##### (1)
$\begin{array}{l}\mathrm{X}-\mathrm{N}\mathrm{P}1-\mathrm{V}-\mathrm{N}\mathrm{P}2-\mathrm{Y}\to \\ \mathrm{X}-\mathrm{N}\mathrm{P}2-\mathrm{b}\mathrm{e}+\mathrm{e}\mathrm{n}\phantom{\rule{thickmathspace}{0ex}}\mathrm{V}-\mathrm{b}\mathrm{y}+\mathrm{N}\mathrm{P}1-\mathrm{Y}\end{array}$

This rule would explain, for instance the grammaticality of ‘the ball is kicked by John’ given that of ‘John kicks the ball’. Observe that the SC involves the constants ‘ $\mathrm{be}+\mathrm{en}$ ’ and ‘by’. The SD mentions three general expressions, ‘NP1’, ‘V’ and ‘NP2’ and treats these as part of the context for the application of the rule. In place of this, Chomsky proposed eliminating the passive rule and replacing it with a more general rule that moves NPs (Chomsky 1977, 1986). The passive rule in (1) involves two applications of the ‘Move NP’ rule, one moving the subject ‘NP1’ to the ‘by’ phrase, and another moving the object ‘NP2’ to the subject position. In effect, all the elements that make the passive rule in (1) specific to transitive constructions are deleted and a simpler rule (‘Move NP’) replaces it.

There is a potential empirical cost to simple rules, however. The simpler a transformation the more it generates unacceptable outputs. Thus, while a grammar with (1) would not derive ‘was jumped by John’ from ‘John jumped’, a grammar eschewing (1) and opting for the simpler ‘Move NP’ rule is not similarly restricted. To prevent overgeneration, therefore, the structure of UG must be enriched with general grammatical conditions that function to reign in the undesired overgeneration (Chomsky 1973, 1977, 1986). Chomsky has repeatedly emphasized the tension inherent in developing theories with both wide empirical coverage and reasonable levels of explanatory adequacy.

A high point of this research agenda is Chomsky’s Lectures on Government and Binding (1981). Here the transformational component is reduced to the extremely simple rule ‘Move a’ – that is, move anything anywhere. To ensure that this transformational liberty does not result in generative chaos, various additions to the grammar are incorporated, many conditions on grammatical operations and outputs are proposed, and many earlier proposals (by both Chomsky and others) are refined. Among these are trace theory, the binding theory, bounding theory, case theory, theta theory and the Empty Category Principle. The picture of the grammar that Chomsky’s Lectures presents is that of a highly modular series of interacting subsystems which in concert restrict the operation of very general and very simple grammatical rules. In contrast to earlier traditional approaches to grammar, Lectures witnesses the virtual elimination of grammatical constructions as theoretical constructs. Thus, in Government Binding (GB)-style theories there are no rules of Passive, Raising, Relativization or Question Formation as there were in earlier theories. Within GB, language variation is not a matter of different grammars having different rules. Rather, the phenomena attested in different languages are deduced by variously setting the parameters of Universal Grammar. Given the interaction of the grammatical modules, a few parametric changes can result in what appear on the surface to be very different linguistic configurations. In contrast to earlier approaches to language, variation consists not in employing different kinds of rules, but in having set the parameters of an otherwise fixed system in somewhat different ways (see Chomsky 1983).

The GB research programme has proven to be quite successful in both its descriptive range and its explanatory appeal. Despite this, Chomsky has urged a yet more ambitious avenue of research. He has embarked on the development of a rationalist approach to grammar that goes under the name of ‘Minimalism’ (Chomsky 1995). The theory is ‘rationalist’ both in that it is grounded on very simple and perspicuous first principles, and in that it makes use only of notions required by ‘virtual conceptual necessity’. Chomsky hopes to make do with concepts that no approach to grammar can conceivably do without and remain true to the most obvious features of linguistic competence. For example, every theory of grammar treats sentences as pairings of sounds and meanings. Thus, any theory will require that every sentence have a phonological and an interpretative structure. In GB theories, these sorts of information are encoded in the PF (Phonetic Form) and LF (Logical Form) phrase markers respectively. In addition, GB theories recognize two other distinctive grammatical levels: S-structure and D-structure. A minimal theory, Chomsky argues, should dispense with everything but LF and PF. It will be based on natural ‘economy’ principles and indispensable primitives. Chomsky has suggested reanalysing many of the restrictions that GB theories impose in terms of ‘least effort’ notions such as ‘shortest move’ and ‘last resort movement’. For example, he proposes that the unacceptability of sentences such as ‘John is expected will win’, are ultimately due to the fact that the moved NP ‘John’ need not have moved from the embedded subject position (between ‘expected’ and ‘will’) as it fulfils no grammatical requirement by so moving. This work is still in its infancy, but it has already prompted significant revisions of earlier conclusions. For example, with the elimination of D-structure, the recursive engine of the grammar has once again become the province of generalized transformations. Whatever its ultimate success, however, Minimalism continues the pursuit of the broad goals of descriptive and explanatory adequacy enunciated in Chomsky’s earliest work.