Taking A Broader View Abstraction And Idealization Philosophy Essay

Martin Stokhof and Michiel van Lambalgen (S&vL, for short) are addressing an important methodological issue concerning the way modern linguistics constructs its proper objects, the appropriate scientific criteria for characterizing the success or failure of this project, and the role of naturalism in modern linguistics. In the understanding of S&vL, the term ‘modern linguistics’ is quasi-synonymous with the generative tradition founded by Noam Chomsky. Unfortunately, this perspective is rather restricted and I propose to take a somewhat broader view of the generative tradition including recent variants of the generative paradigm such as Prince’s and Smolensky’s ‘optimality theory’ (Prince and Smolensky 1993/2004), Jackendoff’s architecture of the language faculty (Jackendoff 1997), and Pustejovsky’s ‘generative lexicon’ (Pustejovsky 1998), to name only a few variants.

I acknowledge the careful distinction between ‘abstraction’ and ‘idealization’ S&vL make. In the following, I will argue that taking a broader view of ‘modern linguistics’ we have to rethink the role of abstraction and idealization. Further, I will argue that abstraction and idealization are both used as methodological tools in physics. Both practices have their value and can lead to enormous scientific progress when used appropriately.

Though I do not like to give definitions for historically matured traditions such as the generative paradigm, I will propose five different aspects which are seen as essential for constituting the generativist approach:

The innateness hypothesis: Innateness is seen as a main factor explaining why languages do share the universal tendencies that they do. Hence, a close relationship between innateness and universal grammar is assumed.

Explicit inaccessible rule view: The idea is that our knowledge of language is stored explicitly as rules. Only we cannot describe them verbally because they are written in a special code only the language processing system can understand (e.g. Pinker 1984 following Chomsky)

Grammar does not use a counting mechanism: Instead of using numerical values and numerical calculations, grammars use discrete means. They are based on categorical decisions and possibly employ preference mechanisms.

Competence-performance distinction: Competence is an idealized capacity (speaker-hearer's knowledge of their language) which is differentiated from performance being the processing (production, understanding) of actual utterances.

Autonomy of syntax: The autonomy thesis states that the syntactic rules and principles of a language can be formulated without reference to meaning, discourse, or language use. In order to demonstrate the autonomy of syntax one must show that there exists an encapsulated system of purely formal generalizations orthogonal to generalizations governing meaning or discourse.

Of course, there are other properties that are connected to the Chomskyan linguistics, such as the inviolability of basic rules and principles of grammar and the unidirectional formulation of the generative device. However, I think there is no independent motivation for these conditions and they are rooted in certain arbitrary logical or computational traditions. For example, consider the feature of unidirectionality/bidirectionality. In the computational linguistics literature (e.g. Appelt 1989) a grammar is called bidirectional if it can be used by processes of approximately equal computational complexity to parse and generate sentences of a language. Contrasting with Chomsky’s unidirectional view [1] , which sees grammar as a directed, generative device, many authors stress the view of bidirectional grammar which has to be represented declaratively and can be applied in different directions – from meaning to form and from form to meaning, respectively. Such a declarative grammar could be based on the (associative and commutative) unification of feature structures such as the PATR II formalism (Shieber 1986) or on some more modern forms of constraint-based and inherently nondirectional grammars (Bresnan 2000; Jackendoff 2002). Presently, optimality theory (OT) is the dominant framework for realizing such bidirectional grammars (cf. Prince and Smolensky 1993/2004; Smolensky and Legendre 2006). Declarative grammars, though symbolic, have important similarities with neural networks, where certain subsymbolic constraints are formulated in a nondirectional, declarative way – examples are harmonic grammar (Legendre et al. 1990a, 1990b) and Hopfield networks (Hopfield 1982).

The mentioned alternatives to the mainstream generativist approach are different in many respects. For instance, Jackendoff (1997, 2002) argues against the syntax-centered view of standard generative grammar, and he specifically treats phonology, syntax and semantics as three parallel generative processes which are coordinated through interface processes. Pustejovsky (1998), on the other hand, argues against the static view of word meaning where each word is characterized by a predetermined number of word senses, and he proposes that the lexicon becomes an active and central component in the linguistic description. However, both Jackendoff’s and Pustejovsky’s approaches do not conflict with the five basic aspects which are essential for the generative paradigm on a broader perspective.

Concerning optimality theory, it is sometimes argued that this approach basically conflicts with the generative paradigm (e.g. Antovic 2007). However, this is not correct as can be seen by considering the five basic traits. First, optimality theory accepts the innateness hypothesis and it crucially relies on the competence/performance distinction. Further, optimality theory assumes ‘strict domination’; i.e. no number of violations of lower order constraints can ever overpower any violation of a higher order constraint. A consequence of this assumption is that grammars do not need a counting mechanism (counting constraint violations). Next, optimality theory generally respects the autonomy of syntax. However, this is only accepted as a general tendency. Optimality theory has means of accounting for certain cases of autonomy breaking – as investigated for instance in connection with the interaction of stress and syllabification (e.g. Itô 1989). Concerning the explicit inaccessible rule view, optimality theory takes two perspectives – a. the symbolic perspective using explicit rules and b. their neural underpinning demonstrating the (complementary) perspective of implicit rules (cf. Smolensky and Legendre 2006). By integrating these two perspectives, optimality theory accepts explicit rules as a proper way to describe aspects of a complex system. This sharply contrast with eliminative connectionism (e.g. Churchland 1992).

In cognitive science, symbolic systems and neuronal network systems are normally seen as establishing incompatible architectures. The generativist linguist is clearly standing on the symbolist’s site (Fodor and Pylyshyn 1988). S&vL describe this situation as suggestive of seeing competent language users as ‘disembodied’ individuals (p. 15). I accept this as a sound description of the opinion of some main stream generativists (not including Chomsky). However, in optimality theory the situation is different. The paper launching the basic ideas of optimality theory has two subchapters entitled ‘Why Optimality Theory has nothing to do with connectionism’ (Prince and Smolensky 1993/2004, Section 10.2.1) and ‘Why Optimality Theory is deeply connected to connectionism’ (Prince and Smolensky 1993/2004, Section 10.2.2), obviously reflecting the different opinions by Smolensky and Prince. The former but not the latter sees optimality theory as representing a very specialized kind of neural network (Harmonic Grammar), with exponential weighting of the constraints. Hence, in Smolensky’s integrative architecture the symbolist and the subsymbolist aspects are seen as two sides of the same coin or as complementary aspects of an embodied integral whole. Further, optimality theory is recognized ‘as a regimentation and pushing to extremes of the basic notion of Harmonic Grammar’ (Prince and Smolensky 1993/2004: 219). The interested reader is referred to Smolensky and Legendre (2006), in which the relations between Harmonic Grammar, Optimality Theory, and principles of connectionist computation are subjected to detailed scrutiny.

In the present context, optimality theory is especially interesting since we find both research tools there – idealization and abstraction. One example is the abstraction mentioned in connection with the ‘strictness of domination’, which can be derived from exponential weightings in the limit of an infinite base. This turns harmonic grammar or other neural network accounts into a system where counting the violations of constraints is not required. Another example of abstraction concerns the transfer to a discrete, crisp notion of concepts. This transfer can be realized by replacing the sigmoid function of a threshold unit by its limiting case where the ‘temperature’ parameter T approaches absolute zero. Besides clear cases of abstraction we also find clear examples of idealizations in optimality theory. In part, these idealizations are similar to the idealizations made in Hopfield networks, e.g. symmetric connections, no self-connections. The aim of these idealizations is to make the theory mathematically tractable. Another example has already been mentioned and concerns the competence/performance distinction, which is essential for OT and many kinds of neural networks.

Accepting optimality theory as one instance of the generative approach (in the broader sense), we have argued that both methodological tools can be found – abstraction and idealization. Interestingly, this situation is similar to the situation in physics, where we normally also find both processes. Note that this close analogy is valid since many ideas in neural modeling go back to ideas of theoretical physics, e.g. the proposal of Hopfield networks and Boltzmann machines. Hence, we can state that both idealization and abstraction are valuable and sound research tools when used with care. S&vL seem to suggest that physics makes exclusive use of abstraction and conclude from this observation that abstraction is the only useful research tool within a naturalist setting. I think this is not true. It is not difficult to find examples that suggest that idealization is an equally important research tool in physics and both tools can lead to enormous scientific progress within the field of naturalist sciences. Let us consider some examples.

The first example is the Bohr model of atoms. This model assumes that electrons are orbiting a nucleus. However, classical mechanics predicts that electrons moving on (elliptical) orbits will release electromagnetic radiation. Because the electrons would lose energy, they would gradually spiral inwards, collapsing into the nucleus. This is disastrous because it predicts that all atoms are unstable. In order to avoid this problem, Bohr stipulated that electrons can only travel in special orbits at a certain discrete set of distances from the nucleus with specific energies. Only when electrons jump from one orbit to a lower energy orbit they can emit electromagnetic radiation with a frequency ν determined by the energy difference of the levels according to the Planck relation E = hν. It is obvious, that the assumptions made by Bohr are idealizations, not abstractions in the sense of S&vL. The Bohr model was very successfully. For the first time, it was possible to precisely predict the spectra of the hydrogen, helium and lithium atom. Despite of its success (honored with a Nobel Prize to Nils Bohr in 1922), it was, nevertheless, an incomplete and somewhat ambiguous theory. For example, it could not predict the spectra of more complex atoms, the binding behavior of atoms in molecules such as H2O, and the spatial, hexagonal symmetry of the shape of water molecules (as found so beautifully in snow crystals). Later, all the ambiguities and shortcomings of the Bohr model were overcome by the development of the Schrödinger/Heisenberg quantum theory.

Another example is classical mechanics when compared with quantum mechanics. Classical mechanics assumes that the act of measuring an observable does not disturb the state that is observed. According to S&vL this assumption is clearly an idealization: a. a qualitative feature is ignored (the observer-dependency of observables and the Heisenberg uncertainty principle of the micro-world); b. in classical theory including statistical mechanics the feature observer-dependency is missing; c. the motivation to save the assumptions of classical physics is primarily ideological. The latter point can be seen by considering hidden variable theories which were espoused by some physicists who argued that quantum mechanics is "incomplete". Einstein is the most famous proponent of hidden variables (cf. Einstein et al. 1935), and he famously insisted that, "I am convinced God does not play dice". For more details the reader is referred to Primas (1982, 2007) who convincingly argues that the relationship between classical mechanics and quantum mechanics is not one of abstraction.

The third example is the description of elementary particles in terms of the irreducible unitary representations of certain symmetry groups (including the SU(3) color symmetry of quarks). The idea of irreducible representations of certain Lie groups connected to principles of symmetry is a powerful tool of finding different kinds of idealizations in order to approach and to systematize the particle zoo.

I do not think that idealization and abstraction are the only research tools available in physics. A third methodological instrument is equally plausible: phenomenology. This tool is used when physics is concerned with calculating detailed predictions for experiments. In this case, theoretical decisions are often based on powerful analogies. For example, the liquid drop model of atomic cores assumes that nucleons interact strongly with each other, like the molecules in a drop of liquid. We cannot see this either as an idealization or as an abstraction. In fact, it is related to a kind of analogical reasoning. Another typical example is the fireball model in high energy physics. Here a certain kind of thermodynamic modeling is used for explaining high energy particle production.

A mix of different methods appears when we consider the most recent developments in high energy physics. Modern theoretical physics has deep conceptual problems. Both general relativity and quantum field theory are inconsistent with each other and one or both are necessarily incorrect. This arises from the fact that general relativity violates unitarity (satisfied by quantum theory) whereas relativistic quantum field theory breaks down completely at small scales and cannot be done in a dynamic curved metric. Unfortunately, it does not combine correctly to gravity. People are aware of this fact and play with different idealizations of this bizarre situation, one of them is the development of (super)string theory (for popular introductions, see Lindley 1993; Smolin 2002). Without going into any detail, this recent development really seems to create a mixing of phenomenology, idealization, and elements of abstraction.

At the end of the target paper, S&vL conclude that "a naturalistic approach that is not ideologically motivated may lead to interesting … results" and they suggest cognitive linguistics, stochastic linguistics and approaches using neuronal models as convincing alternatives to the orthodox generativist conception. Though these alternatives may convey interesting insights, I do not think that a real breakthrough in theoretical linguistics can be achieved following one of these separate lines. I think the situation in linguistics is in some sense similar to the situation of chemistry at the end of the 19th century where many phenomena and empirical generalizations were known but a big unifying, explanatory and empirically sound theory was still missing. As we know now the breakthrough came with quantum theory. With the help of this theory an exact and general formulation of the fundamental laws became suddenly possible. Heisenberg describes the situation as follows:

"Die chemischen Gesetze konnten nicht exakt formuliert und die Frage nach der Natur der chemischen Kräfte nicht beantwortet werden, solange man sich auf die eigentliche Chemie, d.h. die qualitativen Verwandlungen wägbarer Substanzmengen beschränkte. Erst als man zur Chemie der kleinsten Materiemengen (der Atome und Moleküle) vordrang – in das Grenzgebiet, in dem chemische und mechanische Vorgänge nicht mehr scharf unterschieden werden können – gelang die Auffindung and exakte Formulierung der Naturgesetze, die Chemie und Mechanik gleichzeitig umfassen" (Heisenberg 1942/1989: 108) [2] 

Just like the chemical laws cannot be formulated exactly without integrating physics and chemistry, I think the idea is in the air that the deeper laws of linguistics cannot be formulated without integrating linguistics and neuroscience. In particular, the area where the symbolic and the subsymbolic processing cannot strictly be separated from each other is of particular importance. It is this area where the complementary nature of the mental and the physical becomes visible, as stated by a recent Lotze prizer (Atmanspacher and beim Graben 2007; beim Graben 2004, 2011).

Acknowledgement

I am deeply indebted to Peter beim Graben for discussing abstraction, idealization and the phenomenological approach in physics and for providing convincing examples. Thanks go to Stefan Blutner for explaining the crucial traits of generative linguistics and for debating several variants of the generativist approach. Further, I am grateful to Paul Smolensky, Barbara Partee, Hans-Martin Gärtner, and James Pustejovsky for opening my eyes for certain advantages of the generativist approach. Needless to say that for the remaining weaknesses and errors of these comments, no one but myself can be held responsible.