Philosophy of Language » Lecture 7
What is a syntactic structure?
The idea: a sentence isn’t merely a string of words. Rather, it is structured into phrases, which in turn have internal structure.
The syntactic rules – grammar – of a natural language give the permissible phrase structures of a well-formed sentence.
Here’s an example of a (too simple to be true) rule:
This data shows that (i) some phrases ((will) ace the test) can be omitted and a sentence still be grammatical, yet (ii) sub-expressions of those phrases cannot be omitted while preserving grammaticality.
A neat explanation: what can be elided is a syntactic constituent (like the VP ace the test); and not every sub-string is a constituent, but only those which are phrases (Elbourne 2011: 74–76).
A syntactic tree for Alice will ace the test:
(Diagram omitted in `html` version – apologies to those who use this format, but the syntax trees are just too involved to typeset.)The structure is hierarchical: the sentence comprises the NP Alice, and the VP will ace the test; that first VP consists of the auxiliary verb will and the VP ace the test, which consists of the verb ace and the Determiner Phrase the test, which consists of the Determiner the and the Noun test.
A syntactic tree for (2) – note the common structure with (1), which supports the judgment that they are synonymous.
(Diagram omitted in `html` version)The permissible elided constituent is the verb phrase ace the test, coloured purple to mark its absence; this is thus called verb-phrase ellipsis (Johnson 2001).
Ambiguity isn’t only due to homonymy/polysemy:
[A] second kind of ambiguity is called structural ambiguity. All the words in a structurally ambiguous sentence can have just one meaning each. So we are not dealing with lexical ambiguity. Structural ambiguity is ambiguity that arises by the meanings of words or phrases combining with each other in different ways. (Elbourne 2011: 73)
Examples:
Old men and women are law-abiding. (Elbourne 2011: 73)
Tariq is a Persian carpet importer. (Hodges 2001: 10)
Dogs must be carried. (Hodges 2001: 12)
Diagnosis: The sentences (8)–(10) are ambiguous because the strings of words comprising them actually instantiate multiple syntactic structures.
Ambiguity provides crucial evidence for the existence of phrasal structure. For if sentences were just strings of words, with no internal structure apart from order, it is hard to see how there could be multiple meanings for a given sentence.
But if
a sentence is made from phrases, which are assigned a meaning prior to the sentence as a whole being assigned a meaning; and
a given string of words can be decomposed into phrases in more than one way;
then different meanings can be generated from the same string.
The idea is now that structurally ambiguous sentences, like Tariq is a Persian carpet exporter can correspond to two distinct trees:
(Diagram omitted in `html` version)That is, he is a carpet exporter who is himself Persian.
That is, he is a (possibly non-Persian) exporter of carpets that are Persian.
This example arises because English permits noun phrases to modify other noun phrases: the NP dining room table can modify N leg to generate NP dining room table leg, etc.
English also allows determiner phrases like the pot-plant and that black mark, etc., to be modified by preposition phrases (PPs), like in the corner, on the radiator, etc.
This can lead to ambiguity if the PPs in question can also modify a verb like put or returned.
More preposition phrase ambiguities:
This could mean:
Exactly half of the boys were such that there was any girl they kissed. (If there are 4 boys, only 2 of them had kissed a girl.)
Some girl is such that exactly half of the boys kissed her. (If 2 boys had kissed Jane, and another had kissed Sally.)
In the possible scenarios envisaged, these two disambiguations have different truth values. That is surely enough to show they are distinct in meaning, since they have different intensions, and neither reading entails the other.
What about this sentence:
Intuition is that it is ambiguous too, between (17) and (18):
One problem is: the second reading entails the first (since every possible situation in which one woman is admired by every man is a fortiori a situation in which every man admires a woman).
So is (16) ambiguous? Or just not specific enough about the types of situations it refers to?
Think about the task of formalising this sentence into a logical language.
There is no ambiguity in a logical language, but – it is thought – the activity of formalising can reveal ambiguity in the original language if there are two equally good translations.
And there are two good translations of (16):
The existence of two distinct translations might be enough to generate ambiguity – though it’s still the case that (20) entails (19).
We have another problem though: how could (16) be ambiguous? There is just one syntactic tree:
(Diagram omitted in `html` version)There seems no way to make another well-formed tree from these words. So is the ambiguity merely apparent?
There is an approach that lets us see two distinct syntactic structures in (16).
The approach is motivated by the fact that the quantified determiner phrases every man and some woman are treated referentially: they don’t have anything quantificational about them. Yet, arguably, they should be treated attributively.
Compare:
This example has the determiner phrase every linguist as the object of the verb offend. But the kinds of things which can be offended are individuals – how can I offend ‘every linguist’ without offending each of them? So we really need, as the ingredient for our semantics, something which can appropriately take offense.
Note that (21) seems synonymous with
The quantifier raising approach basically takes this synonymous sentence to give the real but hidden syntax of (21). This structure is revealed by taking the quantifier every and ‘raises’ it above where it seems to be found.
First, identify the overt structure of the sentence:
(Diagram omitted in `html` version)Second, remove the determiner phrase and replace it by a trace – the trace will be something like a covert pronoun them.
(Diagram omitted in `html` version)I.e., John offended them. (Now problematic, because them is indeterminate in reference.)
Finally, re-introduce the determiner phrase to bind the residual trace:
(Diagram omitted in `html` version)I.e., Every linguist is such that John offended them.
This is rendered acceptable, since the covert free trace – equivalent to the overt pronoun them in (22) – is now bound by every linguist.
When there are two determiner phrases, as in (16), one has a choice about the order in which one applies quantifier raising, and this can produce structural ambiguities.
Reading (17) of (16) arising by first raising some woman, then every man: for each man there is some woman or other such that he admires her:
(Diagram omitted in `html` version)Reading (18) of (16) arising from the opposite procedure, first raising every man then raising some woman, which yields the claim that there is some specific woman \(x\) such that each man \(y\) is such that \(y\) admires \(x\):
(Diagram omitted in `html` version)This syntactic exercise is motivated by some semantic phenomena: namely, that certain sentences seem to have two meanings.
This motivates the postulation of two or more syntactical structures associated with those surface strings only under a further assumption: that the meaning of a sentence is determined by
the meaning of its constituents; and
the way they are put together.
This is the thesis of compositionality, to which we now turn.
Note that if it is sentences which have meanings, then sentences are not strings of words!
Rather, they are words arranged in a given syntactical tree.
We should probably conclude that there are two homonymous sentences both expressed in the ambiguous English string every man admires some woman.
We just explored how ambiguous sentences have multiple syntactic structures.
But why should this make for a difference in meaning? It will do so only if the meaning of a sentence depends (in part) upon its syntax.
This idea seems sensible. The meaning of a sentence isn’t just an unstructured sum of the meanings of individual words.
But how can we make the proposal concrete? And how can we make it compatible with one fundamental assumption we’ve been making, namely, that the meaning of a sentence is a proposition?
To do this we need to show what the meanings of words are, such that given those meanings, and given a syntactic tree for some sentence, we can predict at least truth-conditions of the complex sentence.
The basic principle here is compositionality:
the meaning of a sentence is calculated on the basis of the meaning of the words in it and their syntactic arrangement. (Elbourne 2011: 99)
It seems plausible that natural languages are compositional. But can we give an argument?
Two features of natural language have been thought to strongly hint that its semantics is compositional (Fodor 1998: 94–100):
Natural language is productive: finite speakers can understand and produce a potentially infinite number of meaningful sentences.
Natural language is systematic: speakers who can understand and produce a sentence with a given syntactic structure can understand and produce any sentence which is a systematic recombination of the elements of the first sentence.
If \(S\) and \(S'\) are sentences, then so is \(S\) and \(S'\). This fact shows that there are infinitely many sentences of English. Let \(S=\) Alice is tall, \(S'=\) Bob is tall, etc.; then these are all sentences:
That you can recognise and understand that all of these are grammatical and meaningful sentences, despite never having encountered them before, and that you can yourself construct new sentences you’ve never heard before, stands in need of explanation: compositionality.
Since people’s representational capacities are surely finite, this infinity of [expressions] must itself be finitely representable…. the demand for finite representation is met if (and … only if) all [expressions] are individuated by their syntax and their contents, and the syntax and contents of each complex [expression] is finitely reducible to the syntax and contents of its (primitive) constituents. (Fodor 1998: 95)
Since competent speakers can understand a complex expression \(e\) they never encountered before, it must be that they (perhaps tacitly) know something on the basis of which they can figure out … what \(e\) means. If this is so, something they already know must determine what \(e\) means. And this knowledge cannot plausibly be anything but knowledge of the structure of \(e\) and knowledge of the individual meanings of the simple constituents of \(e\). (Szabó 2022: §3.1)
However even this case is nuanced, as many idioms are somewhat compositional. Clearly pull strings (meaning gain an advantage by exploiting unofficial channels) is idiomatic. But as Nunberg, Sag, and Wasow (1994) point out, it can be internally modified in predictable ways:
Anyone who understands a complex expression \(e\) and \(e'\) built up through the syntactic operation \(F\) from constituents \(e_{1},…,e_{n}\) and \(e_{1}',…,e_{n}'\) respectively, can also understand any other meaningful complex expression \(e''\) built up through \(F\) from expressions among \(e_{1},…,e_{n}, e_{1}',…,e_{n}'\). So, it must be that anyone who knows what \(e\) and \(e'\) mean is in the position to figure out, without any additional information, what \(e''\) means. If this is so, the meaning of \(e\) and \(e'\) must jointly determine the meaning of \(e''\). But the only plausible way this could be true is if the meaning of \(e\) determines \(F\) and the meanings of \(e_{1},…,e_{n}\), the meaning of \(e'\) determines \(F\) and the meanings of \(e_{1}',…,e_{n}'\), and \(F\) and the meanings of \(e_{1},…,e_{n}, e_{1}',…,e_{n}'\) determine the meaning of \(e''\). …
[E.g.,] it seems reasonable that anyone who can understand ‘The dog is asleep’ and ‘The cat is awake’ can also understand ‘The dog is awake’ and ‘The cat is asleep’, and that anyone who can understand ‘black dog’ and ‘white cat’ can also understand ‘black cat’ and ‘white dog’. (Szabó 2022: §3.2)
Unlike productivity, which seems obvious, systematicity is a bold conjecture: namely, that understanding a sentence puts you in a position to understand arbitrary recombinations of its constituents within the same structure.
But do all who understand ‘within an hour’ and ‘without a watch’ also understand ‘within a watch’ and ‘without an hour’? (Szabó 2022: §3.2)
One argument for systematicity is the learnability of human languages, which are typically acquired through exposure to complex stimuli.
Children are thus apparently decomposing what they hear and recombining it to produce utterances of their own.
That they can do this is evidence of systematicity; they are justified in doing this only if their language is compositional (i.e., what they hear and say really has a structure and independently meaningful constituents).
It can be hard to even conceive of what a non-compositional language might look like.
One example might be propositional attitude ascriptions (Szabó 2022: §4.2.4). . Consider these sentences:
If Millianism about proper names is correct (lecture 4), the embedded sentences in (28) and (29) express the same proposition, but (28) and (29) express different propositions.
Same ingredients, same structure, but different meanings (because (28) can be true while (29) is false) – a violation of compositionality.
Note that the previous examples are often used as an observation against Millianism, or against unstructured propositions. But it’s only problematic for those views if compositionality is assumed; and indeed it is. The argument is thus something like this:
Compositionality is true;
If Millianism is correct, the semantics of belief ascriptions is non-compositional;
Conclusion: Millianism is false.
And what we see in response to this argument isn’t the denial of premise (34) by Millians; it is rather compositional (or compositional-adjacent) accounts of the semantics of belief reports.
Thus compositionality is a methodological assumption common to (almost) all workers in the field.
If compositionality is a methodological assumption, a working hypothesis about how language works, then what we should really do is try to give a compositional semantics for natural language.
If we can, and the resulting theory gives plausible results and looks theoretically ‘nice’, that is some evidence that it actually does give the semantics for English, and thus that English is compositional.
Now, maybe it could turn out that pursuing non-compositional semantics could also provide a good semantic theory of English (Szabó 2022: §3.3). So this argument is hardly conclusive. But since no one has really much idea what a non-compositional theory of English would look like, this doesn’t seem like a live concern.
The question now arises how the construction of the thought proceeds, and by what means the parts are put together so that the whole is something more than the isolated parts. In my essay ‘Negation’, I considered the case of a thought that appears to be composed of one part which is in need of completion or, as one might say, unsaturated, and whose linguistic correlate is the negative particle, and another part which is a thought. We cannot negate without negating something, and this something is a thought. Because this thought saturates the unsaturated part or, as one might say, completes what is in need of completion, the whole hangs together. And it is a natural conjecture that logical combination of parts into a whole is always a matter of saturating something unsaturated. (Frege 1923: 1–2; translation follows Heim and Kratzer 1998: 3)
Statements in general, just like equations or inequalities or expressions in Analysis, can be imagined to be split up into two parts; one complete in itself, and the other in need of supplementation, or unsaturated. Thus, e.g., we split up the sentence
Caesar conquered Gaul
into ‘Caesar’ and ‘conquered Gaul’. The second part is unsaturated – it contains an empty place; only when this place is filled up with a proper name, or with an expression that replaces a proper name, does a complete sense appear. Here too I give the name ‘function’ to the referent of this unsaturated part. In this case the argument is Caesar. (Frege 1891: 139)
A function is an entity that maps something – its argument – to exactly one other thing – its value.
Mathematically, we can model a function by an associated set: a collection of ordered pairs, where the first member of each pair is a potential argument, and the second member is the value the function assigns to that argument. \[\text{gestational parent of} = \left\{\langle \text{Laura Dern}, \text{Diane Ladd}\rangle, \langle \text{Maya Hawke}, \text{Uma Thurman}\rangle\ldots\right\}.\]
If \(f\) is a function, its associated set must satisfy this: if \(\langle x,y\rangle\in f\) and \(\langle x,z\rangle\in f\) then \(y=z\).
When \(\langle x,y\rangle \in f\), we write ‘\(f(x)=y\)’ (read: the unique value associated by \(f\) with the argument \(x\) is \(y\)).
So Frege thinks that the meaning of a sentence is generated by applying a function to an argument. But what sort of function? What are its arguments? What are its values?
Let’s take the simplest sort of example: Sylvester walks.
(Diagram omitted in `html` version)Let’s see, firstly, if we can figure out the extension of this sentence. This will be its actual truth value. We can use \(T\) and \(F\), or \(1\) and \(0\), to name these.
Let us use \(⟦\phi⟧^{w}\) to signify the extension of \(\phi\) in \(w\).
We know that \(⟦\text{\emph{Sylvester walks}}⟧^{w} = 1\), since the embedded sentence is true in the actual world \(w\).
Let’s also keep assuming that the extension of a name in a world \(w\) is its referent, so that \(⟦\text{\emph{Sylvester}}⟧^{w} = \text{Sylvester}\).
So what is \(⟦\text{\emph{walks}}⟧^{w}\)? According to the Fregean hypothesis, it must be something unsaturated that is saturated by the extension of Sylvester, and when it is saturated, it evaluates to the extension of the sentence. That is: it is a function from individuals to truth-values: \[⟦\text{\emph{walks}}⟧^{w}(x) = \begin{cases} 1 \quad\text{ if $x$ walks in $w$} \\ 0 \quad\text{ otherwise}.\end{cases}\]
Applying this rule, \(⟦\text{\emph{walks}}⟧^{w} = \{\langle \text{Sylvester}, 1\rangle, \langle \text{Moby Dick}, 0\rangle,…\}\); hence \(⟦\text{\emph{walks}}⟧^{w}\left(⟦\text{\emph{Sylvester}}⟧^{w}\right)=1\), as needed.
Suppose we have some open formula \(\phi\) of our language in which \(x\) is a free variable: \((\ldots x \ldots )\).
Such open formulae correspond to properties. So, for instance, the open formula \(x\) loves Sylvester corresponds to the property loving Sylvester.
We are thinking of properties as Frege does: in effect, as characteristic functions – functions that have the value \(1\) iff given as argument something with the property.
We may introduce notation in order to pick out this property/function (Heim and Kratzer 1998: §2.5; Elbourne 2011: 102–4). We can use the open formula like this: \([\lambda x. x \text{ loves Sylvester}]\).
The notation \([\lambda x. \Phi(x)]\) can be read the property had by an \(x\) iff that \(x\) is \(\Phi\) – alternatively, the function that maps \(x\) to \(1\) iff \(x\) is \(\Phi\).
These lambda expressions combine with individuals to yield well-formed expressions: \[[λx.x \text{{ loves Sylvester}}](\text{{Antony}}).\]
What about \([\lambda x.\lambda y. x^{y}]\)?
This is a function; it is a function that, when given a number – say, \(3\) – as argument, yields as value another function.
Which other function? The function that, when given a number \(y\) as argument, delivers \(3^{y}\).
So, for particular \(a, b\): \[\begin{aligned}{ }[\lambda x.\lambda y. x^{y}] (a)(b) &= [\lambda y. a^{y}](b) \\ &= a^{b}.\end{aligned}\]
Something interesting: we are treating this exponentiation function as two sequentially applied one-place functions, not as a binary function (Heim and Kratzer 1998: §2.4).
Compare \([λy.λx.x\text{{ loves }}y]\); applied to the argument Sylvester this will give \([λx.x \text{{ loves Sylvester}}]\) from the previous slide.
Recall our account of the extension \(⟦\text{\emph{walks}}⟧^{w}\) as the function that has value \(1\) when the argument walks in \(w\), \(0\) otherwise.
We may now say, rather more concisely: \[⟦\text{\emph{walks}}⟧^{w} = [\lambda x. \text{$x$ walks in $w$}].\]
What about the intension of walks? We could have it deliver an extension at every world; but compositionality suggests we should have its meaning (intension) be a product of the meanings (intensions) of its parts. Rather than a function from worlds to extensions, it should be map an individual to an unstructured proposition. Putting the pieces together: \(⟦\text{\emph{walks}}⟧ = [\lambda x.\lambda w.x\text{ walks in }w]\).
How does this yield an unstructured proposition? Assume Millianism: \[\begin{aligned} ⟦\text{\emph{Sylvester walks}}⟧ &= ⟦\text{\emph{walks}}⟧(⟦\text{\emph{Sylvester}}⟧) \\ &= [\lambda x.\lambda w.x\text{ walks in }w](\text{Sylvester}) \\ &= [\lambda w.\text{Sylvester walks in }w]. \end{aligned}\]
Unpacking the lambda notation, this last is a property of worlds – the property of Sylvester walking in them. But obviously this is an unstructured proposition.
So what about Antony admires Lizzie?
(Diagram omitted in `html` version)Again let’s assume Millianism about names, so that \(⟦\text{\emph{Antony}}⟧ = \text{Antony}\) and \(⟦\text{\emph{Lizzie}}⟧ = \text{Lizzie}\).
And we know that the whole thing needs to have a proposition as its intension.
So what’s crucial is \(⟦\text{\emph{admires Lizzie}}⟧.\) This should be a function from individuals to properties, i.e., sets of individuals – the property of admiring Lizzie: \([\lambda x.\lambda w.x\text{ admires Lizzie in }w]\).
So \(⟦\text{\emph{admires}}⟧\) has to be a function from individuals to functions from individuals to properties: \[\begin{aligned}⟦\text{\emph{admires}}⟧ &= [\lambda y.\lambda x. \lambda w. \text{$x$ admires $y$ in $w$}].\\[10pt] ⟦\text{\emph{Antony admires Lizzie}}⟧ &= ⟦\text{\emph{admires}}⟧(⟦\text{\emph{Lizzie}}⟧)(⟦\text{\emph{Antony}}⟧) \\ &= [\lambda y.\lambda x. \lambda w. \text{$x$ admires $y$ in $w$}](\text{Lizzie})(\text{{Antony}}) \\ &= [\lambda x. \lambda w. \text{$x$ admires Lizzie in $w$}](\text{{Antony}}) \\ &= [\lambda w. \text{Antony admires Lizzie in $w$}]. \end{aligned}\]