In everyday conversations, we often begin to speak before we have completely decided what we are going to say and how we are going to say it. This raises the question as to how speaking and thinking are coordinated temporally. How far do speakers think ahead? Scientists at the MPI for Psycholinguistics show how analyses of speakers' eye movements can be used to investigate this question. Their studies demonstrate how the temporal course of sentence preparation is shaped by the content and form of the utterances formulated by speakers. Their findings present new perspectives on the relationship between thinking and language.
"Think before you speak!" This well-meant piece of advice is typically given when someone has already put their foot in it or divulged a well-kept secret. This is hardly surprising: We have long known that speakers only seldom consider in advance exactly what they would like to say. Instead, they usually only plan the beginning of an utterance, start to speak and then continue with their planning while voicing the beginning of the sentence. This works because the planning of speech, that is the selection of the correct words and their order in a sentence, is a faster process than actually saying the words. For example, a speaker needs at least 1.5 seconds to say "The little girl …". This gives him or her sufficient time to plan the next part of the sentence, for example " … pushes the boy". If there is not sufficient time for planning when uttering part of a sentence, the speaker will take a short pause in the sentence or perhaps say "eh .." to buy some time. However, due to pressure of time or a lack of concentration, sometimes we also make mistakes, for example in the sentence "I am delighted that you went!" (instead of "you came"). In general, however, simultaneous thinking and speaking functions very well, and enables the rapid turn-taking in natural conversation.
How do thoughts form?
An important question in the psychology of language concerns the planning units used by speakers when forming their utterances. Antje Meyer and her team at the Max Planck Institute for Psycholinguistics in Nimwegen are particularly interested in the question as to how the thoughts we intend to express are gradually formed, and whether this process unfolds in the same way for all speakers and in all situations; or whether systematic differences exist between different speakers and contexts. They are also interested in how the mental preparation of a statement is coordinated with its utterance, in particular how far in advance speakers plan utterances before they start to speak.
Eye movements reveal the planning processes
To investigate this question, the scientists ask adult test subjects to describe scenes, as, for example, shown in Figure 1, in their native language, in this case Dutch. They record the descriptions and, based on the speech signal, determine when the test subjects starts to speak and when they utter every further word. During the experiment, the test subjects wear an eye movement camera (Figure 2), which can be used to identify down to the millisecond when and for how long they look at the "agent" (the person performing the action, i.e. the girl in Figure 1a) and the "patient" (the person who "undergoes" the action, the boy in Figure 1a). This approach is based on the general principle that we usually direct our gaze where there is something "important" to be seen, that is, for example, the person performing an action, about whom we would like to speak. The scientists can thus identify from the eye movements when the speaker channels his or her attention at an element of the image, presumably forms the related thoughts and, perhaps also, selects the corresponding words from memory. [5; 6; 7] The scientists can then relate this to the spoken utterances and thereby determine how far in advance speakers plan their utterances before they begin to speak.
Possible planning strategies
So how do speakers plan the description of actions? Previous studies presented two hypotheses in this regard: first, speakers are only able to define the first concept and first word before the utterance.  Accordingly, as soon as the image appears they look at one of the actors involved in the activity, for example the girl, and start to speak immediately. The subsequent words in the utterance would then be planned later. According to the second hypothesis, speakers are able before the beginning of the utterance to roughly determine what happens in the image, meaning who does what.  In this instance, they look first at both actors (the boy and the girl) and, perhaps also, other elements of the image (for example, the sledge). In the first case, only one simple concept is defined; whereas, in the second, a more complex thought structure is already formed before the utterance begins. A third, previously unconsidered possibility is that speakers do not use either of these strategies consistently and that their speech planning depends on the difficulty of the task to be completed. Thus, the units of planning could decrease in size with increasing complexity. To enable the testing of this hypothesis, images were used, in which the actions were either easy to recognise and describe (e.g. Fig. 1a) or more difficult (e.g. Fig. 1b, in which a bodyguard pulls a politician aside). The extent to which the persons in the images could be recognised was also varied, which, however, is not illustrated in the image. The test subjects were not given any specific instructions regarding the nature and length of the descriptions.
Speakers are flexible
Antje Meyer and her Group established that the gaze behaviour of the test subjects, and therefore also the temporal course of their thought and speech planning processes, actually depend on the difficulty of the description task. Part of the results (namely for descriptions with easily recognised persons) are depicted in Figure 3. The graph shows which proportion of all gazes is directed at the agent (black) and the patient (grey) for each point in time from the commencement of the image. The utterances began after around 1.8 to 2 seconds. In general, the test subjects tend to view the agent initially rather than the patient. However, when the action was easy to describe (Fig. 3a), the preference for the agent was not very marked. This is evidenced by the fact that the black line was initially (up to approximately 600 ms) only slightly above the grey one. Subsequently, the test subjects tended to direct their gaze at the agent, who was usually mentioned first, and then at the patient, who was named second. This pattern shows that the test subjects begin by establishing an overview of the events (and in doing this often looked at both the agent and the patient) and formed a thought structure. They then returned to the two actors in sequence when they had selected the individual words from their mental lexicon.
In contrast, when the action was more difficult to describe, the test subjects tended to limit themselves to looking at the person performing the action before the start of the utterance; the general overview phase was largely omitted. This is evident from Figure 3b, in which the black line is considerably above the grey one from the beginning; thus the active person (agent) was looked at far more frequently than the object of the action (patient).
These and additional analyses showed that the test subjects did not rigidly apply a particular planning strategy, but – depending on the situation – planned their speech in different ways. If the situation was easy to grasp, they formed a complex thought structure before beginning to speak. In the case of more complex or less clear situations, they focussed initially on one participant in the action and planned the other part of the utterance later.
Therefore, we can plan our utterances in different ways and think in advance to various extents. Of course, we can also select from a large vocabulary. Both elements – the flexibility in the planning strategies and the flexibility in the choice of what is said – help us to express ourselves quickly and appropriately.
An important general question in linguistics and psycholinguistics concerns the extent to which the structure of language influences thinking. [9; 10] As the experiments at the MPI for Psycholinguistics show that speakers of a language are flexible in the choice of their cognitive planning units, it may be expected that speakers of different languages differ considerably in their approach to planning units. In simple statements in German, Dutch and English, the agent is named first and then the action. This is why the speaker can wait to decide which verb to use until he or she is naming the agent. But what happens in languages, in which the action must be expressed at the beginning of the sentence? Do these speakers always determine first what is happening in a scene? Or do they also focus first on the agent in this case? The MPI researchers in Nimwegen are currently investigating these and similar questions with scientists from the Language and Cognition department of their Institute. To do this, they are carrying out similar experiments to those described above in languages in which the verb is placed at the beginning of the sentence, for example Tzeltal, which is spoken in Mexico and Tagalog, which is spoken in the Philippines. With the help of such empirical studies comparing different languages, the psycholinguists can explore how language and thinking are related to each other.