Have you ever found yourself gesticulating – and felt a bit stupid for it – while talking on the phone? You're not alone: it happens very often that people accompany their speech with hand gestures, sometimes even when no one can see them. Why can't we keep still while speaking? "Because gestures and words very probably form a single "communication system", which ultimately serves to enhance expression intended as the ability to make oneself understood", explains Marina Nespor, a neuroscientist at the International School for Advanced Studies (SISSA) of Trieste. Nespor, together with Alan Langus, a SISSA research fellow, and Bahia Guellai from the Université Paris Ouest Nanterre La Défence, who conducted the investigation at SISSA, has just published a study in Frontiers in Psychology which demonstrates the role of gestures in speech "prosody".
Linguists define prosody as the intonation and rhythm of spoken language, features that help to highlight sentence structure and therefore make the message easier to understand. For example, without prosody, nothing would distinguish the declarative statement "this is an apple" from the surprise question "this is an apple?" (in this case the difference lies in the intonation).
According to Nespor and colleagues, even hand gestures are part of prosody: "the prosody that accompanies speech is not 'modality specific'" explains Langus. "Prosodic information, for the person receiving the message, is a combination of auditory and visual cues. The 'superior' aspects (at the cognitive processing level) of spoken language are mapped to the motor-programs responsible for the production of both speech sounds and accompanying hand gestures".
Nespor, Langus and Guellai had 20 Italian speakers listen to a series of "ambiguous" utterances, which could be said with different prosodies corresponding to two different meanings. Examples of utterances were "come sicuramente hai visto la vecchia sbarra la porta" where, depending on meaning, "vecchia" can be the subject of the main verb (sbarrare, to block) or an adjective qualifying the subject (sbarra, bar) ('As you for sure have seen the old lady blocks the door' versus 'As you for sure have seen the old bar carries it'). The utterances could be simply listened to ("audio only" modality) or be presented in a video, where the participants could both listen to the sentences and see the accompanying gestures. In the "video" stimuli, the condition could be "matched" (gestures corresponding to the meaning conveyed by speech prosody) or "mismatched" (gestures matching the alternative meaning).
"In the matched conditions there was no improvement ascribable to gestures: the participants' performance was very good both in the video and in the "audio only" sessions. It's in the mismatched condition that the effect of hand gestures became apparent", explains Langus. "With these stimuli the subjects were much more likely to make the wrong choice (that is, they'd choose the meaning indicated in the gestures rather than in the speech) compared to matched or audio-only conditions. This means that gestures affect how meaning is interpreted, and we believe this points to the existence of a common cognitive system for gestures, intonation and rhythm of spoken language".
"In human communication, voice is not sufficient: even the torso and in particular hand movements are involved, as are facial expressions", concludes Nespor.