This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:


trusted source


Putting a voice and face together in early infancy determines later language development

Putting a voice and face together in early infancy determines later language development
Credit: Florida International University

Matching the sight and sound of speech—a face to a voice—in early infancy is an important foundation for later language development.

This ability, known as intersensory processing, is an essential pathway to learning new words. According to a recent study published in the journal Infancy, the degree of success at intersensory processing at only 6 months old can predict vocabulary and language outcomes at 18 months, 2 years and 3 years old.

"Adults are highly skilled at this, but infants must learn to relate what they see with what they hear. It's a tremendous job and they do it very early in their development," said lead author Elizabeth V. Edgar, who conducted the study as an FIU psychology doctoral student and is now a postdoctoral fellow at the Yale Child Study Center. "Our findings show that intersensory processing has its own independent contribution to language, over and above other established predictors, including parent language input and socioeconomic status."

Across three years, Edgar and a team at FIU psychology professor Lorraine E. Bahrick's Infant Development Lab tested intersensory processing speed and accuracy in 103 infants between the ages of 3 months and 3 years old, using the Intersensory Processing Efficiency Protocol (IPEP). This tool was created by Bahrick and co-investigator FIU Research Assistant Professor of Psychology James Torrence Todd and colleagues.

Designed to present distraction or simulate the "noisiness" of picking out a from a crowd, the IPEP presents several short video trials. Each trial depicts six faces of women displayed in separate boxes on the screen at once. All the women appear to be speaking.

However, the soundtrack that matches only one of the women speaking is heard on each trial. With an that follows pupil movement, the researchers could measure whether the babies made the match, as well as how long they watched the matching face and voice.

Then, the data was compared with language outcomes at different stages of development—such as how many unique and total words children used. Results revealed infants who looked longer at the correct speaker were later found to have better language outcomes at 18 months, 2 years and 3 years old.

The connection between intersensory processing and language becomes clearer when considering the nature of speech. It's a sound, of course. But it's also accompanied by lip movements, facial expressions and gestures. Speaking is both auditory and visual. Baby talk, in particular, is a true multisensory experience. A parent or caregiver gestures playfully, perhaps moving around a favorite toy while naming it. This sets the stage for learning, understanding what word corresponds to specific objects in the world—something that can only happen once a baby can be more selective with their attention, cutting through distractions to match a voice to a face or a sound to an object.

"Better selective attention to audiovisual speech in infancy may allow to take greater advantage of early word learning opportunities, such as object labeling, provided by caregivers during interactions," Bahrick said.

For parents or caretakers, Edgar pointed out this research serves as a reminder that babies rely on coordinating what they see with what they hear to learn language.

"That means it is helpful to gesture toward what you're talking about or move an object around while saying its name. It's the object-sound synchrony that helps show that this word belongs with this thing," Edgar explained. "As we're seeing in our studies, this is very important in and lays the groundwork for more complex skills later on."

More information: Elizabeth V. Edgar et al, Intersensory processing of faces and voices at 6 months predicts language outcomes at 18, 24, and 36 months of age, Infancy (2023). DOI: 10.1111/infa.12533

Citation: Putting a voice and face together in early infancy determines later language development (2023, June 20) retrieved 3 December 2023 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Why learning animal sounds can be crucial to children's language development


Feedback to editors