Solving the 'cocktail party problem'

January 6, 2014 by Keith Hautala

Ever try to make out a quiet voice in a crowded room, where many conversations are happening all at once?

It's what Kevin Donohue calls the "Cocktail Party Problem."

Donohue, the Databeam Professor of Electrical and Computer Engineering at the University of Kentucky, is working on the technology that just might solve it. For more than 25 years he has researched signal-processing systems. This area deals with systems that mimic the 's ability to extract meaning from audiovisual information. A good example is what's going on, behind your eyes and between your ears, when you watch the video above.

"Your brain is making sense out of the sound and images," Donohue says. "Your ears and eyes function as sensors, which send signals to your brain where they are processed to have meaning."

Electrical and computer engineers are not limited to signals that can be seen or heard naturally by humans. They can employ sensors that use ultrasound, x-rays and electromagnetics to tease a meaningful signal out from a .

For the past six years, the main focus of Donohue's work at UK's Center for Visualization and Virtual Environments (the Vis Center) has been in distributed audio systems. This involves arranging systems of microphones in a room to be able to identify sounds, in particular voices, and to isolate and track them using computers.

The video will load shortly

This technology has applications in surveillance—for example, enabling investigators to home in on a "person of interest" whispering into a cell phone at a noisy airport—as well as in "smart rooms" that "understand" what is happening in an environment and can respond in useful ways, such as taking minutes at meetings, documenting brainstorming sessions, and archiving information for efficient retrieval.

Donohue's work is featured in the above video, produced by the Vis Center as part of its "What's Next" series.

Explore further: Uncovering how humans hear one voice among many

Related Stories

Uncovering how humans hear one voice among many

March 11, 2013

Humans have an uncanny ability to zero in on a single voice, even amid the cacophony of voices found in a crowded party or other large gathering of people. Researchers have long sought to identify the precise mechanisms by ...

Deserts 'greening' from rising CO2

July 3, 2013

Increased levels of carbon dioxide (CO2) have helped boost green foliage across the world's arid regions over the past 30 years through a process called CO2 fertilisation, according to CSIRO research.

Recommended for you

Next steps in understanding brain function

August 26, 2016

The most complex piece of matter in the known universe is the brain. Neuroscientists have recently taken on the challenge to understand brain function from its intricate anatomy and structure. There is no sure way to go about ...

Scientists map brain's action center

August 25, 2016

When you reach for that pan of brownies, a ball-shaped brain structure called the striatum is critical for controlling your movement toward the reward. A healthy striatum also helps you stop yourself when you've had enough.

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

RobertKarlStonjek
not rated yet Jan 07, 2014
Humans generally can't achieve the cocktail party effect from recordings of cocktail parties unless they listen to the recordings many times. Why?

The answer is that in the real cocktail party environment humans turn their head and move in an effort to zoom in on the voice of interest. This is phase locking along with frequency and transient response profiling of the voice of interest. One can not achieve this from a recording...

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.