Researcher gives subjects their voice

February 21, 2013 by Angela Herring
Credit: Rupal Patel.

Stephen Hawking and a 9-​​year-​​old girl with a speech dis­order most likely use the same syn­thetic voice. It's called Per­fect Paul and it's easy to under­stand, espe­cially in acousti­cally chaotic envi­ron­ments like class­rooms full of chil­dren. While new, more natural-​​sounding voices are avail­able, Per­fect Paul remains the most oft-​​used syn­thetic voice in the com­mu­nity of dis­or­dered speakers.

But Per­fect Paul con­veys none of the per­son­ality inherent in vocal iden­tity, explains Rupal Patel, an asso­ciate pro­fessor of com­puter sci­ence and speech lan­guage pathology and audi­ology.

"What we're trying to do is improve the quality," she said, "but also increase the per­son­al­iza­tion of those voices, by not just making it a little kid's voice, but making it that little kid's voice."

Backed by a grant from the National Sci­ence Foun­da­tion, Patel and her research team are devel­oping ways to create per­son­al­ized syn­thetic voices that resemble users' vocal iden­ti­ties while remaining as under­stand­able as those of the healthy donors.

In the first iter­a­tion of the project, which Patel calls VocaliD (pro­nounced vocality, for Vocal Iden­tity), her team com­pu­ta­tion­ally merged the acoustics of a sus­tained vowel sound from a child with a speech dis­order, like this:

with the acoustics of a full sen­tence spoken by a healthy speaker of the same demo­graphic, like this:

The result is a clear, syn­thetic voice with the per­son­ality of the end user:

These voices have already elicited great responses from par­ents; one said, "If [my son] had been able to talk, this is what he would sound like." How­ever, the early ver­sion of VocaliD used a difficult-​​to-​​scale  approach that is not easily repro­ducible. Patel said, "We'd like to be able to allow users to create new voices as they mature in the same way a nat­ural voice evolves."

With the sup­port of another grant from the National Sci­ence Foun­da­tion, her team is cur­rently adding phys­i­o­log­ical infor­ma­tion on top of the acoustics.  "When you hear speech, it's a com­bi­na­tion of your source and your filter," Patel said. The source, she explained, derives from the voice box in the larynx whereas the filter is deter­mined by the shape and length of the vocal tract.

Vocal characteristics—such as pitch, breath­i­ness, and loudness—all emerge from the vocal folds in the larynx and give rise to vocal iden­tity. Mod­u­lating those fea­tures by changing the shape of our mouths and moving our tongues gives rise to dis­tinct vowel and con­so­nant sounds, which, Patel said, are typ­i­cally impaired in dis­or­dered speech.

Using data from a set of sen­sors placed on par­tic­i­pants' tongues and mouths, the researchers will deter­mine the most effi­cient way to approx­i­mate the phys­ical aspects of the dis­or­dered speaker's vocal tract. They can then add this infor­ma­tion into the voice-​​synthesis soft­ware to create voices that will grow and change as the users mature.

The aca­d­emic com­mu­nity has long accepted the source-​​filter theory of speech, but more work needs to be done in order to under­stand it, according to Patel, espe­cially as researchers develop more advanced speech tech­nolo­gies for secu­rity and other applications.

Patel's work in par­tic­ular also aims to inform basic research ques­tions such as, "How much do both the source and filter con­tribute to the iden­tity of a speaker's output?"

Patel's soft­ware is com­pat­ible across assis­tive tech­nology plat­forms, including main­stream touch-​​pad devices, a fea­ture she hopes will increase its adop­tion within the com­mu­nity. Patel spec­u­lates that assis­tive com­mu­ni­ca­tion devices will even­tu­ally appeal to healthy people as a new way of learning, com­mu­ni­cating, and interacting.

"The iPad rev­o­lu­tion is helping to break down bar­riers and increasing the emphasis on user inter­face issues," said Patel, who has been working to improve assis­tive com­mu­ni­ca­tion tech­nolo­gies for more than 16 years. "Lots of kids, both healthy and impaired, are using screens to interact now."

Explore further: Tracking America's physical activity, via smartphone

Related Stories

Professor works toward a better brainwave monitor

December 6, 2012

The elec­trical out­puts of the brain con­tain mas­sive amounts of infor­ma­tion that could be a pow­erful resource if we could fully tap into it. Our brain processes things we see before any con­scious recog­ni­tion ...

Recommended for you

Sustaining biomedical research: Med school deans speak out

May 27, 2015

Cuts in federal support and unreliable funding streams are creating a hostile work environment for scientists, jeopardizing the future of research efforts and ultimately clinical medicine, according to leaders of the nation's ...

Expert debunks the 'curse of the rainbow jersey'

December 14, 2015

The cycling World champion is significantly less successful during the year when he wears the rainbow jersey than in the previous year, but this is not due to a curse, as many believe, according to a study in the Christmas ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.