Researcher gives subjects their voice

by Angela Herring
Credit: Rupal Patel.

Stephen Hawking and a 9-​​year-​​old girl with a speech dis­order most likely use the same syn­thetic voice. It's called Per­fect Paul and it's easy to under­stand, espe­cially in acousti­cally chaotic envi­ron­ments like class­rooms full of chil­dren. While new, more natural-​​sounding voices are avail­able, Per­fect Paul remains the most oft-​​used syn­thetic voice in the com­mu­nity of dis­or­dered speakers.

But Per­fect Paul con­veys none of the per­son­ality inherent in vocal iden­tity, explains Rupal Patel, an asso­ciate pro­fessor of com­puter sci­ence and speech lan­guage pathology and audi­ology.

"What we're trying to do is improve the quality," she said, "but also increase the per­son­al­iza­tion of those voices, by not just making it a little kid's voice, but making it that little kid's voice."

Backed by a grant from the National Sci­ence Foun­da­tion, Patel and her research team are devel­oping ways to create per­son­al­ized syn­thetic voices that resemble users' vocal iden­ti­ties while remaining as under­stand­able as those of the healthy donors.

In the first iter­a­tion of the project, which Patel calls VocaliD (pro­nounced vocality, for Vocal Iden­tity), her team com­pu­ta­tion­ally merged the acoustics of a sus­tained vowel sound from a child with a speech dis­order, like this:

with the acoustics of a full sen­tence spoken by a healthy speaker of the same demo­graphic, like this:

The result is a clear, syn­thetic voice with the per­son­ality of the end user:

These voices have already elicited great responses from par­ents; one said, "If [my son] had been able to talk, this is what he would sound like." How­ever, the early ver­sion of VocaliD used a difficult-​​to-​​scale  approach that is not easily repro­ducible. Patel said, "We'd like to be able to allow users to create new voices as they mature in the same way a nat­ural voice evolves."

With the sup­port of another grant from the National Sci­ence Foun­da­tion, her team is cur­rently adding phys­i­o­log­ical infor­ma­tion on top of the acoustics.  "When you hear speech, it's a com­bi­na­tion of your source and your filter," Patel said. The source, she explained, derives from the voice box in the larynx whereas the filter is deter­mined by the shape and length of the vocal tract.

Vocal characteristics—such as pitch, breath­i­ness, and loudness—all emerge from the vocal folds in the larynx and give rise to vocal iden­tity. Mod­u­lating those fea­tures by changing the shape of our mouths and moving our tongues gives rise to dis­tinct vowel and con­so­nant sounds, which, Patel said, are typ­i­cally impaired in dis­or­dered speech.

Using data from a set of sen­sors placed on par­tic­i­pants' tongues and mouths, the researchers will deter­mine the most effi­cient way to approx­i­mate the phys­ical aspects of the dis­or­dered speaker's vocal tract. They can then add this infor­ma­tion into the voice-​​synthesis soft­ware to create voices that will grow and change as the users mature.

The aca­d­emic com­mu­nity has long accepted the source-​​filter theory of speech, but more work needs to be done in order to under­stand it, according to Patel, espe­cially as researchers develop more advanced speech tech­nolo­gies for secu­rity and other applications.

Patel's work in par­tic­ular also aims to inform basic research ques­tions such as, "How much do both the source and filter con­tribute to the iden­tity of a speaker's output?"

Patel's soft­ware is com­pat­ible across assis­tive tech­nology plat­forms, including main­stream touch-​​pad devices, a fea­ture she hopes will increase its adop­tion within the com­mu­nity. Patel spec­u­lates that assis­tive com­mu­ni­ca­tion devices will even­tu­ally appeal to healthy people as a new way of learning, com­mu­ni­cating, and interacting.

"The iPad rev­o­lu­tion is helping to break down bar­riers and increasing the emphasis on user inter­face issues," said Patel, who has been working to improve assis­tive com­mu­ni­ca­tion tech­nolo­gies for more than 16 years. "Lots of kids, both healthy and impaired, are using screens to interact now."

add to favorites email to friend print save as pdf

Related Stories

3Qs: Facial recognition is the new fingerprint

Sep 21, 2012

Ear­lier this month, the FBI began rolling out a $1 bil­lion update to the national fin­ger­printing data­base. Facial-​​recognition sys­tems, DNA analysis, voice iden­ti­fi­ca­tion and iris ...

A new kind of pub crawl

Aug 24, 2012

Web­sites like Face­book, LinkedIn and other social-​​media net­works con­tain mas­sive amounts of valu­able public infor­ma­tion. Auto­mated web tools called web crawlers sift through these ...

Professor works toward a better brainwave monitor

Dec 06, 2012

The elec­trical out­puts of the brain con­tain mas­sive amounts of infor­ma­tion that could be a pow­erful resource if we could fully tap into it. Our brain processes things we see before any con­scious ...

Data mining in the social-media ecosystem

Sep 18, 2012

Ray­mond Fu, a newly appointed assis­tant pro­fessor of elec­trical and com­puter engi­neering, wants to build a better social-​​media ecosystem, one in which Face­book makes expert friend rec­om­men­da­tions ...

The secrets of spider silk

Feb 07, 2013

Each time a spider draws silk from its spin­neret to create a new web, it also draws on more than 400 mil­lion years of evo­lu­tion. Spi­ders have evolved to pro­duce a library of silks, each using ...

Recommended for you

What are the chances that your dad isn't your dad?

Apr 16, 2014

How confident are you that the man you call dad is really your biological father? If you believe some of the most commonly-quoted figures, you could be forgiven for not being very confident at all. But how ...

New technology that is revealing the science of chewing

Apr 15, 2014

CSIRO's 3D mastication modelling, demonstrated for the first time in Melbourne today, is starting to provide researchers with new understanding of how to reduce salt, sugar and fat in food products, as well ...

After skin cancer, removable model replaces real ear

Apr 11, 2014

(HealthDay)—During his 10-year struggle with basal cell carcinoma, Henry Fiorentini emerged minus his right ear, and minus the hearing that goes with it. The good news: Today, the 56-year-old IT programmer ...

Italy scraps ban on donor-assisted reproduction

Apr 09, 2014

Italy's Constitutional Court on Wednesday struck down a Catholic Church-backed ban against assisted reproduction with sperm or egg donors that has forced thousands of sterile couples to seek help abroad.

User comments