January 7, 2019 report
Three studies show gains being made in using AI to create speech from brainwaves
Three teams working independently have uploaded papers to the bioRxiv preprint server outlining their research involving attempting to use neural network-based AI systems to translate brainwaves into decipherable speech. The first, by a team with members from Columbia University and Hofstra Northwell School of Medicine, focused on recordings made using brain electrode implants with epilepsy patients. The second team, with members from the University of Bremen, Maastricht University, Northwestern University and Virginia Commonwealth University, used data from brain probes implanted during brain tumor surgery. The third team involved a pair of researchers from the University of California and also relied on data from electrodes implanted into the brains of epilepsy patients.
Currently, there is no known way to "listen" to human thought and convert it to decipherable speech. But scientists are getting closer, as demonstrated by the work in these three new efforts. The work involved collecting data from electrodes implanted into the brains of patients who were there for reasons other than speech research—epilepsy patients, for example, have such probes placed on their brains prior to undergoing surgery. And brain tumor patients have them placed in their brains during surgery to help surgeons figure out where and where not to cut. This means that researchers attempting to use such data for other purposes have a small dataset—the probes are only in the brains for a very short period of time. But the different ways they are used offer researchers some advantages. The second team, for example, was also able to gather voice data from the patients undergoing surgery as they responded to requests from the surgeons. That allowed the researchers to compare brain wave activity with the actual words spoken. An AI system was able to translate similar data into actual words at about 40 percent accuracy.
The first team focused on numbers and found that their AI system was approximately 75 percent accurate in extracting numbers from similar data and then reciting them. And the third team used similar data with an AI system to build entire sentences based on data matches with epilepsy patients reading aloud while having their brains probed.
Notably, all three studies involved working with people who were able to speak—the goal is to translate brain waves of patients who are not able to speak into speech. Currently, such patients use eye-controlled cursor boards or other such devices, which do not offer inflection, tone or other speech-related cues.
Miguel Angrick et al. Speech Synthesis from ECoG using Densely Connected 3D Convolutional Neural Networks:, bioRxiv (2018). DOI: 10.1101/478644
Yulia Oganian et al. A speech envelope landmark for syllable encoding in human superior temporal gyrus, bioRxiv (2018). DOI: 10.1101/388280
© 2019 Science X Network