Pushing back the boundaries of machine translation for health

June 13, 2018, CORDIS
Pushing back the boundaries of machine translation for health
Credit: Chinnapong, Shutterstock

EU researchers have brought us a step closer to fully-automated machine translation with a neural-based system capable of translating texts on public health from English into Czech, German, Polish and Romanian.

Online information is often only available in a few languages as organisations cannot afford to translate it into more. But researchers from the EU-funded Health in My Language, or HimL project, have brought the prospect of fully automated a step closer, by working with Scottish and international organisations to produce a system adapted for the domain.

"Immigrant communities may have limited command of the local language – they need information about local health services but it is not available in their language," says Barry Haddow, project co-ordinator and senior researcher in informatics at the University of Edinburgh. "Information about best practices in health care, resulting from recent research, is mainly disseminated in English but consumers would like to access new meta-analyses in their own language."

Deep learning

The HimL team researched quality improvements in machine translation and incorporated these into a new system able to work from English into Czech, German, Polish and Romanian. It started using a syntactic or phrase-based approach, but quickly moved to neural machine translation (NMT), an approach based on which emerged during the life of the project.

New versions were released each year for use by project partners NHS 24, the Scottish , and Cochrane, an NGO that facilitates access to the latest research on health matters. The results were carefully evaluated using user surveys and application-focused testing.

The improvements were made in three main areas; domain adaptation or tuning the translation to the specific terminology of public health; semantics or ensuring accuracy of translation; morphology or making sure morphological variants are correctly produced.

"English doesn't have a lot of morphology, but a lot of languages in Europe, such as Czech and Polish, do – they have different verb and nouns forms according to use and, if you get it wrong, this can change the meaning of the text," says Dr. Haddow.

Users were asked to rank the results produced by HimL compared to a well-known online system. "Our systems were able to offer better results in all language pairs," says Dr. Haddow, "although the extremely high quality required by NHS 24 and Cochrane users means that we are not yet able to automate translation completely."

Less human intervention

The team also looked at how well the HimL systems performed when combined with post-editing – this approach uses machine translation to produce a rough first version, then gets a human translator to edit the result. "Cochrane showed that post-editing using the HimL system in the MateCat tool was 30-40 % faster than translation from scratch for all languages except for Polish," says Dr. Haddow. "We were able to reduce the amount of by between 30–50% to produce as good a as we would have achieved with the fully human approach."

Other outputs include the UFAL medical corpus, a standard data set for training systems to deal with medical texts. It covers eight European pairs, including the HimL ones.

Analysing the output of NMT showed that problems present in earlier systems have now been largely overcome, but that these systems are still prone to omitting important information or adding incorrect information. "To counter this we use a technique called "reconstruction", where the source should be reconstructable from the output," says Dr. Haddow, "we have also shown how to improve NMT using high quality dictionaries and how to incorporate semantic and syntactic information from external tools."

Explore further: Chinese to English translating: Not human, but exceptional

Related Stories

Chinese to English translating: Not human, but exceptional

March 15, 2018
Microsoft announced Wednesday that its labs have developed an AI machine translation system that can translate from Chinese to English with the same accuracy as can a human. The researchers are at Asia and U.S. labs of Microsoft.

Google Brain posse takes neural network approach to translation

April 8, 2017
(Tech Xplore)—The closer we can get a machine translation to be on par with expert human translation, the happier lots of people struggling with translations will be.

Quality of sentence leaps turns corner thanks to newly announced Google machine translation system

September 29, 2016
(Tech Xplore)—Cutesy headlines are not always appreciated as some readers just want the facts without smirks and giggles. Nonetheless, the headline in this week's Engadget was both funny and quite descriptive: "Google's ...

Google teaches machines to become more fluent translators

November 15, 2016
Google is promising that its widely used translation service is now even more fluent, thanks to an advance that's enabling its computers to interpret complete sentences.

Recommended for you

Self-lubricating latex could boost condom use: study

October 17, 2018
A perpetually unctuous, self-lubricating latex developed by a team of scientists in Boston could boost the use of condoms, they reported Wednesday in the journal Royal Society Open Science.

How healthy will we be in 2040?

October 17, 2018
A new scientific study of forecasts and alternative scenarios for life expectancy and major causes of death in 2040 shows all countries are likely to experience at least a slight increase in lifespans. In contrast, one scenario ...

Study finds evidence of intergenerational transmission of trauma among ex-POWs from the Civil War

October 16, 2018
A trio of researchers affiliated with the National Bureau of Economic Research has found evidence that suggests men who were traumatized while POWs during the U.S. Civil War transmitted that trauma to their offspring—many ...

Father's nicotine use can cause cognitive problems in children and grandchildren

October 16, 2018
A father's exposure to nicotine may cause cognitive deficits in his children and even grandchildren, according to a study in mice publishing on October 16 in the open-access journal PLOS Biology by Pradeep Bhide of Florida ...

Many supplements contain unapproved, dangerous ingredients: study

October 13, 2018
(HealthDay)—U.S. health officials have issued more than 700 warnings during the last decade about the sale of dietary supplements that contain unapproved and potentially dangerous drug ingredients, new research reveals.

Age at which women experience their first period is linked to their sons' age at puberty

October 12, 2018
The age at which young women experience their first menstrual bleeding is linked to the age at which their sons start puberty, according to the largest study to investigate this association in both sons and daughters.


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.