February 5, 2021

Designing and evaluating medical deep learning systems

by Oslo University Hospital

Can better design of deep learning studies lead to the faster transformation of medical practices? According to the authors of "Designing deep learning studies in cancer diagnostics," published in Nature Reviews Cancer's latest issue, the answer is yes.

"We propose several protocol items that should be defined before evaluating the external cohort" says first author Andreas Kleppe at the Institute for Cancer Diagnostics and Informatics at Oslo University Hospital."

"In this way, the evaluation becomes rigorous and more reliable. Such evaluations would make it much clearer which systems are likely to work well in clinical practice, and these systems should be further assessed in phase III randomized clinical trials."

Slow implementation is partly a natural consequence of the time needed to evaluate and adapt systems affecting patient treatment. However, many studies assessing well-functioning systems are at high risk of bias.

According to Kleppe, even among the seemingly best studies that evaluate external cohorts, few predefine the primary analysis. Adaptations of the deep learning system, patient selection or analysis methodology can make the results presented over-optimistic.

The frequent lack of stringent evaluation of external data is of particular concern. Some systems are developed or evaluated on too narrow or inappropriate data for the intended medical setting. The lack of a well-established sequence of evaluation steps for converting promising prototypes into properly evaluated medical systems limits deep learning systems' medical utilization.

Millions of adjustable parameters

Deep learning facilitates utilization of large data sets through direct learning of correlations between raw input data and target output, providing systems that may use intricate structures in high-dimensional input data to model the association with the target output accurately. Whereas supervised machine learning techniques traditionally utilized carefully selected representations of the input data to predict the target output, modern deep learning techniques use highly flexible artificial neural networks to correlate input data directly to the target outputs.

The relations learnt by such direct correlation will often be true but may sometimes be spurious phenomena exclusive to the data utilized for learning. The millions of adjustable parameters make deep neural networks capable of performing correctly in training sets even when the target outputs are randomly generated and, therefore, utterly meaningless.

Design and evaluation challenges

The high capacity of neural networks induces severe challenges for designing and developing deep learning systems and validating their performance in the intended medical setting. An adequate clinical performance will only be possible if the system has good generalisability to subjects not included in the training data.

The design challenges involve selecting appropriate training data, such as representativeness of the target population. It also includes modeling questions such as how the variation of training data may be artificially increased without jeopardizing the relationship between input data and target outputs in the training data.

The validation challenge includes verifying that the system generalizes well. For example, does it perform satisfactorily when evaluated on relevant patient populations at new locations and when input data are obtained using differing laboratory procedures or alternative equipment? Moreover, deep learning systems are typically developed iteratively, with repeated testing and various selection processes that may bias results. Similar selection issues have been recognized as a general concern for the medical literature for many years.

Thus, when selecting design and validation processes for diagnostic deep learning systems, one should focus on the generalization challenges and prevent more classical pitfalls in data analysis.

"To achieve good performance for new patients, it is crucial to use various training data. Natural variation is always essential, but so is introducing artificial variation. These types of variation complement each other and facilitate good generalisability," says Kleppe.

More information: Andreas Kleppe et al. Designing deep learning studies in cancer diagnostics, Nature Reviews Cancer (2021). DOI: 10.1038/s41568-020-00327-9

Journal information: Nature Reviews Cancer

Provided by Oslo University Hospital

Citation: Designing and evaluating medical deep learning systems (2021, February 5) retrieved 23 April 2024 from https://medicalxpress.com/news/2021-02-medical-deep.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Deep learning in the emergency department

34 shares

Feedback to editors

Study finds suicidal behaviors increased by over 50% in Catalonia, Spain after the COVID-19 pandemic

45 minutes ago

Why do we move slower the older we get? New study delivers answers

1 hour ago

Stress activates brain regions linked to alcohol use disorder differently for women than men, finds study

1 hour ago

Chemical tool illuminates pathways used by dopamine, opioids and other neuronal signals

1 hour ago

Gentle defibrillation for the heart: A milder method developed by researchers for cardiac arrhythmias

2 hours ago

Q&A: Research shows neural connection between learning a second language and learning to code

2 hours ago

Gut microbiota acts like an auxiliary liver, study finds

2 hours ago

Higher light levels may improve cognitive performance

2 hours ago

Brain neurons re-entering the cell cycle age quickly and shift to senescence, particularly in neurodegenerative disease

3 hours ago

Study reveals alarming rates of pediatric injuries from mechanical bull riding

4 hours ago

Load comments (0)

Designing and evaluating medical deep learning systems

Millions of adjustable parameters

Design and evaluation challenges

Study finds suicidal behaviors increased by over 50% in Catalonia, Spain after the COVID-19 pandemic

Why do we move slower the older we get? New study delivers answers

Stress activates brain regions linked to alcohol use disorder differently for women than men, finds study

Chemical tool illuminates pathways used by dopamine, opioids and other neuronal signals

Gentle defibrillation for the heart: A milder method developed by researchers for cardiac arrhythmias

Q&A: Research shows neural connection between learning a second language and learning to code

Gut microbiota acts like an auxiliary liver, study finds

Higher light levels may improve cognitive performance

Brain neurons re-entering the cell cycle age quickly and shift to senescence, particularly in neurodegenerative disease

Study reveals alarming rates of pediatric injuries from mechanical bull riding

Deep learning in the emergency department

Identifying and assessing frailty in people with diabetes should be a priority, in order to better treat and manage

Researchers measure reliability, confidence for next-gen AI

Deep learning outperforms standard machine learning in biomedical research applications, research shows

Law professor suggests a way to validate and integrate deep learning medical systems

Deep learning on cell signaling networks establishes AI for single-cell biology

Gentle defibrillation for the heart: A milder method developed by researchers for cardiac arrhythmias

A sustainable diagnosis tool for multiple cancers

Magnetic microcoils unlock targeted single-neuron therapies for neurodegenerative disorders

Researchers develop deep-learning model capable of predicting cardiac arrhythmia 30 minutes before it happens

Using AI to improve Alzheimer's treatment through the 'gut-brain axis'

Advancing high-resolution ultrasound imaging with deep learning

Phys.org

Tech Xplore

Science X

Designing and evaluating medical deep learning systems

Millions of adjustable parameters

Design and evaluation challenges

Study finds suicidal behaviors increased by over 50% in Catalonia, Spain after the COVID-19 pandemic

Why do we move slower the older we get? New study delivers answers

Stress activates brain regions linked to alcohol use disorder differently for women than men, finds study

Chemical tool illuminates pathways used by dopamine, opioids and other neuronal signals

Gentle defibrillation for the heart: A milder method developed by researchers for cardiac arrhythmias

Q&A: Research shows neural connection between learning a second language and learning to code

Gut microbiota acts like an auxiliary liver, study finds

Higher light levels may improve cognitive performance

Brain neurons re-entering the cell cycle age quickly and shift to senescence, particularly in neurodegenerative disease

Study reveals alarming rates of pediatric injuries from mechanical bull riding

Related Stories

Deep learning in the emergency department

Identifying and assessing frailty in people with diabetes should be a priority, in order to better treat and manage

Researchers measure reliability, confidence for next-gen AI

Deep learning outperforms standard machine learning in biomedical research applications, research shows

Law professor suggests a way to validate and integrate deep learning medical systems

Deep learning on cell signaling networks establishes AI for single-cell biology

Recommended for you

Gentle defibrillation for the heart: A milder method developed by researchers for cardiac arrhythmias

A sustainable diagnosis tool for multiple cancers

Magnetic microcoils unlock targeted single-neuron therapies for neurodegenerative disorders

Researchers develop deep-learning model capable of predicting cardiac arrhythmia 30 minutes before it happens

Using AI to improve Alzheimer's treatment through the 'gut-brain axis'

Advancing high-resolution ultrasound imaging with deep learning

Newsletter sign up

Donate and enjoy an ad-free experience