Male infertility scoring using AI-assisted image classification requiring no programming
A research group led by Dr. Hideyuki Kobayashi at Toho University Omori Medical Center in Tokyo developed an AI-assisted image classifier that provides scores for histological testis images of patients with azoospermia. The objective of Dr. Kobayashi, a urologist, was to create an easy-to-use method of pathological examination for the daily clinical practice setting. With it, testis images could be classified at 82.6% accuracy.
Infertility affects females and males equally. In male infertility, azoospermia (the absence of sperm in semen) is a major problem that prevents a couple from having a child. For the treatment of patients with azoospermia, testicular sperm extraction (TESE) is required to obtain mature sperm. When examined, histological specimens are typically ranked with the Johnsen score on a scale of 1 to 10 based on the histopathological features of the testis.
"The Johnsen score has been widely used in urology since it was first reported 50 years ago. However, histopathological evaluation of the testis is not an easy task and takes much time due to the complexity of testicular tissue arising from the multiple, highly specialized steps in spermatogenesis. Our goal was to simplify this time-consuming step of diagnosis by taking advantage of AI technology. To do this, we chose Google's automated machine learning (AutoML) Vision, which requires no programming, to create an AI model for individual patient data sets. With AutoML Vision, clinicians with no programming skills can use deep learning in building their own models without help from data scientists," said Dr. Hideyuki Kobayashi, Associate Professor of Urology department at Toho University School of Medicine (Fig. 1).
"The model we created can classify histological images of the testis without help from pathologists. I hope that our approach will enable clinicians in any field of medicine to build AI-based models which can be used in their daily clinical practice," he said.
To simplify the use of Johnsen scores in clinical practice, Dr. Kobayashi defined four labels: Johnsen score 1–3, 4–5, 6–7, and 8–10 (Fig. 2). He and his co-researchers obtained a dataset of 7155 images at magnification X400. All images were uploaded to the Google Cloud AutoML Vision platform. For the X400 magnification image dataset, the average precision (positive predictive value) of the algorithm was 82.6%, precision was 80.31%, and recall was 60.96% (Fig. 3).
AI has become popular and is being applied in all fields of medicine. However, the use of AI by clinicians in hospitals is still hampered by the need of help from data scientists in the proper use of AI. "The cloud-based machine learning framework we used is for everyone. It can become such a powerful tool in medicine that, in the near future, doctors in hospitals will be using AI-based medical image classifiers with ease in the same way they use Microsoft PowerPoint or Excel now," Dr. Kobayashi said. He added, "The most difficult part was taking images of testis pathology and it was very time consuming. Two colleagues worked very hard to obtain all the images used in the study. I really appreciate their dedicated efforts."
Dr. Kobayashi's group has described the development of an AI-based algorithm for evaluating Johnsen scores combining original images (X400), which achieved high accuracy. This is the first report of an algorithm that can be used for predicting Johnsen scores without having to rely on pathologists and data science experts.