Editors Selection IGR 20-4

Clinical Examination Methods: Artificial Intelligence applications II

Minguang He

Comment by Minguang He on:

82400 Development and Validation of a Deep Learning System to Detect Glaucomatous Optic Neuropathy Using Fundus Photographs, Liu H; Li L; Wormstone IM et al., JAMA ophthalmology, 2019; 0:

Find related abstracts

Artificial intelligence has been evolving from an innovation from computer science to clinical adoption in many image-driven clinical disciplines, including ophthalmology. This study is perhaps the best example to demonstrate how to develop and validate a deep learning based artificial intelligence classification on fundus photograph. This research team collected and included 241K images from 68K patients for training and then the validation was performed internally among 28K images from the same source and then validated externally among the images collected in three hospitals in China, one population- based sample (Handan Eye Study) and a multiethnic clinic (Hamilton Glaucoma Center) in US. What impress me the most is that the study involved 22 board-certified ophthalmologists to grade nearly 240K images. This is a tremendous amount of efforts. On the other hand, the investigator chose to use Resnet, a neuron network developed in 2015 that allows deeper and more layers to achieve more accurate classification while at the same time avoid the problems of information loss and optimization error when the layers are too deep although the problem is on its demand on computational power and memory that would compromise its feasibility in real world adoption. Another innovation is their human-computer interaction loop where the manually confirmed false positive images are used for further fine-tuning of the network. One of the interesting findings is that the accuracy of the CNN trained primarily in Chinese eyes, during external validation, was reduced from hospital-based images collected in Chinese, to population-based Chinese sample and further deteriorate among the images collected at the US-based UC San Diego Hamilton Glaucoma Center (less than 90% sensitivity and specificity). This highlights the challenges on the generalizability of CNN classification among the images or features that have not been used to train before.

Issue 20-4

Select Issue