An Update on Artificial Intelligence for Detecting Diabetic Eye Disease: All Hype or the New Reality?

Diabetic retinopathy concept
Diabetic retinopathy concept
Deep learning has shown substantial promise to date in automated image analysis towards the accurate diagnosis of diabetic retinopathy.

Diabetes is a growing epidemic both domestically and internationally that threatens to strain an already overburdened healthcare system. More than 1 in 5 healthcare dollars in the United States are spent on diabetes-related care, while the total annual attributable cost for the disease is over a quarter of a trillion dollars.1 Current estimates show that over 30 million Americans have diabetes and over 400 million people are affected worldwide.2,3 Both of these figures continue to rise at staggering rates that surpass most predictive models.

The most common microvascular complication of the disease is development of diabetic retinopathy (DR) and diabetic macular edema (DME), which is the leading cause of blindness in the working-age population globally.4 In order to prevent potentially permanent vision loss, early recognition and timely referral for treatment of vision-threatening complications is paramount. Progression of retinopathy can be arrested, and in some cases reversed, by employing intensive glycemic management and optimization of associated systemic risk factors (ie, hypertension, hypercholesterolemia) coupled with dietary and lifestyle modifications. For many patients, though, these measures alone are not sufficient, and added local intervention by a retina specialist is undertaken to save vision in the form of intravitreal injection pharmacotherapy (anti-vascular endothelial growth factor [VEGF] agents and/or corticosteroids), laser photocoagulation, and/or vitrectomy surgery.

Given the benefits of diagnosing DR at an earlier stage, the American Academy of Ophthalmology (AAO) and American Diabetes Association (ADA) collectively recommend routine regular screening for patients with diabetes. This often takes the form of a dilated funduscopic examination on at least an annual basis, but may also include ancillary imaging testing, such as color fundus photography, fluorescein angiography, and optical coherence tomography in order to look for more subtle signs of disease or allow for more nuanced staging. Such imaging modalities may lend themselves to automated analysis by a computer algorithm together with human oversight in order to allow for the deployment of large-scale DR screening initiatives.

Recent advances in artificial intelligence have taken the healthcare sector by storm, already evident with the emergence of numerous commercial entities with specific subspecialty niches, such as aiming to improve the pathologic detection of cancer and interpretation of radiology images.5,6 Within ophthalmology, artificial intelligence is already augmenting diagnostic imaging capabilities, which may soon lead to deployment of cost-efficient telemedicine screening programs worldwide. Recent studies have shown that deep learning models are capable of aiding in the detection and diagnosis of diseases afflicting the posterior segment of the eye, namely DR, with extremely high accuracy.7-15

Artificial Intelligence, Machine Learning, and Deep Learning Defined

Due to ever increasing popularity and integration into popular culture, the terms artificial intelligence, machine learning, and deep learning are frequently used interchangeably; however, it is important to differentiate and distinguish amongst the 3. Broadly speaking, these can each be viewed as concentric circles, with the largest circle being artificial intelligence, machine learning being a smaller circle within the subset of artificial intelligence, and deep learning being the smallest circle within the subset of machine learning (Figure 1). More specifically, artificial intelligence is defined as the ability of computer systems to perform complex, independent tasks that require human-like intelligence, such as visual processing, speech recognition, or decision making. Machine learning is employed when computer programs have the ability to improve their own decision making by learning from data provided to them without being provided explicit rules.

Deep learning is an increasingly popular and powerful model of machine learning, composed of algorithms that use a cascade of multilayered artificial neural networks — “deep” referring to the number of layers — to independently perform feature extraction from data.16,17 Each successive layer in the network uses the output from the previous layer as input, with the final layer revealing the diagnostic output. Deep learning can be regarded as an improvement on conventional artificial neural networks by creating networks with multiple layers. Learning in this format can be classified as either supervised (classification-based) or unsupervised (pattern analysis-based). The latter represents one of the more fascinating aspects of deep learning, where large datasets are analyzed to discover underlying patterns without the need for feature engineering.

Figure 1.

Clinically speaking, instead of researchers hand coding instructions to an algorithm on what a microaneurysm, hemorrhage, or neovascular frond may look like on a diabetic fundus photograph, rather, they input an image labeled as “severe nonproliferative DR” for example, and with enough labeled data, the computer eventually learns what that is. To train itself, a deep learning neural network must have a variable and large enough dataset available. In the context of ophthalmology, while it is possible that the algorithm independently appreciates the same classical features of DR, it is also possible that it has identified its own pattern recognition of disease beyond the scope of human interpretation. This is referred to as the “black box” of deep learning. Elucidating what exactly the algorithm interprets is the subject of ongoing research.

Machine Learning Models for Detection of Diabetic Retinopathy

In recent years, numerous research groups have developed deep learning models capable of diagnosing DR (Table 1). In April 2018, the US Food and Drug Administration (FDA) granted marketing approval to IDx (Coralville, Iowa) for the first artificial intelligence-based medical device to detect referable DR from color fundus photographs obtained from a nonmydriatic fundus camera (NW400, Topcon Medical Systems, Oakland, New Jersey). The cloud-based software, IDx-DR, was granted Breakthrough Device designation by the FDA and is the first approved instrument to provide a screening decision without clinician input.12

IDx-DR’s approval was based on a clinical study that assessed the software’s performance on retinal images from 900 patients with diabetes at 10 different primary care sites.12 The platform’s sensitivity and specificity in detecting greater than mild DR (termed referable DR) was 87% and 90%, respectively. Notably, existing staff at the primary care sites received a one-time, standardized 4-hour training program on operating the system, after which they were able to successfully image patients and transfer information to the platform 96% of the time. This was supportive of the company’s claims that the system is easy to use with a reasonable learning curve for clinical staff involved in acquiring images.

IDx is one of several commercial players in a rapidly growing space. Other automated retinal image analysis systems include iGradingM (Medalytix Group Ltd, Manchester, United Kingdom), Retmarker (Retmarker SA, Taveiro, Portugal), and EyeArt (Eyenuk, Woodland Hills, California). As these various platforms utilize different proprietary software and training sets, directly comparing from study to study is difficult and reported accuracy statistics vary amongst publications.

Table 1: Relevant Deep Learning Studies for Detection of Diabetic Retinopathy

Date PublishedAuthorTitleDataset Images for Training and Testing/Validation Reported Outcome
2016Abràmoff7Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learningFundus photos

25,000 training

874 validation
Sensitivity: 96.8% Specificity: 87.0%
AUC: 0.980
2016Gulshan8Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographsFundus photos

128,175 training

Validation: 9963 (EyePACS), 1748 (Messidor)
EyePACS Sensitivity: 97.5% Specificity: 93.4%
AUC: 0.991  

Messidor Sensitivity: 96.1% Specificity: 93.9%
AUC: 0.990
2017Gargeya9Automated identification of diabetic retinopathy using deep learningFundus photos

75,137 training

15,000 validation (mixed sources)
Sensitivity: 94% Specificity: 98%
AUC: 0.94-0.97
2017Ting10Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetesFundus photos

76,370 training

Validation: 71,896 images of 14,880 patients
Sensitivity: 90.5% Specificity: 91.6%
AUC: 0.936  
2018Ramachandran11Diabetic retinopathy screening using deep neural networkFundus photos

 >100,000 training

Validation: 485 (Otago), 1200 (Messidor)
Otago Sensitivity: 84.6% Specificity: 79.7%
AUC: 0.901  

Messidor Sensitivity: 96.0% Specificity: 90.0%
AUC: 0.980
2018Abràmoff 12Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care officesFundus photos

900 participants for validation
Sensitivity: 87.2% Specificity: 90.7%
2019Bhaskaranand13The value of automated diabetic retinopathy screening with the EyeArt system: a study of more than 100,000 consecutive encounters from people with diabetesFundus photos

850,908 images from 101,710 consecutive patient visits
Sensitivity: 91.3% Specificity: 91.1%
AUC: 0.965
2019Gulshan14Performance of a deep-learning algorithm vs manual grading for detecting diabetic retinopathy in IndiaFundus photos

103,634 training

5762 images from 3049 patients at 2 tertiary sites for validation
Aravind Eye Hospital Sensitivity: 88.9% Specificity: 92.2%
AUC: 0.963

Sankara Nethralaya Hospital Sensitivity: 92.1% Specificity: 95.2% AUC: 0.980  

*AUC: area under the receiver operating characteristic curve