In this study, Yoon et al. evaluated factors associated with faster retinal nerve fiber layer (RNFL) thinning on optical coherence tomography in primary open-angle glaucoma (POAG) patients using electronic medical record data. They employed decision tree models, random forest models, and models based on permutation methods, interpreted by the Shapley additive explanation (SHAP) method. In addition to detecting previously known ophthalmic risk factors, they also identified several systemic risk factors.
In the decision tree model, a higher lymphocyte ratio (> 34.65%) was the most important systemic variable discriminating faster or slower RNFL thinning, and higher mean corpuscular hemoglobin (> 32.05 pg) and alkaline phosphatase (> 88.0 IU/L) concentrations were distinguishing factors in the eyes with lymphocyte ratios > 34.65% and ??34.65%, respectively. In the random forest model, higher lymphocyte ratio and higher platelet count were the strongest systemic factors associated with faster RNFL thinning. Previous studies have identified altered immunity as POAG risk factors. Song et al. recently reported a genetic predisposition of higher lymphocyte count to be associated with glaucoma.1 However, previous studies investigating neutrophil to lymphocyte ratio in POAG have reported controversial results,2-5 and no significant difference was found in a recently reported meta-analysis, which also suggested different results among patients of different ethnicities. 6 Further large-scale prospective longitudinal studies with diverse patients will be needed to confirm these findings.
The strength of machine learning models, over conventional linear regression models, is their consideration of potential non-linear relationships and interactions among features. However, one challenge is the limited interpretability of machine learning models. Recently, there have been numerous efforts made for a more explainable artificial intelligence,7 such as the SHAP method used in this study.8,9 Overall, this study demonstrated the use of machine learning approaches using large-scale data in detecting risk factors associated with multifactorial diseases such as POAG.