A machine learning model developed to predict thyrotoxic atrial fibrillation (AF) in people with hyperthyroidism may be the first analytical tool to assess a patient’s risk of developing the condition. The discovery could aid in research and disease management, according to a study published in BMC Endocrine Disorders.

Researchers sought to build a machine learning prediction model for AF and to rank predictors in order of importance using machine learning techniques. They conducted a retrospective, observational study of 420 patients with overt hyperthyroidism, of whom 127 had thyrotoxic AF.

Participants underwent outpatient or inpatient hyperthyroidism treatment at 2 centers in St. Petersburg, Russia, between 2000 and 2019. Adult participants with a history of overt hyperthyroidism associated with Graves disease, toxic adenoma, or multinodular toxic goiter were eligible for study inclusion.

Continue Reading

Thirty six variables, classified into 6 categories, were considered in creating the prediction model: demographic data, hyperthyroidism characteristics, cardiovascular status before and during hyperthyroidism, metabolic parameters and blood tests, smoking status, and heart rate-reducing therapy. The researchers also evaluated thyroid status and other laboratory measurements at the time of a hyperthyroidism diagnosis prior to administration of thyrostatic drugs.

After comparing the variables through classical methods, the researchers trained several intermediate prediction models with 8 machine learning algorithms; the most important variables were selected for the final model. Ultimately, the 10 most important and clinically feasible features were selected: age, sex, hyperthyroidism duration, number of relapses, heart rate, presence of arterial hypertension, rhythm disturbances, premature atrial contraction (PAC), premature ventricular contraction (PVC), supraventricular tachycardia, nonsustained ventricular tachycardia, wandering of atrial pacemaker, and heart rate-reducing therapy.

Data were processed and then randomly divided into 2 parts: 70% were used for estimation of the model (training), and 30% were used for validation (testing; n=294 and 126, respectively). A 5-fold cross-validation was performed to estimate each model’s performance. Three interpretability techniques were then applied to represent the prediction model graphically.

The total cohort was composed of 79.3% women; mean age at hyperthyroidism onset was 44.3 ± 12.1 years. Ninety-four percent of patients had Graves disease. In the majority of cases, thyroid-stimulating hormone was lower than the detection limit of 0.01 µIU/L. Results of a lipid panel assessment showed that total cholesterol, low-density lipoprotein, and mean triglyceride level were on target; mean high-density lipoprotein level for men and women was at the lower limit of the target range.

Among patients with thyrotoxic AF, more patients were men, smokers, and had nonimmune thyrotoxicosis, a prolonged duration of subclinical hyperthyroidism, and multiple relapses compared with the nonthyrotoxic AF group. There were also more cases of arterial hypertension and congestive heart failure in this group.

No association of thyrotoxic AF frequency with heart rate was noted; median heart rate for patients with a thyrotoxic AF diagnosis was 96 bpm vs 92 bpm for patients without thyrotoxic AF, but this difference was not statistically significant.

Among the 8 machine learning methods evaluated, the XGB classifier achieved the highest accuracy, with the best model validated on the test set. Performance metrics were 84% accuracy, 82% precision, and 77% recall. The final XGB model achieved a high predictive capacity, with an area under the receiver operator characteristics of 0.93.

Three interpretability techniques for the thyrotoxic AF prediction model were then applied: feature importance, SHapley Additive exPlanations (SHAP), and partial dependence plot.

In the feature importance method, features other than heart rhythm disorder during hyperthyroidism were the most important, followed by PAC and PVC during hyperthyroidism. Hyperthyroid relapse was the least significant feature in the SHAP method, where PAC during hyperthyroidism contributed most to thyrotoxic AF prediction. Advanced age and long hyperthyroidism duration also had the highest positive impact on thyrotoxic AF risk, while short hyperthyroidism duration, the absence of PAC, and low heart rate had the highest negative impact resulting in a reduced risk for disease.

In the partial dependence plot method, age and hyperthyroidism duration values changed thyrotoxic AF probability as long as other feature values were fixed. If patients were older than 33 years and hyperthyroidism duration was longer than 20 months, thyrotoxic AF development risk was more than 0.5. The minimal risk value was 0.16 for patients who were younger than 20 years with a short period of hyperthyroidism, while maximal risk value was 0.7 for patients older than 60 years with a period of hyperthyroidism of more than 40 months.

The top 4 features in both the feature importance and SHAP methods were hyperthyroidism duration, PAC, PVC, and heart rate during hyperthyroidism. When creating a list of the 5 most important risk factors for thyrotoxic AF, the researchers considered all results from both methods; this list included the top 4 features with the addition of age.

Study limitations included the retrospective nature of the research, the small sample size for a machine learning study, the potential for model accuracy to change when tested across cohorts, and the need for future validation, as well as the need for 3 input variables — PAC, PVC, and rhythm disorders — to require electrocardiographic results.

“Further studies have to confirm these new [thyrotoxic AF] risk factors, as well as validate the usefulness and appropriateness of our model in independent cohorts,” the researchers concluded. “[This] study could serve as a basis for further research focused on [thyrotoxic] AF prediction improvement and facilitation of thyrotoxic patients’ management. Our results could be considered in the development of [thyrotoxic AF] risk scales, introduction of which into clinical practice has a potential to reduce [thyrotoxic] AF incidence.”


Ponomartseva DA, Derevitskii IV, Kovalchuk SV, Babenko AY. Prediction model for thyrotoxic atrial fibrillation: a retrospective study. BMC Endocr Disord. 2021;21(1):150. doi: 10.1186/s12902-021-00809-3