A Systematic Comparison of Horizontal Federated Learning Algorithm Based on Random Forests in a Medical Setting

Andrew Cheng; Jingqing Zhang; Atri Sharma; Vibhor Gupta; Yike Guo

doi:10.1007/s11633-023-1489-6

Andrew Cheng, Jingqing Zhang, Atri Sharma, Vibhor Gupta, Yike Guo. A Systematic Comparison of Horizontal Federated Learning Algorithm Based on Random Forests in a Medical Setting[J]. Machine Intelligence Research. DOI: 10.1007/s11633-023-1489-6

Citation:

A Systematic Comparison of Horizontal Federated Learning Algorithm Based on Random Forests in a Medical Setting

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The medical industry generates vast amounts of data suitable for machine learning during patient-clinician interaction in hospitals. However, as a result of data protection regulations like the general data protection regulation (GDPR), patient data cannot be shared freely across institutions. In these cases, federated learning (FL) is a viable option where a global model learns from multiple data sites without moving the data. In this paper, we focused on random forests (RFs) for its effectiveness in classification tasks and widespread use throughout the medical industry and compared two popular federated random forest aggregation algorithms on horizontally partitioned data. We first provided necessary background information on federated learning, the advantages of random forests in a medical context, and the two aggregation algorithms. A series of extensive experiments using four public binary medical datasets (an excerpt of MIMIC III, Pima Indian diabetes dataset from Kaggle, and diabetic retinopathy and heart failure dataset from UCI machine learning repository) were then performed to systematically compare the two on equal-sized, unequal-sized, and class-imbalanced clients. A follow-up investigation on the effects of more clients was also conducted. We finally empirically analyzed the advantages of federated learning and concluded that the weighted merge algorithm produces models with, on average, 1.903% higher F1 score and 1.406% higher AUCROC value.

FullText(HTML)

References (21)

Supplements (0)

Cited By

A Systematic Comparison of Horizontal Federated Learning Algorithm Based on Random Forests in a Medical Setting

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content