4.7 Article

AI-based detection and classification of distal radius fractures using low-effort data labeling: evaluation of applicability and effect of training set size

Journal

EUROPEAN RADIOLOGY
Volume 31, Issue 9, Pages 6816-6824

Publisher

SPRINGER
DOI: 10.1007/s00330-021-07811-2

Keywords

Radiography; Radius fractures; Deep learning

Funding

  1. Universitat Basel (Universitatsbibliothek Basel)
  2. Gottfried und Julia Bangerter-Rhyner-Stiftung

Ask authors/readers for more resources

The study evaluated the performance of a deep convolutional neural network in detecting and classifying distal radius fractures, metal, and casts on radiographs using labels based on radiology reports. The results showed that the models trained on a DCNN with report-based labels are suitable as a secondary reading tool for detecting distal radius fractures, while models for fracture classification are not yet ready for clinical use. Additionally, larger training sets led to better models in all categories except joint affection.
Objectives To evaluate the performance of a deep convolutional neural network (DCNN) in detecting and classifying distal radius fractures, metal, and cast on radiographs using labels based on radiology reports. The secondary aim was to evaluate the effect of the training set size on the algorithm's performance. Methods A total of 15,775 frontal and lateral radiographs, corresponding radiology reports, and a ResNet18 DCNN were used. Fracture detection and classification models were developed per view and merged. Incrementally sized subsets served to evaluate effects of the training set size. Two musculoskeletal radiologists set the standard of reference on radiographs (test set A). A subset (B) was rated by three radiology residents. For a per-study-based comparison with the radiology residents, the results of the best models were merged. Statistics used were ROC and AUC, Youden's J statistic (J), and Spearman's correlation coefficient (rho). Results The models' AUC/J on (A) for metal and cast were 0.99/0.98 and 1.0/1.0. The models' and residents' AUC/J on (B) were similar on fracture (0.98/0.91; 0.98/0.92) and multiple fragments (0.85/0.58; 0.91/0.70). Training set size and AUC correlated on metal (rho = 0.740), cast (rho = 0.722), fracture (frontal rho = 0.947, lateral rho = 0.946), multiple fragments (frontal rho = 0.856), and fragment displacement (frontal rho = 0.595). Conclusions The models trained on a DCNN with report-based labels to detect distal radius fractures on radiographs are suitable to aid as a secondary reading tool; models for fracture classification are not ready for clinical use. Bigger training sets lead to better models in all categories except joint affection.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available