A CNN model significantly aided orthopaedic surgeons detection of pelvic, rib and spine fractures on CT scans and their time for diagnosis
The use of a convolutional neural network (CNN) deep learning model significantly improved the diagnostic accuracy of orthopaedic surgeons’ detection of fractures as well as the time taken to make the diagnosis according to a study by Japanese researchers.
One of the main causes of diagnostic errors in the emergency department is the failure to correctly interpret radiographs and that the majority of diagnoses missed on radiographs are fractures. Moreover, in patients with multiple traumas, whole body CT scans are generally the preferred imaging modality, but a study has suggested that on a per-scan basis, there were 12.9% of missed injuries, of which 2.5% were clinically significant. The use of a CNN model for automatic detection of rib fractures detection could assist radiologists in improving diagnostic efficiency, reducing diagnosis time and radiologists’ workload. With some evidence that a CNN model can detect individual fractures, there remains uncertainty over whether such models could be of assistance in reducing missed fractures in a wide range of sites such as the spine, ribs and pelvis.
For the present study, the Japanese team examined whether a fracture detection algorithm could be of assistance to orthopaedic surgeons and compared their diagnostic accuracy with and without the CNN model. The team retrospectively reviewed patients who were diagnosed with either a pelvic, rib or spine fracture and who had undergone a CT scan within 7 days of their trauma. The CNN model was trained, and the performance assessed in terms of sensitivity, precision and the F1-score (which is a measure of the model’s accuracy).
CNN model and orthopaedic surgeon’s diagnostic performance
The training and validation set included 181 patients with a mean age of 54.3 (35.3% female). The testing set included 19 patients (mean age 51.2, 31.5% female) with 2,447 CT images and an average of 129 images per patients. The prevalence of pelvic, rib and spine fractures among the images were 5.8%, 3.6% and 5.5% respectively.
The CNN model itself had an overall mean sensitivity of 78.6%, a precision of 64.8% and an F1-score of 71.1% and also high for the individual fracture regions, e.g., pelvic (83.9%), rib (71.3%) and spine (78%).
Three orthopaedic surgeons with 3, 3 and years of experience respectively, reviewed the CT scans both with and without assistance from the CNN model.
For the detection of fractures of a whole-body CT scan, the mean sensitivity for the first surgeon was 69.7% alone but increased to 82.9% with the CNN model (p < 0.0001) and there were similar and significant differences for the other two surgeons. The model also improved the diagnostic accuracy for each of the three scanned areas.
When the researchers examined the time to diagnose the fracture, i.e., based on reading and interpreting the scan, the first surgeon diagnosed a fracture in 278.4 seconds, and this reduced to 162.3 seconds with assistance from the CNN model (p < 0.0001). Once again, there were significantly shorter diagnostic times for the other two surgeons.
The authors concluded that while their CNN model provided a good sensitivity for the detection of pelvic, rib and spinal fractures, when used by the orthopaedic surgeons their sensitivity improved as did the time to make a fracture diagnosis.
Inoue T et al. Automated fracture screening using an object detection algorithm on whole-body trauma computed tomography Sci Rep 2022