The use of a deep-learning has been shown to increase radiologist’s classification accuracy of chest X-ray findings.
The use of X-rays allows for the investigation of various chest pathologies such as infections, chest trauma and malignancy. Nevertheless, errors and discrepancies in radiology still occur and the estimated day-to-day rate is in the region of 3 –5%. In addition, as much as 90% of missed diagnoses of lung cancer occur on chest radiographs. Consequently, the use of artificial intelligence, deep-learning algorithms to assist with the interpretation of X-rays has the potential to improve diagnostic accuracy. Some data have shown how the use of artificial intelligence systems can provide similar interpretative accuracy as residency radiologists and machine learning is increasingly recognised as a powerful technique for recognising patterns on medical images. Furthermore, evidence suggests that deep-learning systems show great potential for the detection of lesions and pattern classification on chest radiographs. Whether a deep-learning model could be trained to detect a wide range of clinically relevant finding, was the subject of a retrospective study by a team from the Department of Radiology, Alfred Health, Melbourne, Australia. For the purposes of the study, initially, the deep-learning system was trained to recognise a range of images. Next, a group of radiologists interpreted a number of chest X-rays without access to the computerised tool. After a 3-month washout period, the same radiologists interpreted the same images but this time with the assistance of tool. Finally, the team assessed the performance of the tool itself and compared these findings with those of the unassisted radiologists. The assessment of performance was based on a comparative analysis of the area under the curve (AUC) for both the radiologists and the deep-learning system.
Initially, a total of 4568 images from 2568 patients were labelled and included in the training dataset and assessed by the deep-learning model and 20 radiologists with a median duration of clinical experience of 10.5 years. All of the images included information on the patient’s gender and age but no other data such as a radiological report. There were 127 chest X-ray findings identified prospectively by three clinical experts. Across these 127 clinical finding, unassisted radiologists had an average AUC of 0.713 (95% CI 0.65–0.79). The lowest AUC was 0.56 and this was for peribronchial cuffing whereas the highest AUC was 0.979, for electronic cardiac devices. With the assistance of the deep-learning algorithm, the average AUC among radiologists increased to an average of 0.808 (95% CI 0.76–0.83) across all 127 clinical findings. Overall, assisted radiologists diagnostic accuracy improved for 102 (80%) of the clinical findings and was statistically non-inferior for 19 (15%) of findings. The AUC for the deep-learning algorithm was statistically superior to unassisted radiologists for 117 (94%) of 124 clinical findings and non-inferior for the remaining three.
In their discussion, the authors suggested that the study showed the potential for deep-learning to improve chest X-ray interpretation across a wide range of clinical findings and concluded that further work is underway to confirm the applicability of the model as a diagnostic adjunct in clinical practice.
Seah JCY et al. Effect of a comprehensive deep-learning model on the accuracy of chest x-ray interpretation by radiologists: a retrospective, multireader multicase study. Lancet Digit Health 2021