A deep learning algorithm for lung ultrasound with the ability to identify patients with COVID-19 at high risk of clinical worsening showed good agreement with the view of clinicians.
Although the diagnostic assessment of patients with COVID-19 is undertaken with a PCR test, diagnostic imaging using computed tomography (CT) has a reported sensitivity and specificity ranging between 61% and 99% and 25% and 33% respectively. However, because
CT imaging is not portable, other solutions are required. One such alternative and portable imaging modality is lung ultrasound (LUS). The technique provides real-time imaging and has the benefit of portability and is widely available. Moreover, LUS which can be used to identify changes in the physical state of superficial lung tissue and may be of potential value in the assessment of patients with COVID-19. However, LUS is generally restricted to visual inspection and interpretation of imaging artefacts and is thus qualitative and subjective although quantitative scoring systems have been proposed. In recent years, deep learning (DL) algorithms using automatic scoring and semantic segmentation have been developed
to classify each LUS frame.
Whether a deep learning algorithm could be used to evaluate LUS videos and provide a score as well as semantic segmentation for each frame that was of prognostic value in patients with COVID-19 was the subject of
a study by a team from the Diagnostic and Interventional Ultrasound Unit, Valle del Serchio Hospital, Lucca, Italy. The team are the first to report on the development of a standardised imaging protocol and scoring system and which utilised a DL algorithm that was able to evaluate LUS videos and which provided, for each frame, a score as well as semantic segmentation. The team then sought to evaluate the prognostic value of this approach by comparing the level of agreement between the output from the DL and the interpretation from expert clinicians.
All patients were examined using LUS and according to a standardised acquisition protocol that involved 14 scanning areas. All videos acquired by the scans were independently evaluated by two clinicians and who assigned a score ranging from 0 to 3 for each video. This scoring system has been described previously such that a score of 0 = high reflectivity of the normal aerated lung surface and a score of 3 = a pleural line that is highly irregular and cobbled. The acquired videos were also fed into the DL algorithm.
The team analysed data from 82 patients (43 male) with a mean age of 61.1 years, all of whom had a PCR confirmed diagnosis of COVID-19. A total of 1488 LUS examinations were performed (note that some patients were scanned multiple times) which generated 314,879 frames. When comparing the level of agreement between the DL system and the clinical experts, the resulted showed a percentage agreement of 85.96% in the stratification between patients at a high risk of clinical worsening of COVID-19 and patients at low risk. Despite this high level of agreement, there were instances where the DL misclassified scores. For example, in 14% of cases the DL misclassified a score of 3 as 2.
In a discussion of their findings, the authors stressed that for LUS to be a reliable means of patient evaluation, a standardised protocol is required. They concluded that the results were encouraging and demonstrate the potential value of using DL models for the automated scoring of LUS and stratification of the risk of disease progression in those with COVID-19.
Mento F et al. Deep learning applied to lung ultrasound videos for scoring COVID-19 patients: A multicenter study. J Accoust Soc Am 2021;149(5):3628–34. https://asa.scitation.org/doi/10.1121/10.0004855