Using an AI-supported mammography screening tool results in a similar breast cancer detection rate compared with standard double reading but with a substantially lower screen-reading workload, according to the interim safety findings of a new randomised controlled trial.
Making use of AI-supported software, researchers from Lund University in Sweden, have shown that a screening mammography avoids the need for double reading of all mammograms, without increasing false positives and almost halving radiologists‘ screen-reading workload.
Although previous retrospective analyses have indicated that combining AI with a radiologist improves the accuracy of breast cancer detection and reduces radiologist workload, there have been no randomised trials evaluating this approach until now.
Commenting on the findings, lead author Dr Kristina Lång said: ‘These promising interim safety results should be used to inform new trials and programme-based evaluations to address the pronounced radiologist shortage in many countries. But they are not enough on their own to confirm that AI is ready to be implemented in mammography screening.
‘We still need to understand the implications on patients’ outcomes, especially whether combining radiologists’ expertise with AI can help detect interval cancers that are often missed by traditional screening, as well as the cost-effectiveness of the technology’.
AI vs standard double reading
Published in The Lancet Oncology, the Mammography Screening with Artificial Intelligence (MASAI) trial enrolled 80,033 Swedish women aged 40-80 years who were eligible for mammography screening. Participants were randomly allocated 1:1 to either AI-supported screening (the intervention group, n = 40,003) or standard double reading without AI (the control group, n = 40,030).
The primary outcome measure of the MASAI trial was the interval cancer rate. Secondary outcomes examined included early screening performance (cancer detection rate, recall rate, false positive rate) and screen-reading workload (number of screen readings and consensus meetings).
The AI-supported system provided an examination-based malignancy risk score on a continuous scale ranging from 1 to 10. These examination were then categorised as either low risk (risk score 1 to 7), intermediate risk (risk scores 8 and 9) or high risk (risk score 10). In the intervention group, examinations with risk scores of 1 to 9 underwent single reading, whereas examinations with risk scores of 10 underwent double reading.
The cancer detection rate (per 1,000 screened women) was broadly similar, with a rate of 6.1% for the AI group and 5.1% in the control group. Similarly, recall rates were also not significantly different (2.2% vs 2.0%) and neither were the false positive rates (1.5% in both arms).
The number of screen readings was considerably lower for the AI-supported group (46,345 vs 83,231), representing a 44.3% workload decrease for reading screening mammograms.