Artificial intelligence (AI) can enhance the readability of patient information leaflets (PILs) in interventional radiology, a UK study has revealed, although expert review remains essential to ensure factual accuracy and patient safety.
Data indicate that 40% of adults struggle to understand written information needed to make health-related decisions. Low health literacy is linked to a range of adverse outcomes, including increased hospitalisations, reduced medication adherence, lower participation in screening programmes and higher mortality.
With this in mind, researchers from NHS Tayside, North Bristol NHS Trust and Cardiff and Vale University Health Board, sought to determine whether a large language model (LLM) could simplify written PILs in interventional radiology without compromising factual content and accuracy.
AI a ‘powerful tool’ for improving PIL readability
The study, published in BMJ Health Care Informatics, analysed PILs from the Cardiovascular and Interventional Radiology Society of Europe (CIRSE) to determine if AI-generated text was easier to understand than the original versions.
A total of 1,769 PILs were uploaded to OpenAI’s GPT-4 interface, where the LLM was prompted to simplify the language while maintaining structure and meaning. Automated readability scores and expert evaluation by three consultant interventional radiologists were used to assess performance.
The modified PILs had a significantly lower average reading grade than the originals (9.5 vs 11.1; p<0.01), indicating that the AI-generated texts used simpler language and shorter sentences.
Despite this improvement, the modified leaflets did not reach the recommended reading grade of six, corresponding to an 11-12-year-old’s reading level, which is considered ideal for patient-facing materials.
What’s more, the consultant interventional radiologists identified minor factual inaccuracies in the AI-generated content and while these did not pose a risk to patients, it highlights the critical need for expert clinical review before implementation in practice.
Future research and conclusions
Study limitations included the focus on English-language texts rather than any of the other 19 languages of CIRSE’s PILs, exclusion of pictorial content that’s often present in PILs, and the use of a single LLM limiting generalisability to other models.
As this study lacked direct patient testing to assess usability and comprehension of the AI-generated PILs, the researchers noted that this patient-centred evaluation should be incorporated into future studies, as well as focusing on exploring different AI models and image-based materials.
The researchers concluded that LLMs can support clinicians in producing clearer, more accessible PILs, enabling patients with varying levels of literacy to better understand interventional procedures. Clinical oversight of AI-generated outputs remains essential to ensure accuracy and safeguard patient safety, they added.
Reference
Clackett W et al. Better understanding: can a large language model safely improve readability of patient information leaflets in interventional radiology? BMJ Health Care Inform 2025;32:e101512.