This website is intended for healthcare professionals only.
Take a look at a selection of our recent media coverage:
25th November 2022
Tumour infiltrating lymphocyte (TIL) scoring based on a machine-learning model has superior classification accuracy for an immune checkpoint inhibitor (ICI) response in patients with advanced non-small cell lung cancer (NSCLC) according to a retrospective analysis by an international research group.
Immunotherapy with immune-checkpoint inhibitors (ICI) has revolutionised the field of oncology for many patients. Nevertheless, not all patients with non-small cell lung cancer benefit from these agents with studies suggesting that in advanced disease, at 1 year, the overall survival rate with, for example, nivolumab, was only 51%. A more favourable response to ICI therapy occurs in those with high programmed cell death ligand-1 expression and a high tumour mutation burden (TMB). A further prognostic factor associated with an improved prognosis in NSCLC patients is high tumour-infiltrating lymphocyte (TIL) levels and which are visually assessed on routine haematoxylin and eosin-stained slides. However, with the increasing use of machine-based learning algorithms in healthcare, some preliminary data highlights the potential for such assessment of haematoxylin and eosin-stained slide sections.
Given the prognostic value of TIL levels, for the present study, researchers developed a machine-learning TIL scoring model to evaluate its association with clinical outcomes in patients with advanced NSCLC. The researchers undertook a retrospective analysis of patient cohorts prescribed PD-(L)1 inhibitors initially for a discovery cohort within a French hospital, followed by an independent validation cohort from hospitals in the UK and the Netherlands. The machine learning model counted tumour, stroma and tumour infiltrating lymphocyte cells whereas values for TMB and PD-L1 expression were determined separately.
Tumour infiltrating lymphocyte cells and ICI response
A total of 685 patients with advanced-stage NSCLCL treated with first or second-line ICI monotherapy were included within the two independent cohorts. The median age in both groups was 66.
Among patients in the discovery cohort, those with a higher TIL cell count had a significantly longer median progression-free survival (Hazard ratio, HR = 0.74, 95% CI 0.61 – 0.90, p = 0.003) and a significantly longer overall survival (HR = 0.76, p = 0.02). Moreover, similar findings of an association between higher tumour infiltrating lymphocyte cell count and both progression-free and overall survival were also observed in the validation cohort.
When using PD-L1 levels as a biomarker, the area under the curve (AUC) was 0.68 and for tumour infiltrating lymphocyte cell levels, only 0.55 and 0.59 for TMB. But when combined, both the PD-L1/TIL and TMB/PD-L1 had higher AUC values (0.68 and 0.70 respectively).
The authors concluded that TIL levels were robustly and independently associated with the response to ICI treatment and could be easily incorporated into the workflow of pathology laboratories at minimal additional cost and might even enhance precision therapy.
Rakaee M et al. Association of Machine Learning-Based Assessment of Tumor-Infiltrating Lymphocytes on Standard Histologic Images With Outcomes of Immunotherapy in Patients With NSCLC. JAMA Oncol 2022
21st November 2022
Using an MRI derived radiomics nomogram that also includes clinical patient characteristics helps to predict which patients with knee osteoarthritis are likely to see improvements in their pain over a 2-year period according to a study by Chinese researchers.
Osteoarthritis (OA) is a common condition that affects 7% of the global population which amounts to more than 500 million people. Although OA has been assessed conventionally using X-rays, an alternative is magnetic resonance imaging (MRI). In fact, it has been suggested that MRI is the best modality for imaging of osteoarthritis enabling visualisation of multiple individual tissue pathologies relating to pain and also to predict clinical outcome. When Chinese researchers undertook a randomised trial to examine the effect of vitamin D on OA knee pain, compared with placebo, there were no significant differences in MRI-measured tibial cartilage volume or knee pain score over 2 years. However, in a post-hoc analysis, 64% of vitamin D participants 57% of placebo participants achieved a 20% improvement in knee pain score over 2 years.
Give that a fifth of participants had actually seen an improvement in their knee pain, researchers wondered how it might be possible to identify those patients who were likely to benefit from vitamin D. They decided to create a radiomics nomogram based on MRI-derived features of subchondral bone together with clinical factors and which could be used to predict an improvement over 2 years in knee OA pain. The team used data from the VIDEO study of knee osteoarthritis in which participants had an MRI scan at baseline and after 24 months. The primary outcome was knee pain, and which was assessed using the WOMAC score. The team used MRI data to create a radiomics model which was trained and then validated with separate cohorts from the VIDEO trial. The data from the model was then used to produce a nomogram for predicting osteoarthritis knee pain improvement over two years. The model was assessed based on the area under the receiver characteristics operating curve (AUC).
MRI derived radiomics model and prediction of knee pain
A total of 216 patients with mean age of 68.3 years (47% female) were included and 172 were used in the training cohort, of whom, 78 had no improvement in pain and the remainder used in the validation cohort.
Only two variables, female gender and baseline total knee pain score were significant predictors of an improvement in knee pain over 2 years and were used in the clinical model, together with vitamin D supplementation. The MRI derived model included a radiomics signature and the two significant clinical variables.
In the validation cohort, the nomogram had a higher AUC than the clinical model (0.83 vs 0.71) for the prediction of knee pain improvement although this difference was not significant (p = 0.08). In addition, the use of a decision curve analysis confirmed the clinical usefulness of nomogram.
The authors concluded that their radiomics-based nomogram which comprised the MR radiomics signature and clinical variables achieved a favourable predictive efficacy and accuracy in differentiating improvement in knee pain among OA patients.
Lin T et al. Prediction of knee pain improvement over two years for knee osteoarthritis using a dynamic nomogram based on MRI-derived radiomics: a proof-of-concept study. Osteoarthritis Cartilage 2022
An AI read chest radiograph can be used to identify patients with severe coronary artery disease who have been referred with suspected angina according to the results of a study by German researchers.
Globally, cardiovascular diseases are the leading cause of death resulting in an estimated 17.9 million lives lost every year. The first step in assessing a patient with suspected stable coronary artery disease (CAD) is to determine the pre-test probability and there are several risk scores available including the Diamond-Forrester score, the CAD Consortium clinical score and CONFIRM risk score. However, the development of artificial intelligence (AI) systems and their use with cardiovascular imaging, is likely to better characterise disease and personalise therapy. This potential made the German researchers re-think about how simple first-line diagnostic tools such as a radiograph might benefit from incorporation of an AI system. They set out to explore whether an AI read chest radiograph might be of value for the detection of significant coronary artery disease (CAD).
The researchers retrospectively considered patients referred to hospital with suspected angina and who underwent coronary angiography and had a chest radiograph. They included only those patients who had posteroanterior (with the patient standing) and anteroposterior (patient sitting) chest radiographs simply because lateral radiographs were not used in all patients. The team used a deep convolutional neural network (DCNN) which was trained to detect significant CAD based on invasive coronary angiography reports. They DCNN had a binary classification, i.e., severe CAD was either absent or present and the model was trained and validated on patients referred for angiography. The performance of the model was assessed based on the area under the receiver operating characteristics curve (AUC) and the associated sensitivity and specificity.
AI read chest radiographs and coronary artery disease
A total of 7728 participants with a mean age of 74 years (70.3% male) were included in the retrospective analysis and of whom, 53% had severe CAD confirmed by invasive angiography.
The AI chest read model had an AUC of 0.73 (95% CI 0.69 – 0.76) and a sensitivity of 90% and a specificity of 31%. Adding the patient’s angina status improved the predictive power of the model for the detection of severe CAD (AUC = 0.77). Moreover, addition of the Diamond Forrester score also improved the predictive power to a similar level (AUC = 0.76).
Using logistic regression, the DCNN prediction was the strongest and independent determinant of severe CAD (odds ratio, OR = 1.04, 95% CI 1.03 – 1.04, p < 0.001).
The authors concluded that an AI-read chest radiograph could be used to determine the pre-test probability of significant CAD in patients referred for suspected angina and called for future studies to externally validate the algorithm to develop a clinically applicable tool, that could support CAD screening in broader settings.
D’Ancona G et al. Deep learning to detect significant coronary artery disease from plain chest radiographs AI4CAD. Int J Cardiol 2022
24th August 2022
A machine-learning MRI model is better able to predict the recurrence of hepatocellular carcinoma (HCC) after a liver transplant than a model based on clinical and laboratory data but is equally effective to a model which uses a combination of clinical/laboratory and MRI-derived data according to a study by US and German researchers.
Liver cancer, of which HCC accounts for about 90% of all cases, remains a global health challenge and it is estimated to have an incidence of over a million cases by 2025. Potentially curative treatment options for hepatocellular carcinoma include liver transplantation, liver resection and thermal ablation, with transplantation offering the lowest rate of cancer recurrence and highest chance of long-term survival. However, despite this, estimated post-transplantation recurrence rates are between 15% and 20%. Methods to estimate the risk of recurrence are therefore needed and hepatobiliary magnetic resonance imaging (MRI) preoperative findings have been found to be associated with a higher tumour recurrence rate in transplanted patients. Machine-learning MRI models have the potential ability to extract information from unstructured medical imaging data and might be of predictive value for cancer recurrence but whether this approach is of value for HCC is uncertain and was the purpose of the present study.
Researchers retrospectively analysed data from a cohort of patients with HCC treated by liver transplant, surgical resection or thermal ablation and who had undergone pre-and post-treatment MRI scans. The US and German team trained a machine-learning MRI system to extract imaging features and developed three predictive models. The first used imaging-derived data only, the second clinical and laboratory individual patient data and a final model, combined the imaging and clinical/laboratory data. The risk of HCC recurrence was predicted over a 6 year period after a patient’s first-line treatment. The predictive value of the different models were assessed based on the area under the receiver operating characteristic curve (AUC).
Machine-learning MRI model and prediction of HCC recurrence
The study included 120 patients with a mean age of 60 years (26.7% male) of whom, 36.7% experienced tumour recurrence during follow-up, with the mean time to recurrence being 26.8 months.
The highest AUC for each of the three models was achieved for the periods 4 and 6 years after treatment. After 6 years, the AUC for the clinical model was 0.69 (95% CI 0.54 – 0.84), 0.85 (95% CI 0.75 – 0.95) for the imaging model and 0.86 (95% CI 0.76 – 0.96) for the combined model.
Over the 6-year period the mean AUC for the imaging model was 0.76, 0.68 for the clinical model and this difference was statistically significant (p = 0.03) although the AUC for the combined model was the same as the imaging model (0.76).
Turning to the individual patient data, the clinical model correctly predicted 25% of recurrences, whereas the imaging model and combined models, both corrected predicted 87.5% of recurrences.
The authors concluded that a machine-learning MRI model could successful predict recurrence of early-stage HCC and that this model was superior to the use of clinical data alone and called for prospective cohort studies to externally validate these algorithms prior to clinical use.
Iske S et al. Machine-Learning Models for Prediction of Posttreatment Recurrence in Early-Stage Hepatocellular Carcinoma Using Pretreatment Clinical and MRI Features: A Proof-of-Concept Study AJR Am J Roentgenol 2022
4th August 2021
The timely detection of COVID-19 infections through PCR testing is vital to contain the spread of the virus. However, while PCR testing has become the most widely used analytical technique to detect the virus, the result is highly dependent on the timing of sample collection, the type of specimen and the quality of the sample. An alternative means of identifying infected individuals is through a combination of symptoms and then ensuring that only those with appropriate symptoms are tested. This approach was used in an Italian study of nearly 3000 subjects and with the aid of a short diagnostic scale, was able to correctly identify the symptoms associated with infection. This same methodology is utilised in the COVID-19 Symptom Study App which is a longitudinal, self-reported study of the symptom profile of patients with COVID-19. Through the use of machine learning models, the study has been able to develop models to identify the main symptoms of infection and their correlation with outcomes. Nevertheless, current models are not conducive to the early detection of infection. This prompted the COVID-19 Symptom Study team to create a machine learning model that captured self-reported symptoms for only the first three days and used this information to predict an individual’s likelihood of being COVID-19 positive.
The team used three different machine learning models to analyse self-reported symptoms. The first model was based on the NHS algorithm which uses the presence of cough, fever or loss of smell between days 1 and 3 as potentially representative of COVID-19 infection. The second logistic regression model, is based on an algorithm which incorporates loss of smell, persistent cough, fatigue and skipped meals and which has been previously validated and found to correlate well with COVID-19 infection. For the third algorithm, the team used 18 self-reported symptoms combined with co-morbidities as well as demographic data and referred to this as a hierarchical Gaussian process model. All three models were compared in terms of sensitivities, specificities and area under the receiver operating characteristics curve (AUC) and evaluated with a training set, for patients self-reporting symptoms between April and October 2020 and a test set for self-reported symptoms between October and November 2020.
There were data from 182,991 participants in the training set and 15,049 in the test set with a similar symptom distribution. The predictive power of the three model was different. For example, the hierarchical Gaussian process model showed the highest predictive value (AUC = 0.80, 95% CI 0.80–0.81) using three days of symptoms compared to the logistic regression model (AUC = 0.74) and the NHS model (AUC = 0.67). The hierarchical Gaussian process model for prediction of COVID-19 infection had a sensitivity of 73% and a specificity 72%. This was higher than either the logistic regression model (59%, 76%, sensitivity, specificity, respectively) and the NHS model (60%, 75%, sensitivity, specificity, respectively). Interestingly, the key symptoms predictive of early COVID-19 were loss of smell, chest pain, persistent cough, abdominal pain, feet blisters, eye soreness and unusual pain.
The authors concluded that the hierarchical Gaussian process model was successfully able to predict the early signs of infection and could be used to enable referral for testing and self-isolation when these symptoms were present.
Canas LS et al. Early detection of COVID-19 in the UK using self-reported symptoms: a large-scale, prospective, epidemiological surveillance study. Lancet Digit Health 2021
26th July 2021
Sepsis can be defined as is a life-threatening organ dysfunction caused by a dysregulated host response to infection. Furthermore, sepsis is responsible for around 11 million deaths each year, which amounts to approximately 20% of all global deaths. Thus, it is crucial that clinicians have a comprehensive understanding of all the relevant clinical factors that can help with the early identification of those patients for whom a poor outcome is likely. This is particularly important since early use of crystalloid therapy reduces mortality, as does prompt administration of antibiotics. Though several scoring systems for sepsis are available, these are based on the assessment of vital signs but which can sometimes be normal upon admission to an emergency department. While machine learning has been shown to have some level of predictive power for mortality, none of the variables currently used in these models are reflective of the symptoms at first presentation. This led a team from the Department of Medical Sciences, Orebro University, Sweden, to use machine learning in an attempt to identify the variables which were predictive of 7- and 30-day mortality in sepsis patients, based on the clinical presentation at an emergency department. They employed a retrospective design and included patients 18 years and older, admitted to hospital with suspected sepsis. The team input previously identified variables, e.g., abnormal temperature, acute altered mental status, etc into the machine learning algorithm. The sensitivity and specificity of the predictive models generated by the machine learning model, were calculated from the area under the receiver operating curve (AUC).
A total of 445 patients with sepsis and a median age of 73 years (52.6% male) were included in the retrospective analysis. Overall, 234 (49.7%) had severe sepsis and 63 patients died within 7-days of admission and 98 within 30 days. The accuracy of the 7-day predictive model was maximal after the inclusion of only six variables; fever, abnormal verbal response, low oxygen saturation, arrival by emergency services, abnormal behaviour/level of consciousness and chills. Using these variables, the AUC sensitivity was 0.84 (95 CI 0.78–0.89) and the specificity 0.67 (95% CI 0.64 –0.70). For the prediction of 30-day mortality, again, only 6 variables were significant; abnormal verbal response, fever, chills, arrival by emergency services, low oxygen saturation and breathing difficulties. This model gave a sensitivity of 0.87 (95% CI 0.81–0.93) and a specificity of 0.64 (95% CI 0.61–0.67).
In discussing their findings, the authors highlighted how their results revealed the importance of the using a clinical symptom complex that was representative of what an emergency department clinician would be likely to encounter in practice. They also suggested that the 7-day model might be of more use in practice since it would be of assistance to emergency care staff for the likely short-term outcome for patients. They concluded that given how the clinical presentation of sepsis can often be non-specific, the use of a machine learning algorithm, based on symptoms and observations, would be most helpful to staff and that future work should focus on validating the method in other cohorts.
Karlsson A et al. Predicting mortality among septic patients presenting to the emergency department– a cross sectional analysis using machine learning. BMC Emerg Med 2021
15th June 2021
Advancements in machine-learning (ML) algorithms in medicine have demonstrated that such systems can be as accurate as humans. However, few systems have been used in routine clinical practice and often ML systems tested in parallel with physicians and actions suggested by the system not acted upon in practice. To fully utilise ML systems in routine clinical care requires a shift from its current adjunctive support role, to being considered as the primary option. In trying to assess the real-world value of an ML algorithm, a team from the Princess Margaret Cancer Centre, Ontario, Canada, decided to explore the value of ML-generated curative-intent radiation therapy (RT) treatment planning for patients with prostate cancer. The team’s overall aim was to evaluate the integration of the ML system as a standard of care and undertook a two-stage study comprising an initial feasibility to clinical deployment. For the initial validation phase, the team included data from 50 patients to assess the ML performance retrospectively. The researchers delivered ML-generated RT plans and asked reviewers to assess these plans (in a blinded fashion) with the actual plans used for the patient. In the subsequent deployment phase, again with 50 patients, both physician generated and ML generated were prospectively compared, again with the treating physician blinded to the source of the plan.
The ML system proved to be much faster at generating plans than the equivalent human-driven process (median 47 vs 118 hours, p < 0.01). Overall, ML-generated plans were deemed to be clinically acceptable for treatment in 89% of cases across both the validation and deployment phase (92% duration the validation phase and 86% during the deployment phase). In only 10 cases, the ML-generated method was deemed not applicable because the plans required consultation with the treating physician, thus unblinding the review process. In addition, 72% of ML-generated RT plans were selected over human-generated RT plans in a head-to-head comparison. However, when compared to the simulation and the deployment phase, the proportion of ML-generated plans used by the treating physician actually reduced from 83% to 61% (p = 0.02).
The authors were unable to fully account for these differences and suggested that either retrospective or simulated studies cannot fully recapitulate the factors influencing clinical-decision-making when patient care is at stake and concluded that further prospective deployment studies are required to validate the impact of ML in real-world clinical settings to fully quantify the value of such methods.
McIntosh C et al. Clinical integration of machine learning for curative-intent radiation treatment of patients with prostate cancer. Nat Med 2021