News Research Lung Cancer Prognostic & Predictive Models Diagnostics & Imaging

AI-Backed Response Criteria Improve Treatment Response Assessments in Pleural Mesothelioma

June 22, 2026 ASCO AI Staff 7 min read
Share Share via Email Share on Facebook Share on LinkedIn Share on Twitter

Progression-free survival as a treatment response measurement was found to be more accurate for patients with pleural mesothelioma when based on an AI-assisted volumetric response criteria, called ARTIMES, than on standard international Response Evaluation Criteria in Solid Tumors (RECIST) criteria. Research about the development and validation of the AI-backed criteria was published in The Lancet Oncology.

The AI-backed criteria also outperformed humans in its assessments of pleural mesothelioma responses. “We are the first in the world to demonstrate that AI outperforms humans in this area, and that physicians can actually base their decisions on it,” said lead study author Kevin B. W. Groot Lipman, PhD, a technical physician in the Department of Thoracic Oncology at the Netherlands Cancer Institute.

“ARTIMES has potential to improve decision making in clinical trials, improve trial-level surrogacy, and support more efficient drug evaluation. AI-assisted volumetry represents a promising direction for response evaluation in mesothelioma and potentially other tumors,” the study authors noted in their paper.

Background and Study Methods

Most physicians use RECIST criteria to measure tumor growth across solid tumors for greater repeatability and comparison across various health systems, scans, and countries. RECIST criteria relies on a diameter-based measurement of tumor growth to indicate if the tumor is growing or reducing in response to treatment.

However, this method has limited applicability to pleural mesothelioma due to its irregular crescent growth pattern in the lining of the lungs. Additionally, many have found that these criteria cannot accurately predict patient survival.

A group of AI experts, radiologists, and pulmonologists from the Netherlands Cancer Institute developed the ARTIMES criteria to measure pleural mesothelioma tumor growth with a more volume-based approach that is compared on a pixel level to prior scans with AI, which is too difficult and time-consuming for physicians to complete.

They conducted a retrospective, multicenter study to develop and validate ARTIMES, with an evaluation of 10,926 computed tomography (CT) scans from 2,080 patients with pleural mesothelioma collected from 14 cohorts, including ten clinical trial cohorts. CT scans with a soft reconstruction kernel were found to be preferred over lung kernel CTs.

Model Methods

A deep-learning segmentation AI model based on nnUNET, an auto-configuring deep learning framework that adapts model architecture, was trained on a subset cohort of 1,176 CT scans plus 100 negative CT scans. The CT scans were annotated by 12 radiologists and a pulmonologist. Performance of the AI segmentation was evaluated after each of six stages.

In the first stage, four radiologists and a pulmonologist corrected the AI-based segmentations generated by a prior pleural plaque model. The model was then retrained on CT scans and segmentations in an active-learning loop until performance stabilized. In the second stage, the training set was expanded and the segmentations were manually corrected by four radiologists and then iteratively incorporated into the model. The updated AI model was deployed at the University Hospital of Leicester in the third stage, and an experienced radiologist corrected the AI segmentations. In the fourth stage, the researchers added tumor-free control images to the final dataset. Independent external testing was done in stages five and six with manual segmentation from a radiologist who did not interact with the AI model or guidelines.

The model was tested on 98 CT scans from independent international hospitals and external testing was conducted on a cohort of 138 CT scans from three sources. Performance of the AI model for segmentation was evaluated with dice similarity coefficient as a measure of overlap and a normalized surface distance of 3 mm as a threshold for a tolerated distance from the reference boundary. Thresholds for progressive disease were then established based on patients with multiple CT scans before and after treatment.

The ARTIMES model was validated on data from eight clinical trials, encompassing 4,674 CT scans from 943 patients, and compared with modified RECIST criteria as a measure of survival.

Key Findings

For the ARTIMES response criteria, the minimal detectable change—which was calculated based on segmented tumor volume and the upper 95% confidence interval (CI) of absolute volumetric difference as well as percentage change—was 35 mL and 12.4%. A partial response was thusly set as a greater than 35 mL and 15% decrease from the maximum tumor volume from the start of treatment, or a 75% reduction in tumor volume independent of absolute change. Progressive disease was considered an average tumor growth of more than 41.2% and greater than 67 mL over 2 months from baseline, or a new lesion detected outside of the pleura.

ARTIMES criteria demonstrated superior prognostic performance (concordance index = 0.83; 95% CI = 0.79–0.87) compared with modified RECIST criteria (concordance index = 0.73; 95% CI = 0.66–0.80; P = .023). The AI model detected progressive disease a median of 38 days ahead of the RECIST criteria at 124 days (95% CI = 115–126) vs 162 days (95% CI = 138–167), respectively (P < .0001).

In clinical trials, progression-free survival based on ARTEMIS criteria was more strongly correlated with overall survival (coefficient of determination [R2] = 88%; 95% CI = 42%–100%) than progression-free survival based on modified RECIST criteria (R2 = 6%; 95% CI = 0%–97%).

Additionally, the model showed a surrogate treatment effect of 0.82 at the clinical trial level, meaning that a hazard ratio for progression-free survival under 0.82 would result in a significant difference in overall survival. No such value was found for the modified RECIST criteria.

The AI-derived tumor volume measurements also outperformed standard T stage and World Health Organization performance status measurements.

“We can now assess tumor response to treatments much more accurately,” stated Sjaak Burgers, a pulmonologist at the Netherlands Cancer Institute. “We can discover the lack of response much sooner than before. This allows a patient to stop treatment earlier and, if possible, switch to a different treatment. This not only provides certainty but also spares our patients unnecessary side effects and reduces healthcare costs.”

Impact

“We obviously want patients worldwide to benefit from this,” Dr. Lipman said. “We are in the process of getting the model approved for use in other hospitals. We are also eagerly awaiting a proposal from the EU to simplify the approval process for this type of medical device.”

Additionally, the study authors shared their code so that researchers globally can begin to use the new model.

“I expect this model to come as a shock to physicians and researchers outside the mesothelioma field,” Dr. Lipman predicted.

“This is going to open up a whole new field of research. We expect that AI will also be able to help with many other types of tumors,” he added, as the institution is already exploring the use of AI models for lung cancer and brain metastases.

He also noted that the model could be used to help make measurements in clinical trials more reliable, pending validation. “This allows us to better assess the efficacy of new treatments in clinical trials,” he said.

DISCLOSURES: Funding for the study provided by Asbestos-Related Disease Section (SAGA) of the Dutch Society of Pulmonology and Tuberculosis (NVALT), Dutch Cancer Society, and Dutch Ministry of Health, Welfare and Sport. For full disclosures of the study authors, visit thelancet.com.

ASCO AI in Oncology is published by Conexiant under a license arrangement with the American Society of Clinical Oncology, Inc. (ASCO®). The ideas and opinions expressed in ASCO AI in Oncology do not necessarily reflect those of Conexiant or ASCO. For more information, see Policies.

KOL Commentary
Watch

Related Content