News Research Breast Cancer Colorectal Cancer Lung Cancer

Machine Learning Model Reduces Misinterpretation of Variants By Liquid Biopsies

A machine learning model demonstrated significant accuracy in distinguishing between variants from tumors vs variants resulting from clonal hematopoiesis that were picked up on liquid biopsies of solid tumors.

June 17, 2026 ASCO AI Staff 5 min read

A multifeatured machine learning model has been developed to filter out biological noise in liquid biopsy samples, helping clinicians better match therapies to their patients’ tumor characteristics. Findings from a study of the development of the plasmaCHORD model were published in Clinical Cancer Research.

Background

Liquid biopsies are commonly used to identify mutations in a patient’s solid tumor, enabling clinicians to select mutation-targeted therapies. However, routine next-generation sequencing of plasma cell-free DNA can also pick up clonal hematopoiesis variants, potentially confounding the results. These white blood cell mutations are common in older individuals and in patients who have previously undergone chemotherapy or radiation.

“When you do a liquid biopsy, and you get the report back, and you see mutations, you do not know if the mutations are coming from the tumor or the white blood cells,” explained co-first author Jenna Canzoniero, MD, MS, Assistant Professor of Oncology at the Johns Hopkins University School of Medicine. “If you want to select a mutation-targeted drug to treat the cancer, you want to make sure you are targeting mutations in the cancer and not mutations in the white blood cells.”

Model Methods

Dr. Canzoniero and colleagues developed a machine learning model called plasma Clonal Hematopoiesis ORigin Detection (plasmaCHORD) that uses characteristics of the DNA fragments as well as variant and patient factors to estimate whether a variant found in a liquid biopsy originates from the tumor or white blood cells.

The team trained the model on a cohort of 426 variants from plasma-only next-generation sequencing samples from 225 patients with various stages of breast, colorectal, esophageal, ovarian, or non–small cell lung cancer. The model integrated all fragment, variant, and patient features to determine a score between 0 and 1 for each variant and binarized for clonal hematopoiesis or tumor origin.

Machine learning techniques of XGBoost and caret package (Classification And REgression Training) were used to simplify the training of the predictive model and to extract important features. Ten-fold cross-validation training was also repeated to optimize the model performance. The researchers verified the model’s accuracy by using matched genetic sequencing of patients’ tumor cells and white blood cells to identify the true source of the mutations.

The locked model was then applied to an independent validation cohort of 1,418 plasma variants from 114 patients with metastatic breast, prostate, or non–small cell lung cancer as well as from cell-free DNA next-generation sequencing samples from patients in a prospective precision oncology clinical trial.

Finally, they provided proof of concept that the information was clinically useful by showing that plasmaCHORD’s prediction of mutation origin helped clinicians avoid selecting likely ineffective therapies for patients evaluated at the Johns Hopkins Molecular Tumor Board.

Findings

Use of plasmaCHORD in the training set resulted in an area under the curve for accuracy of 0.94 for differentiating tumor variants from clonal hematopoiesis variants.

In the independent validation cohort, the area under the curve was 0.9, and accuracy of variant origin identification improved from about 50% to 83% for clinically significant genes.

In addition, when the model was applied to the precision oncology trial, it correctly determined the origins of different variants, thereby preventing matches to incorrectly targeted therapies for each patient.

“About one-third of mutations detected in tumor-naive liquid biopsies can originate from white blood cells, and our ability to match targeted therapies to each patient’s genomic profile depends on our ability to distinguish tumor mutation from biological noise,” said senior study author Valsamo Anagnostou, MD, PhD, the Alex Grass Professor of Oncology and leader of the Johns Hopkins Molecular Tumor Board at the Johns Hopkins University School of Medicine. “An AI model applied to standard liquid biopsy tests could be both clinically valuable and quickly scalable.”

“PlasmaCHORD can be used going forward for both research and potentially for clinical purposes to identify the origin of mutations in a liquid biopsy if you’re not sure,” Dr. Canzoniero concluded, noting that they are looking to further improve model performance going forward.

DISCLOSURES: The work was supported in part by the National Cancer Institute and the Department of Defense; the Bloomberg~Kimmel Institute for Cancer Immunotherapy; the ECOG-ACRIN Thoracic Malignancies Integrated Translational Science Center; the Dr. Miriam and Sheldon G. Adelson Medical Research Foundation; the Stand Up to Cancer-Dutch Cancer Society International Translational Cancer Research Dream Team Grant; the Gray Foundation; the Cole Foundation; the Commonwealth Foundation; the Johns Hopkins Research Program in Quantitative Sciences; the Maryland Cigarette Restitution Fund Johns Hopkins Faculty Recruitment grant; the Pearl M. Stelter fellowship award; and the Breast Cancer Research Foundation Marion R. Wright award. Dr. Canzoniero reports grants from the NIH, Johns Hopkins Research Program in Quantitative Sciences, Maryland Cigarette Restitution Fund Faculty Recruitment, Pearl M. Stetler Fellowship, and Breast Cancer Research Foundation Marion R. Wright Award during the conduct of the study as well as nonfinancial support from Foundation Medicine and personal fees from AstraZeneca outside the submitted work. She also has a patent pending for the algorithm mentioned in the study. Dr. Anagnostou receives grants and personal fees from AstraZeneca and LabCorp/Personal Genome Diagnostics and personal fees from Neogenomics, Guardant Health, Roche, ThermoFisher and Foundation Medicine outside the submitted work and has seven pending patents. For full disclosures of the other study authors as well as data availability, visit aacrjournals.org/clincancerres.

ASCO AI in Oncology is published by Conexiant under a license arrangement with the American Society of Clinical Oncology, Inc. (ASCO^®). The ideas and opinions expressed in ASCO AI in Oncology do not necessarily reflect those of Conexiant or ASCO. For more information, see Policies.

KOL Commentary

Watch

Machine Learning Model Reduces Misinterpretation of Variants By Liquid Biopsies

Background

Model Methods

Findings

Related Content