Paper published in Nature Communications
News |
Large-scale drug sensitivity screens have enabled training drug response prediction models based on cancer cell line omics profiles to advance personalized medicine. While model performances reported in the literature appear promising, successful translation to the clinic remains limited. In this work, we discuss key obstacles that lead to overly optimistic performance estimates of state-of-the-art models, making it challenging to track progress in the field. To address them, we present DrEval, a pipeline for unbiased, biologically meaningful evaluation of cancer drug response models. DrEval is designed as a living open-source benchmark that integrates baseline and literature models with standardized hyperparameter tuning, statistically rigorous evaluation, cross-study benchmarks, and supports ablation studies and publication-ready visualizations. Using DrEval, we show that deep learning models barely outperform a naive model that predicts only the mean drug and cell line effects, while no complex model outperforms properly tuned tree-based ensemble baselines in relevant settings.
🔗 Links:
📄 Paper: doi.org/10.1038/s41467-026-72903-w
💻 Python package: https://github.com/daisybio/drevalpy
🍏 nf-core pipeline: https://nf-co.re/drugresponseeval/1.2.0/