Criteria for the translation of radiomics into clinically useful tests

Play all audios:

ABSTRACT Computer-extracted tumour characteristics have been incorporated into medical imaging computer-aided diagnosis (CAD) algorithms for decades. With the advent of radiomics, an

extension of CAD involving high-throughput computer-extracted quantitative characterization of healthy or pathological structures and processes as captured by medical imaging, interest in

such computer-extracted measurements has increased substantially. However, despite the thousands of radiomic studies, the number of settings in which radiomics has been successfully

translated into a clinically useful tool or has obtained FDA clearance is comparatively small. This relative dearth might be attributable to factors such as the varying imaging and radiomic

feature extraction protocols used from study to study, the numerous potential pitfalls in the analysis of radiomic data, and the lack of studies showing that acting upon a radiomic-based

tool leads to a favourable benefit–risk balance for the patient. Several guidelines on specific aspects of radiomic data acquisition and analysis are already available, although a similar

roadmap for the overall process of translating radiomics into tools that can be used in clinical care is needed. Herein, we provide 16 criteria for the effective execution of this process in

the hopes that they will guide the development of more clinically useful radiomic tests in the future. KEY POINTS * Despite tens of thousands of radiomic studies, the number of settings in

which radiomics is used to guide clinical decision-making is limited, in part owing to a lack of standardization of the radiomic measurement extraction processes and the lack of evidence

demonstrating adequate clinical validity and utility. * Processes to acquire and process source images and extract radiomic measurements should be established and harmonized. * A radiomic

model should be tested on external data not used for its development or, if no such dataset is available, tested using proper internal validation techniques. * Model outputs should be shown

to guide disease management decisions in a way that leads to a favourable risk–benefit balance for patients. * Clinical performance should be assessed periodically in its intended clinical

setting (task and population) after model lockdown. * A list of 16 criteria for the optimal development of a radiomic test has been compiled herein and should hopefully guide the

implementation of future radiomic analyses. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access

through your institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn more Subscribe to

this journal Receive 12 print issues and online access $209.00 per year only $17.42 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy

now Prices may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer

support SIMILAR CONTENT BEING VIEWED BY OTHERS ROBUST IMAGING HABITAT COMPUTATION USING VOXEL-WISE RADIOMICS FEATURES Article Open access 11 October 2021 INVESTIGATION OF RADIOMICS BASED

INTRA-PATIENT INTER-TUMOR HETEROGENEITY AND THE IMPACT OF TUMOR SUBSAMPLING STRATEGIES Article Open access 14 October 2022 IDENTIFICATION OF CT RADIOMIC FEATURES ROBUST TO ACQUISITION AND

SEGMENTATION VARIATIONS FOR IMPROVED PREDICTION OF RADIOTHERAPY-TREATED LUNG CANCER PATIENT RECURRENCE Article Open access 19 April 2024 REFERENCES * Gillies, R. J., Kinahan, P. E. &

Hricak, H. Radiomics: images are more than pictures, they are data. _Radiology_ 278, 563–577 (2016). Article Google Scholar * Giger, M. L. Update on the potential of computer-aided

diagnosis for breast cancer. _Fut. Oncol._ 6, 1–4 (2010). Article Google Scholar * Doi, K. Computer-aided diagnosis in medical imaging: historical review, current status, and future

potential. _Comput. Med. Imaging Graph._ 31, 198–211 (2007). Article Google Scholar * Lambin, P. et al. Radiomics: extracting more information from medical images using advanced feature

analysis. _Eur. J. Cancer_ 48, 441–446 (2012). Article Google Scholar * FDA-NIH Biomarker Working Group. _BEST (Biomarkers, EndpointS, and other Tools) Resource_ (Food and Drug

Administration and National Institutes of Health, 2016). * FDA. _Artificial Intelligence and Machine Learning (AI/ML)-Enabled Devices_

https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices. (2022). * Fornacon-Wood, I. M. et al. Reliability

and prognostic value of radiomic features are highly dependent on choice of feature extraction platform. _Eur. Radiol._ 30, 6241–6250 (2020). Article Google Scholar * Radiomics. _Radiomics

Quality Score – RQS 2.0_ https://www.radiomics.world/rqs2 (2022). * Zwanenburg, A. et al. The image biomarker standardization initiative: standardized quantitative radiomics for high

throughput image-based phenotyping. _Radiology_ 295, 328–338 (2020). Article Google Scholar * Kumar, V. et al. Radiomics: the process and the challenges. _Magn. Reson. Imaging_ 30,

1234–1248 (2012). Article Google Scholar * Fournier, L. et al. Incorporating radiomics into clinical trials: expert consensus endorsed by the European society of radiology on

considerations for data-driven compared to biologically driven quantitative biomarkers. _Eur. Radiol._ 31, 6001–6012 (2021). Article Google Scholar * McShane, L. M. et al. Criteria for the

use of omics-based predictors in clinical trials: explanation and elaboration. _BMC Med._ 11, 220 (2013). Article Google Scholar * Jiang, Y., Edwards, A. V. & Newstead, G. M.

Artificial intelligence applied to breast MRI for improved diagnosis. _Radiology_ 298, 39–46 (2021). Article Google Scholar * Data Science Institute, American College of Radiology. _FDA

Cleared AI Algorithms_ https://www.acrdsi.org/DSI-Services/FDA-Cleared-AI-Algorithms, (2022). * Clark, G. M. Prognostic factors versus predictive factors: examples from a clinical trial of

erlotinib. _Mol. Oncol._ 1, 406–412 (2008). Article Google Scholar * Aerts, H. J. W. L. et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach.

_Nat. Commun._ 5, 4006 (2014). Article CAS Google Scholar * Li, H. et al. Quantitative MRI radiomics in the prediction of molecular classifications of breast cancer subtypes in the

TCGA/TCIA data set. _NPJ Breast Cancer_ 2, 16012 (2016). Article Google Scholar * Li, H. et al. MRI radiomics signatures for predicting the risk of breast cancer recurrence as given by

research versions of gene assays of MammaPrint, Oncotype DX, and PAM50. _Radiology_ 281, 382–391 (2016). Article Google Scholar * Cha, K. H. et al. Bladder cancer treatment response

assessment in CT using radiomics with deep learning. _Nat. Sci. Rep._ 7, 8738 (2017). Google Scholar * Drukker, K. et al. Most-enhancing tumor volume by mri radiomics predicts

recurrence-free survival “Early On” in neoadjuvant treatment of breast cancer. _Cancer Imaging_ 18, 12 (2018). Article Google Scholar * Huang, E. P., Lin, F. I. & Shankar, L. K. Beyond

correlations, sensitivities, and specificities: a roadmap for demonstrating utility of advanced imaging in oncology treatment and clinical trial design. _Acad. Radiol._ 24, 1036–1049

(2017). Article Google Scholar * Subramanian, J. & Simon, R. What should physicians look for in evaluating prognostic gene-expression signatures? _Nat. Rev. Clin. Oncol._ 7, 327–334

(2010). Article Google Scholar * Shafiq-Ul-Hassan, M. et al. Intrinsic dependencies of CT radiomic features on voxel size and number of gray levels. _Med. Phys._ 44, 1050–1062 (2017).

Article CAS Google Scholar * Berenguer, R. et al. Radiomics of CT features may be nonreproducible and redundant: influence of CT acquisition parameters. _Radiology_ 288, 407–415 (2018).

Article Google Scholar * American College of Radiology. _ACR Appropriateness Criteria_ https://www.acr.org/Clinical-Resources/ACR-Appropriateness-Criteria (2022). * Society of Nuclear

Medicine and Medical Imaging. _Procedure Standards_ https://www.snmmi.org/ClinicalPractice/content.aspx?ItemNumber=6414. (2022). * European Association of Nuclear Medicine. _Guidelines_

https://www.eanm.org/publications/guidelines/ (2022). * QIBQ Wiki. _Profiles_ http://qibawiki.rsna.org/index.php/Profiles (2022). * Fass, L. Imaging and cancer: a review. _Mol. Oncol._ 2,

115–152 (2008). Article Google Scholar * Zhao, B. et al. Exploring intra- and inter-reader variability in unidimensional, bidimensional, and volumetric measurements of solid tumors on CT

scans reconstructed at different slice intervals. _Eur. J. Radiol._ 82, 959–968 (2013). Article Google Scholar * O’Connor, J. P. B., Jackson, A., Parker, G. J. M., Roberts, C. &

Jayson, G. C. Dynamic contrast-enhanced MRI in clinical trials of anti-vascular therapies. _Nat. Rev. Clin. Oncol._ 9, 167–177 (2012). Article Google Scholar * Tudorica, L. A. et al. QIN:

a feasible high spatiotemporal resolution breast DCE-MRI protocol for clinical settings. _Magn. Reson. Imaging_ 30, 1257–1267 (2012). Article Google Scholar * Nardone, V. et al. Delta

radiomics: a systematic review. _Radiol. Med._ 126, 1571–1583 (2021). Article Google Scholar * Pinker, K., Riedl, C. & Weber, W. A. Evaluating tumor response with FDG-PET: updates on

PERCIST, comparison with EORTC criteria and clues to future development. _Eur. J. Nucl. Med. Mol. Imaging_ 44, 55–66 (2017). Article Google Scholar * Mackin, D. et al. Harmonizing the

pixel size in retrospective computed tomography radiomics studies. _PLoS ONE_ 12, e0178524 (2017). Article Google Scholar * Madabhushi, A., Udupa, J. K. & Souza, A. Generalized scale:

theory, algorithms, and application to image inhomogeneity correction. _Comput. Image Vis. Underst._ 101, 100–121 (2006). Article Google Scholar * Madabhushi, A. & Udupa, J. K. New

methods of MR image intensity standardization via generalized scale. _Med. Phys._ 33, 3426–3434 (2006). Article Google Scholar * Whitney, H. M. et al. Harmonization of radiomic features of

breast lesions across international DCE-MRI datasets. _J. Med. Imaging_ 7, 012707 (2020). Article Google Scholar * Duron, L. et al. Gray-level discretization impacts reproducible MRI

radiomics texture features. _PLoS ONE_ 14, e0213459 (2019). Article CAS Google Scholar * Larue, R. T. H. M. et al. Influence of gray level discretization on radiomic feature stability for

different CT scanners, tube currents, and slice thicknesses: a comprehensive phantom study. _Acta Oncol._ 56, 1544–1553 (2017). Article Google Scholar * Leijenaar, R. T. et al. The effect

of SUV discretization in quantitative FDG-PET radiomics: the need for standardized methodology in tumor texture analysis. _Nat. Sci. Rep._ 5, 11075 (2015). CAS Google Scholar * Willemink,

M. J. et al. Preparing medical imaging data for machine learning. _Radiology_ 295, 4–15 (2020). Article Google Scholar * Mali, S. A. et al. Making radiomics more reproducible across

scanner and imaging protocol variations: a review of harmonization methods. _J. Per. Med._ 11, 842 (2021). Article Google Scholar * Lin, Y. et al. Deep learning for fully automated tumor

segmentation and extraction of magnetic resonance radiomics features in cervical cancer. _Eur. Radiol._ 30, 1297–1305 (2020). Article Google Scholar * Parmar, C., Grossman, P., Bussink,

J., Lambin, P. & Aerts, H. J. W. L. Machine learning methods for quantitative radiomic biomarkers. _Nat. Sci. Rep._ 5, 13087 (2015). CAS Google Scholar * Primakov, S. P. et al.

Automated detection and segmentation of non-small cell lung cancer computed tomography images. _Nat. Commun._ 13, 3423 (2022). Article CAS Google Scholar * Gilhuijs, K. G. A., Giger, M.

L. & Bick, U. Automated analysis of breast lesions in three dimensions using dynamic magnetic resonance imaging. _Med. Phys._ 25, 1647–1654 (1998). Article CAS Google Scholar * Chen,

W., Giger, M. L., Lan, L. & Bick, U. Computerized interpretation of breast MRI: investigation of enhancement-variance dynamics. _Med. Phys._ 31, 1076–1082 (2004). Article Google Scholar

* Chen, W., Giger, M. L., Bick, U. & Newstead, G. Automatic identification and classification of characteristic kinetic curves of breast lesions on DCE-MRI. _Med. Phys._ 33, 2878–2887

(2006). Article Google Scholar * Chen, W., Giger, M. L., Li, H., Bick, U. & Newstead, G. Volumetric texture analysis of breast lesions on contrast-enhanced magnetic resonance images.

_Magn. Reson. Med._ 58, 562–571 (2007). Article Google Scholar * van Timmeren, J. E. et al. Test-retest data for radiomics feature stability analysis: generalizable or study-specific?

_Tomography_ 2, 361–365 (2016). Article Google Scholar * Afshar, P., Mohammadi, A., Plataniotis, K. N., Oikonomou, A. & Benali, H. From hand-crafted to deep learning-based cancer

radiomics: challenges and opportunities. _IEEE Signal. Process. Mag._ 36, 132–160 (2019). Article Google Scholar * Sahiner, B. et al. Deep learning in medical imaging and radiation

therapy. _Med. Phys._ 46, e1–e36 (2019). Article Google Scholar * Li, Z., Wang, Y., Yu, J., Guo, Y. & Cao, W. Deep learning based radiomics (DLR) and its usage in noninvasive IDH1

prediction for low grade glioma. _Nat. Sci. Rep._ 7, 1–11 (2017). Google Scholar * Antropova, N., Huynh, B. Q. & Giger, M. L. A deep feature fusion methodology for breast cancer

diagnosis demonstrated on three imaging modality datasets. _Med. Phys._ 44, 5162–5171 (2017). Article CAS Google Scholar * International Organization for Standardization. _Guidance for

the Use of Repeatability, Reproducibility, and Trueness Estimates in Measurement Uncertainty Evaluation_ https://www.iso.org/obp/ui/#iso:std:iso:21748:ed-2:v1:en (2020). * Drukker, K.,

Pesce, L. & Giger, M. L. Repeatability in computer-aided diagnosis: application to breast cancer diagnosis on sonography. _Med. Phys._ 37, 2659–2669 (2010). Article Google Scholar *

Kessler, L. G. et al. The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions. _Stat. Methods Med. Res._ 24,

9–26 (2015). Article Google Scholar * Raunig, D. L. et al. Quantitative imaging biomarkers: a review of statistical methods for technical performance assessment. _Stat. Methods Med. Res._

24, 27–67 (2015). Article Google Scholar * Huang, E. P. et al. Multiparametric quantitative imaging in risk prediction: recommendations for data acquisition, technical performance

assessment, and model development and validation. _Acad. Radiol._ https://doi.org/10.1016/j.acra.2022.09.018 (2022). Article Google Scholar * McHugh, D. J. et al. Image contrast, image

preprocessing, and T1-mapping affect MRI radiomic feature repeatability in patients with colorectal cancer liver metastases. _Cancers_ 13, 240 (2021). Article Google Scholar * Jha, A. K.

et al. Repeatability and reproducibility study of radiomic features on a phantom and human cohort. _Sci. Rep._ 11, 2055 (2021). Article CAS Google Scholar * Bissoto, A., Perez, F., Valle,

E. & Avila, S. Skin lesion synthesis with generative adversarial networks. _OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based

Procedures, and Skin Image Analysis. OR 2.0 First International Workshop, CARE Fifth International Workshop, CLIP Seventh International Workshop, ISIC Third International Workshop_. Springer

Lecture Notes in Computer Science (Springer, 2019). * Sullivan, D. C. et al. Metrology standards for quantitative imaging biomarkers. _Radiology_ 277, 813–825 (2015). Article Google

Scholar * Hackstadt, A. J. & Hess, A. M. Filtering for increased power for microarray data analysis. _BMC Bioinformatics_ 10, 11 (2009). Article Google Scholar * Luo, J. et al. A

comparison of batch effect removal methods for enhancement of prediction performance using MACQ-II microarray gene expression data. _Pharmacogenomics J._ 10, 278–291 (2010). Article CAS

Google Scholar * Johnson, W. E., Li, C. & Rabinovic, A. Adjusting batch effects in microarray expression data using empirical bayes methods. _Biostatistics_ 8, 118–127 (2007). Article

Google Scholar * Orlhac, F. et al. A post-reconstruction harmonization method for multicenter radiomic studies in PET. _J. Nucl. Med._ 59, 1321–1328 (2018). Article CAS Google Scholar *

Parker, H. S. & Leek, J. T. The practical effect of batch on genomic prediction. _Stat. Appl. Genet. Mol. Biol._ 11, 10 (2012). Article Google Scholar * Robinson, K., Li, H., Lan, L.,

Schacht, D. & Giger, M. Radiomics robustness assessment and classification evaluation: a two-stage method demonstrated on multivendor FFDM. _Med. Phys._ 46, 2145–2156 (2019). Article

Google Scholar * _The Cancer Imaging Archive_ http://cancerimagingarchive.net (2020). * Clark, K. et al. The Cancer Imaging Archive (TCIA): maintaining and operating a public information

repository. _J. Digital Imaging_ 26, 1045–1057 (2013). Article Google Scholar * Zhu, Y. et al. Deciphering genomic underpinnings of quantitative MRI-based radiomic phenotypes of invasive

breast carcinoma. _Nat. Sci. Rep._ 5, 17787 (2015). CAS Google Scholar * Riley, R. D. et al. Minimum sample size for developing a multivariable prediction model: part II — binary and

time-to-event outcomes. _Stat. Med._ 38, 1276–1296 (2018). Article Google Scholar * Riley, R. D. et al. Minimum sample size for external validation of a clinical prediction model with a

binary outcome. _Stat. Med._ 40, 4230–4251 (2021). Article Google Scholar * Riley, R. D. et al. Minimum sample size calculations for external validation of a clinical prediction model with

a time-to-event outcome. _Stat. Med._ 41, 1280–1295 (2022). Article Google Scholar * Cho, J., Lee, K., Shin, E., Choy, G. & Do, S. How much data is needed to train a medical image

deep learning system to achieve necessary high accuracy? Preprint at https://doi.org/10.48550/arXiv.1511.06348 (2015). * Whitney, H., Li, H., Ji, Y., Liu, P. & Giger, M. L. Comparison of

breast MRI tumor classification using human-engineered radiomics, transfer learning from deep convolutional neural networks, and fusion methods. _Proc. IEEE_ 108, 163–177 (2020). Article

Google Scholar * Hastie, T., Tibshirani, R. & Friedman, J. _The Elements of Statistical Learning: Data Mining, Inference and Prediction_ 2nd edn (Springer, 2009). * Deist, T. M. et al.

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: an empirical comparison of classifiers. _Med. Phys._ 45, 3449–3459 (2018). Article Google Scholar * Haykin S.

_Neural Networks: A Comprehensive Foundation_ (Prentice Hall, 1994). * Ben-Dor, A. et al. Tissue classification with gene expression profiles. _J. Comput. Biol._ 7, 559–583 (2000). Article

CAS Google Scholar * Dudoit, S., Fridlyand, J. & Speed, T. P. Comparison of discrimination methods for the classification of tumors using gene expression data. _J. Am. Stat. Assoc._

97, 77–87 (2002). Article CAS Google Scholar * Heinze, G., Wallisch, C. & Dunkler, D. Variable selection — a review and recommendations for the practicing statistician. _Biom. J._ 60,

431–449 (2018). Article Google Scholar * Tibshirani, R. Regression shrinkage and selection via the LASSO. _J. R. Stat. Soc. Ser. B_ 58, 267–288 (1996). Google Scholar * Hanley, J. A.

& McNeil, B. J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. _Radiology_ 143, 29–36 (1982). Article CAS Google Scholar * Harrell, F. E. Jr.,

Califf, R. M., Pryor, D. B., Lee, K. L. & Rosati, R. A. Evaluating the yield of medical tests. _J. Am. Med. Assoc._ 247, 2543–2546 (1982). Article Google Scholar * Hosmer, D. W. &

Lemeshow, S. Goodness of fit tests for the multiple logistic regression model. _Commun. Stat. Theory Methods_ 9, 1043–1069 (1980). Article Google Scholar * Lemeshow, S. & Hosmer, D. A

review of goodness of fit statistics for use in the development of logistic regression model. _Am. J. Epidemiol._ 115, 92–106 (1982). Article CAS Google Scholar * van Calster, B. &

Steyerberg, E. W. _Wiley StatsRef: Statistics Reference Online_ (John Wiley and Sons, Ltd., 2018). * Bröcker, J. & Smith, L. A. Increasing the reliability of reliability diagrams.

_Weather Forecast._ 22, 651–661 (2007). Article Google Scholar * McLachlan, G. J. _Discriminant Analysis and Statistical Pattern Recognition_ (John Wiley and Sons, 2002). * Stone, M.

Cross-validatory choice and assessment of statistical predictions. _J. R. Stat. Soc. Ser. B_ 36, 111–147 (1974). Google Scholar * Breiman, L. Bagging predictors. _Mach. Learn._ 24, 123–140

(1996). Article Google Scholar * Molinaro, A. M., Simon, R. & Pfeffer, R. M. Prediction error estimation: a comparison of resampling methods. _Bioinformatics_ 21, 3301–3307 (2005).

Article CAS Google Scholar * Dobbin, K. K. & Simon, R. M. Optimally splitting cases for training and testing high-dimensional classifiers. _BMC Med. Genomics_ 4, 31 (2011). Article

Google Scholar * Sachs, M. C. & McShane, L. M. Issues in developing multivariable molecular signatures for guiding clinical care decisions. _J. Biopharm. Stat._ 26, 1098–1110 (2016).

Article Google Scholar * Varma, S. & Simon, R. Bias in error estimation when using cross-validation for model selection. _BMC Bioinformatics_ 7, 91 (2006). Article Google Scholar *

Salahuddin, Z., Woodruff, H. C., Chatterjee, A. & Lambin, P. Transparency of deep neural networks for medical image analysis: a review of interpretability methods. _Comput. Biol. Med._

140, 105111 (2022). Article Google Scholar * Hilsenbeck, S. G., Clark, G. M. & McGuire, W. L. Why do so many prognostic factors fail to pan out? _Breast Cancer Res. Treat._ 22, 197–206

(1992). Article CAS Google Scholar * Vickers, A. J. & Elkin, E. B. Decision curve analysis: a novel method for evaluating prediction models. _Med. Decis. Mak._ 26, 565–574 (2006).

Article Google Scholar * Wu, G. et al. Preoperative CT-based radiomics combined with intraoperative frozen section is predictive of invasive adenocarcinoma in pulmonary nodules: a

multicenter study. _Eur. Radiol._ 30, 2680–2691 (2020). Article Google Scholar * Hayes, D. F. Defining clinical utility of tumor biomarker tests: a clinician’s viewpoint. _J. Clin. Oncol._

39, 238–249 (2021). Article Google Scholar * Saha, A., Hosseinzadeh, M. & Huisman, H. End-to-end prostate cancer detection in bpmri via 3d cnns: effects of attention mechanisms,

clinical priori and decoupled false positive reduction. _Med. Image Anal._ 73, 102155 (2021). Article Google Scholar * Hosseinzadeh, M. et al. Deep learning-assisted prostate cancer

detection on bi-parametric MRI: minimum training data size requirements and effect of prior knowledge. _Eur. Radiol._ 32, 2224–2234 (2022). Article CAS Google Scholar * Baughan, N. et al.

_Sequestration of Imaging Studies in MIDRC: A Multi-institutional Data Commons._ _Medical Imaging 2002; Image Perception, Observer Performance, and Technology Assessment_, vol. 12035 (SPIE,

2022). * Simon, R. M., Paik, S. & Hayes, D. F. Use of archived specimens in evaluation of prognostic and predictive biomarkers. _J. Natl Cancer Inst._ 101, 1446–1452 (2009). Article

Google Scholar * Pappalardo, F., Gusso, G., Tshinanu, F. M. & Viceconti, M. In silico clinical trials: concepts and early adoptions. _Brief. Bioinforma._ 20, 1699–1708 (2019). Article

CAS Google Scholar * Committee on the Review of Omics-Based Tests for Predicting Patient Outcomes in Clinical Trials, Board on Health Care Services, Board on Health Sciences Policy,

Institute of Medicine. _Evolution of Translational Omics: Lessons Learned and the Path Forward_ (The National Academies Press, 2012). * Altman, D. G., McShane, L. M., Sauerbrei, W. &

Taube, S. E. Reporting recommendations for tumor marker prognostic studies (REMARK): explanation and elaboration. _PLoS Med._ 9, e1001216 (2012). Article Google Scholar * Equator Network.

_Enhancing the Quality and Transparency of Health Research_ (EQUATOR) https://www.equator-network.org/ (2022). Download references ACKNOWLEDGEMENTS P.L. acknowledges support for the

publication of this work from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement CHAIMELEON No. 952172, EuCanImage No. 952103, IMI-OPTIMA No. 101034347

and ERC advanced grant (ERC-ADG-2015 No. 694812 – Hypoximmuno). P.K. acknowledges support for the publication of this work from NCI grant P50 CA228944. AUTHOR INFORMATION AUTHORS AND

AFFILIATIONS * Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Rockville, MD, USA Erich P. Huang, Lisa M. McShane & Lalitha K.

Shankar * Division of Radiotherapy and Imaging, Institute of Cancer Research, London, UK James P. B. O’Connor * Department of Radiology, University of Chicago, Chicago, IL, USA Maryellen L.

Giger * Department of Precision Medicine, Maastricht University, Maastricht, Netherlands Philippe Lambin * Department of Radiology, University of Washington, Seattle, WA, USA Paul E. Kinahan

* Department of Diagnostic Radiology, University of Maryland, Baltimore, MD, USA Eliot L. Siegel Authors * Erich P. Huang View author publications You can also search for this author

inPubMed Google Scholar * James P. B. O’Connor View author publications You can also search for this author inPubMed Google Scholar * Lisa M. McShane View author publications You can also

search for this author inPubMed Google Scholar * Maryellen L. Giger View author publications You can also search for this author inPubMed Google Scholar * Philippe Lambin View author

publications You can also search for this author inPubMed Google Scholar * Paul E. Kinahan View author publications You can also search for this author inPubMed Google Scholar * Eliot L.

Siegel View author publications You can also search for this author inPubMed Google Scholar * Lalitha K. Shankar View author publications You can also search for this author inPubMed Google

Scholar CORRESPONDING AUTHOR Correspondence to Erich P. Huang. ETHICS DECLARATIONS COMPETING INTERESTS M.G. has acted as a scientific adviser of Quantitative Insights (now Qlarity Imaging),

is the contact Principal Investigator for MIDRC (funded by NIBIB COVID-19 Contract 75N92020D00021), receives royalties from Hologic, GE Medical Systems, MEDIAN Technologies, Riverain

Medical, Mitsubishi and Toshiba, holds stocks in R2/Hologic, is a shareholder in Qview, and is a co-founder of and equity holder in Quantitative Insights (now Qlarity Imaging). P.L. is a

co-founder, minority shareholder and member of the advisory board of Oncoradiomics, and is listed as a co-inventor on several licensed patents in radiomics. E.P.H., J.P.B.O.-C., L.M.M.,

P.E.K., E.L.S. and L.K.S. declare no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Reviews Clinical Oncology_ thanks K. Bera. J.-E. Bibault, J. Tian and the other,

anonymous, reviewer(s) for their contribution to the peer review of this work. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in

published maps and institutional affiliations. GLOSSARY * Biomarker A characteristic indicating non-pathological or pathological biological processes and/or an increased likelihood of a

response to an exposure or intervention5. * Clinical utility The degree to which acting upon the results of the radiomic test leads to a favourable benefit–risk balance for the patient. *

Clinical validity The adequacy of the clinical performance of the radiomic test for its intended purpose. * Deep learning A class of machine learning based on neural networks. * Model A

computational algorithm applied to extracted image features or voxel-level image data themselves. * Model outputs The result of a computational algorithm applied to the extracted image

features or voxel-level data themselves; a quantity to be used in guiding clinical management. * Model validation Establishment of the ability of a model to predict an outcome of interest

when applied to new data. * Neural network A type of computational algorithm based on the operation of biological neural systems in animals that feeds the input (in this context, feature

measurements or voxel-level data) through a series of nodes that perform mathematical operations on the outputs of preceding nodes to produce an output. In a convolutional neural network,

these mathematical operations involve applying convolutional kernels to the outputs of preceding nodes. * Normalization A process for adjusting the voxel intensity values of an image for

differences resulting from variability in image acquisition and processing parameters. * Omics The study of related sets of biological molecules in a comprehensive fashion with examples

including genomics, transcriptomics, proteomics, metabolomics and epigenomics109. Radiomics naturally extends this definition to include quantification of radiological imaging features for

the purposes of characterization and measurement of structure, function and interaction between biological molecules in a comprehensive and high-throughput manner. * Overfitting The process

of fitting an overly complex model to noise in the data, thus producing a model that is only poorly predictive when applied to completely new data. * Performance metric A quantity indicating

the ability of a model to predict an outcome of interest. * Phantoms An object that is imaged to measure the technical performance of an imaging device. * Radiomic features Quantities

computed from voxel-level image data. * Radiomic test A system comprising materials, methods and procedures for image acquisition, processing and feature extraction, and methods or criteria

for interpretation of the image data for use in guiding clinical management. * Technical artefacts The effects of factors, such as imaging centre, device, operator or device-calibration

settings, on the distribution of the feature measurements. * Technical validity The quality of the feature measurements in terms of their accuracy in assaying an underlying characteristic of

interest or their variability when the feature extraction process is applied repeatedly to the same patient. * Test lockdown Full specification of all image acquisition, processing and

feature extraction procedures, all aspects of the underlying model, and interpretations of the output. RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE

Huang, E.P., O’Connor, J.P.B., McShane, L.M. _et al._ Criteria for the translation of radiomics into clinically useful tests. _Nat Rev Clin Oncol_ 20, 69–82 (2023).

https://doi.org/10.1038/s41571-022-00707-0 Download citation * Accepted: 02 November 2022 * Published: 28 November 2022 * Issue Date: February 2023 * DOI:

https://doi.org/10.1038/s41571-022-00707-0 SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not

currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative