Deep Learning Assessment of Small Renal Masses at Contrast-enhanced Multiphase CT
Chenchen Dai, Ying Xiong, Pingyi Zhu, Linpeng Yao, Jinglai Lin, Jiaxi Yao, Xue Zhang, Risheng Huang, Run Wang, Jun Hou, Kang Wang, Zhang Shi, Feng Chen, Jianming Guo, Mengsu Zeng, Jianjun Zhou†, Shuo Wang†
Radiology (IF = 19.7)
Abstract
Background: Accurate characterization of suspicious small renal masses is crucial for optimized management. Deep learning (DL) algorithms may assist with this effort. Purpose: To develop and validate a DL algorithm for identifying benign small renal masses at contrast-enhanced multiphase CT. Materials and Methods: Surgically resected renal masses measuring 3 cm or less in diameter at contrast-enhanced CT were included. The DL algorithm was developed by using retrospective data from one hospital between 2009 and 2021, with patients randomly allocated in a training and internal test set ratio of 8:2. Between 2013 and 2021, external testing was performed on data from five independent hospitals. A prospective test set was obtained between 2021 and 2022 from one hospital. Algorithm performance was evaluated by using the area under the receiver operating characteristic curve (AUC) and compared with the results of seven clinicians using the DeLong test. Results: A total of 1703 patients (mean age, 56 years ± 12 [SD]; 619 female) with a single renal mass per patient were evaluated. The retrospective data set included 1063 lesions (874 in training set, 189 internal test set); the multicenter external test set included 537 lesions (12.3%, 66 benign) with 89 subcentimeter (≤1 cm) lesions (16.6%); and the prospective test set included 103 lesions (13.6%, 14 benign) with 20 (19.4%) subcentimeter lesions. The DL algorithm performance was comparable with that of urological radiologists: for the external test set, AUC was 0.80 (95% CI: 0.75, 0.85) versus 0.84 (95% CI: 0.78, 0.88) (P = .61); for the prospective test set, AUC was 0.87 (95% CI: 0.79, 0.93) versus 0.92 (95% CI: 0.86, 0.96) (P = .70). For subcentimeter lesions in the external test set, the algorithm and urological radiologists had similar AUC of 0.74 (95% CI: 0.63, 0.83) and 0.81 (95% CI: 0.68, 0.92) (P = .78), respectively. Conclusion: The multiphase CT-based DL algorithm showed comparable performance with that of radiologists for identifying benign small renal masses, including lesions of 1 cm or less.