Artificial intelligence assisted identification of newborn auricular deformities via smartphone application
Liu-Jie Ren*, Rui-Jie Yang*, Li-Li Chen*, Shu-Yue Wang, Chen-Long Li, Yuan Huang, Tian-Yu Zhang☨, Yao-Yao Fu☨, Shuo Wang☨
eClinicalMedicine (IF=9.6)
Abstract
Background Auricular deformities are common in newborns and require early diagnosis and timely intervention. Several factors highlight the necessity of a machine learning-based diagnostic solution: the high prevalence of these conditions, the narrow time window for effective non-surgical treatment, limited medical resources, and the importance of both physical and mental well-being. This study presents a novel artificial intelligence (AI) model to identify and classify common sub-types of auricle deformities, using photos taken with mobile devices.
Methods The dataset was made up of the open-source dataset named BabyEar4k, which contains 3852 auricle images with diagnosis data, and another private dataset containing 104 microtia ears added from ENT Hospital of Fudan University. All the training photos were pre-processed to 800 × 800 RGB images, with the auricles located at the centers. The dataset was divided into two parts, 3835 samples for training/validation and 120 (20 for each class) for testing, i.e., the internal test dataset. 15% of the training data were used for validation during the training process. External validation was conducted on data from three centres across China (Xinjiang N = 252, Guizhou N = 186, and Fujian N = 252). The performance of the model was evaluated by comparative analyses with human volunteers. A prospective test set was collected in Shanghai (Obstetrics & Gynecology Hospital of Fudan University, from 2023/10/17 to 2023/12/29; N = 272). Given the significant variation in the distribution of sub-types, accuracy and weighted F1-score were chosen as primary evaluation metrics.
Findings Four different backbone architectures were evaluated: ResNet50, DenseNet121, EfficientNet, and RegNet. On the internal test set, the model achieved an accuracy of 0.83–0.85 for six-class classification and 0.94–0.98 for binary classification. ResNet50 backbone had the most consistent performance. Multi-center real-world data validation demonstrated satisfactory accuracy, with a range of 0.74–0.82 for six-class classification and 0.79–0.86 for normal/abnormal classification, indicating strong generalizability. In comparative analyses with volunteers, the professionals achieved an accuracy of 0.7–0.8 in the six-class classification task, while the related fellows scored 0.45–0.65, and the laypeople scored 0.45–0.55.
Interpretation The developed system offers an efficient and cost-effective solution for clinical applications, including early diagnosis of newborn auricular deformities, monitoring treatment progress, and educational purposes.