Abstract:Objective To evaluate the performance and inter-reviewer agreement of the O-RADS and two other well-established ultrasound (US) classification systems for determining the malignancy of adnexal masses (AM).Methods A total of 299 patients with 324 AM who underwent surgical treatments in Changzhou Second People's Hospital from January 2017 to December 2020 were enrolled in this retrospective analysis. Three experienced ultrasonographers independently categorized each AM according to Ovarian-Adnexal Imaging Reporting and Data System (O-RADS), Gynecologic Imaging-Reporting and Data System (GI-RADS), and Assessment of Different NEoplasias in the adneXa (ADNEX) models. The receiver operating characteristic (ROC) curves were plotted to determine the optimal cut-off values, and the efficacy of the three models for diagnosing AM was evaluated with pathological findings as the gold standard. The kappa statistics were used to assess the inter-reviewer agreement (IRA).Results When O-RADS 3 was set as the cut-off value, the area under the ROC curve (AUC) of O-RADS model was 0.981, which was greater than that of GI-RADS (0.934) and ADNEX (0.907) models (P < 0.05). However, there was no difference between the AUC of GI-RADS model and that of ADNEX model (P > 0.05). The specificity of GI-RADS model for diagnosing the malignancy of AM was 88.4%, which was lower than that of O-RADS (92.9%) and ADNEX (93.3%) models (P < 0.05). There was no difference between the specificity of O-RADS model and that of ADNEX model (P > 0.05). The IRA was high in all the three models, with kappa statistics ranged from 0.857 to 0.937.Conclusions The O-RADS, GI-RADS, and ADNEX models are all of high value for diagnosing the malignancy of AM with high IRA. Nevertheless, the diagnostic performance of O-RADS model is even greater than that of GI-RADS and ADNEX models.