Background

Angiomyolipoma (AML) and renal cell carcinoma (RCC) are the most prevalent benign and malignant kidney tumors, respectively [1, 2]. Approximately 5% of AMLs lack visible fat in CT images, making their differentiation from RCC challenging [2, 3]. The likelihood of aggressive RCC increases significantly when the diameter is greater than 3 cm [4, 5, 6]; therefore, the American Urological Association guidelines recommend active surveillance for small renal masses with a diameter < 3 cm [7]. Radiofrequency ablation, microwave ablation, or cryoablation treatments are also options for patients with small (≤ 3 cm) cortical tumors [8]. Therefore, the noninvasive diagnosis of small (< 3 cm) fat-poor AML before surgery might facilitate subsequent decision-making for active surveillance or needle biopsy prior to interventional therapy.

Compared with those of RCC, precontrast CT images of fat-poor AML typically exhibit greater attenuation values, whereas postcontrast CT images have prolonged and homogeneous enhancement [9, 10]. Additionally, the morphological characteristics observed in cross-sectional images serve as important indicators for the diagnosis of fat-poor AML [11, 12]. RCCs usually display a round shape, whereas fat-poor AMLs appear more irregular [13, 14, 15]. Numerous previous studies have identified various fat-poor AML-associated shape factors, such as angular interface sign (AIS) and overflowing beer sign (OBS) [12, 16, 17]. However, the use of these parameters is limited by the inconsistencies in assessment criteria across studies and their qualitative nature [11, 17]. The circularity index proposed by Kang et al. [18] enables the quantitative evaluation of tumor shape and demonstrates superior diagnostic efficacy compared with qualitative radiologic signs in distinguishing small (< 4 cm) fat-poor AML from RCC. However, measuring the circularity index requires dedicated computer-assisted analysis because it involves precise contour delineation that is not routinely supported by standard PACS workstations, thereby limiting its utility in routine clinical diagnosis. Furthermore, while prior study has assessed circularity index for differentiating fat-poor AMLs from RCCs in tumors < 4 cm, its value for smaller (< 3 cm) lesions remains unclear.

Circular components are a necessary part of most industrial products, especially in the manufacturing industry [19, 20]. Minimum circumscribed circle (MCC) and maximum inscribed circle (MIC) are two common reference circles that can be used to assess the closeness of engineering components to a perfect circle (roundness or circularity) [19, 20]. The shape of the engineering components is close to a perfect circle as the diameter difference between the MCC and MIC decreases to zero. Therefore, in the present study, it was hypothesized that the differences between the MCCs and MICs of tumors in cross-sectional CT images could be used to evaluate tumor growth morphology and reflect the tumor’s circularity, thereby distinguishing fat-poor AML from RCC.

This study aimed to evaluate the previously proposed morphological parameters and develop a simple, clinically applicable quantitative morphological factor to better distinguish small (< 3 cm) fat-poor AML from RCC.

Methods

Patient cohort

Since this was a retrospective study, the institutional review boards of both centers approved the study and waived the requirement for informed consent. The datasets used in this study included consecutive AML and RCC (clear cell, chromophobe, or papillary) patients who underwent CT scanning from June 2018 to December 2022 at Centers A and B. The inclusion criteria were as follows: the patients with (1) a single sporadic tumor; (2) complete CT images, including precontrast, corticomedullary, and nephrographic phases, acquired within one month before treatment; and (3) pathologically confirmed RCC or AML after partial or radical nephrectomy. The exclusion criteria were as follows: the patients with (1) maximum diameter of the tumor ≥ 3 cm; (2) thick slice thickness (> 3 mm) in the axial or coronal reconstructed CT images; (3) visible fat attenuation within the tumors in precontrast CT images; or (4) inadequate image quality. The initial cohort comprised patients from Center A, and the validation cohort comprised patients from Center B (Fig. 1).

Fig. 1
figure 1

Flowchart of the study enrollment. AML, angiomyolipoma; RCC, renal cell carcinoma

Data collection

The demographic characteristics of the patients, such as sex and age, were recorded. CT examinations were performed via various CT scanners with 64–256 channels. All the patients underwent preoperative CT scans that included precontrast, corticomedullary, and nephrographic phases. Owing to the heterogeneity of CT equipment and the fact to evaluate only tumor morphological features, the present study considered only the inclusion and exclusion criteria for imaging quality. Pathological data were extracted retrospectively from original pathology reports by a genitourinary pathologist, but histological slides were not re-examined.

Analysis of CT features

All the morphological parameters except the circularity index were evaluated via a diagnostic workstation. A radiologist with 11 years of experience in abdominal imaging (Radiologist A), blinded to patients’ data, subjectively selected the optimal contrast phase (either corticomedullary or nephrographic) that best demonstrated tumor-normal kidney tissue demarcation for CT feature analysis in the initial cohort. This radiologist (Radiologist A) evaluated all cases in the initial cohort for statistical analysis.

Qualitative analysis

The qualitative parameters (AIS and OBS) were assessed in the axial and coronal images. On the basis of the definitions of the AIS and OBS in previous studies [12, 17], a 4-point AIS score and a 3-point OBS score were developed. On the basis of the tumor’s growth morphology within the renal parenchyma, an AIS score ranging from 1 to 4 was defined as follows: 4, pyramidal interface with an angle ≤ 90°; 3, pyramidal interface with an angle > 90°; 2, the tumor within the renal parenchyma was compressed without a definable apex; and 1, uniform rounded interface (Fig. 2A-D). On the basis of the length of contact between the bulging-out portion of the tumor and the surface of the kidney, an OBS score ranging from 1 to 3 was defined as follows: 3, contact length ≥ 3 mm; 2, contact length < 3 mm; and 1, no contact (Fig. 2D-F). Tumors with 100% exophytic or endophytic growth were assessed as 1 point in both AIS and OBS scores.

Fig. 2
figure 2

Schematic diagram of qualitative parameters. a. AIS score of 4 (angle). b. AIS score of 3 (angle). c. AIS score of 2 (arrow). d. AIS and OBS scores of 1 (arrows). e. OBS score of 3 (arrow). f. OBS score of 2 (arrow). AIS, angular interface sign; OBS, overflowing beer sign

Quantitative analysis

The quantitative parameters, including the tumor diameter, circularity index, and internal-to-external circle area ratio (IECR), were measured in the axial images. The tumor diameter was obtained by measuring the maximum diameter of the tumor. The MIC of the tumor was defined as the largest circle, which was completely surrounded by the tumor without intersecting it, whereas the MCC of the tumor was defined as the smallest circle that completely surrounded the tumor and did not intersect it [19, 20]. At the image level with the maximum tumor diameter, the radiologist placed the largest circular region of interest (ROI) within the tumor and the smallest circular ROI outside the tumor as the MIC and MCC, respectively, and recorded the areas of the two circles (Fig. 3). Each parameter was measured twice, and its average was taken. To ensure that the ROI was a perfect circle, the visual observation method was first used to adjust the ROI shape, and then the two-point method was used to measure the ROI diameter several times. A difference of no more than 10% was considered a perfect circle. The IECR was calculated via Eq. (1).

Fig. 3
figure 3

Measurement method for the IECR. Figures a and b present schematic diagrams of the MIC and MCC, respectively. IECR, internal-to-external circle area ratio; MIC, maximum inscribed circle; MCC, minimum circumscribed circle

$$\left( {MIC\,area/MCC\,area} \right) \times 100\% $$
(1)

Additionally, the selected axial CT images were transferred to ImageJ 1.51 software to measure the circularity index of the tumor. As described by Kang et al. [18], circularity was measured in the whole plane of the tumor, and the circularity index value of the tumor was obtained via Eq. (2).

$$Medial\,circularity \times 100\% $$
(2)

To determine interreader agreement, another radiologist with 7 years of experience in abdominal imaging (Radiologist B) independently re-evaluated a randomly selected subset of 50 cases in the initial cohort using identical evaluation criteria. In addition, to ensure independent validation, a third radiologist with 10 years of experience in abdominal imaging (Radiologist C) evaluated all cases in the validation cohort to prevent potential bias from prior exposure to the initial cohort’s data and confirm reproducibility across observers and institutions. Prior to the CT feature analysis, all radiologists were provided with a slide presentation to explain the above parameters.

Statistical analyses

SPSS for Windows software (ver. 26.0; IBM, Inc.) was used for all the statistical analyses. The normally distributed, nonnormally distributed, and categorical variables were assessed via t tests, Mann‒Whitney U tests, and chi‒square tests or Fisher’s exact tests, respectively. The normality of the data was assessed via the Kolmogorov‒Smirnov test. A P value of < 0.05 was considered to indicate statistical significance. The diagnostic performance of the statistically significant parameters was assessed via the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The optimal cut-off values were derived via the Youden method. The AUC values between different parameters were compared via the DeLong test. The κ value and intraclass correlation coefficient (ICC) were used to determine the interreader agreement for the qualitative and quantitative parameters, respectively. Multivariate logistic regression analyses were performed using the parameters with P < 0.05. First, the demographic and morphological parameters with P < 0.05 other than the circularity index and IECR (sex, AIS score, OBS score, any sign for AML, and tumor diameter) were screened, and a prediction model omitting circularity index and IECR (Model 1) was developed. Then, using the circularity index or IECR, we screened more key factors and developed a prediction model incorporating circularity index (Model 2) and a prediction model incorporating IECR (Model 3). Each model’s prediction ability was assessed through its accuracy, sensitivity, specificity, PPV, NPV, and AUC.

Results

Demographic characteristics

The initial cohort included 212 tumors from 212 patients, including 177 RCCs (146 clear cell RCCs, 14 papillary RCCs, and 17 chromophobe RCCs) and 35 fat-poor AMLs. There were 130 males (61.32%) and 82 females (38.68%), and the median (interquartile range) age was 59 (51–67) years. Fat-poor AMLs were more common in females than were RCCs (P < 0.001) (Table 1).

Table 1 Comparative analysis of the demographic and morphological characteristics between fat-poor AML and RCC

Qualitative analysis

The results revealed significant differences in both AIS score (P = 0.003) and OBS score (P < 0.001) between RCCs and fat-poor AMLs (Table 1). The diagnostic performance of the AIS score and OBS score at different thresholds is shown in Table 2. To distinguish fat-poor AML from RCC, the optimal cut-off values of the AIS score and OBS score were 2.5 and 1.5, respectively (Table 3). On the basis of these results, an AIS score of ≥ 3 or/and an OBS score of ≥ 2 were defined as the presence of “any sign of AML”. To distinguish fat-poor AML from RCC, “any sign for AML” achieved an AUC value of 0.675 (Table 3). The interreader agreement of the AIS score, OBS score, and any sign for AML were all substantial (κ values of 0.68, 0.71, and 0.74, respectively).

Table 2 Diagnostic performance of the AIS score and OBS score at different thresholds for fat-poor AML

Quantitative analysis

There were significant differences in tumor diameter (P = 0.008), circularity index (P < 0.001), and IECR (P < 0.001) between RCCs and fat-poor AMLs (Table 1; Fig. 4). Among all the qualitative and quantitative parameters, the IECR achieved the highest AUC value of 0.899, followed by the circularity index of 0.824 (Table 3). The AUC value of the IECR was significantly greater than those of sex (Z = 2.245, P = 0.025), AIS score (Z = 4.280, P < 0.001), OBS score (Z = 5.413, P < 0.001), any sign for AML (Z = 3.893, P < 0.001), tumor diameter (Z = 8.582, P < 0.001), and circularity index (Z = 2.128, P = 0.033). The interreader agreements of the tumor diameter, circularity index, and IECR, expressed as the ICC, were 0.95, 0.85, and 0.78, respectively.

Fig. 4
figure 4

Box-and-whisker plot of the relationship between the IECR and tumor pathology in the initial cohort. IECR, internal-to-external circle area ratio; AML, angiomyolipoma; RCC, renal cell carcinoma

Table 3 Diagnostic performance of statistically significant demographic and morphological characteristics for fat-poor AML

We stratified tumors into five size categories (< 1.3 cm, 1.3–1.5 cm, 1.5–2.0 cm, 2.0–2.5 cm, and 2.5–3.0 cm) to evaluate how diagnostic performance of the IECR varies with tumor size. In the initial cohort, IECR’s diagnostic performance was markedly reduced for tumors < 1.3 cm (AUC, 0.400), whereas it recovered to acceptable levels (AUC, 0.850) in the 1.3–1.5 cm subgroup (Table 4).

Table 4 Diagnostic performance of the IECR stratified by tumor size

Multivariate analysis and prediction models

The results of the multivariate logistic regression analysis are presented in Tables 5, 6. Table 7 summarizes the diagnostic performance of the three prediction models that distinguish fat-poor AML from RCC. The diagnostic performance of Model 2 (Z = 1.994, P = 0.046) and Model 3 (Z = 2.076, P = 0.038) outperformed Model 1. The AUC value of Model 3 was greater than that of Model 2, but the difference was not statistically significant (Z = 1.298, P = 0.194) (Fig. 5A).

Fig. 5
figure 5

Receiver operating characteristic curves of the prediction models. (a) Initial cohort. (b) Validation cohort. Model−1, Model−2, and Model−3 are prediction model omitting circularity index and IECR, prediction model incorporating circularity index, and prediction model incorporating IECR, respectively

Validation cohort

The validation cohort included 118 tumors from 118 patients, including 99 RCCs (91 clear cell RCCs, 3 papillary RCCs, and 5 chromophobe RCCs) and 19 fat-poor AMLs. There were 78 males (66.10%) and 40 females (33.90%), and the mean age ± standard deviation was 54.91 ± 12.20 years. Demographic comparisons revealed balanced sex distribution (P = 0.389) but significantly different ages between the initial and validation cohorts (59 [51.25-67] vs. 54 [46.75-64], P = 0.016). However, the Spearman correlation between age and the model’s prediction scores showed negligible associations (r < 0.135, all P > 0.05), confirming age does not systematically influence the model’s discriminative ability.

Table 5 Multivariate logistic regression analyses for the statistically significant demographic and morphological parameters

The validation cohort’s diagnostic performance, using the initial cohort-derived optimal threshold for sex and morphology, is presented in Table 3. The AUC value of the IECR was also significantly greater than those of sex (Z = 3.045, P = 0.002), AIS score (Z = 3.741, P < 0.001), OBS score (Z = 4.337, P < 0.001), any sign for AML (Z = 4.480, P < 0.001), tumor diameter (Z = 2.384, P = 0.017), and circularity index (Z = 2.809, P = 0.005) in the validation cohort. To distinguish fat-poor AML from RCC, the three prediction models exhibited similar diagnostic performance in the validation cohort as in the initial cohort (Table 7; Fig. 5B). Model 3 still demonstrated the best diagnostic performance, significantly surpassing Model 1 (AUC, 0.933 vs. 0.873; Z = 1.998, P = 0.047) and marginally outperforming Model 2 (AUC, 0.933 vs. 0.891; Z = 1.732, P = 0.073) (Fig. 6).

Table 6 Regression coefficients of the prediction models
Table 7 Diagnostic performance of the prediction models for the differentiation of fat-poor AML and RCC
Fig. 6
figure 6

A patient with fat-poor AML. Main morphological parameters: AIS score, 1; OBS score, 1; any sign for AML, absence; tumor diameter, 2.1 cm (a); circularity index, 90.01% (b); MIC area, 1.47 cm2 (c); MCC area, 3.53 cm2 (c); IECR, 41.64%. The probability of diagnosing fat-poor AML via Model 1 (prediction model omitting circularity index and IECR) was 22.32%, and this case was incorrectly diagnosed as RCC. However, the probabilities of diagnosing fat-poor AML via Model 2 (prediction model incorporating circularity index) and Model 3 (prediction model incorporating IECR) were 60.65% and 74.89%, respectively. AML, angiomyolipoma; RCC, renal cell carcinoma; AIS, angular interface sign; OBS, overflowing beer sign; MIC, maximum inscribed circle; MCC, minimum circumscribed circle; IECR, internal-to-external circle area ratio

Discussion

This study demonstrated that morphological parameters could help distinguish small (< 3 cm) fat-poor AML from RCC. Among all morphological parameters, the IECR showed the best diagnostic efficiency. To distinguish fat-poor AML from RCC, the prediction model that used the IECR outperformed the models without the IECR.

AIS is a widely recognized shape feature for distinguishing fat-poor AML from RCC; however, different studies have proposed different definitions for AIS [11, 12, 17, 21, 22]. For example, Lim et al. [11] and Verma et al. [12] defined AIS as “a tapering pyramidal interface with a definable apex within the renal parenchyma”, whereas Kim et al. [17] and Zhou et al. [21] limited the angle of the mass within the renal parenchyma to 90° or less. However, to date, there is no evidence to support which definition has better diagnostic performance. Therefore, this study developed a 4-point AIS score to evaluate the diagnostic performance of different AIS definitions, which might provide more comprehensive diagnostic information. For example, an AIS score of 3 had greater sensitivity (42.86% vs. 25.71%) and lower specificity (84.18% vs. 92.09%) than an AIS score of 4. Moreover, the optimal threshold of 2.5 suggested that AIS should be defined without limiting the angle of the mass within the renal parenchyma. OBS was defined as “the contact length of ≥ 3 mm between the tumor’s bulging-out portion and the kidney surface” [11, 17]. We believe that this quantitative definition might be relatively simple and might limit its wider application. Therefore, a 3-point OBS score was developed. Compared with an OBS score of 3, an OBS score of 2 increased the sensitivity (31.43% vs. 20.00%) but decreased the specificity (91.53% vs. 97.18%) for distinguishing fat-poor AML from RCC. Moreover, the optimal threshold of 1.5 for the OBS score suggested that a contact length of < 3 mm was also a sign of fat-poor AML. In addition, for distinguishing fat-poor AML from RCC, any sign for AML formed by combining an AIS score of ≥ 3 and/or an OBS score of ≥ 2 increased the sensitivity to 54.29%, which was similar to the findings of previous studies [23].

The circularity index also showed a significant value for distinguishing smaller (< 3 cm) fat-poor AML from RCC. However, the AUC value in the present study was lower than that reported by Kang et al. [18] (0.824 vs. 0.924). We hypothesized that this might be due to limiting the lesion size to < 3 cm and the use of thinner CT section thickness. Furthermore, the prediction model incorporating circularity index (Model 2) was statistically better than that without circularity index (Model 1) (AUC, 0.921 vs. 0.873; Z = 1.994, P = 0.046). These data demonstrated the value of the circularity index in quantifying tumor morphology. However, the inability to obtain a circularity index via a diagnostic workstation limits its wider application.

Since the ROI area is easier to obtain than its diameter in the diagnostic workstation, the area ratio of the MIC to the MCC, rather than the diameter difference, was used to evaluate the circularity of the tumor. In theory, the more irregular the tumor, the smaller the IECR, and the rounder the tumor, the larger the IECR; the IECR of a perfect round tumor is 1 [19, 20]. Actually, the MCC reflects the maximum tumor diameter, whereas the MIC can assess the morphological irregularity of the tumor. The present study showed that the IECR better reflected tumor morphology and had the best diagnostic performance among morphological parameters. Importantly, our analysis revealed significant variation in IECR performance based on tumor size. Although tumors < 1.5 cm showed moderate discrimination (AUC, 0.629), stratified analysis identified 1.3 cm as a critical threshold: AUC improved to 0.850–0.970 for tumors ≥ 1.3 cm but dropped to 0.400 for < 1.3 cm lesions. These findings suggest that 1.3 cm may represent the lower limit for reliable IECR application. However, interpretation requires caution due to: (1) limited < 1.3 cm sample size (n = 8), and (2) absence of sub-1.3 cm tumors in the validation cohort. Future studies should validate this threshold in larger cohorts. In addition, measuring the IECR in a single plane might introduce selection bias. However, in the present study, we still chose to measure the IECR in a single plane for the following reasons. First, the morphology of the tumor in the maximum diameter plane is often the most representative of the overall tumor shape. Second, tumor shape generally does not change dramatically between the largest diameter plane and adjacent planes, which may also reduce interreader differences. Third, single-plane measurements are more convenient for performing clinically than whole-tumor measurements. Finally, the interreader agreement for the IECR was excellent (ICC, 0.78).

Among the three developed prediction models, Model 3 achieved the best diagnostic efficiency (AUC, 0.953). In Model 3, IECR and sex were identified as the key factors capable of distinguishing fat-poor AML from RCC, while other morphological parameters were excluded, suggesting that other morphological parameters might be covered by the outstanding IECR. Compared with the circularity index, the IECR was easier to obtain, so Model 3 was more applicable to our daily routine diagnosis than Model 2. Moreover, Model 3 used fewer variables than Model 1 and Model 2, which might also potentially improve reproducibility. In addition, the three prediction models also showed similar diagnostic performance in the validation cohort as in the initial cohort, with Model 3 still achieving the best diagnostic performance (AUC, 0.933).

Recent MRI studies have demonstrated high accuracy in differentiating fat-poor AMLs from RCCs using various quantitative and qualitative parameters [24,25,26,27]. For example, Schieda et al. [24] reported AUC values of 0.66–0.97 using T2-weighted signal-intensity ratios, chemical-shift signal-intensity index, and contrast-enhanced curve analysis. Chen et al. [25] achieved an AUC of 0.913 by combining pseudocapsule formation, wedge-shaped sign, and apparent diffusion coefficient to identify fat-poor AMLs among T2-hypointense clear cell RCCs. Compared with these MRI-based approaches [24,25,26,27], our CT-derived morphological factor offers two distinct clinical advantages: (1) achieving comparable diagnostic performance (Model 3 AUC: 0.953 in the initial cohort and 0.933 in the validation cohort) using only patient sex and tumor morphology; and (2) the fact that identified morphological characteristics (IECR), though validated on CT, could theoretically be assessed using MRI or ultrasound, suggesting potential cross-modality applicability.

There were several limitations. First, this was a retrospective study; thus, it was susceptible to selection bias. Second, although this was a two-center study, the sample size of fat-poor AML patients was relatively small because the lesion size was limited to less than 3 cm. However, our cohort of 54 fat-poor AMLs surpasses the sample sizes reported in most comparable studies [3, 16, 18], with only a limited number of studies including > 50 cases [22]. Third, the number of fat-poor AMLs is relatively small compared to RCCs in our cohort. However, this observed 1:5 ratio (54 vs. 276) accurately reflects the clinical prevalence of small (< 3 cm) tumors. Our study design intentionally prioritized real-world representativeness over artificial sample balancing, ensuring the findings remain clinically translatable despite this numerical disparity. Fourth, while our strict 3 cm cut-off ensures high clinical relevance for small renal masses (as it reflects guideline-recommended treatment thresholds [7, 8] and the well-documented aggressiveness transition beyond this size [4, 5, 6]), it necessarily limits extrapolation to tumors exceeding this threshold. Further validation is required to determine whether these parameters retain predictive value for larger tumors (e.g., 3–4 cm). Finally, owing to space limitations, this study investigated only morphological characteristics and did not include other CT parameters, such as enhancement features.

Conclusions

In conclusion, the IECR is a simple and practical quantitative morphological factor for distinguishing fat-poor AML from RCC. Adding IECR can improve the diagnostic performance of prediction models based on morphological characteristics.