Skip to main content

Use of ultrasound imaging Omics in predicting molecular typing and assessing the risk of postoperative recurrence in breast cancer

Abstract

Background

The aim of this study is to assess the efficacy of a multiparametric ultrasound imaging omics model in predicting the risk of postoperative recurrence and molecular typing of breast cancer.

Methods

A retrospective analysis was conducted on 534 female patients diagnosed with breast cancer through preoperative ultrasonography and pathology, from January 2018 to June 2023 at the Affiliated Cancer Hospital of Xinjiang Medical University. Univariate analysis and multifactorial logistic regression modeling were used to identify independent risk factors associated with clinical characteristics. The PyRadiomics package was used to delineate the region of interest in selected ultrasound images and extract radiomic features. Subsequently, radiomic scores were established through Least Absolute Shrinkage and Selection Operator (LASSO) regression and Support Vector Machine (SVM) methods. The predictive performance of the model was assessed using the receiver operating characteristic (ROC) curve, and the area under the curve (AUC) was calculated. Evaluation of diagnostic efficacy and clinical practicability was conducted through calibration curves and decision curves.

Results

In the training set, the AUC values for the postoperative recurrence risk prediction model were 0.9489, and for the validation set, they were 0.8491. Regarding the molecular typing prediction model, the AUC values in the training set and validation set were 0.93 and 0.92 for the HER-2 overexpression phenotype, 0.94 and 0.74 for the TNBC phenotype, 1.00 and 0.97 for the luminal A phenotype, and 1.00 and 0.89 for the luminal B phenotype, respectively. Based on a comprehensive analysis of calibration and decision curves, it was established that the model exhibits strong predictive performance and clinical practicability.

Conclusion

The use of multiparametric ultrasound imaging omics proves to be of significant value in predicting both the risk of postoperative recurrence and molecular typing in breast cancer. This non-invasive approach offers crucial guidance for the diagnosis and treatment of the condition.

Peer Review reports

Background

In recent years, the incidence of breast cancer (BC) has witnessed a consistent rise, surpassing lung cancer to emerge as the foremost malignant tumor affecting women globally [1]. Notably, it stands as the leading cause of mortality among women worldwide. Statistical data indicate that the prevalence of breast cancer among young women in China exceeds that of other nations, presenting a substantial threat to the physical and mental well-being of Chinese women [2]. A molecular typing-based classification system was introduced at the St. Gallen conference in 2013, categorizing breast cancer into four subtypes: luminal A, luminal B, Human Epidermal Growth Factor Receptor 2 (HER-2) overexpression, and Triple Negative Breast Cancer (TNBC). Currently, the primary modalities used in breast cancer treatment encompass surgery, targeted therapy, endocrine therapy, chemotherapy, and radiotherapy [3, 4]. For patients exhibiting positive estrogen receptor (ER) or progestogen receptor (PR), supplementary endocrine therapy is recommended to manage tumor progression and enhance prognosis [5]. Moreover, patients with HER-2 overexpression may undergo additional targeted therapy [6]. TNBC, characterized by the lack of ER, PR, and HER-2 expression [7], exhibits limited responsiveness to endocrine and targeted therapies, necessitating standardized chemotherapy as a primary therapeutic approach alongside surgical interventions [8, 9].

To enhance the prognosis of patients diagnosed with breast cancer, the pivotal focus lies in early diagnosis and timely intervention. Molecular typing of breast cancer and the assessment of postoperative recurrence risk are crucial factors, enabling clinicians to formulate personalized treatment strategies and evaluate patient prognoses [10,11,12,13]. Guidelines established by the Chinese Society of Clinical Oncology (CSCO) offer appropriate regimens based on factors such as the number of lymph node metastases, molecular typing, histological grading, and tumor size. Treatment modalities, incorporating anthracyclines, paclitaxel, cyclophosphamide, and platinum, are further supplemented with targeted therapies or endocrine therapies based on the assessed risk of recurrence, thereby providing patients with individualized and precise treatment plans. The China Anti-Cancer Association (CACA) guidelines categorize postoperative recurrence risk as high, intermediate, or low, with a focus on investigating intermediate- and high-risk patient groups, given the scarcity of low-risk cases in clinical practice. Crucial to enhancing patient prognosis and quality of life, current preoperative diagnostic techniques for breast cancer predominantly encompass mammography, ultrasound, and magnetic resonance imaging (MRI) [10, 14]. However, the high proportion of dense mammary gland tissue among Chinese women with breast cancer contributes to a notably high false-positive rate in X-ray-based screening, ranging from 65 to 90% [15, 16]. While MRI is characterized by its accuracy, it is cost-prohibitive and time-consuming. Ultrasonography, a painless, non-invasive, cost-effective, and expeditious method, surpasses mammography and MRI in terms of detection rate, accuracy, and cost-benefit ratio among Chinese women, emerging as the primary screening modality for breast diseases [17]. Currently, breast cancer molecular typing and postoperative histopathology results are typically derived from preoperative puncture or postoperative pathology of immunohistochemistry. However, clinical observations reveal differences between core needle punctures of mammary glands and immunohistochemistry of surgical specimens, potentially leading to increased risks of recurrence, metastatic recurrence, and mortality [18]. This discrepancy may stem from variations in immunohistochemistry results within different locations of the same cancer focus, exhibiting differing proportions. Studies indicate that receptor status may undergo changes after neoadjuvant treatment, showcasing inconsistencies of approximately 3–5% in hormone receptor (HR) status and 10% in HER-2 status in breast cancers treated with current neoadjuvant regimens [19]. Studies emphasize the prognostic implications of changes in immunohistochemistry post-treatment, recommending the retesting of biomarkers following neoadjuvant treatment or upon the development of drug resistance [20, 21]. Such reevaluation aims to tailor treatment regimens, mitigate the risk of postoperative recurrence, and enhance patient prognosis. Presently, immunohistochemistry relies on clinical specimens. However, it is susceptible to variations based on site selection and sectioning levels, leading to somewhat inaccurate results and a considerable wait time. Rapid and accurate prediction of the molecular typing and postoperative recurrence risk among patients during the disease course could serve to prompt clinicians on the necessity of updating immunohistochemistry results, potentially extending patient survival and enhancing overall quality of life.

With technological advancements, there is a growing inclination toward multimodal imaging. Multimodal imageomics technology facilitates the extraction of numerous image features from existing medical images in a high-throughput manner. Automated data characterization algorithms are then applied to transform the image data from the region of interest (ROI) into high-resolution feature data. This data can be effectively explored to construct clinical prediction models, providing more comprehensive and supplementary information for the diagnosis and treatment of diseases [22, 23]. The role of multimodal imageomics technology in the auxiliary diagnosis and treatment of diseases has been widely studied, including CT, MRI, ultrasound images, etc. Clinical prediction models based on multimodal imageomics techniques have shown great potential in the diagnosis of diseases [24]. Different imageomics techniques are suitable for different diseases, for example, CT radiomics and deep learning based models perform well in staging lymph node metastasis in pancreatic cancer [25], and for neurological diseases, MRI imageomics and deep learning models have greater potential. In one study, its combined accuracy in distinguishing between neuromyelitis optica spectrum disorders and multiple sclerosis was 82% [26], and deep learning-based ultrasound imageomics is more suitable for breast tumor-related differentiation and diagnosis. Notably, a study demonstrated that a deep learning model in breast cancer diagnosis achieved a classification accuracy of 97.18% in distinguishing malignant, benign, and normal ultrasound images [27]. Another study highlighted the efficacy of multiparametric ultrasound imaging omics in predicting molecular subtypes of breast cancer, with an area under the curve (AUC) of 0.970 for the prediction of triple-negative and non-triple-negative breast cancers [28]. Research data have shown that in the accurate diagnosis of breast cancer, the accuracy of deep learning model in diagnosing malignant tumours in BI-RADS 4a patients is 92.86%, which theoretically reduces unnecessary biopsies by 67.86% [29], increasing diagnosticity while significantly reducing invasive operations for patients. In another study using a deep learning model of ultrasound images to discriminate breast fibroadenomas from lobular breast tumors, the AUC value reached 0.91 [30]. Therefore, the combined application of multimodal ultrasound technology has a broad application prospect for the diagnosis and prognosis of breast cancer.

However, since there is still no in-depth research on multimodal ultrasound technology in determining the risk of postoperative recurrence of breast cancer and the four molecular subtypes, the present study is intended to establish a model by extracting the characteristics of ultrasound images of patients with different types of subtypes and different risks of postoperative recurrence to predict the molecular subtypes and the risk of postoperative recurrence in patients with breast cancer, which is aimed at providing an effective guide to the diagnosis and treatment of breast cancer in a non-invasive way.

Materials and methods

Study participants

Between January 2018 and June 2023, we conducted a retrospective study encompassing 534 cases of female patients diagnosed with breast cancer through surgical procedures at the Affiliated Cancer Hospital of Xinjiang Medical University. The inclusion criteria encompassed the following: (1) Surgical pathological diagnosis in our hospital; (2) breast and axillary ultrasound examination performed in our hospital 15 days before surgery with clear and recognizable lesions; (3) complete clinical, pathological, and ultrasound data; (4) absence of preoperative endocrine, radiotherapy, or chemotherapy treatment; (5) no history of breast cancer in the patients and their relatives; (6) signed informed consent. The exclusion criteria comprised: (1) Male breast cancer was ruled out due to the lower number of male breast cancers and the difference in hormone levels compared to females; (2) Preoperative neoadjuvant therapy results in changes in receptor expression and ultrasound image characteristics, so it is excluded; (3) Previous breast cancer or other malignancies may affect breast cancer pathology and ultrasound image characteristics due to treatment or changes in the body’s immune microenvironment; patients with a history of previous cancer were excluded from this study; (4) To minimize bias, clinical and pathological data and ultrasound images were excluded if any of them were missing; (5) The number of patients with a low risk of postoperative recurrence is small, and to avoid imbalance in the data, only patients with an intermediate and high risk of postoperative recurrence were studied in this study.

Clinical data collection

Data on clinical features of patients with breast cancer were retrospectively collected from the follow-up and medical record systems of our hospital. This information encompassed age, gender, ethnicity, pathological features (lesion size, histological grading, vascular tumor embolus, ER expression, PR expression, HER-2 expression, nerve invasion, and axillary lymph node metastasis), ultrasonographic features (aspect ratio, morphology, margins, posterior echogenicity, intra-lesional blood flow in the lesion, internal echoes, presence or absence of calcification, and lymph node morphology), tumor TNM (Tumor Node Metastasis) clinical staging, molecular typing, and the risk of postoperative recurrence. Patients were categorized into groups based on the latest CSCO guidelines for clinical molecular typing of breast cancer: luminal A group, luminal B group, HER-2 overexpression group, and TNBC group [31]. Furthermore, patients were classified into intermediate-risk and high-risk groups based on the risk of postoperative recurrence using the latest criteria from the CACA guidelines [32].

Instruments and methods

Ultrasound image acquisition

Breast ultrasound image acquisition was conducted by an experienced radiologist, who was blinded to the pathological results. A GE Logic E9 color Doppler ultrasound machine, equipped with a line-array probe, was used for the procedure. The patient assumed the supine position with arms abducted by 90° to fully expose the mammary glands and axilla. Radial scanning initiated clockwise from the outer upper quadrant, centered on the nipple, with overlapping adjacent areas scanned. Ultrasound characteristics of the breast mass and axillary lymph node metastasis were collected from transverse, longitudinal, and radial scanning views, with eligibility criteria requiring the presence of clear and interpretable two-dimensional views.

Radiomics feature extraction and analysis

To enhance the efficiency and precision of outlining the ROI, a concurrent application of manual outlining and artificial intelligence outlining was used. The manual outlining, conducted in a double-blind manner, was executed by a senior radiologist with 10 to 15 years of experience. This radiologist outlined the ROIs and labeled them for storage. The radiologist always uses the same ultrasound machine for image acquisition, avoiding squeezing the tumor as much as possible during the process. The maximum transverse diameter and the maximum longitudinal diameter of the tumor are captured separately, and at least two clear images are saved. The acquired images avoided blood vessels, nerves and ribs as much as possible to minimize the interference with the images and maximize the quality of the images. Unet software was used for the AI outlining segment. In order to test the accuracy of Unet software, we randomly selected 100 ultrasound images, numbered 1-100, and duplicated the copies, one of which outlined the region of interest (ROI) using Unet software, and the other manually outlined the ROI. The ROI was cut and then the overlap of the two images with the same number was compared using the Unet software, resulting in Intersection over Union (IoU) = 0.973, suggesting that the Unet method is accurate. Following the image outlining process, the images were input into the “Pyradiomics” feature package (github.com/Radiomics/pyradiomics) for feature extraction. A total of 744 features were extracted, encompassing shape parameters, first-order parameters, gray-level co-occurrence matrix parameters (GLCM), gray-level run-length matrix (GLRLM) parameters, gray-level size zone matrix (GLSZM) parameters, and gray-level dependence matrix (GLDM) parameters. To address errors arising from inconsistent sample sizes across classifications, the Synthetic Minority Oversampling Technique (SMOTE) was used. SMOTE algorithm is a classic method to solve unbalanced dataset, its full name is Synthetic Minority Over-sampling Technique. SMOTE algorithm is based on the principle of balancing the dataset by synthesizing new minority samples to improve the model performance. It creates new synthetic samples by interpolating between the minority class samples to balance the dataset. The core idea of the SMOTE algorithm is based on the K-nearest neighbor algorithm. For each minority class sample, SMOTE calculates its K nearest neighbor samples and then generates a new sample between two randomly selected nearest neighbors. The image features were divided into a training set and a validation set in a 7:3 ratio, and data normalization was carried out to transform all features between − 1 and 1 using maximum absolute normalization. The Intra-class Correlation Coefficient (ICC) was calculated to retain features with an ICC > 0.75. The LASSO regression was then applied for multiple dimensionality reduction of the data. Finally, features with significant predictive value for both the molecular typing of breast cancer and the risk of postoperative recurrence were identified.

Model construction

The SVM algorithm was used to construct predictive models using the specific features identified through the LASSO method. In the SVM algorithm, the value of test_size is set to 0.3, the kernel function is set to rbf, and the gamma value is set to scale. In the LASSO regression analysis, the specific parameters we set at runtime are: the value of test_size is 0.3, the value of random_state is 15, the value of n_estimators is 200, the value of random_state_rf is 20, the criterion is set to entropy, the class_weight is set to balanced, and the Lasso Alpha parameter is set to scale. weight is set to balanced, Lasso Alpha parameter is -4, 1, 50, the number of iterations Lasso max_iter is 100,000, and lasso is set to tenfold cross-validation. Subsequently, the receiver operating characteristic (ROC) curve for the histological model was generated. To evaluate the consistency of the predictive model with the ideal model, a calibration curve was employed. Furthermore, the clinical practicability of the model was assessed using the decision curve.

Statistical analysis

The data underwent analysis using SPSS 26.0 software, and the SMOTE algorithm was used to address sample size imbalances within each subgroup. For measurement data, the normality of distribution was initially assessed through the Kolmogorov–Smirnov test. Normally distributed data are presented as mean ± standard deviation (x̅±s), and the independent samples t-test was applied for comparisons. Non-normally distributed data are expressed as median (upper quartile, lower quartile) and analyzed using the Mann–Whitney U test. Count data are presented as frequencies, and the chi-squared test and Fisher’s test were used to verify data distribution. A multifactor logistic regression model was constructed to identify relevant influencing factors affecting the molecular typing of breast cancer and the risk of postoperative recurrence. Python 3.6 and Matplotlib software were used to generate the ROC curve, calibration curve, and decision curve. The AUC, sensitivity, specificity, and accuracy served as evaluation indicators for the model performance. Statistical significance was considered when the p-value was less than 0.05.

Results

Comparison of baseline data of clinical information

In this study, 534 cases were ultimately enrolled, comprising 311 cases classified as having an intermediate risk of postoperative recurrence and 223 cases classified as having a high risk of recurrence. Statistical analysis revealed significant differences (P < 0.05) among female patients diagnosed with breast cancer having distinct postoperative recurrence risks in the following indicators: the number of lymph node metastases, lesion size, histological grading, vascular tumor embolus, nerve invasion, ER expression, PR expression, HER-2 expression, proliferation marker (Ki-67) expression, molecular typing, clinical staging, and ultrasound image characteristics (blood flow, mass morphology, mass margins, lymph node morphology, internal calcification) (refer to Table 1). Among the enrolled cases, there were 87 cases of luminal A, 234 cases of luminal B, 84 cases of HER-2 overexpression, and 129 cases of triple-negative breast cancer. Upon analyzing the clinical data and ultrasound characteristics, statistically significant differences (P < 0.05) were observed among the four groups of molecular typing in female patients diagnosed with breast cancer in the following indicators: ethnicity, number of lymph node metastases, lesion size, histologic grading, expression of Ki-67, risk of postoperative recurrence, clinical stage, and features of ultrasound images (mass morphology, internal echogenicity, abnormal lymph node morphology, and internal calcification) (refer to Table 2).

Table 1 Comparison of baseline data of different postoperative recurrence risk groups
Table 2 Comparison of baseline data of different molecular typing groups

Analysis of clinical features

The 22 clinical features underwent statistical analysis, resulting in the identification of 16 risk factors associated with the risk of postoperative recurrence through univariate analysis. Subsequently, these factors underwent multifactorial logistic regression analysis, ultimately revealing 6 independent risk factors: the number of lymph node metastases, ER expression, HER-2 expression, molecular typing, clinical staging, and ultrasonographic blood flow grading (refer to Table 3).

Table 3 Multiple logistic regression model analysis of the risk factors of postoperative recurrence

Using pathology as the gold standard, univariate analysis identified 11 risk factors associated with the molecular typing of breast cancer. Through multifactor logistic regression analysis of these 11 risk factors in the training set, 6 independent risk factors were discerned: Ki-67 expression, number of lymph node metastases, histological grade, postoperative recurrence risk, clinical staging, and lymph node morphology (refer to Table 4).

Table 4 Molecular typing model analysis using multivariate logistic regression

Selection of radiomics features and model construction

Results of screening radiomics features

Using the independent samples t-test and LASSO regression, the postoperative intermediate risk of recurrence was coded as 0, and the high risk of recurrence was coded as 1. In the other subgroup, the HER-2 overexpression type was coded as 0, TNBC as 1, luminal A as 2, and luminal B as 3. A total of 733 features were extracted from the ultrasound images of the patients, and features with an ICC greater than 0.75 were retained and weighted with the LASSO coefficient (Figs. 1 and 2A-C). Additionally, nineteen optimal features for the molecular typing of breast cancer were ultimately identified (refer to Table 5; Fig. 2D). A total of 44 optimal features for the risk of postoperative recurrence were identified (refer to Table 6; Fig. 2E). The radiomics models were subsequently constructed.

Fig. 1
figure 1

Radscores box plot of ultrasound image features for breast cancer postoperative recurrence risk (A) and molecular typing (B). The postoperative intermediate risk of recurrence was coded as 0, and the high risk of recurrence was coded as 1 (A). The HER-2 overexpression type was coded as 0, TNBC as 1, luminal A as 2, and luminal B as 3 (B). A total of 733 features were extracted from the ultrasound images of the patients. After normalizing the extracted features, we get Radscores. First, we find the upper edge, lower edge, median, and two quartiles of Radscores. Then, we connect the two quartiles to draw a box. Then, we connect the upper and lower edges to the box, and the median is in the middle of the box. The yellow dots represent the extracted features, and the blue diamonds represent outliers. In the figure, the median is in the middle of the box, and the data is normally distributed

Fig. 2
figure 2

The independent samples t-test and LASSO regression were used to screen the significant features in molecular typing (A) and the risk of postoperative recurrence (B). In the process of the LASSO, the color line represents the coefficient of the feature with λ Value change curve, corresponding to dashed line λ Value is the best λ Value, keep the features where the coefficient is not 0 (C). Nineteen of the 733 features extracted from patient ultrasound images were associated with the risk of molecular typing (D, P < 0.05), the numbers represent the names of the optimal features in Table 5. Forty-four of the 733 features extracted from patient ultrasound images were associated with the risk of postoperative recurrence (E, P < 0.05), the numbers represent the names of the optimal features in Table 6. The bar plot shows p value for all the ultrasomic features used in the RadScore model in descending order of importance

Table 5 Optimal characteristics for molecular typing
Table 6 Optimal characteristics for the risk of postoperative recurrence

Postoperative recurrence risk prediction model

The AUC values for the postoperative recurrence risk prediction model constructed using ultrasound imaging omics features were 0.9489 and 0.8491 in the training set and the validation set, respectively (refer to Table 7; Fig. 3A and B). The calibration curve indicated that the ultrasound imaging omics model performed well in assessing the consistency of a particular result between the training and validation sets with the ideal model (refer to Fig. 3C, P = 0.30). Analysis of the decision curves demonstrated that clinical ultrasound imaging omics exhibited superior applicability in both the training and validation sets, showcasing enhanced diagnostic performance (refer to Fig. 3D).

Table 7 Performance evaluation of the postoperative recurrence risk models
Fig. 3
figure 3

Predictive model for postoperative recurrence risk of breast cancer. A, the receiver operating characteristic (ROC) curves in training set. B, the ROC curves in validation set. C, calibration curves analysis of the predictive model. Diagonal dotted line indicates perfect prediction, while orange solid line indicates a model’s performance. Closer fitting to the diagonal dotted line indicates better performance. As shown in the figure, the model predicts good performance (P = 0.30). D, decision curves analysis of the predictive model. The red line represents the assumption that all patients have postoperative recurrence. The dotted line indicates the hypothesis that no patients have postoperative recurrence. Red shaded area represents the predictive effectiveness of the model

Molecular typing prediction model

The corresponding AUC values for the molecular typing prediction model in the training set and validation set were as follows: 0.93 and 0.92 for the HER-2 overexpression phenotype, 0.94 and 0.74 for the TNBC phenotype, 1.00 and 0.97 for the luminal A phenotype, and 1.00 and 0.89 for the luminal B phenotype (refer to Table 8; Fig. 4A and B), respectively. The calibration curve indicated that the ultrasound imaging omics model performed effectively in assessing the consistency of a particular result between the training and validation sets with the ideal model (refer to Fig. 4C, P = 0.09). Analysis of the decision curves demonstrated that clinical ultrasound imaging omics exhibited enhanced applicability in both the training and validation sets, displaying superior diagnostic performance (refer to Fig. 4D).

Table 8 Performance evaluation of the molecular typing models
Fig. 4
figure 4

Predictive model for molecular subtyping of breast cancer. A, the receiver operating characteristic (ROC) curves in training set. BB, the ROC curves in validation set. C, calibration curves analysis of the predictive model. Diagonal dotted line indicates perfect prediction, while orange solid line indicates a model’s performance. Closer fitting to the diagonal dotted line indicates better performance. As shown in the figure, the model predicts good performance (P = 0.09). D, decision curves analysis of the predictive model. The red line indicates the hypothesis that all patients had different molecular types of breast cancer. The dotted line represents the hypothesis that none of the patients had different molecular types of breast cancer. The red shaded area indicates the predicted effect of the model

Discussion

Recent studies indicate an increasing incidence of breast cancer, particularly affecting young adults. Assessing the risk of postoperative recurrence and molecular typing is crucial for making personalized treatment decisions and assessing prognosis in patients diagnosed with breast cancer. Currently, postoperative pathology and immunohistochemistry are common methods for assessing these risks. However, the challenge lies in rapidly performing these assessments through non-invasive means. High-frequency ultrasound is adept at clearly displaying the morphological characteristics of breast masses, and its non-invasive, rapid, and convenient nature has made it widely accepted as the preferred examination for breast cancer screening and assessing diagnostic and therapeutic efficacy in China [33]. In this study, we delved into the clinical characteristics of molecular typing and postoperative recurrence risk. Through univariate and logistic regression models, we discovered that predicting molecular typing and postoperative recurrence risk based solely on clinical characteristics proved to be ineffective. Consequently, we further explored the value of ultrasonography in predicting the molecular typing of breast cancer and the risk of postoperative recurrence. This exploration aims to provide evidence supporting the diagnosis and treatment of patients with breast cancer, facilitate timely adjustments in therapeutic direction, and assist in the clinical development of personalized treatment plans.

Relationship between radiomics and clinical and imaging features with the risk of postoperative recurrence

Based on the postoperative recurrence risk assessment table in the CACA guidelines, patients were categorized into intermediate-risk and high-risk groups. Through univariate analysis and multifactorial logistic regression model analysis of the included clinical features, the number of lymph node metastases, ER expression, HER-2 expression, molecular typing, clinical staging, and ultrasonographic blood flow grading were identified as independent factors influencing the risk of postoperative recurrence. A total of 44 radiomic features were extracted and modeled, yielding AUC values of 0.9489 and 0.8491 for the postoperative recurrence risk prediction model in the training and validation sets, respectively. Notably, the radiomics model demonstrated superior predictive efficacy. This finding aligns with previous research, such as by Wang et al., who reported that a radiomics model assessing the risk of recurrence in patients with nasopharyngeal malignancies exhibited better predictive power than clinical, Ki-67-based, and TNM models [34]. Similarly, Qian et al. constructed a radiomics combined clinical model based on multiphase CT images and clinical risk factors, achieving AUCs of 0.813 and 0.838 in the training and validation sets, respectively [35]. This consistency supports the conclusion that radiomics outperforms clinical features in predicting the risk of cancer recurrence.

The relationship between the radiomics and clinical and imaging features with molecular typing

Through univariate analysis and multifactorial logistic regression model analysis of the included clinical features, 6 independent risk factors were identified: Ki-67 expression, number of lymph node metastases, histological grading, risk of postoperative recurrence, clinical staging and lymph node morphology. Additionally, 19 radiomic features were extracted and modeled, resulting in respective AUC values for the molecular typing prediction model in the training set and validation set. Specifically, the AUC values were 0.93 and 0.92 for the HER-2 overexpression phenotype, 0.94 and 0.74 for the TNBC phenotype, 1.00 and 0.97 for the luminal A phenotype, and 1.00 and 0.89 for the luminal B phenotype. These results indicate that the ultrasound imaging omics model exhibited a strong predictive ability. In the results, both luminal A and luminal B types demonstrated an AUC of 1 in the training set, suggesting that the included radiomic features were relatively accurate, and there was no significant difference after feature extraction through multiple trainings. Consequently, the validation set performed well in this model. Notably, the ultrasound imaging omics model in this study outperformed previous research—Wu et al. achieved an overall accuracy of 74.1% in predicting 4 molecular types using X-ray, MRI, and clinical features, while Chen et al. attained a model AUC of 0.834 in distinguishing triple-negative breast cancer from non-triple-negative breast cancer using ultrasound imaging omics [36, 37]. The ultrasound imaging omics model in the current study demonstrated excellent performance, significantly enhancing the accuracy and robustness of predictions.

Limitations

This study has certain limitations: (1) It is confined to a single center with a modest sample size, necessitating expansion in subsequent research endeavors to encompass a more extensive sample size and the implementation of diverse classification methodologies; (2) The ROI delineated are exclusively two-dimensional (2D), introducing susceptibility to the volume effect. Future investigations will address this limitation by delineating three-dimensional (3D) images; (3) The retrospective nature of this study, coupled with the subjective nature of ultrasound examinations and the static quality of the analyzed images, may result in the inadvertent omission of specific feature information; (4) Certain clinical features were subjected to semi-qualitative evaluation, introducing a degree of subjectivity.

Conclusion

In conclusion, the model developed using ultrasound imaging omics features for breast cancer demonstrates robust diagnostic performance, effectively assessing the risk of postoperative recurrence, and exhibiting high accuracy and sensitivity in predicting the molecular typing of breast cancer. This offers clinicians more precise information for both diagnosis and treatment decisions. However, it is important to note that the usage of radiomics is currently in its early developmental stages, and its integration into the medical field will continue to evolve with the further advancement of data sharing and machine learning.

Data availability

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Abbreviations

BC:

Breast cancer

ROI:

Region of interest

LASSO:

Least absolute shrinkage and selection operator

SVM:

Support vector machine

ROC:

Receiver operating characteristic

AUC:

Area under the curve

HER-2:

Human epidermal growthfactor receptor 2

TNBC:

Triple negative breast cancer

ER:

Estrogen receptor

PR:

Progestogen receptor

CSCO:

Chinese society of clinical oncology

CACA:

China anti-cancer association

MRI:

Magnetic resonance imaging

NACT:

Neoadjuvant chemotherapy

HR:

Hormone receptors

GLCM:

Gray-level co-occurrence matrix

GLRLM:

Gray-level run-length matrix

GLSZM:

Gray-level size zone matrix

GLDM:

Gray level dependence matrix

ICC:

Intra-class correlation coefficient

SMOTE:

Synthetic minority oversampling technique

BI-RADS:

Breast imaging reporting and data system

CT:

Computed tomography

CI:

Confidence interval

2D:

Two-dimensional

3D:

Three-dimensional

CDF:

Icolor doppler flow imaging

SWE:

Shear wave elastrography

References

  1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, et al. Global Cancer statistics 2020: GLOBOCAN estimates of incidence and Mortality Worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–49. https://doi.org/10.3322/caac.21660.

    Article  CAS  PubMed  Google Scholar 

  2. Tao X, Li T, Gandomkar Z, Brennan PC, Reed WM. Incidence, mortality, survival, and disease burden of breast cancer in China compared to other developed countries. Asia Pac J Clin Oncol. 2023;19(6):645–54. https://doi.org/10.1111/ajco.13958.

    Article  PubMed  Google Scholar 

  3. Trayes KP, Cokenakes SEH. Breast Cancer Treatment. Am Fam Physician. 2021;104(2):171–8. PMID: 34383430.

    PubMed  Google Scholar 

  4. Burstein HJ, Curigliano G, Thürlimann B, Weber WP, Poortmans P, Regan MM, et al. Customizing local and systemic therapies for women with early breast cancer: the St. Gallen International Consensus guidelines for treatment of early breast cancer 2021. Ann Oncol. 2021;32(10):1216–35. https://doi.org/10.1016/j.annonc.2021.06.023.

    Article  CAS  PubMed  Google Scholar 

  5. Ma HF, Shen J, Xu B, Shen JG. Neoadjuvant chemotherapy combined with endocrine therapy for hormone receptor-positive breast cancer: a systematic review and meta-analysis. Med (Baltim). 2023;102(46):e35928. https://doi.org/10.1097/MD.0000000000035928.

    Article  CAS  Google Scholar 

  6. Lin B, Fan J, Liu F, Wen Y, Li J, Gao F et al. Efficacy and safety of dual Anti-HER2 blockade and Docetaxel with or without Carboplatin as Neoadjuvant Regimen for treatment of HER2-Positive breast Cancer. Technol Cancer Res Treat 2023 Jan-Dec;22:15330338231218152. https://doi.org/10.1177/15330338231218152.

  7. So JY, Ohm J, Lipkowitz S, Yang L. Triple negative breast cancer (TNBC): non-genetic tumor heterogeneity and immune microenvironment: emerging treatment options. Pharmacol Ther. 2022;237:108253. https://doi.org/10.1016/j.pharmthera.2022.108253.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Lee J. Current Treatment Landscape for early triple-negative breast Cancer (TNBC). J Clin Med. 2023;12(4):1524. https://doi.org/10.3390/jcm12041524.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Anurag M, Jaehnig EJ, Krug K, Lei JT, Bergstrom EJ, Kim BJ, et al. Proteogenomic Markers of Chemotherapy Resistance and Response in Triple-negative breast Cancer. Cancer Discov. 2022;12(11):2586–605. https://doi.org/10.1158/2159-8290.CD-22-0200.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Barba D, León-Sosa A, Lugo P, Suquillo D, Torres F, Surre F, et al. Breast cancer, screening and diagnostic tools: all you need to know. Crit Rev Oncol Hematol. 2021;157:103174. https://doi.org/10.1016/j.critrevonc.2020.103174.

    Article  PubMed  Google Scholar 

  11. Andre F, Ismaila N, Henry NL, Somerfield MR, Bast RC, Barlow W, et al. Use of biomarkers to Guide decisions on Adjuvant systemic therapy for women with early-stage invasive breast Cancer: ASCO Clinical Practice Guideline Update-Integration of results from TAILORx. J Clin Oncol. 2019;37(22):1956–64. https://doi.org/10.1200/JCO.19.00945.

    Article  CAS  PubMed  Google Scholar 

  12. Rodin D, Sutradhar R, Jerzak KJ, Hahn E, Nguyen L, Castelo M, et al. Impact of non-adherence to endocrine therapy on recurrence risk in older women with stage I breast cancer after breast-conserving surgery. Breast Cancer Res Treat. 2023;201(1):77–87. https://doi.org/10.1007/s10549-023-06989-x.

    Article  PubMed  Google Scholar 

  13. Waks AG, Winer EP. Breast Cancer Treatment: a review. JAMA. 2019;321(3):288–300. https://doi.org/10.1001/jama.2018.19323.

    Article  CAS  PubMed  Google Scholar 

  14. Rahman WT, Helvie MA. Breast cancer screening in average and high-risk women. Best Pract Res Clin Obstet Gynaecol. 2022;83:3–14. https://doi.org/10.1016/j.bpobgyn.2021.11.007.

    Article  PubMed  Google Scholar 

  15. Li T, Li J, Heard R, Gandomkar Z, Ren J, Dai M, et al. Understanding mammographic breast density profile in China: a sino-australian comparative study of breast density using real-world data from cancer screening programs. Asia Pac J Clin Oncol. 2022;18(6):696–705. https://doi.org/10.1111/ajco.13763.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Vourtsis A, Berg WA. Breast density implications and supplemental screening. Eur Radiol. 2019;29(4):1762–77. https://doi.org/10.1007/s00330-018-5668-8.

    Article  PubMed  Google Scholar 

  17. Wang Y, Li Y, Song Y, Chen C, Wang Z, Li L, et al. Comparison of ultrasound and mammography for early diagnosis of breast cancer among Chinese women with suspected breast lesions: a prospective trial. Thorac Cancer. 2022;13(22):3145–51. https://doi.org/10.1111/1759-7714.14666.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Slostad JA, Yun NK, Schad AE, Warrior S, Fogg LF, Rao R. Concordance of breast cancer biomarker testing in core needle biopsy and surgical specimens: a single institution experience. Cancer Med. 2022;11(24):4954–65. https://doi.org/10.1002/cam4.4843.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Yilmaz C, Cavdar DK. Biomarker discordances and alterations observed in breast Cancer treated with Neoadjuvant Chemotherapy: causes, frequencies, and Clinical Significances. Curr Oncol. 2022;29(12):9695–710. https://doi.org/10.3390/curroncol29120761.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Coiro S, Gasparini E, Falco G, Santandrea G, Foroni M, Besutti G, et al. Biomarkers changes after neoadjuvant chemotherapy in breast Cancer: a seven-year single Institution experience. Diagnostics (Basel). 2021;11(12):2249. https://doi.org/10.3390/diagnostics11122249.

    Article  CAS  PubMed  Google Scholar 

  21. Zhao W, Sun L, Dong G, Wang X, Jia Y, Tong Z. Receptor conversion impacts outcomes of different molecular subtypes of primary breast cancer. Ther Adv Med Oncol. 2021;13:17588359211012982. https://doi.org/10.1177/17588359211012982.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Mayerhoefer ME, Materka A, Langs G, Häggström I, Szczypiński P, Gibbs P, et al. Introduction to Radiomics. J Nucl Med. 2020;61(4):488–95. https://doi.org/10.2967/jnumed.118.222893.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. El Haji H, Souadka A, Patel BN, Sbihi N, Ramasamy G, Patel BK, et al. Evolution of breast Cancer recurrence risk prediction: a systematic review of Statistical and Machine Learning-based models. JCO Clin Cancer Inf. 2023;7:e2300049. https://doi.org/10.1200/CCI.23.00049.

    Article  Google Scholar 

  24. Yaghoobpoor S, Fathi M, Ghorani H, Valizadeh P, Jannatdoust P, Tavasol A, et al. Machine learning approaches in the prediction of positive axillary lymph nodes post neoadjuvant chemotherapy using MRI, CT, or ultrasound: a systematic review. Eur J Radiol Open. 2024;12:100561. https://doi.org/10.1016/j.ejro.2024.100561.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Castellana R, Fanni SC, Roncella C, Romei C, Natrella M, Neri E. Radiomics and deep learning models for CT pre-operative lymph node staging in pancreatic ductal adenocarcinoma: a systematic review and meta-analysis. Eur J Radiol. 2024 May;18:176111510. https://doi.org/10.1016/j.ejrad.2024.111510.

  26. Etemadifar M, Norouzi M, Alaei SA, Karimi R, Salari M. The diagnostic performance of AI-based algorithms to discriminate between NMOSD and MS using MRI features: a systematic review and meta-analysis. Mult Scler Relat Disord 2024 May 11:87105682. https://doi.org/10.1016/j.msard.2024.105682.

  27. Liu H, Cui G, Luo Y, Guo Y, Zhao L, Wang Y, et al. Artificial Intelligence-based breast Cancer diagnosis using Ultrasound images and Grid-based deep feature generator. Int J Gen Med. 2022;15:2271–82. https://doi.org/10.2147/IJGM.S347491.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Zhou BY, Wang LF, Yin HH, Wu TF, Ren TT, Peng C, et al. Decoding the molecular subtypes of breast cancer seen on multimodal ultrasound images using an assembled convolutional neural network model: a prospective and multicentre study. EBioMedicine. 2021;74:103684. https://doi.org/10.1016/j.ebiom.2021.103684.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Hayashida T, Odani E, Kikuchi M, Nagayama A, Seki T, Takahashi M, et al. Establishment of a deep-learning system to diagnose BI-RADS4a or higher using breast ultrasound for clinical application. Cancer Sci. 2022;113(10):3528–34. https://doi.org/10.1111/cas.15511.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Shi Z, Ma Y, Ma X, Jin A, Zhou J, Li N, et al. Differentiation between Phyllodes Tumors and fibroadenomas through breast ultrasound: Deep-Learning Model outperforms Ultrasound Physicians. Sens (Basel). 2023;23(11):5099. https://doi.org/10.3390/s23115099.

    Article  Google Scholar 

  31. Li J, Jiang Z. Chinese Society of Clinical Oncology Breast Cancer (CSCO BC) guidelines in 2022: stratification and classification. Cancer Biol Med. 2022;19(6):769–73. https://doi.org/10.20892/j.issn.2095-3941.2022.0277.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Wu J, Fan D, Shao Z, Xu B, Ren G, Jiang Z, et al. CACA guidelines for holistic integrative management of breast Cancer. Holist Integr Oncol. 2022;1(1):7. https://doi.org/10.1007/s44178-022-00007-8.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Wang K, Zou Z, Shen H, Huang G, Yang S, Calcification. Posterior acoustic, and Blood Flow: Ultrasonic characteristics of triple-negative breast Cancer. J Healthc Eng. 2022;2022:9336185. https://doi.org/10.1155/2022/9336185.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Wang T, Hao J, Gao A, Zhang P, Wang H, Nie P, et al. An MRI-Based Radiomics Nomogram to assess recurrence risk in Sinonasal Malignant tumors. J Magn Reson Imaging. 2023;58(2):520–31. https://doi.org/10.1002/jmri.28548.

    Article  PubMed  Google Scholar 

  35. Qian J, Yang L, Hu S, Gu S, Ye J, Li Z, et al. Feasibility study on Predicting recurrence risk of bladder Cancer based on Radiomics features of multiphase CT images. Front Oncol. 2022;12:899897. https://doi.org/10.3389/fonc.2022.899897.

    Article  PubMed  PubMed Central  Google Scholar 

  36. Wu M, Zhong X, Peng Q, Xu M, Huang S, Yuan J, et al. Prediction of molecular subtypes of breast cancer using BI-RADS features based on a white box machine learning approach in a multi-modal imaging setting. Eur J Radiol. 2019;114:175–84. https://doi.org/10.1016/j.ejrad.2019.03.015.

    Article  PubMed  Google Scholar 

  37. Chen Q, Xia J, Zhang J. Identify the triple-negative and non-triple-negative breast cancer by using texture features of medicale ultrasonic image: a STROBE-compliant study. Med (Baltim). 2021;100(22):e25878. https://doi.org/10.1097/MD.0000000000025878.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We would like to acknowledge the hard and dedicated work of all the staff that implemented the intervention and evaluation components of the study.

Funding

Department of Science and Technology of Xinjiang Uygur Autonomous Region (2022TSYCTD0001).

Author information

Authors and Affiliations

Authors

Contributions

Conception and design of the research: Wen Liu, Chao Dong, Binlin Ma, Xinyu Song, Haoyi XuAcquisition of data: Xiaoli Wang, Yanyan Chen, Xiaoling LengAnalysis and interpretation of the data: Xinyu Song, Haoyi Xu, Yanyan Chen, Xiaoling Leng, Wen LiuStatistical analysis: Xinyu Song, Yue Hu, Zhimin LuoObtaining financing: Chao DongWriting of the manuscript: Xinyu Song, Haoyi Xu, Yue HuCritical revision of the manuscript for intellectual content: Binlin Ma, Xinyu Song, Haoyi Xu, Xiaoli Wang, Zhimin LuoAll authors read and approved the final draft.

Corresponding authors

Correspondence to Chao Dong or Binlin Ma.

Ethics declarations

Ethics approval and consent to participate

This study was conducted with approval from the Ethics Committee of Tumer Hospital Affiliated to Xinjiang Medical University. This study was conducted in accordance with the declaration of Helsinki. Written informed consent was obtained from all participants.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Song, X., Xu, H., Wang, X. et al. Use of ultrasound imaging Omics in predicting molecular typing and assessing the risk of postoperative recurrence in breast cancer. BMC Women's Health 24, 380 (2024). https://doi.org/10.1186/s12905-024-03231-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12905-024-03231-8

Keywords