Skip to main content

Predicting postoperative gastric cancer prognosis based on inflammatory factors and machine learning technology



There is a strong association between gastric cancer and inflammatory factors. Many studies have shown that machine learning can predict cancer patients’ prognosis. However, there has been no study on predicting gastric cancer death based on machine learning using related inflammatory factor variables.


Six machine learning algorithms are applied to predict total gastric cancer death after surgery.


The Gradient Boosting Machine (GBM) algorithm factors accounting for the prognosis weight outcome show that the three most important factors are neutrophil-lymphocyte ratio (NLR), platelet lymphocyte ratio (PLR) and age. The total postoperative death model showed that among patients with gastric cancer from the predictive test group: The highest accuracy was LR (0.759), followed by the GBM algorithm (0.733). For the six algorithms, the AUC values, from high to low, were LR, GBM, GBDT, forest, Tr and Xgbc. Among the six algorithms, Logistic had the highest precision (precision = 0.736), followed by the GBM algorithm (precision = 0.660). Among the six algorithms, GBM had the highest recall rate (recall = 0.667).


Postoperative mortality from gastric cancer can be predicted based on machine learning.

Peer Review reports


Gastric cancer (GC) is a malignancy that originates from gastric epithelial cells. Its histological type is mostly adenocarcinoma. It can be classified as either early gastric cancer or advanced gastric carcinoma, based on stage at diagnosis. Early gastric cancer refers to cases where the cancer cell invasion depth is located in the gastric mucosa or submucosa. Regardless of the surface area of the lesion and lymph node metastasis, once the cancer cell invasion depth exceeds the submucosa, it is considered advanced gastric carcinoma. As the fifth most common malignant tumor in the world, more than 70% of the cases of gastric cancer occur in developing countries. About half of them occur in East Asia, mostly in China. Gastric cancer is the third leading cause of cancer death among men and women all over the world (behind lung and liver cancer), while its mortality rate is the highest in East Asia (14.0 men and 9.80 women per 100,000) [1]. Studies have shown that most patients have a higher 5-year survival rate after treatment by endoscopic mucosal resection and endoscopic submucosal dissection in the early stages [2]. However, in the advanced stages, the 5-year survival rate is low, even after surgery and chemotherapy [3]. Prognosis prediction and treatment plan selection based on traditional TNM Classification of Malignant Tumors staging and histological typing are the most widely used methods in clinical practice at present. However, even though some patients may have the same TNM staging and treatment plans, their prognosis could still be very different. Preoperative identification of patients at high risk of death will allow clinicians to perform early intervention and select those patients most likely to benefit from new therapies such as immunization for individualized treatment, thereby improving the prognosis. Therefore, there is an urgent need to improve the existing prediction models and establish new models that can accurately judge prognosis and guide treatment selection.

Inflammation is a key process in the occurrence and progression of malignant tumors. In recent years, more has been learned about the relationship between inflammation and cancer [4, 5]. Many cancers are induced by chronic inflammation [6, 7]. When there are persistent chronic inflammatory reactions, pathogens cannot be eliminated by inherent immunity or adaptive immunity. Therefore, a reaction occurs, which induces tumor cell production. The tumor growth destroys the tissue structure and produces inflammatory factors. This produces an array of inflammatory cells which hinder the regression of inflammation, thus promoting tumor growth and transmission [13]. Many studies have shown that neutrophil-lymphocyte ratio (NLR) and platelet lymphocyte ratio (PLR) reflect patients’ inflammation and immune status, and that they are prognostic factors of multiple tumors (including rectal, prostate, lung, and breast cancer) [8,9,10].

Artificial intelligence (AI) is an emerging technical science that simulates, extends and expands the theory, methodology, technology and application systems of human intelligence. In the contemporary medical apparatus and instrument field, most artificial intelligence collects data through a machine. It then optimizes and analyzes the data, ultimately leading to either a qualitative or quantitative solution. For instance, machine learning has been used to predict mortality associated with complications after radical cystectomy for bladder cancer [11]. Machine learning based on a series of cancer antigen 125 levels can also predict the recurrence of abdominal and pelvic cancer via CT scan[12]. Furthermore, machine learning has been used to predict early biochemical recurrence after robot-assisted prostatectomy [13]. However, there have been no related studies on mortality prediction for gastric cancer, after surgery. Therefore, this study addresses this gap in the scientific literature.

Materials and Methods

Study population

The present study is a secondary analysis. Data is available on the BioStudies database ( The study involved 1,056 GC patients who had undergone gastrectomy.

According to postoperative histological specimens, all patients had been diagnosed with Stage I-III gastric carcinoma by histology. Tumor staging was performed using the seventh edition of the American Joint Cancer Commission (AJCC) TNM staging system [14]. Criteria for exclusion and inclusion were: (1) there had been no neoadjuvant chemotherapy or radiotherapy; (2) there was complete clinical pathology and follow-up data on potential prognostic factors; (3) there was no recurrence of gastric cancer, residual gastric cancer or other synchronous malignant tumors; (4) there was no acute infection or other inflammation within two weeks before surgery.

The following data were collected: age, sex, preoperative routine laboratory examination, post-operative tumor characteristics and survival time. Blood samples were collected a week before surgery. Papilla and moderately differentiated GC were divided into highly differentiated groups. Signet-ring cell, mucinous and undifferentiated GC were divided into mildly differentiated groups [15].

Biomarker calculation

NLR and PLR were defined as absolute neutrophil and platelet counts divided by absolute lymphocyte count [16]. According to previous studies, COP-NLR was calculated as follows: patients with elevated platelet counts (> 300 × 109/ L) and neutrophil-lymphocyte ratios (> 3) were scored as 2. Those without abnormal values were scored as either 1 or 0 [17].

Main outcome: Follow-up was conducted every 3 months for the first 2 years after surgery, and every 6 months thereafter. The overall state of death is defined as all causes of death after surgery.

Machine learning algorithm

Logistic Regression (LR) is a supervised classification algorithm. For classification, the target variable (or output) y can only adopt discrete values for a given set of features (or inputs) x. Logistic regression establishes a regression model to predict the probability that a given data input can be classified into a category numbered “1”. As linear regression hypothesis data follows a linear function, logistic regression uses a sigmoid function to model the data.

Decision tree algorithms (Tr) are a type of supervised learning which can solve regression and classification problems. The decision tree uses tree representation to solve this problems in which each leaf node corresponds to a class label, and its attributes are represented on the internal nodes of the tree.

Based on the integration of Bagging (Bagging with the self-service sampling method) based on decision tree, random forest (forest) introduces random attribute selection into the decision tree’s training process. In a random forest, for each node of the base decision tree, a subset containing the k attributes is randomly selected from the node’s attribute set. Then, an optimal attribute is selected from the subset.

Gradient Boosting Decision Tree (GBDT) is a Boost algorithm (Boosting is a class of algorithms that promote weak learners over strong learners). It can also be considered an improvement of the Boost algorithm. Every calculation of it will reduce the residual of the previous one. In order to reduce these residuals, a new model can be established in the direction of a gradient with a reduced residual.

As a fast, distributed and high performance gradient-lifting framework based on the decision tree algorithm, Gradient Boosting Machine (GBM) can sort, classify, run regressions, and perform many other machine learning tasks.

Extreme gradient Boosting (Xgbc) incorporates both boosting algorithms and a lifting tree model, which integrates many tree models.

Data processing method: Data analysis was conducted in R. Preoperative inflammatory indexes for the two groups, and quantitative data such as NLR and PLR, were expressed by mean value ± standard deviation for independent t-tests. The difference was considered statistically significant at P < 0.05. The correlation analysis was conducted by python, and the machine learning algorithm was analyzed, while the prognosis weight was constructed with the LightGBM algorithm. The data were randomly divided into a training group and a testing group at a 7:3 ratio. In this study, five cross-validations were used.


Comparison of basic indexes between the two groups: There was no statistical significance between the two groups (P = 0.862). PLR and NLR in the poor prognosis group were greater than those of the control group, and there was statistical significance between the two groups (P < 0.05) (see Table 1).

Table 1 Descriptive statistics on patients

The results of correlation analysis showed that age, tumor stage, NLR and PLR were positively correlated with gastric cancer among postoperative patients (see Fig. 1). In addition, the GBM algorithm factors accounting for the prognosis weight outcome show the three most important factors as NLR, PLR and age (see Fig. 2).

Fig. 1
figure 1

Correlation between variables

Fig. 2
figure 2

Variable importance of features included in the machine learning algorithm for predicting postoperative death outcomes for gastric cancer

Note: GBM: LightGBM

Effect of the total postoperative death model in patients with gastric cancer from the predictive training group: Among the six algorithm models, forest was the most accurate (0.884), followed by Xgbc (0.868). For the six algorithms, the AUC values, from high to low, were forest, Xgbc, GBDT, GBM, Tr and LR. Among the six algorithms, forest had the highest precision and recall rate (precision = 0.876 and recall = 0.823), followed by Xgbc (precision = 0.859 and recall = 0.797) (Fig. 3 and Table 2).

Fig. 3
figure 3

Machine learning algorithms predict gastric cancer postoperative death outcomes in the training group

Table 2 Forecast Results for Training Group

Effect of the total postoperative death model in patients with gastric cancer from the predictive test group: The highest accuracy was LR (0.759), followed by the GBM algorithm (0.733). For the six algorithms, the AUC values, from high to low, were LR, GBM, GBDT, forest, Tr and Xgbc. Among the six algorithms, Logistic had the highest precision (precision = 0.736), followed by the GBM algorithm (precision = 0.660). Among the six algorithms, GBM had the highest recall rate (recall = 0.667).(Fig. 4 and Table 3).

Fig. 4
figure 4

Machine learning algorithms predict gastric cancer postoperative death outcomes in the test group

Table 3 Forecast Results for Testing Group


The tumor inflammation factors NLR and PLR have a predictive effect on prognosis for a variety of tumors. Increasing NLR often indicates poor prognosis, increasing tumor stage, poor treatment response, disease-free survival and short total survival in patients with malignant tumors. However, its internal computer system is neither clear nor accurate yet [18, 19], and neither is its internal mechanism. The results of the present study show that machine learning algorithms can predict the prognosis of gastric cancer. At the same time, among these factors, the three most important factors, ranked sequentially, are NLR, PLR and age.

Jiang et al. [20] have studied the relationship between NLR and gastric cancer, and found that the NLR levels in the gastric cancer group were significantly higher than those in either the gastric polyposis group or the benign gastric tumor group. L. Lian et al. [21] studied the effect of NLR and PLR on the prognosis of gastric cancer and found that the PLR and NLR levels in patients with gastric cancer before surgery were significantly higher than those in healthy subjects. They also found that the lower PLR and NLR levels before surgery had better clinicopathological features, lower invasion depth and less lymph node metastasis. Pietrzyk et al. [22] studied 61 patients with gastric cancer and 61 healthy subjects, and found that the MPV, RDW, NLR and PLR levels in patients with gastric cancer were significantly higher than those in the control group, and that the difference was statistically significant. The present study produced similar results.

NLR predicts poor prognosis among patients with gastric cancer and it is correlated with antitumor therapy efficiency. Studies have shown that the curative effect of chemotherapy in patients with high NLR tumors is significantly lower than that in patients with low NLR tumors [23]. Studies have also shown that high NLR is correlated with an increase in PD-1 + T cells [24].

PD-L1/PD-1 pathway promotes tumor immune tolerance by preventing the inhibition effect of T-cell apoptosis. It also inhibits T-cell activation and antitumor immune response. The mechanism of antitumor therapy for PD-L1/PD-1 lies in tumor inhibition immune privilege, which increases the effect of anti-tumor immune cells [25]. In this study, inflammation-related factors predicted poor prognosis for gastric cancer after surgery. It was also found that NLR and PLR are positively correlated with tumor size. This raises several questions: can immunologic therapy solve the adverse induction effect of inflammatory-related factors on tumors? Is NLR an effective screening index for patients receiving PD-1/PD-L1 therapy?

Studies have shown that NLR and PLR are related to various clinical and pathological GC indexes, while NLR and PLR may be markers of GC disease progress. Chen X. D. et al. [26] have proposed that pre-PLRs levels are an independent predictor of peritoneal metastasis in patients with GC. When combined with various pathological features of gastric cancer (including depth of invasion, lymphoid invasion and pathological stage), the prediction results are more reliable. Kim. E. Y. et al. [27] divided 1,986 GC patients who had undergone therapeutic surgery into the following groups: high and low PLR groups and high and low NLR groups. The results of a comparison between the two groups’ clinical characteristics showed that high NLR and PLR were correlated with poor prognosis. However, NLR was a better predictor of the overall survival rate than PLR. These results are similar to the results of this study.

We are currently in the midst of a boom in health and physical therapy data. Thus, we can expect even more advances in the accuracy of artificial intelligence-assisted diagnosis and the treatment of gastric cancer in the future. Moreover, since data on gastric cancer patients are being generated at accelerating speeds and volumes, the existing PLR and NLP models cannot generate new models according to the new data, and thus the old models’ performance deteriorates due to improvements in diagnosis and treatment data for gastric cancer patients. However, artificial intelligence algorithms can dynamically learn from the collection of gastric cancer-related data. In this way, they self-learn, and gradually improve the diagnosis and prognosis of gastric cancer. Moreover, when a patient is discharged from the hospital, their diagnosis and treatment condition can be fed back to the doctor through an intelligent app, so that their recovery can be evaluated, with data stored and viewed in real-time.

Machine learning in artificial intelligence algorithms is prone to over-fitting and under-fitting in the construction of prediction models. Underfitting refers to when a model performs poorly in a training set, verification set and/or test set; over-fitting means that a model performs well in a training set, but poorly in a verification and/or testing stage, that is, the model has poor generalizability. In this study, the GBM algorithm was the most stable in both the training group and the test group. Therefore, it is the most reliable for determining the weight of various risk factors. However, multi-center queue research must be incorporated to train the model, in order to improve its performance.

This study has several limitations. Firstly, it is a retrospective study of postoperative patients with gastric cancer. And the greater bias is that there is likely a specific phenotype among East Asian patients that is different than other demographic groups which could render our ML model less useful when applied broadly, though it would still be valuable among East Asian patients, who likely have the most to gain from such a model given their higher incidence of GC. Furthermore, we have neither analyzed nor given predictions for gastric cancer patients with other malignant tumors. Finally, the variation trends for NLR and PLR in the occurrence and progression of gastric cancer could be refined and discussed further in the future. However, as NLR and PLR detection have the advantages of convenient and rapid detection, future research is needed to verify their clinical effects. Thus, a large sample study will be needed in the future. Moreover, the mechanism of the interaction between inflammation and gastric cancer needs further study. This could provide a new target for molecular targeted therapy for gastric cancer.


Inflammation, as a feature of gastric cancer, provides a new direction for further study of invasion and metastasis in gastric cancer. The results of this study show that machine learning can improve the prediction of gastric cancer prognosis after surgery.

Data Availability

Data is available at the BioStudies database (


  1. Ferlay J, Soerjomataram I, Dikshit R, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012[J]. Int J Cancer. 2015;136(5):E359–86.

    Article  CAS  PubMed  Google Scholar 

  2. Nashimoto A, Akazawa K, Isobe Y et al. Gastric cancer treated in 2002 in Japan: 2009 annual report of the JGCA nationwide registry[J]. Gastric Cancer, 2013, 16(1):1–27.

  3. Ajani JA, Bentrem DJ, Besh S, et al. Gastric Cancer, Version 2.2013 featured updates to the NCCN Guidelines[J]. J Natl Compr Cancer Network: JNCCN. 2013;11(5):531–46.

    Article  CAS  PubMed  Google Scholar 

  4. Marx J. Inflammation and Cancer: the Link grows Stronger[J]. Science. 2004;306(5698):966–8.

    Article  CAS  PubMed  Google Scholar 

  5. Sethi G. TNF: A master switch for inflammation to cancer[J].Frontiers in Bioscience, 2008, Volume(13):5094.

  6. Bhatelia K, Singh K, Singh R. TLRs: linking inflammation and breast cancer[J]. Cell Signal. 2014;26(11):2350–7.

    Article  CAS  PubMed  Google Scholar 

  7. Munn LL. Cancer and inflammation[J]. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. 2017;9(2):e1370.

    Google Scholar 

  8. Kumano Yohei,Hasegawa Yoriko,Kawahara Takashi et al. Pretreatment Neutrophil to Lymphocyte Ratio (NLR) Predicts Prognosis for Castration Resistant Prostate Cancer Patients Underwent Enzalutamide.[J].Biomed Res Int et al. 2019, 2019: 9450838.

  9. Mirili Cem,Guney Isa Burak,Paydas Semra. Prognostic significance of neutrophil/lymphocyte ratio (NLR) and correlation with PET-CT metabolic parameters in small cell lung cancer (SCLC).[J].Int. J Clin Oncol. 2019;24:168–78.

    Google Scholar 

  10. Azab B, Bhatt VR, Phookan J, et al. Usefulness of the neutrophil-to-lymphocyte ratio in Predicting Short- and long-term mortality in breast Cancer Patients[J]. Ann Surg Oncol. 2012;19(1):217–24.

    Article  PubMed  Google Scholar 

  11. Klén R, Salminen Antti P, Mahmoudian Mehrad et al. Prediction of complication related death after radical cystectomy for bladder cancer with machine learning methodology.[J].Scand J Urol, 2019, undefined: 1–7.

  12. Shinagare Atul B, Ip Balthazar Patricia K, et al. High-Grade Serous Ovarian Cancer: Use of Machine Learning to Predict Abdominopelvic recurrence on CT on the basis of serial Cancer Antigen 125 levels.[J]. J Am Coll Radiol. 2018;15:1133–8.

    Article  CAS  PubMed  Google Scholar 

  13. Wong Nathan C, Lisa, et al. Use of machine learning to predict early biochemical recurrence after robot-assisted prostatectomy. [J] BJU Int. 2019;123:51–7.

    Article  CAS  Google Scholar 

  14. Washington K. 7th Edition of the AJCC Cancer staging Manual: Stomach[J]. Ann Surg Oncol. 2010;17(12):3077–9.

    Article  PubMed  Google Scholar 

  15. [Ahn HS, Lee HJ, Hahn S, Kim WH, Lee KU, Sano T, et al. Evaluation of the seventh american Joint Committee on Cancer/International Union Against Cancer classification of gastric adenocarcinoma in comparison with the sixth classification. Cancer. 2010;116(24):5592–8.

    Article  PubMed  Google Scholar 

  16. Yodying H, Matsuda A, Miyashita M, et al. Prognostic significance of Neutrophil-to-lymphocyte ratio and platelet-to-lymphocyte ratio in oncologic outcomes of Esophageal Cancer: a systematic review and Meta-analysis[J]. Ann Surg Oncol. 2016;23(2):646–54.

    Article  PubMed  Google Scholar 

  17. Ishizuka M, Oyama Y, Abe A, et al. Combination of platelet count and neutrophil to lymphocyte ratio is a useful predictor of postoperative survival in patients undergoing surgery for gastric cancer.[J]. J Surg Oncol. 2015;110(8):935–41.

    Article  Google Scholar 

  18. Cho IR, Park JC, Park CH, et al. Pre-treatment neutrophil to lymphocyte ratio as a prognostic marker to predict chemotherapeutic response and survival outcomes in metastatic advanced gastric cancer[J]. Gastric Cancer Official Journal of the International Gastric Cancer Association & the Japanese Gastric Cancer Association. 2014;17(4):703–10.

    CAS  Google Scholar 

  19. Huang Z, Liu Y, Yang C, et al. Combined neutrophil/platelet/lymphocyte/differentiation score predicts chemosensitivity in advanced gastric cancer[J]. BMC Cancer. 2018;18(1):515.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Jiang Y, Xu H, Jiang H, et al. Pretreatment neutrophil-lymphocyte count ratio may associate with gastric cancer presence[J]. Cancer Biomarkers. 2016;16(4):523–8.

    Article  CAS  PubMed  Google Scholar 

  21. Lian L, Xia YY, Zhou C, et al. Application of platelet/lymphocyte and neutrophil/lymphocyte ratios in early diagnosis and prognostic prediction in patients with resectable gastric cancer[J]. Cancer Biomarkers. 2015;15(6):899–907.

    Article  CAS  PubMed  Google Scholar 

  22. Pietrzyk L, Plewa Z, Denisow-Pietrzyk M, et al. Diagnostic Power of blood parameters as screening markers in gastric Cancer Patients[J]. Asian Pac J Cancer Prev Apjcp. 2016;17(9):4433.

    PubMed  Google Scholar 

  23. Sato H, Tsubosa Y, Kawano T. Correlation between the pretherapeutic neutrophil to lymphocyte ratio and the pathologic response to neoadjuvant chemotherapy in patients with advanced esophageal cancer [J]. World J Surg. 2012;36(3):617–22.

    Article  PubMed  Google Scholar 

  24. Lin Guohe,Liu Yongcheng,Li Shuhong. Elevated neutrophil-to-lymphocyte ratio is an independent poor prognostic factor in patients with intrahepatic cholangiocarcinoma.[J].Oncotarget, 2016, 7:50963–50971.

  25. Malaspina TSDS, Thaís H, Gasparoto, Costa M, R S N, et al. Enhanced programmed death 1 (PD-1) and PD-1 ligand (PD-L1) expression in patients with actinic cheilitis and oral squamous cell carcinoma[J]. Cancer Immunol Immunotherapy. 2011;60(7):965–74.

    Article  CAS  Google Scholar 

  26. Chen XD, Mao CC, Wu RS, et al. Use of the combination of the preoperative platelet-to-lymphocyte ratio and tumor characteristics to predict peritoneal metastasis in patients with gastric cancer.[J]. PLoS ONE. 2017;12(4):e0175074.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Kim EY, Lee JW, Yoo HM, et al. The platelet-to-lymphocyte ratio Versus Neutrophil-to-lymphocyte ratio: which is better as a prognostic factor in gastric Cancer?[J]. Ann Surg Oncol. 2015;22(13):4363–70.

    Article  PubMed  Google Scholar 

  28. Liu Xuechao,Chen Shangxiang,Liu Jianjun. Impact of systemic inflammation on gastric cancer outcomes[. J] PLoS ONE. 2017;12:e0174085.

    Article  Google Scholar 

Download references


We are also grateful to the BioStudies (public) database for including and providing the original data [28].


The authors declare that they have no competing interests.

Author information

Authors and Affiliations



C.M.Z., Y.Z., Y.W. and J.J.Y. wrote the main manuscript text. C.M.Z., Y.Z., Y.W. and J.J.Y. prepared Figs. 1, 2, 3 and 4. C.M.Z., Y.Z., Y.W. and J.J.Y. reviewed the manuscript.

Corresponding authors

Correspondence to Cheng-Mao Zhou, Jian-Jun Yang or Yu Zhu.

Ethics declarations

Ethics approval and consent to participate

The research design was reviewed and approved by the Ethics Committee at the First Affiliated Hospital of Zhengzhou University (2020-KY-378). This study was conducted according to relevant guidelines and regulations. The Ethics Committee at the First Affiliated Hospital of Zhengzhou University exempted the requirement for informed consent.

Consent for publication

Not applicable.

Competing interests


Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, CM., Wang, Y., Yang, JJ. et al. Predicting postoperative gastric cancer prognosis based on inflammatory factors and machine learning technology. BMC Med Inform Decis Mak 23, 53 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: