Skip to main content

Interpretable machine-learning model for Predicting the Convalescent COVID-19 patients with pulmonary diffusing capacity impairment



The COVID-19 patients in the convalescent stage noticeably have pulmonary diffusing capacity impairment (PDCI). The pulmonary diffusing capacity is a frequently-used indicator of the COVID-19 survivors’ prognosis of pulmonary function, but the current studies focusing on prediction of the pulmonary diffusing capacity of these people are limited. The aim of this study was to develop and validate a machine learning (ML) model for predicting PDCI in the COVID-19 patients using routinely available clinical data, thus assisting the clinical diagnosis.


Collected from a follow-up study from August to September 2021 of 221 hospitalized survivors of COVID-19 18 months after discharge from Wuhan, including the demographic characteristics and clinical examination, the data in this study were randomly separated into a training (80%) data set and a validation (20%) data set. Six popular machine learning models were developed to predict the pulmonary diffusing capacity of patients infected with COVID-19 in the recovery stage. The performance indicators of the model included area under the curve (AUC), Accuracy, Recall, Precision, Positive Predictive Value(PPV), Negative Predictive Value (NPV) and F1. The model with the optimum performance was defined as the optimal model, which was further employed in the interpretability analysis. The MAHAKIL method was utilized to balance the data and optimize the balance of sample distribution, while the RFECV method for feature selection was utilized to select combined features more favorable to machine learning.


A total of 221 COVID-19 survivors were recruited in this study after discharge from hospitals in Wuhan. Of these participants, 117 (52.94%) were female, with a median age of 58.2 years (standard deviation (SD) = 12). After feature selection, 31 of the 37 clinical factors were finally selected for use in constructing the model. Among the six tested ML models, the best performance was accomplished in the XGBoost model, with an AUC of 0.755 and an accuracy of 78.01% after experimental verification. The SHAPELY Additive explanations (SHAP) summary analysis exhibited that hemoglobin (Hb), maximal voluntary ventilation (MVV), severity of illness, platelet (PLT), Uric Acid (UA) and blood urea nitrogen (BUN) were the top six most important factors affecting the XGBoost model decision-making.


The XGBoost model reported here showed a good prognostic prediction ability for PDCI of COVID-19 survivors during the recovery period. Among the interpretation methods based on the importance of SHAP values, Hb and MVV contributed the most to the prediction of PDCI outcomes of COVID-19 survivors in the recovery period.

Peer Review reports


As of November 28, 2022, a novel global pandemic triggered by Corona Virus Disease 2019 (COVID-19) has infected more than 641 million people and claimed 6.63 million lives ( Among the COVID-19 survivors, many have shown disastrous effects on multiple organs and systems [1], but the lung is the organ most susceptible to severe damage from COVID-19 [2]. The convalescent COVID-19 patients have demonstrated particularly pronounced PDCI. Our previous study has found that the incidence of DLCO impairment of the COVID-19 patients reached 57.92% in 18 months after discharge [3]. Studies show that pulmonary diffusing capacity of the COVID-19 patients is also significantly impaired in the 1–24 month recovery phase. Studies also suggest impaired gas-blood exchange in patients discharged after admission for COVID-19 [1, 4,5,6,7], and low DLCO may be the result of interstitial or pulmonary vascular abnormalities caused by COVID-19 [8,9,10,11]. Therefore, there is an urgent need for a prognostic assessment and early warning system for COVID-19, especially for a model to predict PDCI of the convalescent patients. To solve this problem, establishment of an early warning model to estimate the DLCO of patients is probably an alternative. The current prediction models for COVID-19 are mainly utilized to identify the high-risk groups of the general population [12], diagnose COVID-19 patients [13], and predict the progression of disease severity and mortality [14, 15]. However, the prediction models for PDCI of COVID-19 patients are still in deficiency.

Machine learning analysis is based on various data mining algorithms of different types and formats to characterize the data features in a more scientific way and gain better insight into data trends and recognized values [16]. At present, machine learning has been widely used in multiple domains of life and production in human society, including the analysis and prediction of energy consumption by sewage treatment and the prediction of building materials and composite properties [17,18,19,20]. Currently, ML is also widely used in healthcare data analysis [21]. However, the “black box” problem makes it difficult to integrate AI and AI technology with clinical practice. The “black box” of medicine allows no clinicians to review the quality of training labels or data, which is contrary to the rules followed by evidence-based medicine. Therefore, the capability of correctly interpreting the output of a predictive model is extremely important to generate appropriate user trust, provide insights on how to improve the model, and support an understanding of the modeling process so that AI can be combined with human intelligence to fully exploit the potential and productivity advantages of AI.

In interpretability machine learning, SHAP is ascribed to the post-interpreting method of model, and its central ideology is to calculate the marginal contribution of features to the output of the model, and then elucidate the “black box model” at both global and local levels. The interpretability of SHAP is essential to enhance the trust of healthcare professionals, because it exhibits sufficient reason to make predictions and how parameters contribute to the model [22]. The interpretation method based on the importance of SHAP value features can help medical researchers understand the decision-making criteria of ML models. The ability of making use of large data sets and predictive models enables clinicians to diagnose, treat and predict their patients in a more confident manner [23].

However, most ML studies worked hard to improve performance by increasing the model complexity, leading to uncertainties in the way how ML operates and makes decisions [24,25,26]. To improve interpretability of the ML models, this study adopted the most popular feature importance estimation in the explainabilty researches [27, 28]. We tried to rank the features according to their importance and used the TreeSHAP method proposed by Lundberg et al. to analyze the clinical features [29].

Therefore, the aim of this investigation is to develop and validate an interpretable ML model based on clinical variables to evaluate the risk of PDCI of the COVID-19 patients in recovery. This study had been reviewed and approved by the Ethics Committee of Hubei Provincial Hospital of Integrated Traditional Chinese and Western Medicine (2,020,009). All participants provided their written or verbal informed consents prior to the study.


Study design and data set

From August to September 2021, a follow-up study had been conducted on the COVID-19 survivors 18 months after discharge from Hubei Provincial Hospital of Integrated Traditional Chinese and Western Medicine. A total of 221 survivors were contacted according to their time of discharge. Related clinical data of survivors, including demographic characteristics (age, sex and body mass index (BMI)) and clinical examination indicators (lung function, chest HRCT, antibody titers and various biochemical indicators), were collected by the well-trained physicians.

The three-step primary procedures were performed in this investigation. Firstly, six popular machine learning models were used to predict pulmonary diffusing capacity of patients at convalescent stage of COVID-19. Secondly, the performance of the six ML models was tested with a selection of indicators including AUC, Accuracy, Recall, Precision, PPV, NPV and F1, and then the model with the optimum performance was defined as the optimal model. Finally, we used the MAHAKIL method for data balance processing to optimize the balance of sample distribution, while the RFECV method for feature selection, which could choose combined features conducive to machine learning. The overall workflow of this study is shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of model design

Patients and outcomes

The criteria for inclusion of survivors in this study were determined in accordance with the COVID-19 management protocols of the World Health Organization (WHO) and National Health Commission of the People’s Republic of China [30, 31](

The severity of COVID-19 is measured as follows:

  1. 1.

    Mild cases: without symptoms and signs of severe and critical infection;

  2. 2.

    Moderate cases: with fever, respiratory symptoms and suggested.

Pneumonia via chest radiology;

  1. 3.

    Severe cases:

  1. a.

    Breathing difficulties, respiratory rate ≥ 30 bpm;

  2. b.

    SpO2 ≤ 93% at rest;

  3. c.

    PaO2/FiO2 ratio ≤ 300 mmHg.

  1. 4.

    Critical cases:

  1. a.

    Respiratory failure requiring mechanical ventilation;

  2. b.


  3. c.

    Multi-organ failure requiring intensive care.

The primary endpoint of this study was the area under the receiver operating characteristic curve (AUROC) of the model’s prediction, while the secondary endpoints were Accuracy, Recall, Precision, PPV, NPV and F1 score of the model’s prediction.

Data collection

Clinical data on the COVID-19 survivors include demographics, medical history, laboratory tests, and scoring system and outcomes of illness severity. Demographic characteristics were extracted covering gender, age, height and body weight. Then, data were collected on comorbidities, including heart failure, anemia and chronic obstructive pulmonary disease (COPD). The laboratory tests abstracted include white blood cells (WBC), Hb, PLT, N%, L%, LY#, IgM, IgG, proBNP, alanine transaminase (ALT), aspartate aminotransferase (AST), Alb, BUN, Cr, UA, HbA1c, Normal unilateral & bilateral score, forced vital capacity (FVC), forced expiratory volume in one second (FEV1), FEV1/FVC, MVV, DLCO, tubercle, ground-glass opacity (GGO), fibrosis, etc. The severity of illness is scaled from 1 to 4.

Feature selection and data preprocessing

High dimensional data analysis modeling poses a challenge for the researchers in the field of data mining. Feature selection technology provides an effective solution this problem by removing irrelevant and redundant data, which renders it possible to reduce the computation time, improve the learning accuracy and better understand the learning model [22]. Cai et al. compared and analyzed some state-of-art feature selection methods on two high-dimensional gene-expression data sets through experiments [27], which found that recursive feature elimination (RFE) could achieve higher accuracy than other feature selection methods. In this regard, we chose RFECV, a Cross Validation version of RFE. The purpose of adding Cross Validation is to select the best number of features, which often requires manual trial and error to obtain the best number of features in studies using RFE. In our study, the RFECV method was used to cyclically remove medical features that were detrimental to the ability of the model in learning to predict the pulmonary diffusing capacity until the assembled features enabled the model to perform optimally. Followed by feature selection, a total of 31 among the 37 clinical factors were ultimately selected for model construction.

Model development and validation

Randomly, the data were separated into a training data set (80%) and a validation data set (20%). Firstly, up-sampling by the MAHAKIL method was utilized to balance the number of samples of different classes in the training set. Secondly, the RFECV method was utilized to choose the optimized combination of features. Then, the selected features from the balanced training set were input into the machine learning model for training and modeling, and the grid-search method was utilized to ensure the validity of the combination of parameters during training. Finally, the trained ML model was utilized to predict and evaluate the data results of the test set, where the features were also processed as the optimal combination of features. In addition, we integrated the overall data, ranked the importance of features by taking XGBoost as the base model and using the TreeExplainer method, and combined with the calculating principle of the SHAP interaction values to further explain the reasons why these features were considered significant.

The XGBoost model ROC_AUC changes corresponding to the number of features are shown in Fig. 2. After feature selection via the RFECV method, 31 were selected as the optimal combined features.

Fig. 2
figure 2

Change of ROC_AUC of the XGBoost model and number of features

Model interpretability

It’s extremely significant to open the black box of ML to improve the compliance and transparency of the ML decision-making process for healthcare workers [32]. Therefore, we took XGBoost with the best performance in AUC evaluation index as the base model, the optimal combined features after feature selection and labels as the input, and use the TreeExplainer method to sort the SHAP values of features. The SHAP value summary diagram of medical characteristics is shown in Fig. 4.

Statistical analysis

The count data were described with the number of cases (%), and Pearson chi-square test was utilized for inter-group comparison. Measurement data accorded with the normal distribution were expressed as the median and interquartile range (M (P25, P75)) by t-test or ANOVA, while the Mann-Whitney U test was used between groups. After feature selection and data preprocessing, six popular machine learning models were developed to predict PDCI of survivors recovering from COVID-19. Overall, performance of each model was assessed by AC, Accuracy, Precision, Recall, PPV, NPV and F1 measurements. Ultimately, the model was explained by the TreeExplainer method.

SPSS 25.0 (IBM, Armonk, New York, USA) was applied for statistical analysis. All statistical tests were two-sided, and P < 0.05 was considered statistically significant.


Clinical characteristics

A total of 221 COVID-19 survivors were included in this study (Mild cases, n = 93; Moderate cases, n = 58; Severe cases, n = 54; Critical cases, n = 16). The median age of the subjects was 58.2 [standard deviation (SD) = 12]. Among them, 104 survivors (47.06%) were male and 117 (52.94%) were female, with an average BMI of 24.62[standard deviation (SD) = 3.5]. The incidence of PDCI in COVID-19 survivors was 57.92% as shown in Table 1 (from J Infect. 2022 Feb; 84(2):e16-e18 Table 1 PMID: 34,963,637).

Table 1 Clinical characteristics of survivors with impaired and normal DLCO

Model development and validation

After feature selection, we utilized 31 alternative factors for model construction, and among the six ML models tested by the team. Compared with GBDT (A.C. 0.67), KNN (A.C. 0.63), RandomForest (A.C. 0.70) and SVC (A.C. 0.70), MLP (A.C. 0.69), XGBoost (A.C. 0.75) has better DLCO predicting ability for COVID-19 survivors. Table 2 exhibits that XGBoost performs optimally in AUC, Accuracy, Recall, Precision, PPV, NPV and F1. After experimental verification, the model has an AUC of 0.755 and an Accuracy of 78.01%. The SHAP summary analysis demonstrated that Hb, MVV, severity of illness, PLT, UA and BUN were the top six key factors affecting the decision-making of XGBoost model.

Table 2 Experimental results of different classifiers
Fig. 3
figure 3

Cosine similarity of XGBoost model

The cosine similarity of XGBoost model is shown in Fig. 3. The green dots are samples that are correctly believed to be 0 (0 represents healthy people), and the red dots are samples that are incorrectly believed to be 0 (patients diagnosed as healthy people). It’s found that the reason for the misjudgment is that the misjudged sample features are quite similar to the true 0 sample features (the vast majority are greater than 0.8). That is to say, one of the important reasons for misjudgment is that these patients’ symptoms or some of the detected indicators in the body have a very high degree of similarity with normal people.

Model interpretability

Figure 4 shows the SHAP summary diagram, which ranks the factors according to their importance to the predicted incidence of the validation cohort. The SHAP summary analysis revealed that Hb, MVV, level, PLT, UA and BUN were the top six most pivotal factors affecting the XGBoost model decision. Figure 5 also shows the correlation between the six factors and the prediction of PDCI occurred in the COVID-19 survivors. The SHAP values above zero for these six characteristics indicate an increased risk of PDCI. Hb and MVV were negatively correlated with DLCO, while severity of illness, PLT, UA and BUN were positively correlated.

Fig. 4
figure 4

SHAP value summary diagram of medical characteristics

The SHAP values in the validation set were utilized to evaluate the feature importance of the XGBoost classifier. Each dot represents 1 patient and is accumulated vertically to describe density. Colors represent high and low values for each element, with dark colors representing higher values and light colors lower values. The X-axis of the diagram represents the SHAP value. A positive SHAP value indicates a positive contribution to the prediction model and a high probability of PDCI occurrence, and vice versa.

Fig. 5
figure 5

Top 6 clinical features in SHAP values of XGBoost. A Hb, B MVV, C severity of illness, D PLT, E UA, F BUN. Values are plotted with a scatter plot. The morbidity of PDCI predicted by the model will be increased when the SHAP value of the feature is > 0, and the disease-free rate of PDCI predicted by the model will be increased when the SHAP value is < 0. MVV: maximal voluntary ventilation; HB: hemoglobin, PLT: platelets; BUN: blood urea nitrogen; UA: uric acid

Finally, we plotted the XGBoost decision-making process against the SHAP values, as shown in Fig. 6. The gray vertical line in the middle of the decision graph marks the model’s base value, and the colored line is for prediction, indicating whether each feature moves the output value to a value higher or lower than the average prediction. The eigenvalues next to the prediction line can be taken as the reference. Starting at the bottom of the graph, the prediction line shows how the SHAP value grows from the base value to the model final score at the top of the graph. The blue broken line is the decision process of predicting a normal object, and the red broken line is the decision process of predicting an exception object.

Fig. 6
figure 6

Output decision chain of the XGBoost model


The aim of this ML-based modeling study was to develop a valid, stable and interpretable model for predicting the incidence of PDCI in COVID-19 survivors in the recovery phase. The findings of the study manifested that the XGBoost model was the most reliable and accurate among all the tested models, with an AUC of 0.755 and an Accuracy of 78.01%. We also found that Hb, MVV, severity of illness, PLT, UA and BUN were the top six key factors influencing the XGBoost model decision-making. Overall, our study demonstrated that it is possible to predict the PDCI incidence of COVID-19 survivors using routinely collected clinical data.

As the global pandemic of COVID-19 continues to exert damaging effects on our world and the number of patients recovering from the disease increases, studies have found that shortness of breath and dyspnea are the most common sequelae among those who have survived hospitalization with COVID-19 due to the presence of PDCI [7]. Therefore, DLCO-based pulmonary function testing can be regarded as a useful parameter to differentiate those at risk of pulmonary sequelae [33]. However, previous studies on COVID-19 mainly focus on the analysis of risk factors and the prediction of mortality of mild-to-moderate cases [34, 35], without a prediction model for PDCI in COVID-19 survivors. Consequently, it’s essential to develop and validate the risk-level outcome prediction models to evaluate the pulmonary function status of COVID-19 survivors.

Apart from using a machine learning model to predict the pulmonary diffusing capacity of patients recovering from COVID-19, our investigation further applied the model to interpretability analysis. Because the internal logic and operating mechanisms of ML models are concealed from users, this uncertainty poses challenges for healthcare workers in applying the machine learning systems in reality. In this study, we used the interpretation method based on the importance of the SHAP value features to help medical researchers understand the decision-making criteria of ML models [36,37,38], enhance the credibility of medical professionals in ML, and coordinate the contradictions or inconsistencies between the knowledge structural elements of machines and human beings with prior knowledge. We adopted the TreeSHAP method [6], which is an effective evaluative method for the importance of tree model features based on the SHAPELY value of classical game theory. The SHAP summary analysis showed the six most important factors of the XGBoost model. Among them, MVV is the most important indicator of lung reserve function, which is closely related to activity endurance. The most serious sequelae of the COVID-19 patients are shortness of breath and dyspnea in the wake of activities, and the significantly decreased pulmonary function reserve [39]. This study confirmed that MVV was positively correlated with the pulmonary diffusing function of COVID-19 survivors. The MVV value of COVID-19 patients with normal diffusing function was significantly higher than that of patients with impaired pulmonary diffusing capacity, which reveals the importance of strengthening pulmonary rehabilitation exercises and increasing pulmonary function reserve in COVID-19 patients during rehabilitation. Studies have found that Hb, a parameter closely related to organ perfusion, alveolar ventilation and blood flow ratio, has greatly contributed to the prediction of pulmonary function outcomes in patients with COVID-19 after recovery. In this study, after the correction of Hb, Hb of COVID-19 patients with decreased pulmonary diffusing function was normal low value or anemia, indicating that there is a long-term imbalance of pulmonary perfusion and ventilation ratio in COVID-19 patients, to which due attention should be paid.

In addition, PLT and severity of illness were negatively correlated with pulmonary diffusing function. The more severe the disease was, the higher the normal value of PLT was, and vice versa. Studies have confirmed that PLT activation is involved in the formation of inflammatory microvascular thrombosis of patients with COVID-19 and is closely related to respiratory failure in COVID-19 patients [10, 40,41,42,43,44]. However, one and a half years later, our study found that PLT was still closely related to PDCI of COVID-19 survivors. Consistent with our previous study [45], these observations suggest that clinically obtained MVV, PLT, Hb and severity of illness are the key factors for using the XGBoost model to predict pulmonary function status in COVID-19 survivors. Besides, compared with the indicators directly affecting pulmonary function, the SHAP pooled analysis exhibited that the increased UA and BUN may be correlated with a growing risk of the retrogressive pulmonary diffusing capacity of the COVID-19 patients.

Currently, the combined use of high-frequency biological data streams and artificial intelligence (AI) indicates a promising application for predicting the diffusing capacity of lungs, which makes it possible for early identification of pulmonary capacity recovery of COVID-19 patients [46,47,48]. However, this study is still subjected to some limitations. First of all, no causal connection between variables and the pulmonary diffusing capacity can be drawn based on the modeling and retrospective design in this study. Secondly, the predictive efficiency of the current models may work differently if the racial and ethnic characteristics of the subjects are not identical in the study. Moreover, it is difficult to obtain more relevant data due to the privacy of COVID-19 patients, leading to a lack of proper external validation of our prediction model, which will affect the credibility of the XGBoost model. Finally, although the findings showed that the model had learned the medical rules in the data, the data expansion is an urgent need in the future to improve the model’s performance.


This article analyzes the pulmonary capacity and other clinical indicators of COVID-19 survivors. Six popular machine learning models were utilized to predict pulmonary diffusing capacity of COVID-19 patients at recovery stage, among which the XGBoost model showed favorable predictive ability. It adopts an optimized second-order Taylor expansion, which can better fit complex nonlinear data sets by using second-order functions. In addition, XGBoost explicitly adds regular terms to the objective function to reduce variance and prevent overfitting of the model. In interpretable machine learning, Hb, and MVV contributed most to the prediction of PDCI outcomes of COVID-19 survivors in the convalescence period. The interpretation methods based on the importance of SHAP values can help doctors better familiarize with the basic concepts and indicators of machine learning and their potential applications in clinical practices, and then readily accept the growing integration of AI and machine learning with modern medicine.

Availability of data and materials

The datasets used and analyzed during the current study available from the corresponding author on reasonable request.


  1. Huang L, Li X, Gu X, Zhang H, Ren L, Guo L, Liu M, Wang Y, Cui D, Wang Y, et al. Health outcomes in people 2 years after surviving hospitalisation with COVID-19: a longitudinal cohort study. Lancet Respir Med. 2022;10(9):863–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Korompoki E, Gavriatopoulou M, Hicklen RS, Ntanasis-Stathopoulos I, Kastritis E, Fotiou D, Stamatelopoulos K, Terpos E, Kotanidou A, Hagberg CA, et al. Epidemiology and organ specific sequelae of post-acute COVID19: a narrative review. J Infect. 2021;83(1):1–16.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Xu B, Ma FQ, He C, Wu ZQ, Fan CY, Mao HR, Zhang JX, Yang M, Hu ZW. Incidence and affecting factors of pulmonary diffusing capacity impairment with COVID-19 survivors 18 months after discharge in Wuhan, China. J Infect 2022.

  4. Huang L, Yao Q, Gu X, Wang Q, Ren L, Wang Y, Hu P, Guo L, Liu M, Xu J, et al. 1-year outcomes in hospital survivors with COVID-19: a longitudinal cohort study. Lancet. 2021;398(10302):747–58.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Shah AS, Wong AW, Hague CJ, Murphy DT, Johnston JC, Ryerson CJ, Carlsten C. A prospective study of 12-week respiratory outcomes in COVID-19-related hospitalisations. Thorax. 2021;76(4):402–4.

    Article  PubMed  Google Scholar 

  6. Huang Y, Tan C, Wu J, Chen M, Wang Z, Luo L, Zhou X, Liu X, Huang X, Yuan S, et al. Impact of coronavirus disease 2019 on pulmonary function in early convalescence phase. Respir Res. 2020;21(1):163.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Wu X, Liu X, Zhou Y, Yu H, Li R, Zhan Q, Ni F, Fang S, Lu Y, Ding X, et al. 3-month, 6-month, 9-month, and 12-month respiratory outcomes in patients following COVID-19-related hospitalisation: a prospective study. Lancet Respir Med. 2021;9(7):747–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Huang C, Huang L, Wang Y, Li X, Ren L, Gu X, Kang L, Guo L, Liu M, Zhou X, et al. 6-month consequences of COVID-19 in patients discharged from hospital: a cohort study. Lancet. 2021;397(10270):220–32.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhao YM, Shang YM, Song WB, Li QQ, Xie H, Xu QF, Jia JL, Li LM, Mao HL, Zhou XM, et al. Follow-up study of the pulmonary function and related physiological characteristics of COVID-19 survivors three months after recovery. EClinicalMedicine. 2020;25:100463.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Lang M, Som A, Mendoza DP, Flores EJ, Reid N, Carey D, Li MD, Witkin A, Rodriguez-Lopez JM, Shepard JO, et al. Hypoxaemia related to COVID-19: vascular and perfusion abnormalities on dual-energy CT. Lancet Infect Dis. 2020;20(12):1365–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Hanidziar D, Robson SC. Hyperoxia and modulation of pulmonary vascular and immune responses in COVID-19. Am J Physiol Lung Cell Mol Physiol. 2021;320(1):L12–6.

    Article  CAS  PubMed  Google Scholar 

  12. Carr E, Bendayan R, Bean D, Stammers M, Wang W, Zhang H, Searle T, Kraljevic Z, Shek A, Phan HTT, et al. Evaluation and improvement of the National Early warning score (NEWS2) for COVID-19: a multi-hospital study. BMC Med. 2021;19(1):23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Jin C, Chen W, Cao Y, Xu Z, Tan Z, Zhang X, Deng L, Zheng C, Zhou J, Shi H, et al. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis. Nat Commun. 2020;11(1):5088.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Abdulaal A, Patel A, Charani E, Denny S, Mughal N, Moore L. Prognostic Modeling of COVID-19 Using Artificial Intelligence in the United Kingdom: Model Development and Validation. J Med Internet Res 2020, 22(8).

  15. Pan P, Li YC, Xiao YJ, Han BC, Su LX, Su ML, Li YS, Zhang SQ, Jiang DP, Chen X et al. Prognostic Assessment of COVID-19 in the Intensive Care Unit by Machine Learning Methods: Model Development and Validation. J Med Internet Res 2020, 22(11).

  16. Zampieri FG, Salluh JIF, Azevedo LCP, Kahn JM, Damiani LP, Borges LP, Viana WN, Costa R, Correa TD, Araya DES, et al. ICU staffing feature phenotypes and their relationship with patients’ outcomes: an unsupervised machine learning analysis. Intens Care Med. 2019;45(11):1599–607.

    Article  Google Scholar 

  17. Bagherzadeh F, Mehrani M-J, Basirifard M, Roostaei J. Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance. J Water Process Eng 2021, 41.

  18. Bagherzadeh F, Shafighfard T. Ensemble Machine Learning approach for evaluating the material characterization of carbon nanotube-reinforced cementitious composites. Case Stud Constr Mater 2022, 17.

  19. Shafighfard T, Bagherzadeh F, Rizi RA, Yoo D-Y. Data-driven compressive strength prediction of steel fiber reinforced concrete (SFRC) subjected to elevated temperatures using stacked machine learning algorithms. J Mater Res Technol. 2022;21:3777–94.

    Article  CAS  Google Scholar 

  20. Bagherzadeh F, Nouri AS, Mehrani M-J, Thennadil S. Prediction of energy consumption and evaluation of affecting factors in a full-scale WWTP using a machine learning approach. Process Saf Environ Prot. 2021;154:458–66.

    Article  CAS  Google Scholar 

  21. Wu Y, Rao K, Liu J, Han C, Gong L, Chong Y, Liu Z, Xu X. Machine learning algorithms for the prediction of Central Lymph Node Metastasis in patients with papillary thyroid Cancer. Front Endocrinol (Lausanne). 2020;11:577537.

    Article  PubMed  Google Scholar 

  22. Ploug T, Holm S. The four dimensions of contestable AI diagnostics - A patient-centric approach to explainable AI. Artif Intell Med 2020, 107.

  23. Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, Asadi H. eDoctor: machine learning and the future of medicine. J Intern Med. 2018;284(6):603–19.

    Article  CAS  PubMed  Google Scholar 

  24. Roscher R, Bohn B, Duarte MF, Garcke J. Explainable machine learning for scientific insights and discoveries. Ieee Access. 2020;8:42200–16.

    Article  Google Scholar 

  25. Reddy S. Explainability and artificial intelligence in medicine. Lancet Digit Health 2022, 4(4).

  26. McCoy LG, Brenna CTA, Chen SS, Vold K, Das S. Believing in black boxes: machine learning for healthcare does not need explainability to be evidence-based. J Clin Epidemiol. 2022;142:252–7.

    Article  PubMed  Google Scholar 

  27. Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: a new perspective. Neurocomputing. 2018;300:70–9.

    Article  Google Scholar 

  28. S-IL SML. A Unified Approach to Interpreting Model Predictions. neural information processing systems 2017.

  29. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, Katz R, Himmelfarb J, Bansal N, Lee SI. From local explanations to Global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Liu X, Zhou H, Zhou Y, Wu X, Zhao Y, Lu Y, Tan W, Yuan M, Ding X, Zou J, et al. Temporal radiographic changes in COVID-19 patients: relationship to disease severity and viral clearance. Sci Rep. 2020;10(1):10263.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Liu X, Zhou H, Zhou Y, Wu X, Zhao Y, Lu Y, Tan W, Yuan M, Ding X, Zou J, et al. Risk factors associated with disease severity and length of hospital stay in COVID-19 patients. J Infect. 2020;81(1):e95–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Hu C, Li L, Li Y, Wang F, Hu B, Peng Z. Explainable machine-learning model for prediction of In-Hospital mortality in septic patients requiring Intensive Care Unit Readmission. Infect Dis Ther. 2022;11(4):1695–713.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Blanco JR, Cobos-Ceballos MJ, Navarro F, Sanjoaquin I, Arnaiz de Las Revillas F, Bernal E, Buzon-Martin L, Viribay M, Romero L, Espejo-Perez S, et al. Pulmonary long-term consequences of COVID-19 infections after hospital discharge. Clin Microbiol Infect. 2021;27(6):892–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Cen Y, Chen X, Shen Y, Zhang XH, Lei Y, Xu C, Jiang WR, Xu HT, Chen Y, Zhu J, et al. Risk factors for disease progression in patients with mild to moderate coronavirus disease 2019-a multi-centre observational study. Clin Microbiol Infect. 2020;26(9):1242–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Zhou F, Yu T, Du R, Fan G, Liu Y, Liu Z, Xiang J, Wang Y, Song B, Gu X, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. Lancet. 2020;395(10229):1054–62.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Price WN. Big data and black-box medical algorithms. Sci Transl Med 2018, 10(471).

  37. The Lancet Respiratory M. Opening the black box of machine learning. Lancet Respir Med. 2018;6(11):801.

    Article  Google Scholar 

  38. Musolf AM, Holzinger ER, Malley JD, Bailey-Wilson JE. What makes a good prediction? Feature importance and beginning to open the black box of machine learning in genetics. Hum Genet. 2022;141(9):1515–28.

    Article  PubMed  Google Scholar 

  39. Arnold DT, Hamilton FW, Milne A, Morley AJ, Viner J, Attwood M, Noel A, Gunning S, Hatrick J, Hamilton S, et al. Patient outcomes after hospitalisation with COVID-19 and implications for follow-up: results from a prospective UK cohort. Thorax. 2021;76(4):399–401.

    Article  PubMed  Google Scholar 

  40. Patel BV, Arachchillage DJ, Ridge CA, Bianchi P, Doyle JF, Garfield B, Ledot S, Morgan C, Passariello M, Price S, et al. Pulmonary angiopathy in severe COVID-19: physiologic, imaging, and hematologic observations. Am J Respir Crit Care Med. 2020;202(5):690–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Taus F, Salvagno G, Cane S, Fava C, Mazzaferri F, Carrara E, Petrova V, Barouni RM, Dima F, Dalbeni A, et al. Platelets promote Thromboinflammation in SARS-CoV-2 Pneumonia. Arterioscler Thromb Vasc Biol. 2020;40(12):2975–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Chao Y, Rebetz J, Blackberg A, Hovold G, Sunnerhagen T, Rasmussen M, Semple JW, Shannon O. Distinct phenotypes of platelet, monocyte, and neutrophil activation occur during the acute and convalescent phase of COVID-19. Platelets. 2021;32(8):1092–102.

    Article  CAS  PubMed  Google Scholar 

  43. Nicolai L, Leunig A, Brambs S, Kaiser R, Weinberger T, Weigand M, Muenchhoff M, Hellmuth JC, Ledderose S, Schulz H, et al. Immunothrombotic Dysregulation in COVID-19 pneumonia is Associated with respiratory failure and Coagulopathy. Circulation. 2020;142(12):1176–89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Manne BK, Denorme F, Middleton EA, Portier I, Rowley JW, Stubben C, Petrey AC, Tolley ND, Guo L, Cody M, et al. Platelet gene expression and function in patients with COVID-19. Blood. 2020;136(11):1317–29.

    Article  CAS  PubMed  Google Scholar 

  45. Xu B, Ma FQ, He C, Wu ZQ, Fan CY, Mao HR, Zhang JX, Yang M, Hu ZW. Incidence and affecting factors of pulmonary diffusing capacity impairment with COVID-19 survivors 18 months after discharge in Wuhan, China. J Infect 2022, 84(2).

  46. Topalovic M, Das N, Janssens W. Artificial intelligence for pulmonary function test interpretation reply. Eur Respir J 2019, 53(6).

  47. Topalovic M, Das N, Burgel PR, Daenen M, Derom E, Haenebalcke C, Janssen R, Kerstjens HAM, Liistro G, Louis R et al. Artificial intelligence outperforms pulmonologists in the interpretation of pulmonary function tests. Eur Respir J 2019, 53(4).

  48. Mekov E, Miravitlles M, Petkov R. Artificial intelligence and machine learning in respiratory medicine. Expert Rev Resp Med. 2020;14(6):559–64.

    Article  CAS  Google Scholar 

Download references


The authors thank all members of our team for their valuable discussion.


This study was funded by Clinical Study on Prevention and Treatment of COVID-19 by Integrated Traditional Chinese and Western Medicine (2020YFC0841600) under the National Science and Technology Emergency Project. The funding body played no roles in neither design, collection, analysis, interpretation of data in the study nor writing of the manuscript.

Author information

Authors and Affiliations



BX and JXZ designed the study and has full access to all the data used in the study; CY F and CH were responsible for the data collection; FQM and HRY conducted the data analysis; ZWH, HRM and YQ drafted the manuscript. All authors had revised and agreed with publication before the manuscript submission. FQM, CH, and HRY contributed equally to this work.

Corresponding authors

Correspondence to Ji-xian Zhang or Bo Xu.

Ethics declarations

Competing interests


Ethical approval and consent to participation

This study had been reviewed and approved by the Ethics Committee of Hubei Provincial Hospital of Integrated Traditional Chinese and Western Medicine (2,020,009). All participants provided their written or verbal informed consents prior to this study. All methods were adopted in line with relevant guidelines and regulations..

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Fq., He, C., Yang, Hr. et al. Interpretable machine-learning model for Predicting the Convalescent COVID-19 patients with pulmonary diffusing capacity impairment. BMC Med Inform Decis Mak 23, 169 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Interpretable artificial intelligence
  • Machine learning
  • COVID-19
  • Pulmonary diffusing capacity impairment
  • Maximal voluntary ventilation