Automated segmentation and diagnosis of pneumothorax on chest X-rays with fully convolutional multi-scale ScSE-DenseNet: a retrospective study

Background Pneumothorax (PTX) may cause a life-threatening medical emergency with cardio-respiratory collapse that requires immediate intervention and rapid treatment. The screening and diagnosis of pneumothorax usually rely on chest radiographs. However, the pneumothoraces in chest X-rays may be very subtle with highly variable in shape and overlapped with the ribs or clavicles, which are often difficult to identify. Our objective was to create a large chest X-ray dataset for pneumothorax with pixel-level annotation and to train an automatic segmentation and diagnosis framework to assist radiologists to identify pneumothorax accurately and timely. Methods In this study, an end-to-end deep learning framework is proposed for the segmentation and diagnosis of pneumothorax on chest X-rays, which incorporates a fully convolutional DenseNet (FC-DenseNet) with multi-scale module and spatial and channel squeezes and excitation (scSE) modules. To further improve the precision of boundary segmentation, we propose a spatial weighted cross-entropy loss function to penalize the target, background and contour pixels with different weights. Results This retrospective study are conducted on a total of eligible 11,051 front-view chest X-ray images (5566 cases of PTX and 5485 cases of Non-PTX). The experimental results show that the proposed algorithm outperforms the five state-of-the-art segmentation algorithms in terms of mean pixel-wise accuracy (MPA) with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.93\pm 0.13$$\end{document}0.93±0.13 and dice similarity coefficient (DSC) with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.92\pm 0.14$$\end{document}0.92±0.14, and achieves competitive performance on diagnostic accuracy with 93.45% and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$F_1$$\end{document}F1-score with 92.97%. Conclusion This framework provides substantial improvements for the automatic segmentation and diagnosis of pneumothorax and is expected to become a clinical application tool to help radiologists to identify pneumothorax on chest X-rays.


Background
Pneumothorax (PTX) is an acute pulmonary disease with respiratory disorder caused by the abnormal accumulation of air in the pleural space between the chest wall and the lung [1,2]. According to the previous study in United States, PTX can occur in a variety of clinical settings and in individuals of any age, with a 35% recurrence rate in men [3]. PTX can cause pleuritic chest discomfort and dyspnea, and in severe cases may precipitate life-threatening medical emergency with cardio-respiratory collapse, requiring immediate intervention and subsequent prevention [4].
The screening and diagnosis of pneumothorax usually rely on chest radiographs that are formed by the differences in the absorption of X-ray ionizing radiation of different tissues in the chest [5]. Since chest radiographs project all three-dimensional anatomical clues of the chest onto a two-dimensional plane, the pneumothoraces in chest X-rays may be very subtle and overlapped with the ribs or clavicles. The identification of pneumothorax in chest X-ray is difficult and largely depends on the experience of radiologists. The failure of radiologists to detect PTX in early examination is one of the leading causes of PTX death [2]. Therefore, it is highly demanded to develop an automatic algorithm to reduce missed diagnosis and to help radiologists identify PTX accurately and timely.
Conventional PTX detection methods mainly consider the local and global texture cues [6], features from phase stretch transform (PST) [2], and local binary pattern (LBP) and then employ support vector machine (SVM) to classify the presence and absence of pneumothorax [7]. These conventional algorithms, which count on handcrafted features and require prior knowledge for the feature engineering that can be well modeled through shape and appearance features and consistent data distribution, are suited to the detection of regular organs and lesions. However, the modeling capability of the conventional method is very limited when the shape and size of PTX vary greatly and the characteristics are not obvious.
Recently, deep learning-based technologies, especially the convolutional neural networks (CNNs), have shown great potential in medical image analysis [8,9]. Several deep CNNs algorithms have been proposed for the identification of PTX with the image-level annotation. Wang et al. [10] released a large-scale chest X-ray dataset with image-level annotation, and proposed a deep CNN for the classification of 14 abnormalities (including PTX) on chest X-ray. This study is a milestone of PTX detection in the era of deep learning. Later, the studies of [11][12][13][14] proposed more accurate classification networks for the 14 kinds of chest diseases, and the studies of [4,15] proposed methods that only detect PTX. Despite these deep learning-based methods have demonstrated effectiveness in the PTX identification with image-level annotation, the utilization of image-level annotation makes the localization of pneumothorax on chest X-ray insufficiently precise. Since the segmentation of PTX region can help determine the large PTX for the automatic triaging scheme [16], accurate segmentation of PTX with pixellevel annotation is very crucial to the accurate localization of pneumothorax. However, due to the difficulty in obtaining pixel-level annotations of PTX, there are few studies on PTX segmentation.
Lesion segmentation in medical images is the most fundamental tool for the support of lesion analysis and treatment planning. Automatic and accurate segmentation tool can better help radiologists in the quantitative image analysis and support precise diagnosis. In this study, we create a large chest X-ray dataset for pneumothorax with pixel-level annotation by radiologists and explore an automatic segmentation algorithm for PTX identification using fully convolutional networks (FCNs) [17]. FCNs were introduced in the literature as a natural extension of CNNs to formulate semantic segmentation as pixel classification problem. FCNs and its further extensions like U-Net [18] have achieved remarkable performance for several tasks like the segmentation of lungs, clavicles, heart in chest radiographs [19], brain tumors [20], estimation of cardiothoracic ratio [21], etc. However, the PTX areas in chest X-rays may be very subtle and varied in shape, overlapping with the ribs or clavicle, and therefore the PTX segmentation task suffers from pixel imbalance and multi-scale problems.
In this study, we propose a fully convolutional multiscale scSE-DenseNet framework for PTX segmentation and diagnosis with the pixel-level annotation on chest X-ray. The framework consists of three modules: (1) a fully convolutional DenseNet (FC-DenseNet), which is parameter efficient and served as the backbone of the Conclusion: This framework provides substantial improvements for the automatic segmentation and diagnosis of pneumothorax and is expected to become a clinical application tool to help radiologists to identify pneumothorax on chest X-rays. Keywords: Chest X-ray, Pneumothorax segmentation and diagnosis, fully convolutional DenseNet, Spatial and channel squeezes and excitation, Spatial weighted cross-entropy loss framework; (2) a multi-scale module that captures the variability of viewpoint-related objects and learns the relationships across image structures at multiple scales; (3) a scSE module, which is incorporated into each convolution layer in the dense block of FC-DenseNet and can adaptively recalibrate feature maps to elucidate useful features while suppressing non-useful features without adding much parameters. To tackle the imbalance problem of pixels [22], we also introduce a spatially weighted cross-entropy loss (SW-CEL) function to penalize the target areas, background and boundary pixels using different weights. The proposed method can not only reduce the impact of class imbalance, but also better describe the boundary areas to segment and diagnose pneumothorax accurately. This study extends our preliminary work [23] by redesigning the automatic segmentation and diagnosis framework for PTX, adding extensive experiments to evaluate the automatic segmentation and diagnosis of PTX, and discussing the effects of different growth rates and loss functions on PTX segmentation.

Methods
In this section, an end-to-end deep learning framework is proposed for PTX segmentation by using FC-DenseNet as a backbone with the embedding of multi-scale module and scSE module, and a simple classifier is added to set the threshold to diagnose PTX by classifying the predicted PTX segmentation maps, as shown in Fig. 1.

Fully convolutional DenseNet for PTX segmentation
A deep learning-based typical segmentation architecture is composed of two parts: a down-sampling path (contraction) and an up-sampling path (expansion), where the down-sampling path is responsible for feature learning and the up-sampling path aims to restore the spatial information and image resolution. Alternatively, skip connections can be used to help the up-sampling path to recover spatial detail information from the downsampling path by reusing feature maps. In this study, we employ FC-DenseNet [24,25] as the network backbone for its advantages of parameters reduction, computational efficiency and better withstand of over-fitting problem.
The down-sampling path of FC-densenet consists multiple blocks, each containing a dense block followed by a transition-down block. For each dense block, it iteratively concatenates all feature maps in a feedforward paradigm. A dense block contains multiple layers, each consisting of a batch normalization, a non-linearity activation function, a convolution operation, and a dropout connection (see Fig. 1c). Each layer in the dense block, l, takes all feature maps of the preceding layers that match the spatial resolution as input, outputs k feature maps and passes them to the subsequent layers (see Fig. 1b), where k is known as growth rate. Hence, the number of feature maps in the dense block grows linearly with the depth of the down-sampling path of FC-DenseNet and the output of the lth layer can be defined as: where x l denotes the feature maps at the lth layer, the notation ⊕ denotes the channel-wised concatenation for the feature maps from the layer l − 1 to the layer 0. H is a composition of batch normalization, exponential linear unit and convolutional layer with dropout rate of 0.2 (see Fig. 1c), and H l represents a composite function of the lth layer.
In order to reduce the spatial dimensionality of the feature maps, a transition down block following the dense block is introduced (see Fig. 1d). The transition down block consists of batch normalization, exponential linear unit and 1 × 1 convolution for depth preserving with dropout rate of 0.2, and followed by a 2 × 2 max pooling operation. In particular, the end block of the down-sampling path is called bottleneck and is connected to the up-sampling path.
Through the up-sampling path, the spatial resolution of the input can be recovered by transition up blocks, dense blocks, and skip connections from the corresponding blocks of the down-sampling path. The transition up block is a transposed 3 × 3 convolution (see Fig. 1e), which implements the up-sampling of the previous feature maps. Then, the up-sampled feature maps are channel-wisely concatenated with the feature maps from the corresponding skip connections in the down-sampling path as the input of the dense block in the up-sampling path. At the end of up-sampling path, the feature maps of the output are convolved with a 1 × 1 convolution layer, and followed by a softmax layer and average max-pooling operation to generate the final segmentation map. This connection pattern strongly encourages the reuse of features and allows all layers of the architecture to receive direct supervision signals.

Multi-scale convolution module
To learn the relations across lesion features on multiple scales, multiple convolution kernels with different receptive fields were parallelly incorporated into the first convolution layer of FC-DenseNet to capture variability of viewpoint-related object. The module for processing chest X-ray images with varying size of convolution kernels is called the multi-scale convolution module. GoogLeNet [26] has introduced the multi-scale convolution kernels into a parallel sub-network as a inception module, allowing the abstract convolution features with different scales to be transported to the subsequent layer simultaneously. The inception module of GoogLeNet contains different size of convolution filters such as 1 × 1 , 3 × 3 and 5 × 5 convolutional kernels, and 3 × 3 maxpooling operation.
In the semantic segmentation task, a small convolution kernel can help the detection of small target regions, and a larger convolution kernel can not only detect the larger target regions, but also eliminate the false positive regions. Therefore, we add a larger convolution kernel ( 7 × 7 ) to expand the receptive field for the segmentation of PTX. To avoid the reduction of segmentation accuracy caused by dimension reduction, we also removed the 1 × 1 convolution kernel and 3 × 3 max-pooling, making the multi-scale convolution kernel module more efficiently in the PTX segmentation architecture. After these different convolution operations, all feature maps are The automatic segmentation and diagnosis framework for pneumothorax on chest X-rays. a The proposed segmentation network architecture. The difference between our segmentation network and the original FC-DenseNet is marked in red on the subgraph. b An example of a dense block embedded with scSE modules. c A layer in the scSE-embedded dense block that consists of batch normalization, exponential linear unit, 3 × 3 convolution operation, and drop-out rate ρ = 0.2 . d A transition down block, which is composed of batch normalization, exponential linear unit, 1 × 1 convolution, dropout ( ρ = 0.2 ) and 2 × 2 max pooling. (e) A transition up block, which is composed of 3 × 3 transposed convolution channel-wisely concatenated for the subsequent dense block (see Fig. 2)

Spatial and channel squeezes and excitation (scSE) module
Most of fully convolutional networks (FCNs)-based segmentation methods mainly focus on the joint space and channel encoding. For example, FC-DenseNet can simultaneously transmit the spatial and channel information of the current filters to the subsequent convolution layers to improve the utilization of features. However, spatialand channel-wise independent coding are less utilized.
Recently, Hu et al. [27] proposed a framework embedded with squeeze and excitation (SE) blocks to model the interdependencies between feature channels, and achieved state-of-the-art results in image classification. Roy et al. [28] introduced three variants of the SE blocks, including the channel SE (cSE) module, the spatial SE (sSE) module, and the concurrent spatial and channel squeeze and excitation (scSE) module, to migrated the SE blocks from image classification to image segmentation with promising performance. The purposes of the SE and cSE module are to adaptively recalibrate feature maps along the channels and to elucidate useful channels while suppressing the less useful channels. The cSE module can only reweight channels and the sSE module can only reweight spaces, while the scSE module can recalibrate the feature maps of channels and spaces respectively, and then merge these feature maps into output layer.
In this study, we embedded the scSE module into each dense block and proposed the application of scSE dense block in pneumothorax segmentation (see Fig. 1a, b). We denote the input feature maps of a dense block as U , U ∈ R H ×W ×C , where H, W, and C denote the spatial height, width, and the number of channels, respectively. As illustrated in Fig. 3, the input feature maps U can be recalibrated to the output feature maps U scSE , U scSE ∈ R H ×W ×C , through the two branches of U sSE and U cSE . The U scSE can be formulated as: where U sSE and U cSE are recalibrated from U in spatial space and on the channels, respectively. U sSE can provide more relevant spatial locations by ignoring irrelevant spatial locations and U cSE can be adaptively tuned to ignore less important channels and to emphasize more important channels.
Specifically, U sSE can be obtained from U through a 1 × 1 × 1 convolution kernel and a sigmoid function. The computing weight of the convolution kerner, denoted as W s , W s ∈ R 1×1×C×1 , can be used to learn a projection tensor Q, where Q ∈ R H ×W . Then the sigmoid function σ (·) is applied to rescale the activations of Q into [0, 1]. Hence, U sSE can be defined as: For the cSE module, a global average pooling operation g(·) is first performed on the input feature maps U to generate a vector z embedded with globally spatial information, where z = g(U) , z ∈ R 1×1×C . Then two consecutive fully connected layers are used to convert the vector z into a new vector ẑ , ) denote the weights of the two consecutive fully connected layers, respectively, and δ(·) denotes the operation of ReLU. Afterwards, we apply a sigmoid function σ (·) to normalize the activations into [0, 1]. Therefore, the formulation of cSE module can be defined as:  Fig. 3 The concurrent spatial and channel squeeze and excitation (scSE) module. The input feature maps of a dense block U can be recalibrated to the output feature maps U scSE through the two branches of U sSE and U cSE . The top branch is the spatial recalibrating ( U sSE ), and the bottom branch is channel-wise recalibrating ( U cSE ), and then U sSE and U cSE are merged into the output In summary, the scSE module combines the advantages of sSE module and cSE module, enabling better adaptive recalibration of feature maps, so that the scSE dense block can elucidate more useful information while suppressing less useful features in the application of pneumothorax segmentation.

Spatially weighted cross-entropy loss
The serious pixel class imbalance issue between the region of interests (ROIs) and the surrounding background generally exists in medical image segmentation. The number of pixels with pathology is much less than that without pathology. This tends to cause the learning model to fall into a local minimum. The typical cross entropy loss (CEL), which measures the quantization error of all pixels by calculating the pixel-level probabilistic error between the predicted output class and the target class, is susceptible to the class imbalance problem. Then weighted cross-entropy loss (W-CEL) is introduced to mitigate the effect of class imbalance by giving different weights to target classes and background pixels.
Meanwhile, dice loss [29] is also proposed to optimize the dice overlap coefficient between the predictive segmentation map and the ground truth map. However, due to the narrow boundary of the pneumothorax class, it is still difficult to distinguish the target classes from the background pixels through W-CEL and dice loss. Therefore, the boundary class is also required to be considered along with the target and background classes. Pneumothorax segmentation is generally formulated as a binary classification task with respect to object (pneumothorax) versus background, where '0' is used to represent the background pixels and '1' is used to represent the pneumothorax pixels. In this study, if the eight neighborhoods of the pixel value '1' have a pixel value of '0' , we define this pixel value '1' as boundary contour pixels. To formulate the boundary contour pixels of pneumothrax, an edge detector is used to determine whether a pixel is a boundary pixel or not, and then the boundary range is cross-expanded by morphological dilation. Therefore, a spatial weighted crossentropy loss (SW-CEL) is proposed by considering the different weights of target, background and boundary [30]. As shown in Fig. 4, spatial weight maps generated from the ground-truth images are used to calculate the weight loss of each pixel in the cross-entropy loss. The spatially weighted cross-entropy (SW-CEL) loss can be formulated as: where X denotes the training samples, W denotes the set of learnable weights, W = (w 1 , w 2 , . . . , w l ) , and w l denotes the weight matrix of the lth layer. p(t i |x i ; W ) represents the probability prediction for a pixel x i , and t i is the target label of the pixel x i , (x i ∈ X) . w map (x i ) is the estimated weight for each pixel x i , which can be defined as: denotes the set of all ground truth classes, i.e., pneumothorax class and background class. For each chest X-ray image, N denotes the set of total pixels and T c denotes the set of pixels corresponding to each class c, c ∈ C , and B c denotes the boundary contour pixel set, B c ⊂ T c ⊂ N . F T (x i ) and F B (x i ) denote the indicator functions defined on the subsets T c and B c , respectively. Fig. 4 The process of generating the spatial weight map. The groud-truth image b is delineated by radiologist according to the chest X-ray image a. Through edge detection and morphological dilation of the boundary contour pixels of the target class, the spatial weight map c can be generated from the ground-truth image b. The colors in the spatial weight map represent the weight distribution according to its relative class frequency

Automated classification for pneumothorax diagnosis
Most of previous studies of pneumothorax on chest X-ray mainly focous on PTX or not PTX diagnosis with image-level annotation. The learning of pneumothorax diagnosis with image-level annotation is a typical weakly supervised learning method, which often leads to inaccurate locations of pneumothorax lesions because the locations of pneumothorax lesions are not marked. Pneumothorax segmentation can accurately provide pixel-level lesion locations and better assist radiologists in pneumothorax diagnosis. In this study, we propose a pixel-wise level supervised network for the automatic segmentation and diagnosis of PTX (see Fig. 1). Since the predicted segmentation maps are the result of binary pixel-wise classification network, a simple classifier is added and a threshold is set to classify the predicted segmentation maps. We specify that if the predicted segmentation map is greater than a threshold, it is pneumothorax; otherwise, it is non-pneumothorax. If the threshold is too small, it may be segmentation noise; If the threshold is too high, small pneumothorax may be missed. Therefore, the threshold is empirically set to 50 pixels for pneumothorax diagnosis according to the predicted segmentation maps.

Dataset
The study data was conducted with three-stage procedures. The first stage searched a keyword "pneumothorax" in picture archiving and communications system (PACS) of our institution to obtain all relevant chest radiographs and radiology reports. In second stage, the key word "pneumothorax" was identified in the radiological report, and those without pneumothorax were classified as non-pneumothorax (Non-PTX) group, while those with pneumothorax were classified as pneumothorax (PTX) group. Third, all image data in PTX group was pixel-wisely annotated by three medical students and then revised by an experienced radiologist. Our eligible sample included a total of 11,051 frontview chest X-ray images (5566 cases of PTX and 5485 cases of Non-PTX). We named this dataset as "PX-ray". As shown in Table 1, the PX-ray dataset was randomly divided into the training, validation and test sets by stratified sampling strategy, so as to ensure that the ratio of PTX group and Non-PTX group in each set was the same.

Evaluation
To evaluate the performance of the PTX segmentation network, we used three quantitative metrics: mean pixelwise accuracy (MPA), dice similarity coefficients (DSC) and Hausdorff distance (HD). Statistical tests were also used to show whether there are significant differences in the results of different segmentation algorithms. If the p value of the statistical test is less than 0.05, there is a significant difference between the two results.
MPA is the average ratio of the accuractely classified pixels on the classes of PTX and non-PTX, defined as: where N denotes the number of samples, C denotes the number of classes, p c denotes the number of the accuractely classified pixels of class c, and P c denotes all pixels of class c in the ground truths. More importantly, we defined the pixel-wise accuracy (PA) of the PTX group class as PA 1 .
DSC is a standard measure for segmentation evaluation by calculating the overlap rate between the ground-truth map and the predicted segmentation map.
where A c denotes all pixels of class c in the predicted segmentation map and B c denotes all pixels of class c in the ground-truth map. More importantly, we define the DSC of the PTX group class as DSC 1 .
Hausdorff (HD) metric is also used to measure the contour distance between the ground-truth map and the predicted segmentation map, which can be defined as: (h(P, G), h(G, P)) (12) h(P, G) = max where P and G are the pixel sets of the predicted segmentation map and the ground-truth contours, respectively. The smaller the Hausdorff value, the higher the matching degree of the two contours.

Detailed settings
All experiments in this study were conducted on Nvidia Tesla V100 GPU server. The weights of the PTX segmentation network were initialized with HeUniform [31]. We used Adam optimizer ( β 1 = 0.9 , β 2 = 0.999 ) with learning rate of 1e − 4 and weight decay of 1e − 4 to train the segmentaion network model for 200 epochs. During the training process of all models, data augmentation was performed by random horizontal flips and the validation set was used to early stop the training process. We monitored the dice similarity coefficient (DSC) score in the pneumothorax group with patience value of 20 epochs.

Results
The qualitative and quantitative evaluation experiments are carried out to show the effectiveness of our proposed PTX segmentation and diagnosis framework. We first compare the performance of our network with that of U-Net [18], SegNet [32], DeepLab v3+ [33], DenseASPP [34] and original FC-DenseNet [25]. To verify the efficacy of the embedded modules in our segmentation and diagnosis network, we also embed the multi-scale module and scSE modules into U-Net and develop a new architecture, named as "MS_scSE_U-Net", for comparison. Note that the above segmentation and diagnosis networks share the same hyper-parameters and loss function SW-CEL during training. Table 2 shows that our PTX segmentation network, i.e., MS_scSE_FC-DenseNet, outperforms U-Net, SegNet, DeepLab v3+, DenseASPP and original FC-DenseNet in terms of MPA with 0.93 ± 0.13 , PA 1 with 0.86 ± 0.27 , DSC with 0.92 ± 0.14 and DSC 1 with 0.84 ± 0.27 . Meanwhile, our network MS_scSE_FC-DenseNet performs better than the original FC-DenseNet, and MS_scSE_U-Net performs better than the original U-Net, which shows that the performance of the network embedded with the multi-scale module and scSE modules is better than that without them. This indicates that the proposed multi-scale module and scSE module play an important role in improving the performance of the segmentation networks for PTX. In addition, compared with the original FC-DenseNet, the parameter number of the proposed network increased by 10.59%, but is still much less than that of other segmentation networks. Our method has a low time cost in terms of giga floating-point operations per second (GFLOPS). Figure 5 shows some result cases of large, moderate and small pneumothorax with different segmentation algorithms. For each case, we present the orginal chest X-ray image, the ground-truth image, and the segmentation results of the comparison methods and our proposed method MS_scSE_FC-DenseNet, as well as the corresponding DSC 1 and HD scores. It can be found that our method performs better with a larger DSC 1 and a smaller HD scores, which can more accurately help radiologists find the pneumothorax area. In addition, as shown in the bottom line of Fig. 5, our algorithm can segement the small bilateral thoracic regions that are very difficult for radiologists to manually label, indicating the potential of our method for clinical computer-assisted diagnosis. Figure 6 shows qualitative evaluation of our proposed PTX segmentation network and the five comparison frameworks on the PTX segmentation task. The X-axis represents the intervals of DSC score, and the Y-axis represents the number of samples falling into the DSC intervals from the X-axis. Compared with other frameworks, our segmentation network has the largest number of sample size in the range of [0.9, 1.0] and the smallest sample size in the range of [0, 0.6].

Performance of pneumothorax diagnosis
The quantitative performances of pneumothorax diagnosis with different models are shown in Table 3.

Table 2 Result comparisons of different segmentation models
The number with * represents a significant difference comparing other methods to our method, according to student's T-test for two independent samples ( p < 0.05) Our network shows the best results in terms of accuracy, sensitivity, negative predictive value (NPV) and  indicates great potential for the pixel-wise level supervised networks. The pixel-level supervised network not only provides image-level information but also provides pneumothorax location and size information, which is of great help to network learning.

Discussion
In this section, we discuss the effects of different gowth rates and loss functions on the pneumothorax segmentation performance. Table 4 discusses that our pneumothorax segmentation network performance with different growth rate (k) parameters. Note that according to students' t-test of the two independent samples, the number with * in the table represents that there is a significant difference ( p < 0.05 ) between the model with k = 12 and other models. We can see that under the same framework, the results grow steadily as the value of k increases. The segmentation network with k = 12 shows the best performance. Therefore, we use k = 12 model as our final network for pneumothorax segmentation. Table 5 discusses the segmentation performance of three different loss functions, including CEL, W-CEL and SW-CEL. In order to further evaluate the performance of the loss function, we carry out experiments on our proposed network and the previous state-of-the-art networks including U-Net [18], SegNet [32], DeepLab v3+ [33], DenseASPP [34] and FC-DenseNet [25]. The statistical T-tests on the test set indicates that models trained on   SW-CEL had no statistical significance in terms of DSC scores, while most models trained with SW-CEL showed the best performance in terms of Hausdorff distance scores. This indicates that the weight penalty for contour pixels could help to learn boundary contour accurately.

Conclusion
In this study, we proposed a fully convolutional multiscale scSE-DenseNet framework for automatic pneumothorax segmentation and diagnosis, which incorporates the advantages of feature reuse of DenseNet and greatly reduces a large number of parameters. We used the multi-scale module to capture the variability of viewpoint-related objects, as well as the scSE modules to conduct adaptive recalibration of the feature map and to boost meaningful features for better performance. To tackle the imbalance problem of pixels, SW-CEL was also introduced to better extract the pneumothorax boundaries on chest X-rays. The experiments conducted on PX-ray dataset demonstrate that our proposed framework is superior to the five state-of-the-art segmentation architectures in terms of MPA and DSC scores. This framework provides substantial improvements for the automatic segmentation and diagnosis of pneumothorax and is expected to become a clinical application tool for the pneumothorax segmentation and diagnosis.