Comparison of wavelet transformations to enhance convolutional neural network performance in brain tumor segmentation

Introduction and goal to background Due to the importance of segmentation of MRI images in identifying brain tumors, various methods including deep learning have been introduced for automatic brain tumor segmentation. On the other hand, using a combination of methods can improve their performance. Among them is the use of wavelet transform as an auxiliary element in deep networks. The analysis of the requirements of such combinations has been addressed in this study. Method In this developmental study, different wavelet functions were used to compress brain MRI images and finally as an auxiliary element in improving the performance of the convolutional neural network in brain tumor segmentation. Results Based on the results of the tests performed, the Daubechies1 function was most effective in enhancing network performance in segmenting MRI images and was able to balance the performance and computational overload. Conclusion Choosing the wavelet function to optimize the performance of a convolutional neural network should be based on the requirements of the problem, also taking into account some considerations such as computational load, processing time, and performance of the wavelet function in optimizing CNN output in the intended task.


Introduction
Brain tumors are dangerous diseases caused by an abnormal division of cells in the brain [1][2][3]. Unlike most other tumors, given the value and importance of the brain in the body, both benign and malignant types are dangerous [4,5]. Along with the methods used to diagnose the brain tumor, MRI imaging is an important still and image segmentation is one of the main processes in MRI image analysis, which is used to extracting tumor size, also visualization and display [6][7][8][9][10][11][12]. Accurate brain tumor segmentation is useful for brain modeling and pathological atlases generation [3,13].
Brain MRI segmentation is a challenging task because differences in size, shape, texture, and brightness of tumors in images increase the probability of error occurrence [14]. Also large numbers of brain scans increase the time of MRI analysis [15]. The mentioned factors turn brain MRI segmentation into a complex and timeconsuming process that results in wrong or delayed decisions [16,17]. Various experiences have been presented in the papers to automate brain tumor segmentation which involves the application of a variety of numerical Open Access methods and machine-learning techniques to the processing of MRI images [4,[18][19][20][21][22][23][24][25][26]. Authors in [19,27,28] introduced a new algorithm for automatic brain segmentation based on intensity inhomogeneity and noises in brain MR images. They worked on a nonlocal means technique to reach an spatially regularized segmentation method. Increasingly used deep learning methods were also applied for tumor detection [20-24, 26, 29-32]. Authors in [33] used differential feature neural network and symmetry axes of brain MRI images via a 2D CNN to identify tumor section in MRI slices.
Also, some another deep learning methods such as convolutional neural networks have been introduced (CNN) in this area [31][32][33][34][35][36]. Unlike older methods, in which feature engineering plays a key role in the success of the method, deep learning methods don't require feature engineering. They acquire hierarchical attributes and features directly from raw data without feature engineering [37,38] that lead to a high-level data representation based on a hierarchy of middles [39,40].
On the other hand, one of the most common transformations used in signal and image processing is wavelet transform. Its main capability is to parse input signal at different scales of details and approximations. So, it is known as a multispectral analysis tool, capable of extracting information from signals at different levels [41][42][43][44][45]. Using wavelet transforms to enhance the CNN performance requires applying some considerations such as the type of kernel function and the number of wavelet decomposition levels that are the subject of the present study. This is a continuation of our previous study [11] and involves a comparative analysis of wavelet transforms in improving CNN performance in brain tumor segmentation.

Materials and methods
This developmental study was approved and supported by the Tehran University of Medical Sciences with the project code: 96-04-85-36667 and ethical approval No: IR.TUMS.VCR.REC.1396.4506.
In order to compare wavelet transformations, as the aim of this study, the BRATS dataset was used. It includes a total of 227 brain tumor MRI images, consisting of 227 cases of high-grade glioma and 50 cases of low-grade glioma. Details of accessing and using BRATS 2015 and some extra descriptions were explained in detail in [11], our previous study.
Multimodal Brain Tumor Image Segmentation Benchmark (Brain Tumor Segmentation: BRATS), the dataset used in this study, has been validated and aggregated by NIH radiologists in order to compare the state-of-the-art in automated brain tumor segmentation and highlight their performance. It has been organized from 2012 as BRATS challenge in conjunction with the international conference on Medical Image Computing and Computer Assisted Interventions (MICCAI). For this purpose, a unique dataset of MR scans of low-and high-grade glioma patients made available by several human experts, as well as realistically generated synthetic brain tumor datasets for which the ground truth segmentation is known. Each automatic brain tumor segmentation algorithm can use this dataset as its present its performance, exactly for dataset goal as a benchmark for comparing with others [46].
As described in [11], to compare the CNN's performance enhancement via different wavelet transform's in brain tumor segmentation, their output was injected into the CNN's structure.

Multiple wavelet injection
A Fully Convolutional Network (FCN) model in U-Net form was used for segmentation in this study. In addition to the fact, that many studies have used this method in segmentation applications, the following reasons for using FCN were compelling. FCN U-Net, due to its unique features in compressing and decompressing feature maps, is a good choice for the application of segmentation. In addition, FCN U-Net is capable to accept additional feature maps (wavelet transformations) at multiple layers.
FCN structures involve three types of layers: convolution, inverse-convolution, and pooling (sampling). The basis of this network's function is compressing the input images into smaller feature maps till the middle layers, then reconstructing the output (segmented) image in the tail layers by up sampling and reverse-convolution of small feature maps [47].
In the case of wavelet transforms, it should be noted that wavelets are mathematical transformations that break down the input signal into different frequencies with appropriate levels of details and approximation. Their advantage over Fourier transforms is their simultaneous decomposition of frequency and time [48]. Whereas Fourier transforms are not capable of extracting events location, and just provide frequency information [49]. Figure 1 shows an example case where the change of event location does not affect the Fourier transform output.
Parsing the same signals using wavelet transform gives quite different results as Fig. 2.  Using wavelet transformations in signal analysis requires some special conditions and considerations as mentioned earlier. Based on the definitions, wavelet is a function that has several important properties: oscillation, zero mean and short length. In order to be applicable, it is necessary to concentrate on a limited range (-k, k) [50,51]. However, the number of wavelets is infinite with this definition [52]. Some famous types of wavelets are listed in Fig. 3 [53].
In this study, a variety of wavelet transforms were used to improve CNN performance and their role in CNN segmentation performance was investigated too. During the comparison, the Pywt library was used to implement the wavelet transformation and Tensorflow library was used to implementation of CNN's structure.

Training WFCN network
According to the purpose of the study to identify tumor areas against other parts of the image, the two-class network mode was used in the form of binary decisions (soft_max). In order to speed up the calculations and strengthen the network performance, the default network weights adopted from imagenet-vgg-verydeep-19. In order to train the convolutional neural network, in this study, the backpropagation algorithm with a number of executions of 100 epoch was used. The use of this number of training rounds was considered according to the trial and error and due to the relative instability of the amount of errors in the epoch above 80. The training was performed on hardware equipped with Nvidia GTX 980Ti, each time it took about 18 h to run epochs.

Performance evaluation
The evaluation of algorithms in brain tumor segmentation was performed using Dice metric which is a measure of coincidence between the segmented area and actual tumor area as below.
S is the pixels of the tumor segmented by the algorithm and G is the pixels of the actual tumor.

Results
According to the definition of wavelet transforms, there were a plenty of options for wavelet injection such as daubechies, Morlet, Symmlet and more [54]. In order to select the appropriate wavelet, transform to compress input images, a comprehensive review performed.  The selection of the wavelet function should be done in a way that is suitable in tumor identification. As one of the main features of tumors in MRI images is their brightness difference to the background, therefore, edge detection is a useful clue for tumor segmentation. Also, given the high process requirements of deep learning methods, it was necessary to select a function as the wavelet kernel that is simple as possible in terms of computational burden. Table 1  Also the average time consumed for compressing input images are presented in the Fig. 4 (non daubechies kernels) and Fig. 5 (for daubechies kernels).
A comparison of wavelet transforms output shows that db1 has a good performance in edges identification (similar to the derivative operator) also it has lower compression time compared to other transforms. In fact, it evokes a suitable balance between performance in edge detection and time-consuming. Due to the relative size of the tumor to the input image (250*250 pixel), the wavelet transforms outputs with sizes smaller than 15 × 15 actually did not contain signs of tumor, and no further compression was required. For this reason, sizes 120, 60, 30 and 15 were selected as target layer sizes for the injection of wavelet output. Therefore, four paths with wavelet injection capability were considered (Fig. 6) and their performance was evaluated as Table 2. WFCN1 architecture, which refers to the use of firstlevel wavelet injection (with 120*120 size) in the CNN structure, was chosen as the top model among other WFCNs. Evaluation with details of WFCN1 architecture are represented in Table 3: A comparative surveying the related studies [55][56][57][58][59] show our methods achievement in the term of Dice accuracy. Figure 7 demonstrates the performance comparison between the superior ones from the surveyed studies (yellow columns) and our method (blue column).

Discussion and conclusion
As mentioned earlier, various functions can be used as a wavelet kernel, and in fact, a wide variety of functions can be used in this regard. Db1 as a wavelet kernel is useful in identifying the edges of the brain tumor, so it is a good choice for brain tumor segmentation. Also, it achieves a good balance between the computational burden and edge detection. Other studies on the application of the combinational models of CNNs and wavelet transforms show that different functions have been employed as the wavelet kernels. Such as Haar [60], daubechies3 and symmlet4 [61], Gabor [62,63], Contourlet [40], and Curvelet [64]. The main reason for such variation is based on the different nature of the applications and also the difference in the input data.
The first point about the wavelet kernel selected in this study is its simplicity. Means that less computation is required to use this function as the wavelet kernel, which speeds up the using of wavelet transforms on the images. Therefore applying of this function as a wavelet kernel does not add much time overload to the process of model training. The time overload associated with applying wavelet transforms must be considered in the term of time complexity. One of the solutions to    reduce computation at this stage is to applying wavelet transforms in the first iteration of learning, saving the results and using them in the next iterations. However, it is important to pay attention to memory management in such situations. The second point is about the shape of the db1. Since db1's shape is similar to the derivative operator and edge is a key feature to identify tumor areas from non-tumor backgrounds in brain scan image, the ability of a function to mimic the derivative operator and discover the edges of the image is considered a major advantage for db1. The findings of the study, can help researcher's in better designing of CNNs with the aim of brain tumor segmentation. Because the difference in brightness levels between tumors and healthy areas of the brain in the MRI images, db1 can be used as an auxiliary part of CNNs. Due to its low computational cost, it can be useful in designing rapid tumor segmentation methods. It can also be used on lower computing capacity devices and responds in real time.
In the end, it should be noted that some additional paths can improve CNN performance in a desired task and It is better not to compromise the core network structure during these additions. In fact, in such a situation we can expect that additional extension causes improved performance. Depending on the nature of the CNN, and its ability to select from the inputs, if the additional path is a performance enhancer, it has been remained and interfered within the model training process. Otherwise, it will be removed from the training process. Consequently, by maintaining the basic structure of the CNN model, considering the appropriate wavelet function, management of computation time and the number of decomposition levels, the network performance can be improved.

Data availability statement
The Brain Tumor Segmentation dataset (BRATS) provided by NIH is used for testing the research idea: around 155 slices for each patient in.mha format as a collection of images: T1, T2, T1 with contrast (T1C), Flair and a Ground Truth segmented image. BRATS 2015 used in this study contains 224 High Grade glioma and 50 Low Grade Glioma MRI images.

Limitations
The computing power required to process deep learning models is high, and researchers were limited in their access to hardware and only had access to an average one (GPU: GTX 980Ti). This limited the study in terms of time and the ability to use heavier models, and researchers had to consider this issue in designing and testing the model. Of course, today, with the advancement of technology in this field, the cost of access to more powerful hardware is gradually decreasing, and on the other hand, services are provided that can potentially be used in the design and implementation of similar studies by researchers.