Skip to main content
Fig. 3 | BMC Medical Informatics and Decision Making

Fig. 3

From: Swin Unet3D: a three-dimensional medical image segmentation network combining vision transformer and convolution

Fig. 3

The schematic diagram of a W-MSA-3D and b SW-MSA-3D with a window size of 2. The tokens of the same color in a belong to the same window, and we only calculate the self-attention within each window. To obtain the dependency information interaction between adjacent windows, we divide some tokens within neighboring windows into the same window after cyclic shifting, and only tokens satisfying these conditions are allowed to calculate the window self-attention between them. Other tokens that do not satisfy the condition are shielded from attention between them by a masking mechanism even if they belong to the same window after a circular shift, as shown in b

Back to article page