Concurrent channel and spatial attention in Fully Convolutional Network for individual pig image segmentation

Zhiwei Hu, Hua Yang, Tiantian Lou, Hongwen Yan

Abstract


The separation of individual pigs from the pigpen scenes is crucial for precision farming, and the technology based on convolutional neural networks can provide a low-cost, non-contact, non-invasive method of pig image segmentation. However, two factors limit the development of this field. On the one hand, the individual pigs are easy to stick together, and the occlusion of debris such as pigpens can easily make the model misjudgment. On the other hand, manual labeling of group-raised pig data is time-consuming and labor-intensive and is prone to labeling errors. Therefore, it is urgent for an individual pig image segmentation model that can perform well in individual scenarios and can be easily migrated to a group-raised environment. In order to solve the above problems, taking individual pigs as research objects, an individual pig image segmentation dataset containing 2066 images was constructed, and a series of algorithms based on fully convolutional networks were proposed to solve the pig image segmentation problem. In order to capture the long-range dependencies and weaken the background information such as pigpens while enhancing the information of individual parts of pigs, the channel and spatial attention blocks were introduced into the best-performing decoders UNet and LinkNet. Experiments show that using ResNext50 as the encoder and Unet as the decoder as the basic model, adding two attention blocks at the same time achieves 98.30% and 96.71% on the F1 and IOU metrics, respectively. Compared with the model adding channel attention block alone, the two metrics are improved by 0.13% and 0.22%, respectively. The experiment of introducing channel and spatial attention alone shows that spatial attention is more effective than channel attention. Taking VGG16-LinkNet as an example, compared with channel attention, spatial attention improves the F1 and IOU metrics by 0.16% and 0.30%, respectively. Furthermore, the heatmap of the feature of different layers of the decoder after adding different attention information proves that with the increase of layers, the boundary of pig image segmentation is clearer. In order to verify the effectiveness of the individual pig image segmentation model in group-raised scenes, the transfer performance of the model is verified in three scenarios of high separation, deep adhesion, and pigpen occlusion. The experiments show that the segmentation results of adding attention information, especially the simultaneous fusion of channel and spatial attention blocks, are more refined and complete. The attention-based individual pig image segmentation model can be effectively transferred to the field of group-raised pigs and can provide a reference for its pre-segmentation.
Keywords: pig, image segmentation, Fully Convolutional Network (FCN), attention mechanism, channel and spatial attention
DOI: 10.25165/j.ijabe.20231601.6528

Citation: Hu Z W, Yang H, Lou T T, Yang H W. Concurrent channel and spatial attention in Fully Convolutional Network for individual pig image segmentation. Int J Agric & Biol Eng, 2023; 16(1): 232–242.

Keywords


pig, image segmentation, Fully Convolutional Network (FCN), attention mechanism, channel and spatial attention

Full Text:

PDF

References


Guo Y Z, Zhu W X, Jiao P P, Ma C H, Yang J J. Multi-object extraction from topview group-housed pig images based on adaptive partitioning and multilevel thresholding segmentation. Biosystems Engineering, 2015; 135: 54–60.

Xu J Y, Zhou S Y, Xu A J, Ye J H, Zhao A Y. Automatic scoring of postures in grouped pigs using depth image and CNN-SVM. Computers and Electronics in Agriculture, 2022; 194: 106746. doi: 10.1016/j.compag.2022.106746.

He K, Wang D, Tong M, Zhu Z J. An improved GrabCut on multiscale features. Pattern Recognition, 2020; 103: 107292. doi: 10.1016/j. patcog.2020.107292.

Ma L, Ji B, Liu H K, Zhu W X, Li W, Zhang T. Differentiating profile based on single pig contour. Transactions of the CSAE, 2013; 29(10): 168–174. (in Chinese)

LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015; 521(7553): 436–444.

Hu J, Shen L, Albanie S, Sun G, Wu E H. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020; 42(8): 2011–2023.

Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu: IEEE, 2017; pp.2261–2269. doi: 10.1109/CVPR.2017.243.

Li X, Wang W H, Hu X L, Yang J. Selective kernel networks. In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach: IEEE, 2019; pp.510–519. doi: 10.1109/CVPR.2019.00060.

Girshick R. Fast R-CNN. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago: IEEE, 2015; pp.1440–1448. doi: 10.1109/ICCV.2015.169.

Lin T Y, Goyal P, Girshick R, He K M, Dollar P. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020; 42(2): 318–327.

Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017; pp.7263–7271.

Huang Z L, Wang X G, Wei Y C, Huang L C, Shi H, Liu W Y, et al. CCNet: Criss-cross attention for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020; Early access. doi: 10.1109/TPAMI.2020.3007032.

Zhang H, Dana K, Shi J P, Zhang Z Y, Wang X G, Tyagi A, et al. Context encoding for semantic segmentation. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018; pp.7151–7160. doi: 10.1109/CVPR.2018.00747.

Bolya D, Zhou C, Xiao F Y, Lee Y J. YOLACT: Real-time instance segmentation. In: 2019 IEEE International Conference on Computer Vision (ICCV), 2019; pp.9156–9165. doi: 10.1109/ICCV.2019.00925.

Wang X L, Zhang R F, Kong T, Li L, Shen C H. SOLOv2: Dynamic and fast instance segmentation. In: 34th Conference on Neural Information Processing Systems (NeurIPS 2020), 2020; Vancouver, pp.17721–17732.

Jensen D B, Pedersen L J. Automatic counting and positioning of slaughter pigs within the pen using a convolutional neural network and video images. Computers and Electronics in Agriculture, 2021; 188: 106296. doi: 10.1016/j.compag.2021.106296.

Huang E D, Mao A X, Gan H M, Ceballos M C, Parsons T D, Xue Y J, et al. Center clustering network improves piglet counting under occlusion. Computers and Electronics in Agriculture, 2021; 189: 106417. doi: 10.1016/j.compag.2021.106417.

Marsot M, Mei J Q, Shan X C, Ye L Y, Feng P, Yan X J, et al. An adaptive pig face recognition approach using Convolutional Neural Networks. Computers and Electronics in Agriculture, 2020; 173: 105386. doi: 10.1016/j.compag.2020.105386.

Wang Z Y, Liu T H. Two-stage method based on triplet margin loss for pig face recognition. Computers and Electronics in Agriculture, 2022; 194: 106737. doi: 10.1016/j.compag.2022.106737.

Hu Z W, Yan H W, Lou T T. Parallel channel and position attention-guided feature pyramid for face posture detection. Int J Agric & Biol Eng, 2022; 15(6): 222–234.

Yan H W, Hu Z W, Cui Q L. Study on feature extraction of pig face based on principal component analysis. INMATEH-Agricultural Engineering, 2022; 68(3): 333–340.

Yan H W, Liu Z Y, Cui Q L, Hu Z W, Li Y W. Detection of facial gestures of group pigs based on improved Tiny-YOLO. Transactions of the CSAE, 2019; 35(18): 169–179. (in Chinese)

Chen F E, Liang X M, Chen L H, Liu B Y, Lan Y B. Novel method for real-time detection and tracking of pig body and its different parts. Int J Agric & Biol Eng, 2020; 13(6): 144–149.

Gan H M, Ou M Q, Zhao F Y, Xu C G, Li S M, Chen C X, et al. Automated piglet tracking using a single convolutional neural network. Biosystems Engineering, 2021; 205: 48–63.

Hu Z W, Yang H, Lou T T. Instance detection of group breeding pigs using a pyramid network with dual attention feature. Transactions of the CSAE, 2021; 37(5): 166–174. (in Chinese)

Xiao D Q, Lin S C, Liu Y F, Yang Q M, Wu H L. Group-housed pigs and their body parts detection with Cascade Faster R-CNN. Int J Agric & Biol Eng, 2022; 15(3): 203–209.

Gan H M, Xu C G, Hou W H, Guo J F, Liu K, Xue Y J. Spatiotemporal graph convolutional network for automated detection and analysis of social behaviours among pre-weaning piglets. Biosystems Engineering, 2022; 217: 102–114.

Yan H W, Liu Z Y, Cui Q L, Hu Z W. Multi-target detection based on feature pyramid attention and deep convolution network for pigs. Transactions of the CSAE, 2020; 36(11): 193–202. (in Chinese)

Chen C, Zhu W X, Steibel J, Siegford J, Han J J, Norton T. Recognition of feeding behaviour of pigs and determination of feeding time of each pig by a video-based deep learning method. Computers and Electronics in Agriculture, 2020; 176: 105642. doi: 10.1016/j.compag.2020.105642.

Chen C, Zhu W X, Oczak M, Maschat K, Baumgartner J, Larsen M L V, et al. A computer vision approach for recognition of the engagement of pigs with different enrichment objects. Computers and Electronics in Agriculture, 2020; 175: 105580. doi: 10.1016/j.compag.2020.105580.

Chen C, Zhu W W, Steibel J, Siegford J, Han J J, Norton T. Classification of drinking and drinker-playing in pigs by a video-based deep learning method. Biosystems Engineering, 2020; 196: 1–14.

He H X, Qiao Y L, Li X M, Chen C Y, Zhang X F. Automatic weight measurement of pigs based on 3D images and regression network. Computers and Electronics in Agriculture, 2021; 187: 106299. doi: 10.106/j.compag.2021.106299.

Liu K, Yang H Q, Yang H, Hu Z W, Meng K. Instance segmentation of group-housed pigs based on recurrent residual attention. Journal of South China Agricultural University, 2020; 41(6): 169–178. (in Chinese)

Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017; 39(4): 640–651.

Mou L C, Hua Y S, Zhu X X. A relation-augmented fully convolutional network for semantic segmentation in aerial scenes. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach: IEEE, 2019; pp.12408–12417. doi: 10.1109/CVPR.2019.01270.

Sun W H, Huang Z P, Liang M, Shao T F, Bi H Z. Cocoon image segmentation method based on fully convolutional networks. In: The Seventh Asia International Symposium on Mechatronics, 2020; 589: 832–843.

Psota E T, Mittek M, Pérez L C, Schmidt T, Mote B. Multi-pig part detection and association with a fully-convolutional network. Sensors, 2019; 19(4): 852. doi: 10.3390/s19040852.

Yang A Q, Huang H S, Zheng C, Zhu X M, Yang X F, Chen P F, et al. High-accuracy image segmentation for lactating sows using a fully convolutional network. Biosystems Engineering, 2018; 176: 36–47.

Hu Z W, Yang H, Lou T T. Extraction of pig contour based on fully convolutional networks. Journal of South China Agricultural University, 2018; 39(6): 111–119. (in Chinese)

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014; arXiv: 1409.1556.

Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, 2015; pp.234–241.

Yang A Q, Huang H S, Zhu X M, Yang X F, Chen P F, Li S M, et al. Automatic recognition of sow nursing behaviour using deep learning-based segmentation and spatial and temporal features. Biosystems Engineering, 2018; 175: 133–145.

Yang A Q, Huang H S, Yang X F, Li S M, Chen C, Gan H, et al. Automated video analysis of sow nursing behavior based on fully convolutional network and oriented optical flow. Computers and Electronics in Agriculture, 2019; 167: 105048. doi: 10.1016/j.compag. 2019.105048.

Hu Z W, Yang H, Lou T T. Dual attention-guided feature pyramid network for instance segmentation of group pigs. Computers and Electronics in Agriculture, 2021; 186: 106140. doi: 10.1016/j.compag. 2021.106140.

He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), 2017; pp.2980–2988. doi: 10.1109/ICCV.2017.322.

Cai Z W, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt lake: IEEE, 2018; pp.6154-6162. doi: 10.1109/CVPR.2018.00644.

Chen K, Pang J M, Wang J Q, Xiong Y, Li X X, Sun S Y, et al. Hybrid task cascade for instance segmentation. In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach: IEEE, 2019; pp.4969–4978. doi: 10.1109/CVPR.2019.00511.

Tu S Q, Yuan W J, Liang Y, Wang F, Wan H. Automatic detection and segmentation for group-housed pigs based on PigMS R-CNN. Sensors, 2021; 21(9): 3251. doi: 10.3390/s21093251.

Navaneeth B, Singh B, Chellappa R, Davis L S. Soft-NMS--improving object detection with one line of code. In: 2017 IEEE International Conference on Computer Vision (ICCV), 2017; pp.5562–5570. doi: 10.1109/ICCV.2017.593.

Fu J, Liu J, Tian H J, Fang Z W, Lu H Q. Dual attention network for scene segmentation. In: 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019; pp.3146–3154.

Roy A G, Navab N, Wachinger C. Concurrent spatial and channel ‘squeeze & excitation’ in fully convolutional networks. In: 2018 International Conference on Medical Image Computing and Computer-assisted Intervention, 2018; pp.421–429.

Woo S, Park J, Lee J Y, Kweon I S. CBAM: Convolutional block attention module. In: The European Conference on Computer Vision (ECCV), 2018; pp.3–19.

Chaurasia A, Culurciello E. Linknet: Exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), Petersburg: IEEE, 2017; pp.1–4.

Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv, 2014; arXiv: 1412.6980.

Deng J, Dong W, Socher R, Li L J, Li K, Li F F. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami: IEEE, 2009; pp.248-255. doi: 10.1109/CVPR.2009.5206848.

Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 fourth International Conference on 3D Vision (3DV), Stanford: IEEE, 2016; pp.565–571. doi: 10.1109/3DV.2016.79.

Sandler M, Howard A, Zhu M L, Zhmoginov A, Chen L C. Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake: IEEE, 2018; pp.4510–4520. doi: 10.1109/CVPR.2018.00474.

He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016; pp.770–778. doi: 10.1109/ CVPR.2016.90.

Xie S N, Girshick R, Dollár P, Tu Z W, He K M. Aggregated residual transformations for deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu: IEEE, 2017; pp.5987–5995.

Lin T Y, Dollár P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu: IEEE, 2017; pp.936–944. doi: 10.1109/CVPR.2017.106.

Zhao H, Shi J, Qi X, Wang X, Jia J. Pyramid scene parsing network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017; pp.2881–2890.

Park J, Woo S, Lee J Y, Kweon I S. BAM: Bottleneck attention module. arXiv, 2018; arXiv:1807.06514.




Copyright (c) 2023 International Journal of Agricultural and Biological Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

2023-2026 Copyright IJABE Editing and Publishing Office