Spatial-channel transformer network based on mask-RCNN for efficient mushroom instance segmentation
Abstract
Key words: edible mushrooms; picking; instance segmentation; deep learning; algorithm
DOI: 10.25165/j.ijabe.20241704.8987
Citation: Wang J L, Song W D, Zheng W G, Feng Q C, Wang M F, Zhao C J. Spatial-channel transformer network based on mask-RCNN for efficient mushroom instance segmentation. Int J Agric & Biol Eng, 2024; 17(4): 227–235.
Keywords
Full Text:
PDFReferences
Wang M, Zhao R. A review on nutritional advantages of edible mushrooms and its industrialization development situation in protein meat analogues. Journal of Future Foods, 2023; 3(1): 1–7.
Li C, Xu S. Edible mushroom industry in China: Current state and perspectives. Applied Microbiology and Biotechnology, 2022; 106(11): 3949–3955.
Retsinas G, Efthymiou N, Anagnostopoulou D, Maragos P. Mushroom detection and three dimensional pose estimation from multi-view point clouds. Sensors, 2023; 23(7): 3576.
Hua X, Li H, Zeng J, Han C, Chen T, Tang L, et al. A review of target recognition technology for fruit picking robots: from digital image processing to deep learning. Applied Sciences, 2023; 13(7): 4160.
Qi X, Dong J, Lan Y, Zhu H. Method for identifying litchi picking position based on YOLOv5 and PSPNet. Remote Sensing, 2022; 14(9): 2004.
Dean Z, Liu X Y, Chen Y, Jin J, Jia W K, Hu C L. Image recognition at night for apple picking robot. Transactions of the CSAM, 2015; 46(3): 15–22.
Xu C, Lu Y, Jiang H, Liu S, Ma Y, Zhao T. Counting crowded soybean pods based on deformable attention recursive feature pyramid. Agronomy, 2023; 13(6): 1507.
Yang C H, Xiong L Y, Wang Z, Wang Y, Shi G, Kuremot T, et al. Integrated detection of citrus fruits and branches using a convolutional neural network. Comput Electron in Agric, 2020; 174: 105469.
Chen W, Lu S, Liu B, Li G, Qian T. Detecting citrus in orchard environment by using improved YOLOv4. Scientific Programming. 2020; 2020: 1–3.
Chen P, Li W, Yao S, Ma C, Zhang J, Wang B, et al. Recognition and counting of wheat mites in wheat fields by a three-step deep learning method. Neurocomputing, 2021; 437: 21–30.
Li R, Wang R J, Zhang J, Xie C J, Liu L, Wang F Y, et al. An effective data augmentation strategy for CNN-based pest localization and recognition in the field. IEEE Access, 2019; 7: 160274–160283.
Liu T, Chen W, Wu W, Sun C M, Guo W S, Zhu X K. Detection of aphids in wheat fields using a computer vision technique. Biosystems Engineering, 2016; 141: 82–93.
He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, 2017; pp.2961-2969.
Huang Z J, Huang L C, Gong Y C, Huang C, Wang X G. Mask scoring R-CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019; pp.6409-6418.
Sun C Z, Hu X M, Yu T. Structural design of agaricus bisporus picking robot based on cartesian coordinate system. Electrical Engineering and Computer Science (EECS), 2019; 2: 103–106.
Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg AC, Lo W Y, Dollár P. Segment anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023; pp.4015–4026.
Cai Z Y, Jian Y, Zhang Z Y, Jin C Q, Da F P. SST-ReversibleNet: Reversible-prior-based spectral-spatial transformer for efficient hyperspectral image reconstruction. Arxiv preprint, 2023; arxiv: 2305.04054.
Cai Z Y, Li C Y, Yu Y, Jin C Q, Da F P. Momentum accelerated unfolding network with spectral-spatial prior for computational spectral imaging. Applied Soft Computing, 2024; Feb 21: 111420.
Chen K, Pang J M, Wang J Q, Xiong Y, Li X X, Sun S Y, et al. Hybrid task cascade for instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition 2019; pp.4974–4983.
Yang S Z, Huang J, Yu X Y, Yu T. Research on a segmentation and location algorithm based on mask RCNN for agaricus bisporus. In 2022 2nd International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), IEEE, 2022; pp.717–721.
Cong P C, Feng H, Lv K F, Zhou J C, Li S D. MYOLO: a lightweight fresh shiitake mushroom detection model based on YOLOv3. Agriculture, 2023; 13(2): 392.
Hafiz A M, Bhat G M. A survey on instance segmentation: state of the art. International Journal of Multimedia Information Retrieval, 2020; 9(3): 171–89.
Romera-Paredes B, Torr P H. Recurrent instance segmentation. In Proceedings of 14th European Conference on Computer Vision–ECCV 2016, Amsterdam, The Netherlands, 2016; pp.312–329.
Arnab A, Torr PH. Pixelwise instance segmentation with a dynamically instantiated network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017; pp.441–450.
Lee Y, Park J. Centermask: Real-time anchor-free instance segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020; pp.13906–13915.
Cai Z W, Vasconcelos N. Cascade R-CNN: High quality object detection and instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019; 43(5): 1483–1498.
Bolya D, Zhou C, Xiao F, Lee Y J. Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019; pp.9157–9166.
Chen H, Sun K Y, Tian Z, Shen C H, Huang Y M, Yan Y L. Blendmask: Top-down meets bottom-up for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020; pp.8573–8581.
Ying H, Huang Z, Liu S, Shao T J, Zhou K. Embedmask: Embedding coupling for one-stage instance segmentation. Arxiv preprint, 2019; arxiv: 1912.01954.
Wang X L, Zhang R F, Kong T, Li L, Shen C H. Solov2: Dynamic and fast instance segmentation. Advances in Neural information Processing Systems, 2020; 33: 17721–17732.
Shojaiee F, Baleghi Y. EFASPP U-Net for semantic segmentation of night traffic scenes using fusion of visible and thermal images. Engineering Applications of Artificial Intelligence, 2023; 117: 105627.
Kaur A, Goyal P, Rajhans R, Agarwal L, Goyal N. Fusion of multivariate time series meteorological and static soil data for multistage crop yield prediction using multi-head self-attention network. Expert Systems with Applications, 2023; 226: 120098.
Yang Q L, Ye Y, Gu L C, Wu Y T. MSFCA-net: A multi-scale feature convolutional attention network for segmenting crops and weeds in the field. Agriculture, 2023; 13(6): 1176.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, et al. Attention is all you need. Advances in Neural Information Processing Systems, 2017; 30: 1–11.
Gillioz A, Casas J, Mugellini E, Abou Khaled O. Overview of the Transformer-based Models for NLP Tasks. In 15th Conference on Computer Science and Information Systems (FedCSIS), IEEE, 2020; pp.179–183.
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, et al. An image is worth 16x16 words: Transformers for image recognition at scale. arxiv preprint arxiv: 2010.11929. 2020 Oct 22.
Liu Z, Lin Y T, Cao Y, Hu H, Wei Y X, Zhang Z, et al. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021; pp.10012–10022.
Bao W X, Xie W J, Hu G S, Yang X J, Su B B. Wheat ear counting method in UAV images based on TPH-YOLO. Transactions of the CSAE, 2023; 39(1): 155–161. (in Chinese)
Xu Y L, Kong S L, Chen Q Y, Gao Z Y, Li C X. Model for identifying strong generalization apple leaf disease using transformer. Transactions of the CSAE, 2022; 38(16): 198–206. (in Chinese)
Wang C, Wu X H, Zhang Y Q, Wang W J. Recognizing weeds in maize fields using shifted window Transformer network. Transactions of the CSAE, 2022; 38(15): 133–42. (in Chinese)
Fu L L, Huang H, Wang H, Huang S C, Chen D. Classification of maize growth stages using the Swin transformer model. Transactions of the CSAE, 2022; 38(14): 191–200.
Zhu D L, Yu M S, Liang M F. Real-time instance segmentation of maize ears using SwinT-YOLACT. Transactions of the CSAE, 2023; 39(14): 164–172. (in Chinese)
Liu X, Yi S, Li L, Cheng X H, Wang C. Semantic segmentation of terrace image regions based on lightweight CNN-transformer hybrid networks. Transactions of the CSAE, 2023; 39(13): 171–181. (in Chinese)
Fang Y X, Yang S S, Wang X G, Li Y, Fang C, Shan Y, et al. Instances as queries. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021; pp.6910–6919.
Kirillov A, Wu Y, He K, Girshick R. Pointrend: Image segmentation as rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020; pp.9799–9808.
Cai Z Y, Jin C, Da F. DMDC: Dynamic-mask-based dual camera design for snapshot Hyperspectral Imaging. arxiv preprint, 2023; arxiv: 2308.01541.
Copyright (c) 2024 International Journal of Agricultural and Biological Engineering
This work is licensed under a Creative Commons Attribution 4.0 International License.