Ge-Peng Ji, Guobao Xiao, Yu-Cheng Chou, Deng-Ping Fan, Kai Zhao, Geng Chen, Luc Van Gool. Video Polyp Segmentation: A Deep Learning Perspective. Machine Intelligence Research, vol. 19, no. 6, pp.531-549, 2022. https://doi.org/10.1007/s11633-022-1371-y
Citation: Ge-Peng Ji, Guobao Xiao, Yu-Cheng Chou, Deng-Ping Fan, Kai Zhao, Geng Chen, Luc Van Gool. Video Polyp Segmentation: A Deep Learning Perspective. Machine Intelligence Research, vol. 19, no. 6, pp.531-549, 2022. https://doi.org/10.1007/s11633-022-1371-y

Video Polyp Segmentation: A Deep Learning Perspective

doi: 10.1007/s11633-022-1371-y
More Information
  • Author Bio:

    Ge-Peng Ji received the M. Sc. degree in communication and information systems from Wuhan University, China in 2021. He is currently a Ph. D. student at Australian National University, supervised by Professor Nick Barnes, majoring in Engineering and Computer Science. He has published about 10 peer-reviewed journal and conference papers. In 2021, he received the Student Travel Award from Medical Image Computing and Computer-Assisted Intervention Society. His research interests lie in computer vision, especially in a variety of dense prediction tasks, such as video analysis, medical image segmentation, camouflaged object segmentation, and saliency detection. E-mail: gepengai.ji@gmail.com ORCID iD: 0000-0001-7092-2877

    Guobao Xiao received the Ph. D. degree in computer science and technology from Xiamen University, China in 2016. From 2016–2018, he was a postdoctoral fellow at School of Aerospace Engineering, Xiamen University, China. He is currently a professor at Minjiang University, China. He has published over 50 papers in international journals and conferences, including TPAMI/TIP/TITS/TIE/TMM, IJCV, PR, ICCV, ECCV, etc. He has been awarded the best Ph. D. thesis in Fujian Province and the best Ph. D. thesis award in China Society of Image and Graphics (a total of ten winners in China). He also served on the program committee (PC) of CVPR, ICCV, ECCV, etc. His research interests include machine learning, computer vision and pattern recognition. E-mail: x-gb@163.com ORCID iD: 0000-0003-2928-8100

    Yu-Cheng Chou received the B. Sc. degree in software engineering from School of Computer Science, Wuhan University, China in 2022. He is currently a visiting student at Johns Hopkins University, supervised by Zongwei Zhou and Prof. Alan Yuille. His research interests include medical imaging, causality, and computer vision, especially developing novel methodologies to detect lesions accurately and exploring explainability through causality for computer-aided diagnosis and surgery.E-mail: johnson111788@gmail.com ORCID iD: 0000-0002-9334-2899

    Deng-Ping Fan received the Ph. D. degree from Nankai University, China in 2019. He joined the Inception Institute of Artificial Intelligence (IIAI), UAE in 2019. He is a Postdoctoral Researcher, working with Prof. Luc Van Gool in Computer Vision Laboratory, ETH Zürich, Switzerland. He has published about 50 top journal and conference papers such as TPAMI, IJCV, TIP, TNNLS, TMI, CVPR, ICCV, ECCV, IJCAI, etc. He won the Best Paper Finalist Award at IEEE CVPR 2019, and the Best Paper Award Nominee at IEEE CVPR 2020. He was recognized as the CVPR 2019 outstanding reviewer with a special mention award, the CVPR 2020 outstanding reviewer, the ECCV 2020 high-quality reviewer, and the CVPR 2021 outstanding reviewer. He served as a program committee board (PCB) member of IJCAI 2022–2024, a senior program committee (SPC) member of IJCAI 2021, a program committee member (PC) of CAD&CG 2021, a committee member of China Society of Image and Graphics (CSIG), area chair in NeurIPS 2021 Datasets and Benchmarks Track, area chair in MICCAI2020 Workshop. His research interests include computer vision, deep learning, and saliency detection. E-mail: dengpfan@gmail.com ORCID iD: 0000-0002-5245-7518 (Corresponding author)

    Kai Zhao received the B. Sc. and M. Sc. degrees from Shanghai University, China in 2014 and 2017, respectively, and the Ph. D. degree from College of Computer Science, Nankai University, China in 2020. He is currently a postdoctoral researcher at University of California, USA. He has over 10 peer-reviewed publications in computer vision and machine learning-related areas, including TPAMI, TIP, NeurIPS, ICCV, CVPR, ECCV and IJCAI. His research interests include computer vision and machine intelligence. E-mail: kz@kaizhao.net ORCID iD: 0000-0002-2496-0829

    Geng Chen received the Ph. D. degree from Northwestern Polytechnical University, China in 2016. He was a research scientist at the Inception Institute of Artificial Intelligence, UAE from 2019 to 2021, and a postdoctoral research associate at the University of North Carolina at Chapel Hill, USA from 2016 to 2019. He is a professor at Northwestern Polytechnical University, China. He has published over 60 papers in peer-reviewed international conference proceedings and journals. His research interests include medical image analysis and computer vision. E-mail: geng.chen.cs@gmail.com ORCID iD: 0000-0001-8350-6581

    Luc Van Gool received the B. Eng. degree in electromechanical engineering from the Katholieke Universiteit Leuven in 1981. Currently, he is a professor at the Katholieke Universiteit Leuven in Belgium and the ETH Zürich, Switzerland. He leads computer vision research at both places and also teaches at both 5. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science committee. He is a co-founder of 10 spin-off companies. His research interests include 3D reconstruction and modeling, object recognition, tracking, gesture analysis, and a combination of those. E-mail: vangool@vision.ee.ethz.ch ORCID iD: 0000-0002-3445-5711

  • Corresponding author: * Corresponding author is Deng-Ping Fan (E-mail: dengpfan@gmail.com)
  • Received Date: 2022-07-03
  • Accepted Date: 2022-08-24
  • Publish Online: 2022-11-03
  • Publish Date: 2022-11-22
  • We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era. Over the years, developments in VPS are not moving forward with ease due to the lack of a large-scale dataset with fine-grained segmentation annotations. To address this issue, we first introduce a high-quality frame-by-frame annotated VPS dataset, named SUN-SEG, which contains 158690 colonoscopy video frames from the well-known SUN-database. We provide additional annotation covering diverse types, i.e., attribute, object mask, boundary, scribble, and polygon. Second, we design a simple but efficient baseline, named PNS+, which consists of a global encoder, a local encoder, and normalized self-attention (NS) blocks. The global and local encoders receive an anchor frame and multiple successive frames to extract long-term and short-term spatial-temporal representations, which are then progressively refined by two NS blocks. Extensive experiments show that PNS+ achieves the best performance and real-time inference speed (170 fps), making it a promising solution for the VPS task. Third, we extensively evaluate 13 representative polyp/object segmentation models on our SUN-SEG dataset and provide attribute-based comparisons. Finally, we discuss several open issues and suggest possible research directions for the VPS community. Our project and dataset are publicly available at https://github.com/GewelsJI/VPS.

     

  • 1 These statistical data come from this website, http://amed8k.sundatabase.org/, which is different from the data reported in the original paper[3]. Besides, the SUN-database is available for only non-commercial use in research or educational purposes, which could be freely accessed with permission from the authors.
    2 The descriptions of complete annotations refer to https://github.com/GewelsJI/VPS/blob/main/docs/DATA_DESCRIPTION.md.
    3 Seen denotes that the samples in the testing dataset are from the same case in the training set, whereas the unseen indicates that the scenario do not exist in the training set.
    † These authors contribute equally to this work
  • loading
  • [1]
    J. Bernal, J. Sánchez, F. Vilariño. Towards automatic polyp detection with a polyp appearance model. Pattern Recognition, vol. 45, no. 9, pp. 3166–3182, 2012. DOI: 10.1016/j.patcog.2012.03.002.
    [2]
    J. G. B. Puyal, K. K. Bhatia, P. Brandao, O. F. Ahmad, D. Toth, R. Kader, L. Lovat, P. Mountney, D. Stoyanov. Endoscopic polyp segmentation using a hybrid 2D/3D CNN. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Lima, Peru, pp. 295–305, 2020. DOI: 10.1007/978-3-030-59725-2_29.
    [3]
    M. Misawa, S. E. Kudo, Y. Mori, K. Hotta, K. Ohtsuka, T. Matsuda, S. Saito, T. Kudo, T. Baba, F. Ishida, H. Itoh, M. Oda, K. Mori. Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video). Gastrointestinal Endoscopy, vol. 93, no. 4, pp. 960–967, 2021. DOI: 10.1016/j.gie.2020.07.060.
    [4]
    G. P. Ji, Y. C. Chou, D. P. Fan, G. Chen, H. Z. Fu, D. Jha, L. Shao. Progressively normalized self-attention network for video polyp segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 142–152, 2021. DOI: 10.1007/978-3-030-87193-2_14.
    [5]
    J. Silva, A. Histace, O. Romain, X. Dray, B. Granado. Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery, vol. 9, no. 2, pp. 283–293, 2014. DOI: 10.1007/s11548-013-0926-3.
    [6]
    J. Bernal, F. J. Sánchez, G. Fernández-Esparrach, D. Gil, C. Rodríguez, F. Vilariño. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics, vol. 43, pp. 99–111, 2015. DOI: 10.1016/j.compmedimag.2015.02.007.
    [7]
    P. Mesejo, D. Pizarro, A. Abergel, O. Rouquette, S. Beorchia, L. Poincloux, A. Bartoli. Computer-aided classification of gastrointestinal lesions in regular colonoscopy. IEEE Transactions on Medical Imaging, vol. 35, no. 9, pp. 2051–2063, 2016. DOI: 10.1109/TMI.2016.2547947.
    [8]
    N. Tajbakhsh, S. R. Gurudu, J. M. Liang. Automated polyp detection in colonoscopy videos using shape and context information. IEEE Transactions on Medical Imaging, vol. 35, no. 2, pp. 630–644, 2016. DOI: 10.1109/TMI.2015.2487997.
    [9]
    Gastrointestinal Image ANAlysis (GIANA) Challenge. [Online], Available: https://endovissub2017-giana.grand-challenge.org/.
    [10]
    D. Vázquez, J. Bernal, F. J. Sánchez, G. Fernández-Esparrach, A. M. López, A. Romero, M. Drozdzal, A. Courville. A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of Healthcare Engineering, vol. 2017, Article number 4037190, 2017. DOI: 10.1155/2017/4037190.
    [11]
    A. Koulaouzidis, D. K. Iakovidis, D. E. Yung, E. Rondonotti, U. Kopylov, J. N. Plevris, E. Toth, A. Eliakim, G. W. Johansson, W. Marlicz, G. Mavrogenis, A. Nemeth, H. Thorlacius, G. E. Tontini. Kid project: An internet-based digital video atlas of capsule endoscopy for research purposes. Endoscopy International Open, vol. 5, no. 6, pp. E477–E483, 2017. DOI: 10.1055/s-0043-105488.
    [12]
    D. K. Iakovidis, S. V. Georgakopoulos, M. Vasilakakis, A. Koulaouzidis, V. P. Plagianakos. Detecting and locating gastrointestinal anomalies using deep learning and iterative cluster unification. IEEE Transactions on Medical Imaging, vol. 37, no. 10, pp. 2196–2210, 2018. DOI: 10.1109/TMI.2018.2837002.
    [13]
    K. Pogorelov, K. R. Randel, C. Griwodz, S. L. Eskeland, T. De Lange, D. Johansen, C. Spampinato, D. T. Dang-Nguyen, M. Lux, P. T. Schmidt, M. Riegler, P. Halvorsen. KVASIR: A multi-class image dataset for computer aided gastrointestinal disease detection. In Proceedings of the 8th ACM on Multimedia Systems Conference, Taipei, China, pp. 164–169, 2017. DOI: 10.1145/3083187.3083212.
    [14]
    S. Ali, N. Ghatwary, B. Braden, D. Lamarque, A. Bailey, S. Realdon, R. Cannizzaro, J. Rittscher, C. Daul, J. East. Endoscopy disease detection challenge 2020. [Online], Available: https://arxiv.org/abs/2003.03376, 2020.
    [15]
    H. Borgli, V. Thambawita, P. H. Smedsrud, S. Hicks, D. Jha, S. L. Eskeland, K. R. Randel, K. Pogorelov, M. Lux, D. T. D. Nguyen, D. Johansen, C. Griwodz, H. K. Stensland, E. Garcia-Ceja, P. T. Schmidt, H. L. Hammer, M. A. Riegler, P. Halvorsen, T. De Lange. HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Scientific Data, vol. 7, no. 1, Article number 283, 2020. DOI: 10.1038/s41597-020-00622-y.
    [16]
    D. Jha, P. H. Smedsrud, M. A. Riegler, P. Halvorsen, T. De Lange, D. Johansen, H. D. Johansen. Kvasir-SEG: A segmented polyp dataset. In Proceedings of the 26th International Conference on Multimedia Modeling, Springer, Daejeon, Korea, pp. 451–462, 2020. DOI: 10.1007/978-3-030-37734-2_37.
    [17]
    L. F. Sánchez-Peralta, J. B. Pagador, A. Picón, Á. J. Calderón, F. Polo, N. Andraka, R. Bilbao, B. Glover, C. L. Saratxaga, F. M. Sánchez-Margallo. PICCOLO white-light and narrow-band imaging colonoscopic dataset: A performance comparative of models and datasets. Applied Sciences, vol. 10, no. 23, Article number 8501, 2020. DOI: 10.3390/app10238501.
    [18]
    P. H. Smedsrud, V. Thambawita, S. A. Hicks, H. Gjestang, O. O. Nedrejord, E. Næss, H. Borgli, D. Jha, T. J. D. Berstad, S. L. Eskeland, M. Lux, H. Espeland, A. Petlund, D. T. D. Nguyen, E. Garcia-Ceja, D. Johansen, P. T. Schmidt, E. Toth, H. L. Hammer, T. De Lange, M. A. Riegler, P. Halvorsen. Kvasir-capsule, a video capsule endoscopy dataset. Scientific Data, vol. 8, no. 1, Article number 142, 2021. DOI: 10.1038/s41597-021-00920-z.
    [19]
    W. Wang, J. G. Tian, C. W. Zhang, Y. H. Luo, X. Wang, J. Li. An improved deep learning approach and its applications on colonic polyp images detection. BMC Medical Imaging, vol. 20, no. 1, Article number 83, 2020. DOI: 10.1186/s12880-020-00482-3.
    [20]
    Y. T. Ma, X. J. Chen, K. Cheng, Y. Li, B. Sun. LDPolypVideo benchmark: A large-scale colonoscopy video dataset of diverse polyps. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 387–396, 2021. DOI: 10.1007/978-3-030-87240-3_37.
    [21]
    K. D. Li, M. I. Fathan, K. Patel, T. X. Zhang, C. C. Zhong, A. Bansal, A. Rastogi, J. S. Wang, G. H. Wang. Colonoscopy polyp detection and classification: Dataset creation and comparative evaluations. PLoS One, vol. 16, no. 8, Article number e0255809, 2021. DOI: 10.1371/journal.pone.0255809.
    [22]
    S. Ali, D. Jha, N. Ghatwary, S. Realdon, R. Cannizzaro, O. E. Salem, D. Lamarque, C. Daul, K. V. Anonsen, M. A. Riegler, K. V. Anonsen, A. Petlund, P. Halvorsen, J. Rittscher, T. De Lange, J. E. East. Polypgen: A multi-center polyp detection and segmentation dataset for generalisability assessment. [Online], Available: https://arxiv.org/abs/2106.04463.
    [23]
    B. V. Dhandra, R. Hegadi, M. Hangarge, V. S. Malemath. Analysis of abnormality in endoscopic images using combined hsi color space and watershed segmentation. In Proceedings of the 18th International Conference on Pattern Recognition, IEEE, Hong Kong, China, pp. 695–698, 2006. DOI: 10.1109/ICPR.2006.268.
    [24]
    A. V. Mamonov, I. N. Figueiredo, P. N. Figueiredo, Y. H. R. Tsai. Automated polyp detection in colon capsule endoscopy. IEEE Transactions on Medical Imaging, vol. 33, no. 7, pp. 1488–1502, 2014. DOI: 10.1109/TMI.2014.2314959.
    [25]
    O. H. Maghsoudi. Superpixel based segmentation and classification of polyps in wireless capsule endoscopy. In Proceedings of the Signal Processing in Medicine and Biology Symposium, IEEE, Philadelphia, USA, 2017. DOI: 10.1109/SPMB.2017.8257027.
    [26]
    L. Q. Yu, H. Chen, Q. Dou, J. Qin, P. A. Heng. Integrating online and offline three-dimensional deep learning for automated polyp detection in colonoscopy videos. IEEE Journal of Biomedical and Health Informatics, vol. 21, no. 1, pp. 65–75, 2017. DOI: 10.1109/JBHI.2016.2637004.
    [27]
    W. Tavanapong, J. Oh, M. A. Riegler, M. Khaleel, B. Mittal, P. C. De Groen. Artificial intelligence for colonoscopy: Past, present, and future. IEEE Journal of Biomedical and Health Informatics, vol. 26, no. 8, pp. 3950–3965, 2022. DOI: 10.1109/JBHI.2022.3160098.
    [28]
    H. Gammulle, S. Denman, S. Sridharan, C. Fookes. Two-stream deep feature modelling for automated video endoscopy data analysis. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Lima, Peru, pp. 742–751, 2020. DOI: 10.1007/978-3-030-59716-0_71.
    [29]
    G. Carneiro, L. Z. C. T. Pu, R. Singh, A. Burt. Deep learning uncertainty and confidence calibration for the five-class polyp classification from colonoscopy. Medical Image Analysis, vol. 62, Article number 101653, 2020. DOI: 10.1016/j.media.2020.101653.
    [30]
    R. K. Zhang, Y. L. Zheng, C. C. Y. Poon, D. G. Shen, J. Y. W. Lau. Polyp detection during colonoscopy using a regression-based convolutional neural network with a tracker. Pattern Recognition, vol. 83, pp. 209–219, 2018. DOI: 10.1016/j.patcog.2018.05.026.
    [31]
    L. Y. Wu, Z. Q. Hu, Y. F. Ji, P. Luo, S. T. Zhang. Multi-frame collaboration for effective endoscopic video polyp detection via spatial-temporal feature transformation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 302–312, 2021. DOI: 10.1007/978-3-030-87240-3_29.
    [32]
    P. Brandao, E. Mazomenos, G. Ciuti, R. Caliò, F. Bianchi, A. Menciassi, P. Dario, A. Koulaouzidis, A. Arezzo, D. Stoyanov. Fully convolutional neural networks for polyp segmentation in colonoscopy. In Proceedings of the SPIE 10134, Medical Imaging 2017: Computer-aided Diagnosis, Orlando, USA, pp. 101–107, 2017. DOI: 10.1117/12.2254361.
    [33]
    M. Akbari, M. Mohrekesh, E. Nasr-Esfahani, S. M. R. Soroushmehr, N. Karimi, S. Samavi, K. Najarian. Polyp segmentation in colonoscopy images using fully convolutional network. In Proceedings of the 40th Annual International Conference of the Engineering in Medicine and Biology Society, Honolulu, USA, pp. 69–72, 2018. DOI: 10.1109/EMBC.2018.8512197.
    [34]
    O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: 10.1007/978-3-319-24574-4_28.
    [35]
    Z. W. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, J. M. Liang. UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Transactions on Medical Imaging, vol. 39, no. 6, pp. 1856–1867, 2020. DOI: 10.1109/TMI.2019.2959609.
    [36]
    D. Jha, P. H. Smedsrud, M. A. Riegler, D. Johansen, T. De Lange, P. Halvorsen, H. D. Johansen. ResuNet++: An advanced architecture for medical image segmentation. In Proceedings of the International Symposium on Multimedia, IEEE, San Diego, USA, pp. 225–2255, 2019. DOI: 10.1109/ISM46123.2019.00049.
    [37]
    J. F. Zhong, W. Wang, H. S. Wu, Z. K. Wen, J. Qin. PolypSeg: An efficient context-aware network for polyp segmentation from colonoscopy videos. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Lima, Peru, pp. 285–294, 2020. DOI: 10.1007/978-3-030-59725-2_28.
    [38]
    R. F. Zhang, G. B. Li, Z. Li, S. G. Cui, D. H. Qian, Y. Z. Yu. Adaptive context selection for polyp segmentation. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Lima, Peru, pp. 253–262, 2020. DOI: 10.1007/978-3-030-59725-2_25.
    [39]
    D. Jha, S. Ali, N. K. Tomar, H. D. Johansen, D. Johansen, J. Rittscher, M. A. Riegler, P. Halvorsen. Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. IEEE Access, vol. 9, pp. 40496–40510, 2021. DOI: 10.1109/ACCESS.2021.3063716.
    [40]
    H. S. Wu, J. F. Zhong, W. Wang, Z. K. Wen, J. Qin. Precise yet efficient semantic calibration and refinement in convnets for real-time polyp segmentation from colonoscopy videos. In Proceedings of AAAI Conference on Artificial Intelligence, Palo Alto, USA, pp. 2916–2924, 2021.
    [41]
    J. Wei, Y. W. Hu, R. M. Zhang, Z. Li, S. K. Zhou, S. G. Cui. Shallow attention network for polyp segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 699–708, 2021. DOI: 10.1007/978-3-030-87193-2_66.
    [42]
    X. Q. Zhao, L. H. Zhang, H. C. Lu. Automatic polyp segmentation via multi-scale subtraction network. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 120–130, 2021. DOI: 10.1007/978-3-030-87193-2_12.
    [43]
    B. Murugesan, K. Sarveswaran, S. M. Shankaranarayana, K. Ram, J. Joseph, M. Sivaprakasam. PSI-Net: Shape and boundary aware joint multi-task deep network for medical image segmentation. In Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Berlin, Germany, pp. 7223–7226, 2019. DOI: 10.1109/EMBC.2019.8857339.
    [44]
    R. X. Wang, S. Y. Chen, C. J. Ji, J. P. Fan, Y. Li. Boundary-aware context neural network for medical image segmentation. Medical Image Analysis, vol. 78, Article number 102395, 2022. DOI: 10.1016/j.media.2022.102395.
    [45]
    Y. Q. Fang, C. Chen, Y. X. Yuan, K. Y. Tong. Selective feature aggregation network with area-boundary constraints for polyp segmentation. In Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Shenzhen, China, pp. 302–310, 2019. DOI: 10.1007/978-3-030-32239-7_34.
    [46]
    Y. T. Shen, X. Jia, M. Q. H. Meng. HRENet: A hard region enhancement network for polyp segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 559–568, 2021. DOI: 10.1007/978-3-030-87193-2_53.
    [47]
    G. P. Ji, L. Zhu, M. C. Zhuge, K. R. Fu. Fast camouflaged object detection via edge-based reversible re-calibration network. Pattern Recognition, vol. 123, Article number 108414, 2022. DOI: 10.1016/j.patcog.2021.108414.
    [48]
    D. P. Fan, G. P. Ji, T. Zhou, G. Chen, H. Z. Fu, J. B. Shen, L. Shao. PraNet: Parallel reverse attention network for polyp segmentation. In Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Lima, Peru, pp. 263–273, 2020. DOI: 10.1007/978-3-030-59725-2_26.
    [49]
    T. C. Nguyen, T. P. Nguyen, G. H. Diep, A. H. Tran-Dinh, T. V. Nguyen, M. T. Tran. CCBANet: Cascading context and balancing attention for polyp segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 633–643, 2021. DOI: 10.1007/978-3-030-87193-2_60.
    [50]
    M. J. Cheng, Z. S. Kong, G. L. Song, Y. H. Tian, Y. S. Liang, J. Chen. Learnable oriented-derivative network for polyp segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 720–730, 2021. DOI: 10.1007/978-3-030-87193-2_68.
    [51]
    T. Kim, H. Lee, D. Kim. UACANet: Uncertainty augmented context attention for polyp segmentation. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, pp. 2167–2175, 2021. DOI: 10.1145/3474085.3475375.
    [52]
    F. Shamshad, S. Khan, S. W. Zamir, M. H. Khan, M. Hayat, F. S. Khan, H. Z. Fu. Transformers in medical imaging: A survey. [Online], Available: https://arxiv.org/abs/2201.09873, 2022.
    [53]
    Y. D. Zhang, H. Y. Liu, Q. Hu. Transfuse: Fusing transformers and CNNs for medical image segmentation. In Proceedings of the 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer, Strasbourg, France, pp. 14–24, 2021. DOI: 10.1007/978-3-030-87193-2_2.
    [54]
    S. H. Li, X. C. Sui, X. D. Luo, X. X. Xu, Y. Liu, R. Goh. Medical image segmentation using squeeze-and-expansion transformers. In Proceedings of the 30th International Joint Conference on Artificial Intelligence, Montreal, Canada, pp. 807–815, 2021. DOI: 10.24963/ijcai.2021/112.
    [55]
    W. H. Wang, E. Z. Xie, X. Li, D. P. Fan, K. T. Song, D. Liang, T. Lu, P. Luo, L. Shao. PVT v2: Improved baselines with pyramid vision transformer. Computational Visual Media, vol. 8, no. 3, pp. 415–424, 2022. DOI: 10.1007/s41095-022-0274-8.
    [56]
    B. Dong, W. H. Wang, D. P. Fan, J. P. Li, H. Z. Fu, L. Shao. Polyp-PVT: Polyp segmentation with pyramid vision transformers. [Online], Available: https://arxiv.org/abs/2108.06932, 2021.
    [57]
    D. P. Fan, G. P. Ji, G. L. Sun, M. M. Cheng, J. B. Shen, L. Shao. Camouflaged object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 2777–2787, 2020. DOI: 10.1109/CVPR42600.2020.00285.
    [58]
    U. Ramer. An iterative procedure for the polygonal approximation of plane curves. Computer Graphics and Image Processing, vol. 1, no. 3, pp. 244–256, 1972. DOI: 10.1016/S0146-664X(72)80017-0.
    [59]
    D. P. Fan, J. Zhang, G. Xu, M. M. Cheng, L. Shao. Salient objects in clutter. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: 10.1109/TPAMI.2022.3166451.
    [60]
    D. P. Fan, Z. Lin, Z. Zhang, M. L. Zhu, M. M. Cheng. Rethinking RGB-D salient object detection: Models, data sets, and large-scale benchmarks. IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 5, pp. 2075–2089, 2021. DOI: 10.1109/TNNLS.2020.2996406.
    [61]
    X. L. Wang, R. Girshick, A. Gupta, K. M. He. Non-local neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7794–7803, 2018. DOI: 10.1109/CVPR.2018.00813.
    [62]
    Y. C. Gu, L. J. Wang, Z. Q. Wang, Y. Liu, M. M. Cheng, S. P. Lu. Pyramid constrained self-attention network for fast video salient object detection. In Proceedings of AAAI Conference on Artificial Intelligence, Palo Alto, USA, vol. 34, pp. 10869–10876, 2020. DOI: 10.1609/aaai.v34i07.6718.
    [63]
    L. T. Guo, J. Liu, X. X. Zhu, P. Yao, S. C. Lu, H. Q. Lu. Normalized and geometry-aware self-attention network for image captioning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 10327–10336, 2020. DOI: 10.1109/CVPR42600.2020.01034.
    [64]
    J. L. Ba, J. R. Kiros, G. E. Hinton. Layer normalization. [Online], Available: https://arxiv.org/abs/1607.06450, 2016.
    [65]
    S. H. Gao, M. M. Cheng, K. Zhao, X. Y. Zhang, M. H. Yang, P. Torr. Res2Net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 2, pp. 652–662, 2021. DOI: 10.1109/TPAMI.2019.2938758.
    [66]
    S. T. Liu, D. Huang, Y. H. Wang. Receptive field block net for accurate and fast object detection. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 404–419, 2018. DOI: 10.1007/978-3-030-01252-6_24.
    [67]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
    [68]
    P. Krähenbühl, V. Koltun. Efficient inference in fully connected CRFs with gaussian edge potentials. In Proceedings of the 24th International Conference on Neural Information Processing Systems, ACM, Granada, Spain, pp. 109–117, 2011. DOI: 10.5555/2986459.2986472.
    [69]
    X. K. Lu, W. G. Wang, C. Ma, J. B. Shen, L. Shao, F. Porikli. See more, know more: Unsupervised video object segmentation with co-attention Siamese networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 3618–3627, 2019. DOI: 10.1109/CVPR.2019.00374.
    [70]
    T. F. Zhou, J. W. Li, S. Z. Wang, R. Tao, J. B. Shen. MATNet: Motion-attentive transition network for zero-shot video object segmentation. IEEE Transactions on Image Processing, vol. 29, pp. 8326–8338, 2020. DOI: 10.1109/TIP.2020.3013162.
    [71]
    R. T. Liu, Z. R. Wu, S. X. Yu, S. Lin. The emergence of objectness: Learning zero-shot segmentation from videos. In Proceedings of the Advances in Neural Information Processing Systems, online, pp. 13137–13152, 2021.
    [72]
    M. Zhang, J. Liu, Y. F. Wang, Y. Piao, S. Y. Yao, W. Ji, J. Li, H. C. Lu, Z. X. Luo. Dynamic context-sensitive filtering network for video salient object detection. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 1533–1543, 2021. DOI: 10.1109/ICCV48922.2021.00158.
    [73]
    G. P. Ji, K. R. Fu, Z. Wu, D. P. Fan, J. B. Shen, L. Shao. Full-duplex strategy for video object segmentation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 4902–4913, 2021. DOI: 10.1109/ICCV48922.2021.00488.
    [74]
    R. Achanta, S. Hemami, F. Estrada, S. Susstrunk. Frequency-tuned salient region detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 1597–1604, 2009. DOI: 10.1109/CVPR.2009.5206596.
    [75]
    D. P. Fan, G. P. Ji, X. B. Qin, M. M. Cheng. Cognitive vision inspired object segmentation metric and loss function. SCIENTIA SINICA Informationis, vol. 51, no. 9, pp. 1475–1489, 2021. DOI: 10.1360/SSI-2020-0370. (in Chinese)
    [76]
    M. M. Cheng, D. P. Fan. Structure-measure: A new way to evaluate foreground maps. International Journal of Computer Vision, vol. 129, no. 9, pp. 2622–2638, 2021. DOI: 10.1007/s11263-021-01490-8.
    [77]
    R. Margolin, L. Zelnik-Manor, A. Tal. How to evaluate foreground maps?” In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 248–255, 2014. DOI: 10.1109/CVPR.2014.39.
    [78]
    A. Borji, M. M. Cheng, H. Z. Jiang, J. Li. Salient object detection: A benchmark. IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5706–5722, 2015. DOI: 10.1109/TIP.2015.2487833.
    [79]
    D. P. Fan, M. M. Cheng, Y. Liu, T. Li, A. Borji. Structure-measure: A new way to evaluate foreground maps. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 4558–4567, 2017. DOI: 10.1109/ICCV.2017.487.
    [80]
    D. P. Fan, C. Gong, Y. Cao, B. Ren, M. M. Cheng, A. Borji. Enhanced-alignment measure for binary foreground map evaluation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 698–704, 2018. DOI: 10.24963/ijcai.2018/97.
    [81]
    D. P. Fan, G. P. Ji, M. M. Cheng, L. Shao. Concealed object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: 10.1109/TPAMI.2021.3085766.
    [82]
    G. P. Ji, D. P. Fan, Y. C. Chou, D. Dai, A. Liniger, L. Van Gool. Deep gradient learning for efficient camouflaged object detection. Machine Intelligence Research, to be published. DOI: 10.1007/S11633-022-1365-9
    [83]
    X. Q. Guo, J. Liu, Y. X. Yuan. Semantic-oriented labeled-to-unlabeled distribution translation for image segmentation. IEEE Transactions on Medical Imaging, vol. 41, no. 2, pp. 434–445, 2022. DOI: 10.1109/TMI.2021.3114329.
    [84]
    I. B. Senkyire, Z. Liu. Supervised and semi-supervised methods for abdominal organ segmentation: A review. International Journal of Automation and Computing, vol. 18, no. 6, pp. 887–914, 2021. DOI: 10.1007/s11633-021-1313-0.
    [85]
    K. Zou, X. D. Yuan, X. J. Shen, M. Wang, H. Z. Fu. TbraTS: Trusted brain tumor segmentation. [Online], Available: https://arxiv.org/abs/2206.09309.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(6)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (47) PDF downloads(1) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return