Rui Jiang, Ruixiang Zhu, Hu Su, Yinlin Li, Yuan Xie, Wei Zou. Deep Learning-based Moving Object Segmentation: Recent Progress and Research Prospects. Machine Intelligence Research, vol. 20, no. 3, pp.335-369, 2023. https://doi.org/10.1007/s11633-022-1378-4
Citation: Rui Jiang, Ruixiang Zhu, Hu Su, Yinlin Li, Yuan Xie, Wei Zou. Deep Learning-based Moving Object Segmentation: Recent Progress and Research Prospects. Machine Intelligence Research, vol. 20, no. 3, pp.335-369, 2023. https://doi.org/10.1007/s11633-022-1378-4

Deep Learning-based Moving Object Segmentation: Recent Progress and Research Prospects

doi: 10.1007/s11633-022-1378-4
More Information
  • Author Bio:

    Rui Jiang received the B. Sc. degree in computational mathematics from Xi′an University of Technology (XUT), China in 2006, the M. Sc. degree in applied mathematics from the Xi′an Jiaotong University (XJTU), China in 2010, and the Ph. D. degree in pattern recognition and intelligent systems from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2016. She is now an assistant professor with College of Information Engineering, Shanghai Maritime University (SMU), China, and a visiting scholar with School of Computer Science and Technology, East China Normal University, China. She is a member of IEEE.Her research interests include machine learning, pattern recognition and computer vision.E-mail: rjiang@shmtu.edu.cnORCID iD: 0000-0001-7195-7887

    Ruixiang Zhu received the B. Eng. degree in software engineering from Shandong University of Technology (SDUT), China in 2020. He is currently a master student in computer technology with College of Information Engineering, Shanghai Maritime University (SMU), China.His research interests include deep learning and computer vision.E-mail: 202030310266@stu.shmtu.edu.cn

    Hu Su received the B. Sc. and M. Sc. degrees in information and computation science from Shandong University (SDU), China in 2007 and 2010, respectively, and the Ph. D. degree in control science and engineering from State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2013. Currently, he is an associate researcher with Research Center of Precision Sensing and Control, CASIA, China.His research interests include intelligent control and optimization, and computer vision.E-mail: hu.su@ia.ac.cn (Corresponding author)ORCID iD: 0000-0002-0551-3193

    Yinlin Li received the B. Sc. degree in measurement and control technology and instrumentation from Xidian University, China in 2011, and the Ph. D. degree in pattern recognition and intelligent system from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2016. She is currently an associate professor with State Key Laboratory for Management and Control of Complex Systems, CASIA, China.Her research interests include robotic vision, biologically inspired visual algorithms and robotic manipulation.E-mail: yinlin.li@ia.ac.cn

    Yuan Xie received the Ph. D. degree in pattern recognition and intelligent systems from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2013. He is currently a full professor with School of Computer Science and Technology, East China Normal University, China. He has published around 80 papers in major international journals and conferences, including IJCV, TPAMI, TIP, CVPR, ECCV, ICCV, ICML, NeurIPS, AAAI, IJCAI. He received the Hong Kong Scholar Award from the Society of Hong Kong Scholars and the China National Postdoctoral Council in 2014.His research interests include image processing, computer vision, machine learning and pattern recognition.E-mail: yxie@cs.ecnu.edu.cn

    Wei Zou received the B. Sc. degree in control theory and control engineering from Inner Mongolia University of Science and Technology, China in 1997, the M. Sc. degree in control theory and control engineering from Shandong University, China in 2000, and the Ph. D. degree in control theory and control engineering from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2003. Since 2012, he has been a researcher with Research Center of Precision Sensing and Control, CASIA, China.His research interests include visual control and intelligent robots. E-mail: wei.zou@ia.ac.cn

  • Received Date: 2022-06-12
  • Accepted Date: 2022-09-27
  • Publish Online: 2023-04-20
  • Publish Date: 2023-06-01
  • Moving object segmentation (MOS), aiming at segmenting moving objects from video frames, is an important and challenging task in computer vision and with various applications. With the development of deep learning (DL), MOS has also entered the era of deep models toward spatiotemporal feature learning. This paper aims to provide the latest review of recent DL-based MOS methods proposed during the past three years. Specifically, we present a more up-to-date categorization based on model characteristics, then compare and discuss each category from feature learning (FL), and model training and evaluation perspectives. For FL, the methods reviewed are divided into three types: spatial FL, temporal FL, and spatiotemporal FL, then analyzed from input and model architectures aspects, three input types, and four typical preprocessing subnetworks are summarized. In terms of training, we discuss ideas for enhancing model transferability. In terms of evaluation, based on a previous categorization of scene dependent evaluation and scene independent evaluation, and combined with whether used videos are recorded with static or moving cameras, we further provide four subdivided evaluation setups and analyze that of reviewed methods. We also show performance comparisons of some reviewed MOS methods and analyze the advantages and disadvantages of reviewed MOS methods in terms of technology. Finally, based on the above comparisons and discussions, we present research prospects and future directions.

     

  • loading
  • [1]
    B. Garcia-Garcia, T. Bouwmans, A. J. R. Silva. Background subtraction in real applications: Challenges, current models and future directions. Computer Science Review, vol. 35, Article number 100204, 2020. DOI: 10.1016/j.cosrev.2019.100204.
    [2]
    Y. Wang, P. M. Jodoin, F. Porikli, J. Konrad, Y. Benezeth, P. Ishwar. CDnet 2014: An expanded change detection benchmark dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, USA, pp. 393–400, 2014. DOI: 10.1109/CVPRW.2014.126.
    [3]
    M. Mandal, S. K. Vipparthi. An empirical review of deep learning frameworks for change detection: Model design, experimental frameworks, challenges and research needs. IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 7, pp. 6101–6122, 2022. DOI: 10.1109/TITS.2021.3077883.
    [4]
    T. Bouwmans, S. Javed, M. Sultana, S. K. Jung. Deep neural network concepts for background subtraction: A systematic review and comparative evaluation. Neural Networks, vol. 117, pp. 8–66, 2019. DOI: 10.1016/j.neunet.2019.04.024.
    [5]
    Y. M. Latha, B. S. Rao. A systematic review on background subtraction model for data detection. In Proceedings of International Conference Pervasive Computing and Social Networking, Springer, Salem, India, pp. 341–349, 2022. DOI: 10.1007/978-981-16-5640-8_27.
    [6]
    R. Kalsotra, S. Arora. Background subtraction for moving object detection: Explorations of recent developments and challenges. The Visual Computer, to be published.
    [7]
    O. Barnich, M. Van Droogenbroeck. ViBe: A universal background subtraction algorithm for video sequences. IEEE Transactions on Image Processing, vol. 20, no. 6, pp. 1709–1724, 2011. DOI: 10.1109/TIP.2010.2101613.
    [8]
    H. Sajid, S. C. S. Cheung. Universal multimode background subtraction. IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3249–3260, 2017. DOI: 10.1109/TIP.2017.2695882.
    [9]
    C. Stauffer, W. E. L. Grimson. Adaptive background mixture models for real-time tracking. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Fort Collins, USA, pp. 246–252, 1999. DOI: 10.1109/CVPR.1999.784637.
    [10]
    M. Hofmann, P. Tiefenbacher, G. Rigoll. Background segmentation with feedback: The pixel-based adaptive segmenter. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, USA, pp. 38–43, 2012. DOI: 10.1109/CVPRW.2012.6238925.
    [11]
    M. L. Chen, Q. X. Yang, Q. Li, G. Wang, M. H. Yang. Spatiotemporal background subtraction using minimum spanning tree and optical flow. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 521–534, 2014. DOI: 10.1007/978-3-319-10584-0_34.
    [12]
    P. L. St-Charles, G. A. Bilodeau, R. Bergevin. SuBSENSE: A universal change detection method with local adaptive sensitivity. IEEE Transactions on Image Processing, vol. 24, no. 1, pp. 359–373, 2015. DOI: 10.1109/TIP.2014.2378053.
    [13]
    C. R. Wren, A. Azarbayejani, T. Darrell, A. P. Pentland. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 780–785, 1997. DOI: 10.1109/34.598236.
    [14]
    Y. Y. Chen, J. Q. Wang, H. Q. Lu. Learning sharable models for robust background subtraction. In Proceedings of IEEE International Conference on Multimedia and Expo, Turin, Italy, 2015. DOI: 10.1109/ICME.2015.7177419.
    [15]
    S. C. Liao, G. Y. Zhao, V. Kellokumpu, M. Pietikäinen, S. Z. Li. Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, USA, pp. 1301–1306, 2010. DOI: 10.1109/CVPR.2010.5539817.
    [16]
    Y. Goyat, T. Chateau, L. Malaterre, L. Trassoudaine. Vehicle trajectories evaluation by static video sensors. In Proceedings of IEEE Intelligent Transportation Systems Conference, Toronto, Canada, pp. 864–869, 2006. DOI: 10.1109/ITSC.2006.1706852.
    [17]
    P. L. St-Charles, G. A. Bilodeau, R. Bergevin. A self-adjusting approach to change detection based on background word consensus. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 990–997, 2015. DOI: 10.1109/WACV.2015.137.
    [18]
    S. Q. Jiang, X. B. Lu. WeSamBE: A weight-sample-based method for background subtraction. IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 9, pp. 2105–2115, 2018. DOI: 10.1109/TCSVT.2017.2711659.
    [19]
    S. M. Roy, A. Ghosh. Foreground segmentation using adaptive 3 phase background model. IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 6, pp. 2287–2296, 2020. DOI: 10.1109/TITS.2019.2915568.
    [20]
    L. Maddalena, A. Petrosino. A self-organizing approach to background subtraction for visual surveillance applications. IEEE Transactions on Image Processing, vol. 17, no. 7, pp. 1168–1177, 2008. DOI: 10.1109/TIP.2008.924285.
    [21]
    D. Culibrk, O. Marques, D. Socek, H. Kalva, B. Furht. Neural network approach to background modeling for video object segmentation. IEEE Transactions on Neural Networks, vol. 18, no. 6, pp. 1614–1627, 2007. DOI: 10.1109/TNN.2007.896861.
    [22]
    L. Maddalena, A. Petrosino. Extracting a background image by a multi-modal scene background model. In Proceedings of the 23rd International Conference on Pattern Recognition, IEEE, Cancun, Mexico, pp. 143–148, 2016. DOI: 10.1109/ICPR.2016.7899623.
    [23]
    M. Yu, Y. Z. Yu, A. Rhuma, S. M. R. Naqvi, L. Wang, J. A. Chambers. An online one class support vector machine-based person-specific fall detection system for monitoring an elderly individual in a room environment. IEEE Journal of Biomedical and Health Informatics, vol. 17, no. 6, pp. 1002–1014, 2013. DOI: 10.1109/JBHI.2013.2274479.
    [24]
    Z. Xu, B. Min, R. C. C. Cheung. A robust background initialization algorithm with superpixel motion detection. Signal Processing:Image Communication, vol. 71, pp. 1–12, 2019. DOI: 10.1016/j.image.2018.07.004.
    [25]
    N. M. Oliver, B. Rosario, A. P. Pentland. A Bayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 831–843, 2000. DOI: 10.1109/34.868684.
    [26]
    E. J. Candès, X. Li, Y. Ma, J. Wright. Robust principal component analysis? Journal of the ACM, vol. 58, no. 3, Article number 11, 2011. DOI: 10.1145/1970392.1970395.
    [27]
    J. Yao, J. M. Odobez. Multi-layer background subtraction based on color and texture. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA, 2007. DOI: 10.1109/CVPR.2007.383497.
    [28]
    A. B. Godbehere, A. Matsukawa, K. Goldberg. Visual tracking of human visitors under variable-lighting conditions for a responsive audio art installation. In Proceedings of the American Control Conference, IEEE, Montreal, Canada, pp. 4305–4312, 2012. DOI: 10.1109/ACC.2012.6315174.
    [29]
    B. Laugraud, M. Van Droogenbroeck. Is a memoryless motion detection truly relevant for background generation with LaBGen? In Proceedings of the 18th International Conference on Advanced Concepts for Intelligent Vision Systems, Springer, Antwerp, Belgium, pp. 443–454, 2017. DOI: 10.1007/978-3-319-70353-4_38.
    [30]
    S. H. Lee, G. C. Lee, J. Yoo, S. Kwon. WisenetMD: Motion detection using dynamic background region analysis. Symmetry, vol. 11, no. 5, Article number 621, 2019. DOI: 10.3390/sym11050621.
    [31]
    S. Bianco, G. Ciocca, R. Schettini. Combination of video change detection algorithms by genetic programming. IEEE Transactions on Evolutionary Computation, vol. 21, no. 6, pp. 914–928, 2017. DOI: 10.1109/TEVC.2017.2694160.
    [32]
    F. El Baf, T. Bouwmans, B. Vachon. Fuzzy integral for moving object detection. In Proceedings of IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence), Hong Kong, China, pp. 1729–1736, 2008. DOI: 10.1109/FUZZY.2008.4630604.
    [33]
    H. X. Zhang, D. Xu. Fusing color and texture features for background model. In Proceedings of the 3rd Fuzzy Systems and Knowledge Discovery, Springer, Xi′an, China, pp. 887–893, 2006. DOI: 10.1007/11881599_110.
    [34]
    B. Xu, N. Y. Wang, T. Q. Chen, M. Li. Empirical evaluation of rectified activations in convolutional network. [Online], Available: https://arxiv.org/abs/1505.00853, 2015.
    [35]
    D. Misra. Mish: A self regularized non-monotonic activation function. In Proceedings of the 31st British Machine Vision Conference, Manchester UK, 2020.
    [36]
    B. Ding, H. M. Qian, J. Zhou. Activation functions and their characteristics in deep neural networks. In Proceedings of Chinese Control and Decision Conference, IEEE, Shenyang, China, pp. 1836–1841, 2018. DOI: 10.1109/CCDC.2018.8407425.
    [37]
    D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference for Learning Representations, San Diego, USA, 2015. DOI: 10.48550/arXiv.1412.6980.
    [38]
    R. Y. Sun. Optimization for deep learning: An overview. Journal of the Operations Research Society of China, vol. 8, no. 2, pp. 249–294, 2020. DOI: 10.1007/s40305-020-00309-6.
    [39]
    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
    [40]
    S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.
    [41]
    S. Ioffe. Batch renormalization: Towards reducing minibatch dependence in batch-normalized models. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 1942–1950, 2017.
    [42]
    J. Kukačka, V. Golkov, D. Cremers. Regularization for deep learning: A taxonomy. [Online], Available: https://arxiv.org/abs/1710.10686, 2017.
    [43]
    J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: 10.1109/CVPR.2018.00745.
    [44]
    S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, German, pp. 3–19, 2018. DOI: 10.1007/978-3-030-01234-2_1.
    [45]
    Z. Y. Niu, G. Q. Zhong, H. Yu. A review on the attention mechanism of deep learning. Neurocomputing, vol. 452, pp. 48–62, 2021. DOI: 10.1016/j.neucom.2021.03.091.
    [46]
    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby. An image is worth 16 × 16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference for Learning Representations, 2021.
    [47]
    J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 248–255, 2009. DOI: 10.1109/CVPR.2009.5206848.
    [48]
    T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerlan, pp. 740–755, 2014. DOI: 10.1007/978-3-319-10602-1_48.
    [49]
    M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. F. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Q. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Q. Zheng. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. [Online], Available: https://arxiv.org/abs/1603.04467, 2016.
    [50]
    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. M. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. J. Bai, S. Chintala. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 721, 2019.
    [51]
    P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P. A. Manzagol. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, vol. 11, no. 12, pp. 3371–3408, 2010.
    [52]
    D. P. Kingma, M. Welling. Auto-encoding variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, Banff, Canada, 2014. DOI: 10.48550/arXiv.1312.6114.
    [53]
    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.
    [54]
    K. F. Wang, C. Gou, Y. J. Duan, Y. L. Lin, X. H. Zheng, F. Y. Wang. Generative adversarial networks: Introduction and outlook. IEEE/CAA Journal of Automatica Sinica, vol. 4, no. 4, pp. 588–598, 2017. DOI: 10.1109/JAS.2017.7510583.
    [55]
    J. Gui, Z. N. Sun, Y. G. Wen, D. C. Tao, J. P. Ye. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE Transactions on Knowledge and Data Engineering, to be published.
    [56]
    K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representation, San Diego, USA, 2015. DOI: 10.48550/arXiv.1409.1556.
    [57]
    C. Szegedy, W. Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 1–9, 2015. DOI: 10.1109/CVPR.2015.7298594.
    [58]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
    [59]
    J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 3431–3440, 2015. DOI: 10.1109/CVPR.2015.7298965.
    [60]
    C. Z. Wu, J. Sun, J. Wang, L. F. Xu, S. Zhan. Encoding-decoding network with pyramid self-attention module for retinal vessel segmentation. International Journal of Automation and Computing, vol. 18, no. 6, pp. 973–980, 2021. DOI: 10.1007/s11633-020-1277-0.
    [61]
    L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834–848, 2018. DOI: 10.1109/TPAMI.2017.2699184.
    [62]
    F. Yu, V. Koltun. Multi-scale context aggregation by dilated convolutions. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016. DOI: 10.48550/arxiv.org/abs/1511.07122.
    [63]
    H. Noh, S. Hong, B. Han. Learning deconvolution network for semantic segmentation. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 1520–1528, 2015. DOI: 10.1109/ICCV.2015.178.
    [64]
    V. Badrinarayanan, A. Kendall, R. Cipolla. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 12, pp. 2481–2495, 2017. DOI: 10.1109/TPAMI.2016.2644615.
    [65]
    L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [66]
    L. C. Chen, G. Papandreou, F. Schroff, H. Adam. Rethinking atrous convolution for semantic image segmentation. [Online], Available: https://arxiv.org/abs/1706.05587, 2017.
    [67]
    L. C. Chen, Y. K. Zhu, G. Papandreou, F. Schroff, H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 833–851, 2018. DOI: 10.1007/978-3-030-01234-2_49.
    [68]
    O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: 10.1007/978-3-319-24574-4_28.
    [69]
    I. B. Senkyire, Z. Liu. Supervised and semi-supervised methods for abdominal organ segmentation: A review. International Journal of Automation and Computing, vol. 18, no. 6, pp. 887–914, 2021. DOI: 10.1007/s11633-021-1313-0.
    [70]
    K. M. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN. In Proceedings of IEEE International Conference on Computer Vision, Venice, Italy, pp. 2980–2988, 2017. DOI: 10.1109/ICCV.2017.322.
    [71]
    S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017. DOI: 10.1109/TPAMI.2016.2577031.
    [72]
    Z. C. Lipton, J. Berkowitz, C. Elkan. A critical review of recurrent neural networks for sequence learning. [Online], Available: https://arxiv.org/abs/1506.00019, 2015.
    [73]
    S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. DOI: 10.1162/neco.1997.9.8.1735.
    [74]
    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
    [75]
    J. Zhou, G. Q. Cui, S. D. Hu, Z. Y. Zhang, C. Yang, Z. Y. Liu, L. F. Wang, C. C. Li, M. S. Sun. Graph neural networks: A review of methods and applications. AI Open, vol. 1, pp. 57–81, 2020. DOI: 10.1016/j.aiopen.2021.01.001.
    [76]
    J. Gracewell, M. John. Dynamic background modeling using deep learning autoencoder network. Multimedia Tools and Applications, vol. 79, no. 7, pp. 4639–4659, 2020. DOI: 10.1007/s11042-019-7411-0.
    [77]
    P. Xu, M. Ye, Q. H. Liu, X. D. Li, L. S. Pei, J. Ding. Motion detection via a couple of auto-encoder networks. In Proceedings of IEEE International Conference on Multimedia and Expo, Chengdu, China, 2014. DOI: 10.1109/ICME.2014.6890140.
    [78]
    P. Xu, M. Ye, X. Li, Q. H. Liu, Y. Yang, J. Ding. Dynamic background learning through deep auto-encoder networks. In Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, USA, pp. 107–116, 2014. DOI: 10.1145/2647868.2654914.
    [79]
    B. Rezaei, F. Amirreza, S. Ostadabbas. DeepPBM: Deep probabilistic background model estimation from video sequences. In Proceedings of the International Conference on Pattern Recognition, Springer, pp. 608–621, 2021. DOI: 10.1007/978-3-030-68790-8_47.
    [80]
    A. Vacavant, T. Chateau, A. Wilhelm, L. Lequièvre. A benchmark dataset for outdoor foreground/background extraction. In Proceedings of the Asian Conference on Computer Vision, Springer, Daejeon, Republic of Korea, pp. 291–300, 2013. DOI: 10.1007/978-3-642-37410-4_25.
    [81]
    B. Rezaei, A. Farnoosh, S. Ostadabbas. G-LBM: Generative low-dimensional background model estimation from video sequences. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 293–310, 2020. DOI: 10.1007/978-3-030-58610-2_18.
    [82]
    P. M. Jodoin, L. Maddalena, A. Petrosino, Y. Wang. Extensive benchmark and survey of modeling methods for scene background initialization. IEEE Transactions on Image Processing, vol. 26, no. 11, pp. 5244–5256, 2017. DOI: 10.1109/TIP.2017.2728181.
    [83]
    S. Javed, A. Mahmood, T. Bouwmans, S. K. Jung. Background-foreground modeling based on spatiotemporal sparse subspace clustering. IEEE Transactions on Image Processing, vol. 26, no. 12, pp. 5840–5854, 2017. DOI: 10.1109/TIP.2017.2746268.
    [84]
    I. Halfaoui, F. Bouzaraa, O. Urfalioglu. CNN-based initial background estimation. In Proceedings of the 23rd International Conference on Pattern Recognition, IEEE, Cancun, Mexico, pp. 101–106, 2016. DOI: 10.1109/ICPR.2016.7899616.
    [85]
    I. Ul Haq, T. Iwata, Y. Kawahara. Dynamic mode decomposition via convolutional autoencoders for dynamics modeling in videos. Computer Vision and Image Understanding, vol. 216, Article number 103355, 2022. DOI: 10.1016/j.cviu.2021.103355.
    [86]
    K. Toyama, J. Krumm, B. Brumitt, B. Meyers. Wallflower: Principles and practice of background maintenance. In Proceedings of the 7th IEEE International Conference on Computer Vision, Kerkyra, Greece, pp. 255–261, 1999. DOI: 10.1109/ICCV.1999.791228.
    [87]
    P. J. Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of Fluid Mechanics, vol. 656, pp. 5–28, 2010. DOI: 10.1017/S0022112010001217.
    [88]
    A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Hazirbas, V. Golkov, P. van der Smagt, D. Cremers, T. Brox. FlowNet: Learning optical flow with convolutional networks. In Proceedings of IEEE International Conference on Computer Vision, Santiago, Chile, pp. 2758–2766, 2015. DOI: 10.1109/ICCV.2015.316.
    [89]
    M. Sultana, A. Mahmood, S. Javed, S. K. Jung. Unsupervised deep context prediction for background estimation and foreground segmentation. Machine Vision and Applications, vol. 30, no. 3, pp. 375–395, 2019. DOI: 10.1007/s00138-018-0993-0.
    [90]
    M. Sultana, A. Mahmood, S. K. Jung. Unsupervised moving object detection in complex scenes using adversarial regularizations. IEEE Transactions on Multimedia, vol. 23, pp. 2005–2018, 2021. DOI: 10.1109/TMM.2020.3006419.
    [91]
    L. Li, W. Huang, I. Y. H. Gu, Q. Tian. Statistical modeling of complex backgrounds for foreground object detection. IEEE Transactions on Image Processing, vol. 13, no. 11, pp. 1459–1472, 2004. DOI: 10.1109/TIP.2004.836169.
    [92]
    Z. F. Zhu, Y. Y. Meng, D. Q. Kong, X. X. Zhang, Y. D. Guo, Y. Zhao. To see in the dark: N2DGAN for background modeling in nighttime scene. IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 2, pp. 492–502, 2021. DOI: 10.1109/TCSVT.2020.2987874.
    [93]
    M. Braham, M. Van Droogenbroeck. Deep background subtraction with scene-specific convolutional neural networks. In Proceedings of the International Conference on Systems, Signals and Image Processing, IEEE, Bratislava, Slovakia, 2016. DOI: 10.1109/IWSSIP.2016.7502717.
    [94]
    Y. Wang, Z. M. Luo, P. M. Jodoin. Interactive deep learning method for segmenting moving objects. Pattern Recognition Letters, vol. 96, pp. 66–75, 2017. DOI: 10.1016/j.patrec.2016.09.014.
    [95]
    M. Babaee, D. T. Dinh, G. Rigoll. A deep convolutional neural network for video sequence background subtraction. Pattern Recognition, vol. 76, pp. 635–649, 2018. DOI: 10.1016/j.patcog.2017.09.040.
    [96]
    L. A. Lim, H. Y. Keles. Foreground segmentation using convolutional neural networks for multiscale feature encoding. Pattern Recognition Letters, vol. 112, pp. 256–262, 2018. DOI: 10.1016/j.patrec.2018.08.002.
    [97]
    L. A. Lim, H. Y. Keles. Learning multi-scale features for foreground segmentation. Pattern Analysis and Applications, vol. 23, no. 3, pp. 1369–1380, 2020. DOI: 10.1007/s10044-019-00845-9.
    [98]
    M. O. Tezcan, P. Ishwar, J. Konrad. BSUV-Net: A fully-convolutional neural network for background subtraction of unseen videos. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Snowmass, USA, pp. 2774–2783, 2020. DOI: 10.1109/WACV45572.2020.9093464.
    [99]
    D. D. Zeng, M. Zhu, A. Kuijper. Combining background subtraction algorithms with convolutional neural network. Journal of Electronic Imaging, vol. 28, no. 1, Article number 013011, 2019. DOI: 10.1117/1.JEI.28.1.013011.
    [100]
    R. Wang, F. Bunyak, G. Seetharaman, K. Palaniappan. Static and moving object detection using flux tensor with split Gaussian models. In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition Workshops, Columbus, USA, pp. 414–418, 2014. DOI: 10.1109/CVPRW.2014.68.
    [101]
    M. De Gregorio, M. Giordano. CWISARDH.+: Background detection in RGBD videos by learning of weightless neural networks. In Proceedings of International Conference on Image Analysis and Processing, Springer, Catania, Italy, pp. 242–253, 2017. DOI: 10.1007/978-3-319-70742-6_23.
    [102]
    G. Rahmon, F. Bunyak, G. Seetharaman, K. Palaniappan. Motion U-Net: Multi-cue encoder-decoder network for motion segmentation. In Proceedings of the 25th International Conference on Pattern Recognition, IEEE, Milan, Italy, pp. 8125–8132, 2020. DOI: 10.1109/ICPR48806.2021.9413211.
    [103]
    F. Bunyak, K. Palaniappan, S. K. Nath, G. Seetharaman. Flux tensor constrained geodesic active contours with sensor fusion for persistent object tracking. Journal of Multimedia, vol. 2, no. 4, pp. 20–33, 2007. DOI: 10.4304/jmm.2.4.20-33.
    [104]
    Z. Zivkovic, F. van der Heijden. Efficient adaptive density estimation per image pixel for the task of background subtraction. Pattern Recognition Letters, vol. 27, no. 7, pp. 773–780, 2006. DOI: 10.1016/j.patrec.2005.11.005.
    [105]
    Z. Zivkovic. Improved adaptive Gaussian mixture model for background subtraction. In Proceedings of the 17th International Conference on Pattern Recognition, IEEE, Cambridge, UK, pp. 28–31, 2004. DOI: 10.1109/ICPR.2004.1333992.
    [106]
    L. Maddalena, A. Petrosino. Towards benchmarking scene background initialization. In Proceedings of International Conference on Image Analysis and Processing, Springer, Genoa, Italy, pp. 469–476, 2015. DOI: 10.1007/978-3-319-23222-5_57.
    [107]
    W. B. Zheng, K. F. Wang, F. Y. Wang. A novel background subtraction algorithm based on parallel vision and Bayesian GANs. Neurocomputing, vol. 394, pp. 178–200, 2020. DOI: 10.1016/j.neucom.2019.04.088.
    [108]
    M. Braham, S. Piérard, M. Van Droogenbroeck. Semantic background subtraction. In Proceedings of IEEE International Conference on Image Processing, Beijing, China, pp. 4552–4556, 2017. DOI: 10.1109/ICIP.2017.8297144.
    [109]
    S. Isik, K. Özkan, S. Günal, Ö. N. Gerek. SWCD: A sliding window and self-regulated learning-based background updating method for change detection in videos. Journal of Electronic Imaging, vol. 27, no. 2, Article number 23002, 2018. DOI: 10.1117/1.JEI.27.2.023002.
    [110]
    T. Minematsu, A. Shimada, R. I. Taniguchi. Simple background subtraction constraint for weakly supervised background subtraction network. In Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance, Taipei, China, 2019. DOI: 10.1109/AVSS.2019.8909896.
    [111]
    M. Vijayan, P. Raguraman, R. Mohan. A fully residual convolutional neural network for background subtraction. Pattern Recognition Letters, vol. 146, pp. 63–69, 2021. DOI: 10.1016/j.patrec.2021.02.017.
    [112]
    Y. Z. Yang, J. H. Ruan, Y. Q. Zhang, X. Cheng, Z. Zhang, G. J. Xie. STPNet: A spatial-temporal propagation network for background subtraction. IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 4, pp. 2145–2157, 2022. DOI: 10.1109/TCSVT.2021.3088130.
    [113]
    C. Cuevas, E. M. Yáñez, N. García. Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA. Computer Vision and Image Understanding, vol. 152, pp. 103–117, 2016. DOI: 10.1016/j.cviu.2016.08.005.
    [114]
    T. Akilan, Q. J. Wu, A. Safaei, J. Huo, Y. M. Yang. A 3D CNN-LSTM-based image-to-image foreground segmentation. IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 3, pp. 959–971, 2020. DOI: 10.1109/TITS.2019.2900426.
    [115]
    Y. Wang, Z. J. Yu, L. Q. Zhu. Foreground detection with deeply learned multi-scale spatial-temporal features. Sensors, vol. 18, no. 12, Article number 4269, 2018. DOI: 10.3390/s18124269.
    [116]
    Y. Y. Chen, J. Q. Wang, B. K. Zhu, M. Tang, H. Q. Lu. Pixelwise deep sequence learning for moving object detection. IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 9, pp. 2567–2579, 2019. DOI: 10.1109/TCSVT.2017.2770319.
    [117]
    D. D. Zeng, X. Chen, M. Zhu, M. Goesele, A. Kuijper. Background subtraction with real-time semantic segmentation. IEEE Access, vol. 7, pp. 153869–153884, 2019. DOI: 10.1109/ACCESS.2019.2899348.
    [118]
    P. W. Patil, S. Murala. MSFgNet: A novel compact end-to-end deep network for moving object detection. IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 11, pp. 4066–4077, 2019. DOI: 10.1109/TITS.2018.2880096.
    [119]
    V. M. Mondéjar-Guerra, J. Rouco, J. Novo, M. Ortega. An end-to-end deep learning approach for simultaneous background modeling and subtraction. In Proceedings of the 30th British Machine Vision Conference, Cardiff, UK, pp. 266–277, 2019.
    [120]
    W. J. Kim, S. Hwang, J. Lee, S. Woo, S. Lee. AIBM: Accurate and instant background modeling for moving object detection. IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 7, pp. 9021–9036, 2022. DOI: 10.1109/TITS.2021.3090092.
    [121]
    D. Liang, Z. Q. Wei, H. Sun, H. Y. Zhou. Robust cross-scene foreground segmentation in surveillance video. In Proceedings of IEEE International Conference on Multimedia and Expo, Shenzhen, China, 2021. DOI: 10.1109/ICME51207.2021.9428086.
    [122]
    University Kyushu. LIMU, 2008, [Online], Available: https://limu.ait.kyushu-u.ac.jp/dataset/en/, 2022.
    [123]
    J. Zhang, Y. Li, F. Q. Chen, Z. S. Pan, X. Y. Zhou, Y. D. Li, S. S. Jiao. X-Net: A binocular summation network for foreground segmentation. IEEE Access, vol. 7, pp. 71412–71422, 2019. DOI: 10.1109/ACCESS.2019.2919802.
    [124]
    J. Zhang, S. H. Wang, J. Y. Qiu, X. R. Pan, J. H. Zou, Y. X. Duan, Z. S. Pan, Y. Li. A fast X-shaped foreground segmentation network with CompactASPP. Engineering Applications of Artificial Intelligence, vol. 97, Article number 104077, 2021. DOI: 10.1016/j.engappai.2020.104077.
    [125]
    J. Zhang, X. Zhang, Y. Y. Zhang, Y. X. Duan, Y. Li, Z. S. Pan. Meta-knowledge learning and domain adaptation for unseen background subtraction. IEEE Transactions on Image Processing, vol. 30, pp. 9058–9068, 2021. DOI: 10.1109/TIP.2021.3122102.
    [126]
    M. Mandal, V. Dhar, A. Mishra, S. K. Vipparthi. 3DFR: A swift 3D feature reductionist framework for scene independent change detection. IEEE Signal Processing Letters, vol. 26, no. 12, pp. 1882–1886, 2019. DOI: 10.1109/LSP.2019.2952253.
    [127]
    Z. J. Zou, Z. T. Meng, L. Shu, J. Hao. A change-aware approach for relative motion segmentation. In Proceedings of IEEE International Conference on Multimedia and Expo, Shenzhen, China, 2021. DOI: 10.1109/ICME51207.2021.9428082.
    [128]
    W. B. Zheng, K. F. Wang, F. Y. Wang. Background subtraction algorithm with Bayesian generative adversarial networks. Acta Automatica Sinica, vol. 44, no. 5, pp. 878–890, 2018. DOI: 10.16383/j.aas.2018.c170562. (in Chinese)
    [129]
    C. Q. Zhao, A. Basu. Dynamic deep pixel distribution learning for background subtraction. IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 11, pp. 4192–4206, 2020. DOI: 10.1109/TCSVT.2019.2951778.
    [130]
    Z. H. Hu, T. Turki, N. Phan, J. T. L. Wang. A 3D atrous convolutional long short-term memory network for background subtraction. IEEE Access, vol. 6, pp. 43450–43459, 2018. DOI: 10.1109/ACCESS.2018.2861223.
    [131]
    M. Mandal, S. K. Vipparthi. Scene independency matters: An empirical study of scene dependent and scene independent evaluation for CNN-based change detection. IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 3, pp. 2031–2044, 2022. DOI: 10.1109/TITS.2020.3030801.
    [132]
    M. Mandal, V. Dhar, A. Mishra, S. K. Vipparthi, M. Abdel-Mottaleb. 3DCD: Scene independent end-to-end spatiotemporal feature learning framework for change detection in unseen videos. IEEE Transactions on Image Processing, vol. 30, pp. 546–558, 2021. DOI: 10.1109/TIP.2020.3037472.
    [133]
    B. X. Hou, Y. Liu, N. M. Ling, L. Z. Liu, Y. X. Ren. A fast lightweight 3D separable convolutional neural network with multi-input multi-output for moving object detection. IEEE Access, vol. 9, pp. 148433–148448, 2021. DOI: 10.1109/ACCESS.2021.3123975.
    [134]
    F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, A. Sorkine-Hornung. A benchmark dataset and evaluation methodology for video object segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 724–732, 2016. DOI: 10.1109/CVPR.2016.85.
    [135]
    S. Choo, W. Seo, D. J. Jeong, N. I. Cho. Multi-scale recurrent encoder-decoder network for dense temporal classification. In Proceedings of 24th International Conference on Pattern Recognition, IEEE, Beijing, China, pp. 103–108, 2018. DOI: 10.1109/ICPR.2018.8545597.
    [136]
    S. Choo, W. Seo, D. J. Jeong, N. I. Cho. Learning background subtraction by video synthesis and multi-scale recurrent networks. In Proceedings of the 14th Asian Conference on Computer Vision, Springer, Perth, Australia, pp. 357–372, 2019. DOI: 10.1007/978-3-030-20876-9_23.
    [137]
    L. Yang, J. Li, Y. S. Luo, Y. Zhao, H. Cheng, J. Li. Deep background modeling using fully convolutional network. IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 1, pp. 254–262, 2018. DOI: 10.1109/TITS.2017.2754099.
    [138]
    P. W. Patil, A. Dudhane, S. Murala. Multi-frame recurrent adversarial network for moving object segmentation. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, Waikoloa, USA, pp. 2301–2310, 2021. DOI: 10.1109/WACV48630.2021.00235.
    [139]
    P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 1125–1134, 2017. DOI: 10.1109/CVPR.2017.632.
    [140]
    G. M. Shi, T. Huang, W. S. Dong, J. J. Wu, X. M. Xie. Robust foreground estimation via structured Gaussian scale mixture modeling. IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 4810–4824, 2018. DOI: 10.1109/TIP.2018.2845123.
    [141]
    S. Javed, A. Mahmood, S. Al-Maadeed, T. Bouwmans, S. K. Jung. Moving object detection in complex scene using spatiotemporal structured-sparse RPCA. IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 1007–1022, 2019. DOI: 10.1109/TIP.2018.2874289.
    [142]
    T. Akilan, Q. M. J. Wu. sEnDec: An improved image to image CNN for foreground localization. IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 10, pp. 4435–4443, 2020. DOI: 10.1109/TITS.2019.2940547.
    [143]
    T. Akilan, Q. M. J. Wu, W. D. Zhang. Video foreground extraction using multi-view receptive field and encoder-decoder DCNN for traffic and surveillance applications. IEEE Transactions on Vehicular Technology, vol. 68, no. 10, pp. 9478–9493, 2019. DOI: 10.1109/TVT.2019.2937076.
    [144]
    P. W. Patil, A. Dudhane, S. Chaudhary, S. Murala. Multi-frame based adversarial learning approach for video surveillance. Pattern Recognition, vol. 122, Article number 108350, 2022. DOI: 10.1016/j.patcog.2021.108350.
    [145]
    C. L. Li, X. Wang, L. Zhang, J. Tang, H. J. Wu, L. Lin. Weighted low-rank decomposition for robust grayscale-thermal foreground detection. IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 4, pp. 725–738, 2017. DOI: 10.1109/TCSVT.2016.2556586.
    [146]
    H. W. Yong, D. Y. Meng, W. M. Zuo, L. Zhang. Robust online matrix factorization for dynamic background subtraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 7, pp. 1726–1740, 2018. DOI: 10.1109/TPAMI.2017.2732350.
    [147]
    L. Chen, X. Jiang, X. Z. Liu, T. Kirubarajan, Z. X. Zhou. Outlier-robust moving object and background decomposition via structured $\ell_p$-regularized low-rank representation. IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 5, no. 4, pp. 620–638, 2021. DOI: 10.1109/TETCI.2019.2935747.
    [148]
    P. W. Patil, A. Dudhane, S. Murala. End-to-End recurrent generative adversarial network for traffic and surveillance applications. IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 14550–14562, 2020. DOI: 10.1109/TVT.2020.3043575.
    [149]
    L. Maddalena, A. Petrosino. The SOBS algorithm: What are the limits? In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, USA, pp. 21–26, 2012. DOI: 10.1109/CVPRW.2012.6238922.
    [150]
    T. S. F. Haines, T. Xiang. Background subtraction with DirichletProcess mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 4, pp. 670–683, 2014. DOI: 10.1109/TPAMI.2013.239.
    [151]
    D. Berjón, C. Cuevas, F. Morán, N. García. Real-time nonparametric background subtraction with tracking-based foreground update. Pattern Recognition, vol. 74, pp. 156–170, 2018. DOI: 10.1016/j.patcog.2017.09.009.
    [152]
    P. W. Patil, A. Dudhane, S. Murala, A. B. Gonde. Deep adversarial network for scene independent moving object segmentation. IEEE Signal Processing Letters, vol. 28, pp. 489–493, 2021. DOI: 10.1109/LSP.2021.3059195.
    [153]
    P. W. Patil, K. M. Biradar, A. Dudhane, S. Murala. An end-to-end edge aggregation network for moving object segmentation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8146–8155, 2020. DOI: 10.1109/CVPR42600.2020.00817.
    [154]
    F. X. Li, T. Kim, A. Humayun, D. Tsai, J. M. Rehg. Video segmentation by tracking many figure-ground segments. In Proceedings of EEE International Conference on Computer Vision, Sydney, Australia, pp. 2192–2199, 2013. DOI: 10.1109/ICCV.2013.273.
    [155]
    I. Osman, M. Abdelpakey, M. S. Shehata. TransBlast: Self-supervised learning using augmented subspace with Transformer for background/foreground separation. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshops, IEEE, Montreal, Canada, pp. 215–224, 2021. DOI: 10.1109/ICCVW54120.2021.00029.
    [156]
    J. Zbontar, L. Jing, I. Misra, Y. LeCun, S. Deny. Barlow twins: Self-supervised learning via redundancy reduction. In Proceedings of 38th International Conference on Machine Learning, pp. 12310–12320, 2021.
    [157]
    J. Pont-Tuset, F. Perazzi, S. Caelles, P. Arbeláez, A. Sorkine-Hornung, L. Van Gool. The 2017 DAVIS challenge on video object segmentation. [Online], Available: https://arxiv.org/abs/1704.00675, 2017.
    [158]
    J. Zhang, Y. Li, C. L. Ren, L. Huang, S. H. Wang, Y. X. Duan, Z. S. Pan, J. Xie. Cross-scene foreground segmentation algorithm based on high-level feature differencing between frames. Acta Electronica Sinica, vol. 49, no. 10, pp. 2032–2040, 2021. DOI: 10.12263/DZXB.20200620. (in Chinese)
    [159]
    J. H. Giraldo, S. Javed, T. Bouwmans. Graph moving object segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 5, pp. 2485–2503, 2022. DOI: 10.1109/TPAMI.2020.3042093.
    [160]
    J. H. Giraldo, S. Javed, N. Werghi, T. Bouwmans. Graph CNN for moving object detection in complex environments from unseen videos. In Proceedings of IEEE/CVF International Conference on Computer Vision Workshops, IEEE, Montreal, Canada, pp. 225–233, 2021. DOI: 10.1109/ICCVW54120.2021.00030.
    [161]
    V. Mahadevan, N. Vasconcelos. Spatiotemporal saliency in dynamic scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 1, pp. 171–177, 2010. DOI: 10.1109/TPAMI.2009.112.
    [162]
    X. W. Zhou, C. Yang, W. C. Yu. Moving object detection by detecting contiguous outliers in the low-rank representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 3, pp. 597–610, 2013. DOI: 10.1109/TPAMI.2012.132.
    [163]
    J. He, L. Balzano, A. Szlam. Incremental gradient on the Grassmannian for online foreground and background separation in subsampled video. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, pp. 1568–1575, 2012. DOI: 10.1109/CVPR.2012.6247848.
    [164]
    H. S. Zhao, X. J. Qi, X. Y. Shen, J. P. Shi, J. Y. Jia. ICNet for real-time semantic segmentation on high-resolution images. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 418–434, 2018. DOI: 10.1007/978-3-030-01219-9_25.
    [165]
    H. S. Zhao, J. P. Shi, X. J. Qi, X. G. Wang, J. Y. Jia. Pyramid scene parsing network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6230–6239, 2017. DOI: 10.1109/CVPR.2017.660.
    [166]
    A. Guzman-Pando, M. I. Chacon-Murguia. DeepFoveaNet: Deep fovea eagle-eye bioinspired model to detect moving objects. IEEE Transactions on Image Processing, vol. 30, pp. 7090–7100, 2021. DOI: 10.1109/TIP.2021.3101398.
    [167]
    Y. X. Ge, J. Y. Zhang, X. Y. Ren, C. Q. Zhao, J. Yang, A. Basu. Deep variation transformation network for foreground detection. IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 9, pp. 3544–3558, 2021. DOI: 10.1109/TCSVT.2020.3042559.
    [168]
    W. J. Zhou, S. Kaneko, M. Hashimoto, Y. Satoh, D. Liang. Foreground detection based on co-occurrence background model with hypothesis on degradation modification in dynamic scenes. Signal Processing, vol. 160, pp. 66–79, 2019. DOI: 10.1016/j.sigpro.2019.02.021.
    [169]
    D. Liang, B. Kang, X. Y. Liu, P. Gao, X. Y. Tan, S. Kaneko. Cross-scene foreground segmentation with supervised and unsupervised model communication. Pattern Recognition, vol. 117, Article number 107995, 2021. DOI: 10.1016/j.patcog.2021.107995.
    [170]
    D. Liang, J. X. Pan, H. Sun, H. Y. Zhou. Spatio-temporal attention model for foreground detection in cross-scene surveillance videos. Sensors, vol. 19, no. 23, Article number 5142, 2019. DOI: 10.3390/s19235142.
    [171]
    D. K. Prasad, D. Rajan, L. Rachmawati, E. Rajabally, C. Quek. Video processing from electro-optical sensors for object detection and tracking in a maritime environment: A survey. IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 8, pp. 1993–2016, 2017. DOI: 10.1109/TITS.2016.2634580.
    [172]
    R. Huang, M. Zhou, Y. Xing, Y. B. Zou, W. Fan. Change detection with various combinations of fluid pyramid integration networks. Neurocomputing, vol. 437, pp. 84–94, 2021. DOI: 10.1016/j.neucom.2021.01.030.
    [173]
    S. C. Li, P. C. Han, S. H. Bu, P. M. Tong, Q. Li, K. Li, G. Wan. Change detection in images using shape-aware siamese convolutional network. Engineering Applications of Artificial Intelligence, vol. 94, Article number 103819, 2020. DOI: 10.1016/j.engappai.2020.103819.
    [174]
    T. Bouwmans, A. Sobral, S. Javed, S. K. Jung, E. H. Zahzah. Decomposition into low-rank plus additive matrices for background/foreground separation: A review for a comparative evaluation with a large-scale dataset. Computer Science Review, vol. 23, pp. 1–71, 2017. DOI: 10.1016/j.cosrev.2016.11.001.
    [175]
    V. Monga, Y. L. Li, Y. C. Eldar. Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing. IEEE Signal Processing Magazine, vol. 38, no. 2, pp. 18–44, 2021. DOI: 10.1109/MSP.2020.3016905.
    [176]
    A. Sobral. BGSLibrary: An OpenCV C++ background subtraction library. In Proceedings of IX Workshop de Visão Computacional, Rio de Janeiro, Brazil, vol. 27, 2013. DOI: 10.13140/2.1.1740.7044.
    [177]
    A. Sobral, T. Bouwmans, E. H. Zahzah. LRSLibrary: Low-rank and sparse tools for background modeling and subtraction in videos. Handbook of Robust Low-Rank and Sparse Matrix Decomposition: Applications in Image and Video Processing, T. Bouwmans, N. S. Aybat, E. H. Zahzah, Eds., Boca Raton, USA: CRC Press, pp. 14-1–14-11, 2016.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(20)  / Tables(8)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (214) PDF downloads(11) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return