Yang Wu, Ding-Heng Wang, Xiao-Tong Lu, Fan Yang, Man Yao, Wei-Sheng Dong, Jian-Bo Shi, Guo-Qi Li. Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies. Machine Intelligence Research. https://doi.org/10.1007/s11633-022-1340-5
Citation: Yang Wu, Ding-Heng Wang, Xiao-Tong Lu, Fan Yang, Man Yao, Wei-Sheng Dong, Jian-Bo Shi, Guo-Qi Li. Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies. Machine Intelligence Research. https://doi.org/10.1007/s11633-022-1340-5

Efficient Visual Recognition: A Survey on Recent Advances and Brain-inspired Methodologies

doi: 10.1007/s11633-022-1340-5
More Information
  • Author Bio:

    Yang Wu received the B. Sc. degree in information and the Ph. D. degree in control science and engineering from Xi′an Jiaotong University, China in 2004 and 2010, respectively. He is currently a principal researcher with Applied Research Center (ARC) Laboratory, Tencent Platform and Content Group (PCG), China. From July 2019 to May 2021, he was a program-specific senior lecturer with Department of Intelligence Science and Technology, Kyoto University, Japan. He was an assistant professor of the Nara Institute of Science and Technology (NAIST) International Collaborative Laboratory for Robotics Vision, NAIST, from December 2014 to June 2019. From 2011 to 2014, he was a program-specific researcher with the Academic Center for Computing and Media Studies, Kyoto University, Japan. His research interests include computer vision, pattern recognition, as well as multimedia content analysis, enhancement and generation. E-mail: dylanywu@tencent.com (Corresponding author) ORCID iD: 0000-0001-8010-6857

    Ding-Heng Wang received the B. Eng. degree in mechanical engineering and automation, the M. Eng. degree in software engineering from Xi′an Jiaotong University, China in 2010 and 2014, respectively, and the Ph. D. degree in control science and engineering from Xi′an Jiaotong University, China in 2022. From 2014 to 2017, he was a software engineer in China Aerospace Science and Industry Corporation Limited, China. His research interests include tensor decomposition, neural network compression, and efficient machine learning model. E-mail: wangdai11@stu.xjtu.edu.cn

    Xiao-Tong Lu received the B. Sc. degree in electronic engineering from Xidian University, China in 2016, where he is currently a Ph. D. degree candidate in intelligent information processing. His research interests include deep learning, compressive sensing, image restoration and deep neural network compression. E-mail: dmptcode@163.com

    Fan Yang received the B. Sc. degree in geographical informational system from Nanjing University, China in 2012, and the M. Sc. degree in information science from Nara Institute of Science and Technology, Japan in 2018. He is currently a Ph. D. degree candidate in information science at Nara Institute of Science and Technology, Japan. His research interest is on video processing. E-mail: yang.fan.xv6@is.naist.jp

    Man Yao received the M. Eng. degree in the electronic and communication engineering from Xi′an Jiaotong University, China in 2018. He is currently a Ph. D. degree candidate in control science and engineering at Xi′an Jiaotong University, China. From May 2021 to the present, he is doing an internship in Peng Cheng Laboratory, China. His research interests include spiking neural network and dynamic neural network. E-mail: manyao@stu.xjtu.edu.cn

    Wei-Sheng Dong received the B. Sc. degree in electronic engineering from Huazhong University of Science and Technology, China in 2004, and the Ph. D. degree in circuits and system from Xidian University, China in 2010. He was a visiting student with Microsoft Research Asia, China in 2006. From 2009 to 2010, he was a research assistant with Department of Computing, Hong Kong Polytechnic University, China. In 2010, he joined School of Electronic Engineering, Xidian University, China as a lecturer, where he has been a professor since 2016. He was a recipient of the Best Paper Award at the SPIE Visual Communication and Image Processing (VCIP) in 2010. He has served as an Associate Editor of IEEE Transactions on Image Processing and is currently an Associate Editor of SIAM Journal of Imaging Sciences. His research interests include inverse problems in image processing, deep learning, and parse representation. E-mail: wsdong@mail.xidian.edu.cn

    Jian-Bo Shi received the B. A. degree in computer science and mathematics from Cornell University, USA in 1994, and the Ph. D. degree in computer science from the University of California at Berkeley, USA in 1998. He joined The Robotics Institute at Carnegie Mellon University, USA in 1999 as a research faculty, and in 2003, University of Pennsylvania where he is currently a professor of Computer and Information Science. In 2007, he was awarded the Longuet-Higgins Prize for his work on Normalized Cuts. His research focuses on first person vision, human behavior analysis and image recognition-segmentation. His other research interests include image/video retrieval, 3D vision, and vision based desktop computing. His long-term interests center around a broader area of machine intelligence, he wishes to develop a “visual thinking” module that allows computers not only to understand the environment around us, but also to achieve cognitive abilities such as machine memory and learning. E-mail: jshi@seas.upenn.edu

    Guo-Qi Li received the B. Eng. degree in automation from the Xi′an University of Technology, China in 2004, the M. Eng. degree in control engineering from Xi′an Jiaotong University, China in 2007, and the Ph. D. degree in electrical and electronic engineering from Nanyang Technological University, Singapore in 2011. From 2011 to 2014, he was a scientist with the Data Storage Institute and the Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore. From 2014 to 2022, he was an assistant professor and associate professor at Tsinghua University, China. Since 2022, he has been with Institute of Automation, Chinese Academy of Sciences and the University of Chinese Academy of Sciences, where he is currently a full professor. He has authored or co-authored more than 150 journal and conference papers. He has been actively involved in professional services such as serving as a Tutorial Chair, an International Technical Program Committee Member, a PC member, a Publication Chair and a Track Chair for several international conferences. He is an Editorial-Board Member for Control and Decision, and served as Associate Editors for Journal of Control and Decision and Frontiers in Neuroscience: Neuromorphic Engineering. He is a reviewer for Mathematical Reviews published by the American Mathematical Society and serves as a reviewer for a number of prestigious international journals and top AI conferences including ICLR, NeurIPS, ICML, AAI, etc. He was the recipient of the First Class Prize in Science and Technology of the Chinese Institute of Command and Control in 2018, the Second Prize of Fujian Provincial Science and Technology Progress Award in 2020. He received the outstanding Young Talent Award of the Beijing Natural Science Foundation in 2021. His research interests include brain-inspired intelligence, neuromorphic computing and spiking neural networks. E-mail: guoqi.li@ia.ac.cn (Corresponding author) ORCID iD: 0000-0002-8994-431X

  • Received Date: 2022-04-07
  • Accepted Date: 2022-05-26
  • Rev Recd Date: 2022-05-26
  • Publish Online: 2022-08-18
  • Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs, particularly the modern deep neural networks (DNNs) and some brain-inspired methodologies, have largely boosted the recognition performance on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Although recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this survey, we present the review of recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related and brain-inspired visual recognition approaches, including efficient network compression and dynamic brain-inspired networks. We investigate not only from the model but also from the data point of view (which is not the case in existing surveys) and focus on four typical data types (images, video, points, and events). This survey attempts to provide a systematic summary via a comprehensive survey that can serve as a valuable reference and inspire both researchers and practitioners working on visual recognition problems.

     

  • loading
  • [1]
    Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: 10.1109/5.726791.
    [2]
    G. E. Hinton, R. R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, vol. 313, no. 5786, pp. 504–507, 2006. DOI: 10.1126/science.1127647.
    [3]
    A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 1106–1114, 2012.
    [4]
    T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: 10.1007/978-3-319-10602-1_48.
    [5]
    J. K. Song, Y. Y. Guo, L. L. Gao, X. L. Li, A. Hanjalic, H. T. Shen. From deterministic to generative: Multimodal stochastic RNNs for video captioning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3047–3058, 2019. DOI: 10.1109/TNNLS.2018.2851077.
    [6]
    L. L. Gao, X. P. Li, J. K. Song, H. T. Shen. Hierarchical LSTMs with adaptive attention for visual captioning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 5, pp. 1112–1131, 2020. DOI: 10.1109/TPAMI.2019.2894139.
    [7]
    S. E. Wei, V. Ramakrishna, T. Kanade, Y. Sheikh. Convolutional pose machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 4724–4732, 2016. DOI: 10.1109/CVPR.2016.511.
    [8]
    W. Maass. Networks of spiking neurons: The third generation of neural network models. Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997. DOI: 10.1016/S0893-6080(97)00011-7.
    [9]
    E. Ahmed, A. Saint, A. E. R. Shabayek, K. Cherenkova, R. Das, G. Gusev, D. Aouada, B. Ottersten. A survey on deep learning advances on different 3D data representations. [Online], Available: https://arxiv.org/abs/1808.01462, 2019.
    [10]
    L. Liu, J. Chen, P. Fieguth, G. Y. Zhao, R. Chellappa, M. Pietikäinen. From bow to CNN: Two decades of texture representation for texture classification. International Journal of Computer Vision, vol. 127, no. 1, pp. 74–109, 2019. DOI: 10.1007/s11263-018-1125-z.
    [11]
    L. Liu, W. L. Ouyang, X. G. Wang, P. Fieguth, J. Chen, X. W. Liu, M. Pietikäinen. Deep learning for generic object detection: A survey. International Journal of Computer Vision, vol. 128, no. 2, pp. 261–318, 2020. DOI: 10.1007/s11263-019-01247-4.
    [12]
    G. Gallego, T. Delbruük, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, D. Scaramuzza. Event-based vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, pp. 154–180, 2022. DOI: 10.1109/TPAMI.2020.3008413.
    [13]
    Q. R. Zhang, M. Zhang, T. H. Chen, Z. F. Sun, Y. Z. Ma, B. Yu. Recent advances in convolutional neural network acceleration. Neurocomputing, vol. 323, pp. 37–51, 2019. DOI: 10.1016/j.neucom.2018.09.038.
    [14]
    L. Deng, G. Q. Li, S. Han, L. P. Shi, Y. Xie. Model compression and hardware acceleration for neural networks: A comprehensive survey. Proceedings of IEEE, vol. 108, no. 4, pp. 485–532, 2020. DOI: 10.1109/JPROC.2020.2976475.
    [15]
    Y. Cheng, D. Wang, P. Zhou, T. Zhang. Model compression and acceleration for deep neural networks: The principles, progress, and challenges. IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 126–136, 2018. DOI: 10.1109/MSP.2017.2765695.
    [16]
    V. Lebedev V. Lempitsky. Speeding-up convolutional neural networks: A survey. Bulletin of the Polish Academy of Sciences:Technical Sciences, vol. 66, no. 6, pp. 799–810, 2018. DOI: 10.24425/bpas.2018.125927.
    [17]
    T. Elsken, J. H. Metzen, F. Hutter. Neural architecture search: A survey. The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1997–2017, 2019. DOI: 10.5555/3322706.3361996.
    [18]
    Y. Z. Han, G. Huang, S. J. Song, L. Yang, H. H. Wang, Y. L. Wang. Dynamic neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: 10.1109/TPAMI.2021.3117837.
    [19]
    P. Lichtsteiner, C. Posch, T. Delbruck. A 128×128 120 dB 15 μs latency asynchronous temporal contrast vision sensor. IEEE Journal of Solid-state Circuits, vol. 43, no. 2, pp. 566–576, 2008. DOI: 10.1109/JSSC.2007.914337.
    [20]
    C. Posch, D. Matolin, R. Wohlgenannt. A QVGA 143 dB dynamic range frame-free PWM image sensor with lossless pixel-level video compression and time-domain CDS. IEEE Journal of Solid-state Circuits, vol. 46, no. 1, pp. 259–275, 2011. DOI: 10.1109/JSSC.2010.2085952.
    [21]
    A. Krizhevsky. Learning Multiple Layers of Features from Tiny Images, Master dissertation, University of Toronto, Canada, 2009.
    [22]
    J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: 10.1109/CVPR.2009.5206848.
    [23]
    Y. Xiang, W. Kim, W. Chen, J. W. Ji, C. Choy, H. Su, R. Mottaghi, L. Guibas, S. Savarese. ObjectNet3D: A large scale database for 3D object recognition. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 160–176, 2016. DOI: 10.1007/978-3-319-46484-8_10.
    [24]
    A. R. Zamir, A. Sax, W. Shen, L. Guibas, J. Malik, S. Savarese. Taskonomy: Disentangling task transfer learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3712–3722, 2018. DOI: 10.1109/CVPR.2018.00391.
    [25]
    H. Jhuang, J. Gall, S. Zuffi, C. Schmid, M. J. Black. Towards understanding action recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 3192–3199, 2013. DOI: 10.1109/ICCV.2013.396.
    [26]
    A. Shahroudy, J. Liu, T. T. Ng, G. Wang. NTU RGB+D: A large scale dataset for 3D human activity analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1010–1019, 2016. DOI: 10.1109/CVPR.2016.115.
    [27]
    C. H. Liu, Y. Y. Hu, Y. H. Li, S. J. Song, J. Y. Liu. PKU-MMD: A large scale benchmark for continuous multi-modal human action understanding. [Online], Available: https://arxiv.org/abs/1703.07475, 2017.
    [28]
    Y. S. Tang, Y. Tian, J. W. Lu, P. Y. Li, J. Zhou. Deep progressive reinforcement learning for skeleton-based action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, 2018, pp. 5323–5332. DOI: 10.1109/CVPR.2018.00558.
    [29]
    J. X. Hou, G. J. Wang, X. H. Chen, J. H. Xue, R. Zhu, H. Z. Yang. Spatial-temporal attention res-TCN for skeleton-based dynamic hand gesture recognition. In Proceedings of Computer Vision, Springer, Munich, Germany, pp. 273–286, 2018. DOI: 10.1007/978-3-030-11024-6_18.
    [30]
    A. X. Chang, T. Funkhouser, L. Guibas, P. Hanrahan, Q. X. Huang, Z. M. Li, S. Savarese, M. Savva, S. R. Song, H. Su, J. X. Xiao, L. Yi, F. Yu. ShapeNet: An information-rich 3D model repository. [Online], Available: https://arxiv.org/abs/1512.03012, 2015.
    [31]
    H. Rebecq, R. Ranftl, V. Koltun, D. Scaramuzza. High speed and high dynamic range video with an event camera. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 6, pp. 1964–1980, 2021. DOI: 10.1109/TPAMI.2019.2963386.
    [32]
    W. S. Cheng, H. Luo, W. Yang, L. Yu, S. S. Chen, W. Li. Det: A high-resolution DVS dataset for lane extraction. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 1666–1675, 2019. DOI: 10.1109/CVPRW.2019.00210.
    [33]
    T. Delbruck, M. Lang. Robotic goalie with 3 ms reaction time at 4% CPU load using event-based dynamic vision sensor. Frontiers in Neuroscience, vol. 7, Article number 223, 2013. DOI: 10.3389/fnins.2013.00223.
    [34]
    A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Nayak, A. Andreopoulos, G. Garreau, M. Mendoza, J. Kusnitz, M. Debole, S. Esser, T. Delbruck, M. Flickner, D. Modha. A low power, fully event-based gesture recognition system. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 7388–7397, 2017. DOI: 10.1109/CVPR.2017.781.
    [35]
    Z. Wu, Z. Xu, R. N. Zhang, S. M. Li. SIFT feature extraction algorithm for image in DCT domain. Applied Mechanics and Materials, vol. 347−350, pp. 2963–2967, 2013. DOI: 10.4028/www.scientific.net/AMM.347-350.2963.
    [36]
    L. Gueguen, A. Sergeev, B. Kadlec, R. Liu, J. Yosinski. Faster neural networks straight from jpeg. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 3937–3948, 2018. DOI: 10.5555/3327144.3327308.
    [37]
    A. Paul, T. Z. Khan, P. Podder, R. Ahmed, M. M. Rahman, M. H. Khan. Iris image compression using wavelets transform coding. In Proceedings of the 2nd International Conference on Signal Processing and Integrated Networks, IEEE, Noida, India, pp. 544–548, 2015. DOI: 10.1109/SPIN.2015.7095407.
    [38]
    O. Rippel, L. Bourdev. Real-time adaptive image compression. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 2922–2930, 2017. DOI: 10.5555/3305890.3305983.
    [39]
    J. Ballé, D. Minnen, S. Singh, S. J. Hwang, N. Johnston. Variational image compression with a scale hyperprior. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, pp. 1–49, 2018.
    [40]
    D. Minnen, G. Toderici, S. Singh, S. J. Hwang, M. Covell. Image-dependent local entropy models for learned image compression. In Proceedings of the 25th IEEE International Conference on Image Processing, IEEE, Athens, Greece, pp. 430–434, 2018. DOI: 10.1109/ICIP.2018.8451502.
    [41]
    D. Minnen, J. Ballé, G. D. Toderici. Joint autoregressive and hierarchical priors for learned image compression. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 10794–10803, 2018. DOI: 10.5555/3327546.3327736.
    [42]
    G. J. Sullivan, J. R. Ohm, W. J. Han, T. Wiegand. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, 2012. DOI: 10.1109/TCSVT.2012.2221191.
    [43]
    T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, 2003. DOI: 10.1109/TCSVT.2003.815165.
    [44]
    T. Chen, H. J. Liu, Q. Shen, T. Yue, X. Cao, Z. Ma. Deepcoder: A deep neural network based video compression. In Proceedings of IEEE Visual Communications and Image Processing, IEEE, St. Petersburg, USA, pp. 1–4, 2017. DOI: 10.1109/VCIP.2017.8305033.
    [45]
    G. Lu, W. L. Ouyang, D. Xu, X. Y. Zhang, Z. Y. Gao, M. T. Sun. Deep Kalman filtering network for video compression artifact reduction. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 591–608, 2018. DOI: 10.1007/978-3-030-01264-9_35.
    [46]
    C. Y. Wu, N. Singhal, P. Krähenbühl. Video compression through image interpolation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 425–440, 2018. DOI: 10.1007/978-3-030-01237-3_26.
    [47]
    X. Z. Zhu, Y. W. Xiong, J. F. Dai, L. Yuan, Y. C. Wei. Deep feature flow for video recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4141–4150, 2017. DOI: 10.1109/CVPR.2017.441.
    [48]
    C. Y. Wu, M. Zaheer, H. X. Hu, R. Manmatha, A. J. Smola, P. Krähenbuühl. Compressed video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6026–6035, 2018. DOI: 10.1109/CVPR.2018.00631.
    [49]
    W. Yan, Y. T. Shao, S. Liu, T. H. Li, Z. Li, G. Li. Deep AutoEncoder-based lossy geometry compression for point clouds. [Online], Available: https://arxiv.org/abs/1905.03691, 2019.
    [50]
    J. Q. Wang, H. Zhu, Z. Ma, T. Chen, H. J. Liu, Q. Shen. Learned point cloud geometry compression. [Online], Available: https://arxiv.org/abs/1909.12037, 2019.
    [51]
    Y. Q. Yang, C. Feng, Y. R. Shen, D. Tian. FoldingNet: Point cloud auto-encoder via deep grid deformation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 206–215, 2018. DOI: 10.1109/CVPR.2018.00029.
    [52]
    M. Yao, H. H. Gao, G. S. Zhao, D. S. Wang, Y. H. Lin, Z. X. Yang, G. Q. Li. Temporal-wise attention spiking neural networks for event streams classification. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montréal, Canada, pp. 10201–10210, 2021. DOI: 10.1109/ICCV48922.2021.01006.
    [53]
    Y. X. Wang, B. W. Du, Y. R. Shen, K. Wu, G. R. Zhao, J. G. Sun, H. K. Wen. EV-Gait: Event-based robust gait recognition using dynamic vision sensors. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 6351–360, 2019. DOI: 10.1109/CVPR.2019.00652.
    [54]
    Y. Sekikawa, K. Hara, H. Saito. EventNet: Asynchronous recursive event processing. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 3882–3891, 2019. DOI: 10.1109/CVPR.2019.00401.
    [55]
    K. Chitta, J. M. Alvarez, E. Haussmann, C. Farabet. Training data subset search with ensemble active learning. [Online], Available: https://arxiv.org/abs/1905.12737, 2020.
    [56]
    O. Sener, S. Savarese. Active learning for convolutional neural networks: A core-set approach. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [57]
    K. Vodrahalli, K. Li, J. Malik. Are all training examples created equal? An empirical study. [Online], Available: https://arxiv.org/abs/1811.12569, 2018.
    [58]
    V. Birodkar, H. Mobahi, S. Bengio. Semantic redundancies in image-classification datasets: The 10% you don′t need. [Online], Available: https://arxiv.org/abs/1901.11409, 2019.
    [59]
    J. Y. Gao, Z. H. Yang, C. Sun, K. Chen, R. Nevatia. TURN TAP: Temporal unit regression network for temporal action proposals. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 3648–3656, 2017. DOI: 10.1109/ICCV.2017.392.
    [60]
    J. Carreira, A. Zisserman. Quo vadis, action recognition? A new model and the kinetics dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4724–4733, 2017. DOI: 10.1109/CVPR.2017.502.
    [61]
    S. N. Xie, C. Sun, J. Huang, Z. W. Tu, K. Murphy. Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 318–335, 2018. DOI: 10.1007/978-3-030-01267-0_19.
    [62]
    M. Zolfaghari, K. Singh, T. Brox. ECO: Efficient convolutional network for online video understanding. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 713–730, 2018. DOI: 10.1007/978-3-030-01216-8_43.
    [63]
    S. Yeung, O. Russakovsky, G. Mori, L. Fei-Fei. End-to-end learning of action detection from frame glimpses in videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2678–2687, 2016. DOI: 10.1109/CVPR.2016.293.
    [64]
    J. J. Huang, N. N. Li, T. Zhang, G. Li, T. J. Huang, W. Gao. SAP: Self-adaptive proposal model for temporal action detection based on reinforcement learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, pp. 6951–6958, 2018. DOI: 10.1609/aaai.v32i1.12229.
    [65]
    S. Y. Lan, R. Panda, Q. Zhu, A. K. Roy-Chowdhury. FFNet: Video fast-forwarding via reinforcement learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6771–6780, 2018. DOI: 10.1109/CVPR.2018.00708.
    [66]
    H. H. Fan, Z. W. Xu, L. C. Zhu, C. G. Yan, J. J. Ge, Y. Yang. Watching a small portion could be as good as watching all: Towards efficient video classification. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 705–711, 2018. DOI: 10.5555/3304415.3304516.
    [67]
    A. Kar, N. Rai, K. Sikka, G. Sharma. AdaScan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5699–5708, 2017. DOI: 10.1109/CVPR.2017.604.
    [68]
    Z. X. Wu, C. M. Xiong, C. Y. Ma, R. Socher, L. S. Davis. AdaFrame: Adaptive frame selection for fast video recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1278–1287, 2019. DOI: 10.1109/CVPR.2019.00137.
    [69]
    J. C. Yang, Q. Zhang, B. B. Ni, L. G. Li, J. X. Liu, M. D. Zhou, Q. Tian. Modeling point clouds with self-attention and gumbel subset sampling. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 3318–3327, 2019. DOI: 10.1109/CVPR.2019.00344.
    [70]
    A. Paigwar, O. Erkent, C. Wolf, C. Laugier. Attentional pointNet for 3D-object detection in point clouds. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Long Beach, USA, pp. 1297–1306, 2019. DOI: 10.1109/CVPRW.2019.00169.
    [71]
    C. Kingkan, J. Owoyemi, K. Hashimoto. Point attention network for gesture recognition using point cloud data. In Proceedings of the 29th British Machine Vision Conference, Newcastle, UK, pp. 1–13, 2018. [Online], Available: https://bmvc2018.org/contents/papers/0427.pdf.
    [72]
    A. Khodamoradi, R. Kastner. O(N)o(N)-space spatiotemporal filter for reducing noise in neuromorphic vision sensors. IEEE Transactions on Emerging Topics in Computing, vol. 9, no. 1, pp. 15–23, 2021. DOI: 10.1109/TETC.2017.2788865.
    [73]
    H. J. Liu, C. Brandli, C. H. Li, S. C. Liu, T. Delbruck. Design of a spatiotemporal correlation filter for event-based sensors. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Lisbon, Portugal, pp. 722–725, 2015. DOI: 10.1109/ISCAS.2015.7168735.
    [74]
    V. Padala, A. Basu, G. Orchard. A noise filtering algorithm for event-based asynchronous change detection image sensors on trueNorth and its implementation on TrueNorth. Frontiers in Neuroscience, vol. 12, pp. 1–14, 2018. DOI: 10.3389/fnins.2018.00118.
    [75]
    N. Tajbakhsh, J. Y. Shin, S. R. Gurudu, R. T. Hurst, C. B. Kendall, M. B. Gotway, J. M. Liang. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1299–1312, 2016. DOI: 10.1109/TMI.2016.2535302.
    [76]
    U. K. Lopes, J. F. Valiati. Pre-trained convolutional neural networks as feature extractors for tuberculosis detection. Computers in Biology and Medicine, vol. 89, pp. 135–143, 2017. DOI: 10.1016/j.compbiomed.2017.08.001.
    [77]
    O. J. Hénaff, A. Srinivas, J. De Fauw, A. Razavi, C. Doersch, S. M. A. Eslami, A. van den Oord. Data-efficient image recognition with contrastive predictive coding. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, vol. 119, pp. 4182–4192, 2020. DOI: 10.5555/3524938.3525329.
    [78]
    A. S. Razavian, H. Azizpour, J. Sullivan, S. Carlsson. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Columbus, USA, pp. 512–519, 2014. DOI: 10.1109/CVPRW.2014.131.
    [79]
    Y. Wu, J. Qiu, J. Takamatsu, T. Ogasawara. Temporal-enhanced convolutional network for person re-identification. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, USA, pp. 7412–7419, 2018.
    [80]
    H. Bilen, B. Fernando, E. Gavves, A. Vedaldi, S. Gould. Dynamic image networks for action recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3034–3042, 2016. DOI: 10.1109/CVPR.2016.331.
    [81]
    H. Bilen, B. Fernando, E. Gavves, A. Vedaldi. Action recognition with dynamic image networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2799–2813, 2018. DOI: 10.1109/TPAMI.2017.2769085.
    [82]
    F. Yang, Y. Wu, S. Sakti, S. Nakamura. Make skeleton-based action recognition model smaller, faster and better. In Proceedings of ACM Multimedia Asia, ACM, Beijing, China, Article number 31, 2019. DOI: 10.1145/3338533.3366569.
    [83]
    C. Li, Q. Y. Zhong, D. Xie, S. L. Pu. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 786–792, 2018. DOI: 10.5555/3304415.3304527.
    [84]
    C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2818–2826, 2016. DOI: 10.1109/CVPR.2016.308.
    [85]
    P. Q. Wang, P. F. Chen, Y. Yuan, D. Liu, Z. H. Huang, X. D. Hou, G. Cottrell. Understanding convolution for semantic segmentation. In Proceedings of IEEE Winter Conference on Applications of Computer Vision. IEEE, Lake Tahoe, USA, pp. 1451–1460, 2018. DOI: 10.1109/WACV.2018.00163.
    [86]
    F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. In Proceedings of the 5th International Conference on Learning Representations, [Online], Available: https://arxiv.org/abs/1602.07360, 2016.
    [87]
    X. Y. Zhang, X. Y. Zhou, M. X. Lin, J. Sun. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6848–6856, 2018. DOI: 10.1109/CVPR.2018.00716.
    [88]
    F. Juefei-Xu, V. N. Boddeti, M. Savvides. Perturbative neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3310–3318, 2018. DOI: 10.1109/CVPR.2018.00349.
    [89]
    F. Juefei-Xu, V. N. Boddeti, M. Savvides. Local binary convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4284–4293, 2017. DOI: 10.1109/CVPR.2017.456.
    [90]
    Z. Z. Wu, S. M. King. Investigating gated recurrent neural networks for speech synthesis. [Online], Available: https://arxiv.org/abs/1601.02539, 2016.
    [91]
    J. van der Westhuizen, J. Lasenby. The unreasonable effectiveness of the forget gate. [Online], Available: https://arxiv.org/abs/1804.04849, 2018.
    [92]
    H. Sak, A. W. Senior, F. Beaufays. Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In Proceedings of the 15th Annual Conference of the International Speech Communication Association, Singapore, pp. 338–342, 2014.
    [93]
    Y. H. Wu, M. Schuster, Z. F. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, J. Klingner, A. Shah, M. Johnson, X. B. Liu, Ƚ. Kaiser, S. Gouws, Y. Kato, T. Kudo, H. Kazawa, K. Stevens, G. Kurian, N. Patil, W. Wang, C. Young, J. Smith, J. Riesa, A. Rudnick, O. Vinyals, G. Corrado, M. Hughes, J. Dean. Google′s neural machine translation system: Bridging the gap between human and machine translation. [Online], Available: https://arxiv.org/abs/1609.08144, 2016.
    [94]
    B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [95]
    E. Real, A. Aggarwal, Y. P. Huang, Q. V. Le. Regularized evolution for image classifier architecture search. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, USA, pp. 4780–4789, 2019. DOI: 10.1609/aaai.v33i01.33014780.
    [96]
    K. Kandasamy, W. Neiswanger, J. Schneider, B. Póczos, E. P. Xing. Neural architecture search with Bayesian optimisation and optimal transport. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2020–2029, 2018. DOI: 10.5555/3326943.3327130.
    [97]
    H. Cai, L. G. Zhu, S. Han. Proxylessnas: Direct neural architecture search on target task and hardware. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [98]
    M. Astrid, S. I. Lee. Cp-decomposition with tensor power method for convolutional neural networks compression. In Proceedings of IEEE International Conference on Big Data and Smart Computing, IEEE, Jeju, Korea, pp. 115–118, 2017. DOI: 10.1109/BIGCOMP.2017.7881725.
    [99]
    J. T. Chien, Y. T. Bao. Tensor-factorized neural networks. IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 5, pp. 1998–2011, 2018. DOI: 10.1109/TNNLS.2017.2690379.
    [100]
    J. M. Ye, L. N. Wang, G. X. Li, D. Chen, S. D. Zhe, X. Q. Chu, Z. L. Xu. Learning compact recurrent neural networks with block-term tensor decomposition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9378–9387, 2018. DOI: 10.1109/CVPR.2018.00977.
    [101]
    A. Novikov, D. Podoprikhin, A. Osokin, D. P. Vetrov. Tensorizing neural networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 442–450, 2015. DOI: 10.5555/2969239.2969289.
    [102]
    T. Garipov, D. Podoprikhin, A. Novikov, D. Vetrov. Ultimate tensorization: Compressing convolutional and FC layers alike. [Online], Available: https://arxiv.org/abs/1611.03214, 2016.
    [103]
    D. H. Wang, G. S. Zhao, G. Q. Li, L. Deng, Y. Wu. Compressing 3DCNNs based on tensor train decomposition. Neural Networks, vol. 131, pp. 215–230, 2020. DOI: 10.1016/j.neunet.2020.07.028.
    [104]
    A. Tjandra, S. Sakti, S. Nakamura. Compressing recurrent neural network with tensor train. In Proceedings of International Joint Conference on Neural Networks, IEEE, Anchorage, USA, pp. 4451–4458, 2017. DOI: 10.1109/IJCNN.2017.7966420.
    [105]
    Y. C. Yang, D. Krompass, V. Tresp. Tensor-train recurrent neural networks for video classification. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 3891–3900, 2017. DOI: 10.5555/3305890.3306083.
    [106]
    Y. Pan, J. Xu, M. L. Wang, J. M. Ye, F. Wang, K. Bai, Z. L. Xu. Compressing recurrent neural networks with tensor ring for action recognition. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence, the 31st Innovative Applications of Artificial Intelligence Conference, and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Honolulu, USA, pp. 4683–4690, 2019. DOI: 10.1609/aaai.v33i01.33014683.
    [107]
    B. J. Wu, D. H. Wang, G. S. Zhao, L. Deng, G. Q. Li. Hybrid tensor decomposition in neural network compression. Neural Networks, vol. 132, pp. 309–320, 2020. DOI: 10.1016/j.neunet.2020.09.006.
    [108]
    M. Yin, S. Y. Liao, X. Y. Liu, X. D. Wang, B. Yuan. Compressing recurrent neural networks using hierarchical tucker tensor decomposition. [Online], Available: https://arxiv.org/abs/2005.04366, 2020.
    [109]
    S. Wu, G. Q. Li, F. Chen, L. P. Shi. Training and inference with integers in deep neural networks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [110]
    Y. K. Yang, L. Deng, S. Wu, T. Y. Yan, Y. Xie, G. Q. Li. Training high-performance and large-scale deep neural networks with full 8-bit integers. Neural Networks, vol. 125, pp. 70–82, 2020. DOI: 10.1016/j.neunet.2019.12.027.
    [111]
    M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi. XNOR-Net: ImageNet classification using binary convolutional neural networks. In Proceedings of the 14th European Conference on Computer Vision. Springer, Amsterdam, The Netherlands, pp. 525–542, 2016. DOI: 10.1007/978-3-319-46493-0_32.
    [112]
    Q. Lou, F. Guo, M. Kim, L. T.s Liu, L. Jiang. AutoQ: Automated kernel-wise neural network quantization. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [113]
    Y. Y. Lin, C. Sakr, Y. Kim, N. Shanbhag. PredictiveNet: An energy-efficient convolutional neural network via zero prediction. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Baltimore, USA, 2017. DOI: 10.1109/ISCAS.2017.8050797.
    [114]
    M. C. Song, J. C. Zhao, Y. Hu, J. Q. Zhang, T. Li. Prediction based execution on deep neural networks. In Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Los Angeles, USA, pp. 752–763, 2018. DOI: 10.1109/ISCA.2018.00068.
    [115]
    V. Akhlaghi, A. Yazdanbakhsh, K. Samadi, R. K. Gupta, H. Esmaeilzadeh. SnaPEA: Predictive early activation for reducing computation in deep convolutional neural networks. In Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Los Angeles, USA, pp. 662–673, 2018. DOI: 10.1109/ISCA.2018.00061.
    [116]
    W. Wen, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Learning structured sparsity in deep neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2082–2090, 2016. DOI: 10.5555/3157096.3157329.
    [117]
    J. H. Luo, J. X. Wu, W. Y. Lin. ThiNet: A filter level pruning method for deep neural network compression. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 5068–5076, 2017. DOI: 10.1109/ICCV.2017.541.
    [118]
    S. H. Lin, R. R. Ji, Y. C. Li, C. Deng, X. L. Li. Toward compact convnets via structure-sparsity regularized filter pruning. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 2, pp. 574–588, 2020. DOI: 10.1109/TNNLS.2019.2906563.
    [119]
    B. Y. Liu, M. Wang, H. Foroosh, M. Tappen, M. Penksy. Sparse convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 806–814, 2015. DOI: 10.1109/CVPR.2015.7298681.
    [120]
    W. Wen, C. Xu, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Coordinating filters for faster deep neural networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 658–666, 2017. DOI: 10.1109/ICCV.2017.78.
    [121]
    S. Han, H. Z. Mao, W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. [Online], Available: https://arxiv.org/abs/1510.00149, 2015.
    [122]
    Y. Choi, M. El-Khamy, J. Lee. Compression of deep convolutional neural networks under joint sparsity constraints. [Online], Available: https://arxiv.org/abs/1805.08303, 2018.
    [123]
    B. W. Pan, W. W. Lin, X. L. Fang, C. Q. Huang, B. L. Zhou, C. W. Lu. Recurrent residual module for fast inference in videos. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1536–1545, 2018. DOI: 10.1109/CVPR.2018.00166.
    [124]
    S. Han, X. Y. Liu, H. Z. Mao, J. Pu, A. Pedram, M. A. Horowitz, W. J. Dally. EIE: Efficient inference engine on compressed deep neural network. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, IEEE, Seoul, Korea, pp. 243–254, 2016. DOI: 10.1109/ISCA.2016.30.
    [125]
    K. Chen, J. Q. Wang, S. Yang, X. C. Zhang, Y. J. Xiong, C. C. Loy, D. H. Lin. Optimizing video object detection via a scale-time lattice. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7814–7823, 2018. DOI: 10.1109/CVPR.2018.00815.
    [126]
    S. Lee, S. Chang, N. Kwak. UrnEt: User-resizable residual networks with conditional gating module. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, USA, pp. 4569–4576, 2020. DOI: 10.1609/aaai.v34i04.5886.
    [127]
    B. Y. Fang, X. Zeng, M. Zhang. NestDNN: Resource-aware multi-tenant on-device deep learning for continuous mobile vision. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, ACM, New Delhi, India, pp. 115–127, 2018. DOI: 10.1145/3241539.3241559.
    [128]
    N. Shazeer, K. Fatahalian, W. R. Mark, R. T. Mullapudi. Hydranets: Specialized dynamic architectures for efficient inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8080–8089, 2018. DOI: 10.1109/CVPR.2018.00843.
    [129]
    G. Huang, D. L. Chen, T. H. Li, F. Wu, L. van der Maaten, K. Q. Weinberger. Multi-scale dense networks for resource efficient image classification. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [130]
    Q. S. Guo, Z. P. Yu, Y. C. Wu, D. Liang, H. Y. Qin, J. J. Yan. Dynamic recursive neural network. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 5142–5151, 2019. DOI: 10.1109/CVPR.2019.00529.
    [131]
    G. Huang, S. C. Liu, L. van der Maaten, K. Q. Weinberger. CondenseNet: An efficient DenseNet using learned group convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2752–2761, 2018. DOI: 10.1109/CVPR.2018.00291.
    [132]
    B. Yang, G. Bender, Q. V. Le, J. Ngiam. CondConv: Conditionally parameterized convolutions for efficient inference. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1307–1318, 2019. DOI: 10.5555/3454287.3454404.
    [133]
    Y. P. Chen, X. Y. Dai, M. C. Liu, D. D. Chen, L. Yuan, Z. C. Liu. Dynamic convolution: Attention over convolution kernels. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 11027–11036, 2020. DOI: 10.1109/CVPR42600.2020.01104.
    [134]
    A. W. Harley, K. G. Derpanis, I. Kokkinos. Segmentation-aware convolutional networks using local attention masks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 5048–5057, 2017. DOI: 10.1109/ICCV.2017.539.
    [135]
    H. Su, V. Jampani, D. Q. Sun, O. Gallo, E. Learned-Miller, J. Kautz. Pixel-adaptive convolutional neural networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11158–11167, 2019. DOI: 10.1109/CVPR.2019.01142.
    [136]
    K. Roy, A. Jaiswal, P. Panda. Towards spike-based machine intelligence with neuromorphic computing. Nature, vol. 575, no. 7784, pp. 607–617, 2019. DOI: 10.1038/s41586-019-1677-2.
    [137]
    M. Ehrlich, L. Davis. Deep residual learning in the JPEG transform domain. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3483–3492, 2019. DOI: 10.1109/ICCV.2019.00358.
    [138]
    Z. H. Liu, T. Liu, W. J. Wen, L. Jiang, J. Xu, Y. Z. Wang, G. Quan. DeepN-JPEG: A deep neural network favorable jpeg-based image compression framework. In Proceedings of the 55th ACM/ESDA/IEEE Design Automation Conference, IEEE, San Francisco, USA, 2018, pp. 1–6, 2018. DOI: 10.1109/DAC.2018.8465809.
    [139]
    M. Javed, P. Nagabhushan, B. B. Chaudhuri. A review on document image analysis techniques directly in the compressed domain. Artificial Intelligence Review, vol. 50, no. 4, pp. 539–568, 2018. DOI: 10.1007/s10462-017-9551-9.
    [140]
    E. Oyallon, E. Belilovsky, S. Zagoruyko, M. Valko. Compressing the input for CNNs with the first-order scattering transform. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, 2018, pp. 305–320. DOI: 10.1007/978-3-030-01240-3_19.
    [141]
    R. Torfason, F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, L. van Gool. Towards image understanding from deep compression without decoding. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [142]
    T. Chang, B. Tolooshams, D. Ba. RandNet: Deep learning with compressed measurements of images. In Proceedings of the 29th IEEE International Workshop on Machine Learning for Signal Processing, IEEE, Pittsburgh, USA, pp. 1–6, 2019. DOI: 10.1109/MLSP.2019.8918878.
    [143]
    L. D. Chamain, Z. Ding. Faster and accurate classification for JPEG2000 compressed images in networked applications. [Online], Available: https://arxiv.org/abs/1909.05638, 2019.
    [144]
    C. X. Ding, D. C. Tao. Trunk-branch ensemble convolutional neural networks for video-based face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 1002–1014, 2018. DOI: 10.1109/TPAMI.2017.2700390.
    [145]
    L. Pigou, A. van den Oord, S. Dieleman, M. van Herreweghe, J. Dambre. Beyond temporal pooling: Recurrence and temporal convolutions for gesture recognition in video. International Journal of Computer Vision, vol. 126, no. 2−4, pp. 430–439, 2018. DOI: 10.1007/s11263-016-0957-7.
    [146]
    A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, S. W. Baik. Action recognition in video sequences using deep bi-directional LSTM with CNN features. IEEE Access, vol. 6, pp. 1155–1166, 2018. DOI: 10.1109/ACCESS.2017.2778011.
    [147]
    S. Tulyakov, M. Y. Liu, X. D. Yang, J. Kautz. MoCoGAN: Decomposing motion and content for video generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1526–1535, 2018. DOI: 10.1109/CVPR.2018.00165.
    [148]
    S. Y. Sun, Z. H. Kuang, L. Sheng, W. L. Ouyang, W. Zhang. Optical flow guided feature: A fast and robust motion representation for video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 1390–1399, 2018. DOI: 10.1109/CVPR.2018.00151.
    [149]
    G. Lu, W. L. Ouyang, D. Xu, X. Y. Zhang, C. L. Cai, Z. Y. Gao. DVC: An end-to-end deep video compression framework. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10998–11007, 2019. DOI: 10.1109/CVPR.2019.01126.
    [150]
    A. Habibian, T. van Rozendaal, J. Tomczak, T. Cohen. Video compression with rate-distortion autoencoders. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 7032–7041, 2020. DOI: 10.1109/ICCV.2019.00713.
    [151]
    M. Quach, G. Valenzise, F. Dufaux. Learning convolutional transforms for lossy point cloud geometry compression. In Proceedings of IEEE International Conference on Image Processing, IEEE, Taipei, China, pp. 4320–4324, 2019. DOI: 10.1109/ICIP.2019.8803413.
    [152]
    C. Moenning, N. A. Dodgson. Fast marching farthest point sampling. In Proceedings of the 24th Annual Conference of the European Association for Computer Graphics, Eurographics Association, Granada, Spain, pp. 39–42, 2003.
    [153]
    O. Dovrat, I. Lang, S. Avidan. Learning to sample. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2755–2764, 2019. DOI: 10.1109/CVPR.2019.00287.
    [154]
    R. Q. Charles, H. Su, M. Kaichun, L. J. Guibas. PointNet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 77–85, 2017. DOI: 10.1109/CVPR.2017.16.
    [155]
    Y. Zhao, Y. J. Xiong, D. H. Lin. Trajectory convolution for action recognition. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2208–2219, 2018. DOI: 10.5555/3327144.3327148.
    [156]
    S. Mukherjee, L. Anvitha, T. M. Lahari. Human activity recognition in RGB-D videos by dynamic images. Multimedia Tools and Applications, vol. 79, no. 27, pp. 19797–19801, 2020. 10.1007/s11042-020-08747-3.
    [157]
    Y. Xiao, J. Chen, Y. C. Wang, Z. G. Cao, J. T. Zhou, X. Bai. Action recognition for depth video using multi-view dynamic images. Information Sciences, vol. 480, pp. 287–304, 2019. DOI: 10.1016/j.ins.2018.12.050.
    [158]
    H. Liu, J. H. Tu, M. Y. Liu. Two-stream 3D convolutional neural network for skeleton-based action recognition. [Online], Available: https://arxiv.org/abs/1705.08106, 2017.
    [159]
    D. Maturana, S. Scherer. VoxNet: A 3D convolutional neural network for real-time object recognition. In Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IEEE, Hamburg, Germany, pp. 922–928, 2015. DOI: 10.1109/IROS.2015.7353481.
    [160]
    J. Y. Chang, G. Moon, K. M. Lee. V2V-PoseNet: Voxel-to-voxel prediction network for accurate 3D hand and human pose estimation from a single depth map. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 5079–5088, 2018. DOI: 10.1109/CVPR.2018.00533.
    [161]
    Q. Y. Wang, Y. X. Zhang, J. S. Yuan, Y. L. Lu. Space-time event clouds for gesture recognition: From RGB cameras to event cameras. In Proceedings of IEEE Winter Conference on Applications of Computer Vision, IEEE, Waikoloa, USA, pp. 1826–1835, 2019. DOI: 10.1109/WACV.2019.00199.
    [162]
    M. Denil, B. Shakibi, L. Dinh, M. Ranzato, N. de Freitas. Predicting parameters in deep learning. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, USA, pp. 2148–2156, 2013. DOI: 10.5555/2999792.2999852.
    [163]
    D. H. Wang, B. J. Wu, G. S. Zhao, M. Yao, H. N. Chen, L. Deng, T. Y. Yan, G. Q. Li. Kronecker CP decomposition with fast multiplication for compressing RNNs. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: 10.1109/TNNLS.2021.3105961.
    [164]
    L. Deng, Y. J. Wu, Y. F. Hu, L. Liang, G. Q. Li, X. Hu, Y. F. Ding, P. Li, Y. Xie. Comprehensive SNN compression using ADMM optimization and activity regularization. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: 10.1109/TNNLS.2021.3109064.
    [165]
    A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.04861, 2017.
    [166]
    B. C. Wu, A. Wan, X. Y. Yue, P. Jin, S. C. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, K. Keutzer. Shift: A zero FLOP, zero parameter alternative to spatial convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9127–9135, 2018. DOI: 10.1109/CVPR.2018.00951.
    [167]
    W. J. Luo, Y. J. Li, R. Urtasun, R. Zemel. Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 4905–4913, 2016. DOI: 10.5555/3157382.3157645.
    [168]
    K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [169]
    A. Paszke, A. Chaurasia, S. Kim, E. Culurciello. ENet: A deep neural network architecture for real-time semantic segmentation. [Online], Available: https://arxiv.org/abs/1606.02147, 2016.
    [170]
    M. Holschneider, R. Kronland-Martinet, J. Morlet, P. Tchamitchian. A real-time algorithm for signal analysis with the help of the wavelet transform. In Wavelets: Time-Frequency Methods and Phase Space, J. M. Combes, A. Grossmann, P. Tchamitchian, Eds. Berlin, Germany: Springer, pp. 286–297, 1989. DOI: 10.1007/978-3-642-97177-8_28.
    [171]
    F. Yu, V. Koltun. Multi-scale context aggregation by dilated convolutions. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.
    [172]
    J. F. Dai, H. Z. Qi, Y. W. Xiong, Y. Li, G. D. Zhang, H. Hu, Y. C. Wei. Deformable convolutional networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 764–773, 2017. DOI: 10.1109/ICCV.2017.89.
    [173]
    M. Lin, Q. Chen, S. C. Yan. Network in network. [Online], Available: https://arxiv.org/abs/1312.4400, 2013.
    [174]
    C. Szegedy, Wei Liu, Y. Q. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going deeper with convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1–9, 2015. DOI: 10.1109/CVPR.2015.7298594.
    [175]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
    [176]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Identity mappings in deep residual networks. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 630–645, 2016. DOI: 10.1007/978-3-319-46493-0_38.
    [177]
    F. Chollet. Xception: Deep learning with depthwise separable convolutions. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1800–1807, 2017. DOI: 10.1109/CVPR.2017.195.
    [178]
    M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottlenecks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510–4520, 2018. DOI: 10.1109/CVPR.2018.00474.
    [179]
    S. Chen, Y. Liu, X. Gao, Z. Han. Mobilefacenets: Efficient CNNs for accurate real-time face verification on mobile devices. In Proceedings of the 13th Chinese Conference on Biometric Recognition, Springer, Urumqi, China, pp. 428–438, 2018. DOI: 10.1007/978-3-319-97909-0_46.
    [180]
    S. N. Xie, R. Girshick, P. Dollár, Z. W. Tu, K. M. He. Aggregated residual transformations for deep neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5987–5995, 2017. DOI: 10.1109/CVPR.2017.634.
    [181]
    S. Hochreiter J. Schmidhuber. Long short-term memory. Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. DOI: 10.1162/neco.1997.9.8.1735.
    [182]
    F. A. Gers, J. Schmidhuber, F. Cummins. Learning to forget: Continual prediction with LSTM. In Proceedings of the 19th International Conference on Artificial Neural Networks, Edinburgh, UK, pp. 850–855, 1999. DOI: 10.1049/cp:19991218.
    [183]
    K. Cho, B. van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. Proceedings of Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1724–1734, 2014. DOI: 10.3115/v1/D14-1179.
    [184]
    G. B. Zhou, J. X. Wu, C. L. Zhang, Z. H. Zhou. Minimal gated unit for recurrent neural networks. International Journal of Automation and Computing, vol. 13, no. 3, pp. 226–234, 2016. DOI: 10.1007/s11633-016-1006-2.
    [185]
    A. Kusupati, M. Singh, K. Bhatia, A. Kumar, P. Jain, M. Varma. FastGRNN: A fast, accurate, stable and tiny kilobyte sized gated recurrent neural network. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 9031–9042, 2018. DOI: 10.5555/3327546.3327577.
    [186]
    J. Bradbury, S. Merity, C. M. Xiong, R. Socher. Quasi-recurrent neural networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [187]
    S. Z. Zhang, Y. H. Wu, T. Che, Z. H. Lin, R. Memisevic, R. Salakhutdinov, Y. Bengio. Architectural complexity measures of recurrent neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 1822–1830, 2016. DOI: 10.5555/3157096.3157301.
    [188]
    N. Kalchbrenner, I. Danihelka, A. Graves. Grid long short-term memory. [Online], Available: https://arxiv.org/abs/1507.01526, 2015.
    [189]
    M. Fraccaro, S. K. Sønderby, U. Paquet, O. Winther. Sequential neural models with stochastic layers. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2207–2215, 2016. DOI: 10.5555/3157096.3157343.
    [190]
    G. Hinton, O. Vinyals, J. Dean. Distilling the knowledge in a neural network. [Online], Available: https://arxiv.org/abs/1503.02531, 2015.
    [191]
    K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, J. Schmidhuber. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2017. DOI: 10.1109/TNNLS.2016.2582924.
    [192]
    B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697–8710, 2018. DOI: 10.1109/CVPR.2018.00907.
    [193]
    H. X. Liu, K. Simonyan, Y. M. Yang. Darts: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [194]
    A. Rawal, R. Miikkulainen. From nodes to networks: Evolving recurrent neural networks. [Online], Available: https://arxiv.org/abs/1803.04439, 2018.
    [195]
    Z. Zhong, J. J. Yan, W. Wu, J. Shao, C. L. Liu. Practical block-wise neural network architecture generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 2423–2432, 2018. DOI: 10.1109/CVPR.2018.00257.
    [196]
    C. X. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, L. Fei-Fei, A. Yuille, J. Huang, K. Murphy. Progressive neural architecture search. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 19–35, 2018. DOI: 10.1007/978-3-030-01246-5_2.
    [197]
    H. X. Liu, K. Simonyan, O. Vinyals, C. Fernando, K. Kavukcuoglu. Hierarchical representations for efficient architecture search. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [198]
    B. Baker, O. Gupta, N. Naik, R. Raskar. Designing neural network architectures using reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [199]
    Z. Zhong, J. J. Yan, W. Wu, J. Shao, C. L. Liu. Practical block-wise neural network architecture generation. [Online], Available: https://arxiv.org/abs/1708.05552, 2017.
    [200]
    H. Cai, J. C. Yang, W. N. Zhang, S. Han, Y. Yu. Path-level network transformation for efficient architecture search. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 678–687, 2018.
    [201]
    H. Cai, T. Y. Chen, W. N. Zhang, Y. Yu, J. Wang. Efficient architecture search by network transformation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, USA, pp. 2787–2794, 2018. DOI: 10.5555/3504035.3504375.
    [202]
    L. X. Xie, A. L. Yuille. Genetic CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1388–1397, 2017. DOI: 10.1109/ICCV.2017.154.
    [203]
    A. Klein, E. Christiansen, K. Murphy, F. Hutter. Towards reproducible neural architecture and hyperparameter search. In Proceedings of the 2nd Reproducibility in Machine Learning Workshop, Stockholm, Sweden, 2018.
    [204]
    M. X. Tan, Q. V. Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 6105–6114, 2019.
    [205]
    T. Elsken, J. Metzen, F. Hutter. Efficient multi-objective neural architecture search via Lamarckian evolution. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [206]
    A. Klein, S. Falkner, S. Bartels, P. Hennig, F. Hutter. Fast Bayesian optimization of machine learning hyperparameters on large datasets. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, pp. 528–536, 2017.
    [207]
    H. Cai, C. Gan, T. Z. Wang, Z. K. Zhang, S. Han. Once-for-all: Train one network and specialize it for efficient deployment. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [208]
    A. Klein, S. Falkner, J. T. Springenberg, F. Hutter. Learning curve prediction with Bayesian neural networks. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [209]
    T. Wei, C. H. Wang, Y. Rui, C. W. Chen. Network morphism. In Proceedings of the 33rd International Conference on Machine Learning, New York, USA, pp. 564–572, 2016.
    [210]
    M. Masana, J. van de Weijer, L. Herranz, A. D. Bagdanov, J. M. Álvarez. Domain-adaptive deep network compression. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 4299–4307, 2017. DOI: 10.1109/ICCV.2017.460.
    [211]
    T. Kumamoto, M. Suzuki, H. Matsueda. Singular-value-decomposition analysis of associative memory in a neural network. Journal of the Physical Society of Japan, vol. 86, no. 2, Article number 24005, 2017. DOI: 10.7566/JPSJ.86.024005.
    [212]
    T. Deb, A. K. Ghosh, A. Mukherjee. Singular value decomposition applied to associative memory of Hopfield neural network. Materials Today:Proceedings, vol. 5, no. 1, pp. 2222–2228, 2018. DOI: 10.1016/j.matpr.2017.09.222.
    [213]
    Z. X. Zou, Z. W. Shi. Ship detection in spaceborne optical image with SVD networks. IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 10, pp. 5832–5845, 2016. DOI: 10.1109/TGRS.2016.2572736.
    [214]
    X. Y. Zhang, J. H. Zou, X. Ming, K. M. He, J. Sun. Efficient and accurate approximations of nonlinear convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 1984–1992, 2015. DOI: 10.1109/CVPR.2015.7298809.
    [215]
    X. Y. Zhang, J. H. Zou, K. M. He, J. Sun. Accelerating very deep convolutional networks for classification and detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1943–1955, 2016. DOI: 10.1109/TPAMI.2015.2502579.
    [216]
    Y. Ioannou, D. Robertson, R. Cipolla, A. Criminisi. Deep roots: Improving CNN efficiency with hierarchical filter groups. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5977–5986, 2017. DOI: 10.1109/CVPR.2017.633.
    [217]
    B. Peng, W. M. Tan, Z. Y. Li, S. Zhang, D. Xie, S. L. Pu. Extreme network compression via filter group approximation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 307–323, 2018. DOI: 10.1007/978-3-030-01237-3_19.
    [218]
    G. S. Hu, Y. Hua, Y. Yuan, Z. H. Zhang, Z. Lu, S. S. Mukherjee, T. M. Hospedales, N. M. Robertson, Y. X. Yang. Attribute-enhanced face recognition with neural tensor fusion networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 3764–3773, 2017. DOI: 10.1109/ICCV.2017.404.
    [219]
    J. D. Carroll, J. J. Chang. Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, vol. 35, no. 3, pp. 283–319, 1970. DOI: 10.1007/BF02310791.
    [220]
    L. De Lathauwer, B. De Moor, J. Vandewalle. A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1253–1278, 2000. DOI: 10.1137/S0895479896305696.
    [221]
    L. De Lathauwer, B. De Moor, J. Vandewalle. On the best rank-1 and rank-(R1, R2, ···, RN) approximation of higher-order tensors. SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324–1342, 2000. DOI: 10.1137/S0895479898346995.
    [222]
    L. R. Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, vol. 31, no. 3, pp. 279–311, 1966. DOI: 10.1007/BF02289464.
    [223]
    T. G. Kolda, B. W. Bader. Tensor decompositions and applications. SIAM Review, vol. 51, no. 3, pp. 455–500, 2009. DOI: 10.1137/07070111X.
    [224]
    G. Huang, Z. Liu, L. van der Maaten, K. Q. Weinberger. Densely connected convolutional networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 2261–2269, 2017. DOI: 10.1109/CVPR.2017.243.
    [225]
    X. C. Zhang, Z. Z. Li, C. C. Loy, D. H. Lin. PolyNet: A pursuit of structural diversity in very deep networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3900–3908, 2017. DOI: 10.1109/CVPR.2017.415.
    [226]
    Y. P. Chen, J. N. Li, H. X. Xiao, X. J. Jin, S. C. Yan, J. S. Feng. Dual path networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 4470–4478, 2017. DOI: 10.5555/3294996.3295200.
    [227]
    Y. D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, D. Shin. Compression of deep convolutional neural networks for fast and low power mobile applications. In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2016.
    [228]
    J. Kossaifi, A. Khanna, Z. Lipton, T. Furlanello, A. Anandkumar. Tensor contraction layers for parsimonious deep nets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1940–1946, 2017. DOI: 10.1109/CVPRW.2017.243.
    [229]
    J. Kossaifi, Z. C. Lipton, A. Kolbeinsson, A. Khanna, T. Furlanello, A. Anandkumar. Tensor regression networks. Journal of Machine Learning Research, vol. 21, no. 123, pp. 1–21, 2020.
    [230]
    M. Janzamin, H. Sedghi, A. Anandkumar. Beating the perils of non-convexity: Guaranteed training of neural networks using tensor methods. [Online], Available: https://arxiv.org/abs/1506.08473, 2016.
    [231]
    V. Lebedev, Y. Ganin, M. Rakhuba, I. V. Oseledets, V. S. Lempitsky. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [232]
    D. T. Tran, A. Iosifidis, M. Gabbouj. Improving efficiency in convolutional neural networks with multilinear filters. Neural Networks, vol. 105, pp. 328–339, 2018. DOI: 10.1016/j.neunet.2018.05.017.
    [233]
    K. T. Schütt, F. Arbabzadah, S. Chmiela, K. R. Müller, A. Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nature Communications, vol. 8, no. 1, pp. 1–8, 2017. DOI: 10.1038/ncomms13890.
    [234]
    M. Y. Zhou, Y. P. Liu, Z. Long, L. X. Chen, C. Zhu. Tensor rank learning in CP decomposition via convolutional neural network. Signal Processing:Image Communication, vol. 73, pp. 12–21, 2019. DOI: 10.1016/j.image.2018.03.017.
    [235]
    S. Oymak, M. Soltanolkotabi. End-to-end learning of a convolutional neural network via deep tensor decomposition. [Online], Available: https//arxiv.org/abs/1805.06523, 2018.
    [236]
    L. Grasedyck, D. Kressner, C. Tobler. A literature survey of low-rank tensor approximation techniques. GAMM-Mitteilungen, vol. 36, no. 1, pp. 53–78, 2013. DOI: 10.1002/gamm.201310004.
    [237]
    A. Cichocki, D. Mandic, L. De Lathauwer, G. X. Zhou, Q. B. Zhao, C. Caiafa, H. A. Phan. Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 145–163, 2015. DOI: 10.1109/MSP.2013.2297439.
    [238]
    L. De Lathauwer. Decompositions of a higher-order tensor in block terms – Part Ⅱ: Definitions and uniqueness. SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 3, pp. 1033–1066, 2008. DOI: 10.1137/070690729.
    [239]
    A. H. Phan, A. Cichocki, P. Tichavský, R. Zdunek, S. Lehky. From basis components to complex structural patterns. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 3228–3232, 2013. DOI: 10.1109/ICASSP.2013.6638254.
    [240]
    A. H. Phan, A. Cichocki, I. Oseledets, G. G. Calvi, S. Ahmadi-Asl, D. P. Mandic. Tensor networks for latent variable analysis: Higher order canonical polyadic decomposition. IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 6, pp. 2174–2188, 2020. DOI: 10.1109/TNNLS.2019.2929063.
    [241]
    W. H. He, Y. J. Wu, L. Deng, G. Q. Li, H. Y. Wang, Y. Tian, W. Ding, W. H. Wang, Y. Xie. Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences. Neural Networks, vol. 132, pp. 108–120, 2020. DOI: 10.1016/j.neunet.2020.08.001.
    [242]
    L. Deng, Y. J. Wu, X. Hu, L. Liang, Y. F. Ding, G. Q. Li, G. S. Zhao, P. Li, Y. Xie. Rethinking the performance comparison between SNNs and ANNs. Neural Networks, vol. 121, pp. 294–307, 2020. DOI: 10.1016/j.neunet.2019.09.005.
    [243]
    A. Cichocki. Tensor networks for dimensionality reduction, big data and deep learning. In Advances in Data Analysis with Computational Intelligence Methods, A. E. Gawȩda, J. Kacprzyk, L. Rutkowski, G. G. Yen, Eds., Cham, Germany: Springer, pp. 3–49, 2018. DOI: 10.1007/978-3-319-67946-4_1.
    [244]
    A. Pellionisz, R. Llinás. Tensor network theory of the metaorganization of functional geometries in the central nervous system. Neuroscience, vol. 16, no. 2, pp. 245–273, 1985. DOI: 10.1016/0306-4522(85)90001-6.
    [245]
    I. V. Oseledets, E. E. Tyrtyshnikov. Breaking the curse of dimensionality, or how to use SVD in many dimensions. SIAM Journal on Scientific Computing, vol. 31, no. 5, pp. 3744–3759, 2009. DOI: 10.1137/090748330.
    [246]
    I. V. Oseledets. Tensor-train decomposition. SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011. DOI: 10.1137/090752286.
    [247]
    B. N. Khoromskij. O(dlog N)-quantics approximation of N-d tensors in high-dimensional numerical modeling. Constructive Approximation, vol. 34, no. 2, pp. 257–280, 2011. DOI: 10.1007/s00365-011-9131-1.
    [248]
    M. Espig, K. K. Naraparaju, J. Schneider. A note on tensor chain approximation. Computing and Visualization in Science, vol. 15, no. 6, pp. 331–344, 2012. DOI: 10.1007/s00791-014-0218-7.
    [249]
    Q. B. Zhao, M. Sugiyama, L. H. Yuan, A. Cichocki. Learning efficient tensor representations with ring-structured networks. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Brighton, UK, pp. 8608–8612, 2018. DOI: 10.1109/ICASSP.2019.8682231.
    [250]
    Q. B. Zhao, G. X. Zhou, S. L. Xie, L. Q. Zhang, A. Cichocki. Tensor ring decomposition. [Online], Available: https://arxiv.org/abs/1606.05535, 2016.
    [251]
    W. Hackbusch, S. Kühn. A new scheme for the tensor representation. Journal of Fourier Analysis and Applications, vol. 15, no. 5, pp. 706–722, 2009. DOI: 10.1007/s00041-009-9094-9.
    [252]
    L. Grasedyck. Hierarchical singular value decomposition of tensors. SIAM Journal on Matrix Analysis and Applications, vol. 31, no. 4, pp. 2029–2054, 2010. DOI: 10.1137/090764189.
    [253]
    N. Lee, A. Cichocki. Regularized computation of approximate pseudoinverse of large matrices using low-rank tensor train decompositions. SIAM Journal on Matrix Analysis and Applications, vol. 37, no. 2, pp. 598–623, 2016. DOI: 10.1137/15M1028479.
    [254]
    N. Lee, A. Cichocki. Fundamental tensor operations for large-scale data analysis using tensor network formats. Multidimensional Systems and Signal Processing, vol. 29, no. 3, pp. 921–960, 2018. DOI: 10.1007/s11045-017-0481-0.
    [255]
    N. Cohen, O. Sharir, A. Shashua. On the expressive power of deep learning: A tensor analysis. In Proceedings of the 29th Annual Conference on Learning Theory, New York, USA, pp. 698–728, 2016.
    [256]
    M. Zhu, S. Gupta. To prune, or not to prune: Exploring the efficacy of pruning for model compression. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [257]
    H. T. Huang, L. B. Ni, K. W. Wang, Y. G. Wang, H. Yu. A highly parallel and energy efficient three-dimensional multilayer CMOS-RRAM accelerator for tensorized neural network. IEEE Transactions on Nanotechnology, vol. 17, no. 4, pp. 645–656, 2018. DOI: 10.1109/TNANO.2017.2732698.
    [258]
    J. H. Su, J. L. Li, B. Bhattacharjee, F. R. Huang. Tensorial neural networks: Generalization of neural networks and application to model compression. [Online], Available: https://arxiv.org/abs/1805.10352, 2018.
    [259]
    D. H. Wang, G. S. Zhao, H. N. Chen, Z. X. Liu, L. Deng, G. Q. Li. Nonlinear tensor train format for deep neural network compression. Neural Networks, vol. 144, pp. 320–333, 2021. DOI: 10.1016/j.neunet.2021.08.028.
    [260]
    J. Achterhold, J. M. Köhler, A. Schmeink, T. Genewein. Variational network quantization. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [261]
    C. Leng, H. Li, S. H. Zhu, R. Jin. Extremely low bit neural network: Squeeze the last bit out with ADMM. [Online], Available: https//arxiv.org/abs/1707.09870, 2017.
    [262]
    A. J. Zhou, A. B. Yao, Y. W. Guo, L. Xu, Y. R. Chen. Incremental network quantization: Towards lossless cnns with low-precision weights. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [263]
    S. Jung, C. Son, S. Lee, J. Son, J. J. Han, Y. Kwak, S. J. Hwang, C. Choi. Learning to quantize deep networks by optimizing quantization intervals with task loss. Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 345–4354, 2019. DOI: 10.1109/CVPR.2019.00448.
    [264]
    S. C. Zhou, Y. Z. Wang, H. Wen, Q. Y. He, Y. H. Zou. Balanced quantization: An effective and efficient approach to quantized neural networks. Journal of Computer Science and Technology, vol. 32, no. 4, pp. 667–682, 2017. DOI: 10.1007/s11390-017-1750-y.
    [265]
    Y. Choi, M. El-Khamy, J. Lee. Learning sparse low-precision neural networks with learnable regularization. [Online], Available: https//arxiv.org/abs/1809.00095, 2018.
    [266]
    K. Wang, Z. J. Liu, Y. J. Lin, J. Lin, S. Han. HAQ: Hardware-aware automated quantization with mixed precision. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 8604–8612, 2019. DOI: 10.1109/CVPR.2019.00881.
    [267]
    L. Deng, P. Jiao, J. Pei, Z. Z. Wu, G. Q. Li. GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework. Neural Networks, vol. 100, pp. 49–58, 2018. DOI: 10.1016/j.neunet.2018.01.010.
    [268]
    R. Banner, I. Hubara, E. Hoffer, D. Soudry. Scalable methods for 8-bit training of neural networks. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 5151–5159, 2018. DOI: 10.5555/3327345.3327421.
    [269]
    C. Sakr, N. R. Shanbhag. Per-tensor fixed-point quantization of the back-propagation algorithm. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [270]
    N. G. Wang, J. Choi, D. Brand, C. Y. Chen, K. Gopalakrishnan. Training deep neural networks with 8-bit floating point numbers. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 7686–7695, 2018. DOI: 10.5555/3327757.3327866.
    [271]
    R. Zhao, Y. W. Hu, J. Dotzel, C. De Sa, Z. R. Zhang. Improving neural network quantization without retraining using outlier channel splitting. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA, pp. 7543–7552, 2019.
    [272]
    Z. C. Liu, Z. Q. Shen, M. Savvides, K. T. Cheng. ReActNet: Towards precise binary neural network with generalized activation functions. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 143–159, 2020. DOI: 10.1007/978-3-030-58568-6_9.
    [273]
    G. Tej Pratap, R. Kumar, N. S. Pradeep. Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference. In Proceedings of International Joint Conference on Neural Networks, IEEE, Shenzhen, China, 2021. DOI: 10.1109/IJCNN52387.2021.9533724.
    [274]
    C. Gong, Y. Chen, Y. Lu, T. Li, C. Hao, D. M. Chen. VecQ: Minimal loss DNN model compression with vectorized weight quantization. IEEE Transactions on Computers, vol. 70, no. 5, pp. 696–710, 2021. DOI: 10.1109/TC.2020.2995593.
    [275]
    C. Z. Zhu, S. Han, H. Z. Mao, W. J. Dally. Trained ternary quantization. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [276]
    R. P. K. Poudel, U. Bonde, S. Liwicki, C. Zach. ContextNet: Exploring context and detail for semantic segmentation in real-time. [Online], Available: https://arxiv.org/abs/1805.04554, 2018.
    [277]
    R. P. K. Poudel, S. Liwicki, R. Cipolla. Fast-SCNN: Fast semantic segmentation network. In Proceedings of the 30th British Machine Vision Conference, Cardiff, UK, 2019.
    [278]
    M. Courbariaux, Y. Bengio, J. P. David. BinaryConnect: Training deep neural networks with binary weights during propagations. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 3123–3131, 2015.
    [279]
    I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio. Quantized neural networks: Training neural networks with low precision weights and activations. The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6869–6898, 2017. DOI: 10.5555/3122009.3242044.
    [280]
    S. C. Zhou, Y. X. Wu, Z. K. Ni, X. Y. Zhou, H. Wen, Y. H. Zou. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. [Online], Available: https://arxiv.org/abs/1606.06160, 2016.
    [281]
    M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, Y. Bengio. Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1. [Online], Available: https://arxiv.org/abs/1602.02830, 2016.
    [282]
    K. Weinberger, A. Dasgupta, J. Langford, A. Smola, J. Attenberg. Feature hashing for large scale multitask learning. In Proceedings of the 26th Annual International Conference on Machine Learning, ACM, Montréal, Canada, pp. 1113–1120, 2009. DOI: 10.1145/1553374.1553516.
    [283]
    W. L. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, Y. X. Chen. Compressing neural networks with the hashing trick. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 2285–2294, 2015.
    [284]
    R. Spring, A. Shrivastava. Scalable and sustainable deep learning via randomized hashing. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, Halifax, Canada, pp. 445–454, 2017. DOI: 10.1145/3097983.3098035.
    [285]
    Y. J. Lin, S. Han, H. Z. Mao, Y. Wang, B. Dally. Deep gradient compression: Reducing the communication bandwidth for distributed training. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [286]
    T. W. Chin, C. Zhang, D. Marculescu. Layer-compensated pruning for resource-constrained convolutional neural networks. [Online], Available: https://arxiv.org/abs/1810.00518, 2018.
    [287]
    Y. H. He, J. Lin, Z. J. Liu, H. R. Wang, L. J. Li, S. Han. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 815–832, 2018. DOI: 10.1007/978-3-030-01234-2_48.
    [288]
    X. F. Xu, M. S. Park, C. Brick. Hybrid pruning: Thinner sparse networks for fast inference on edge devices. [Online], Available: https://arxiv.org/abs/1811.00482, 2018.
    [289]
    J. B. Ye, X. Lu, Z. Lin, J. Z. Wang. Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [290]
    J. H. Luo, J. X. Wu. AutoPruner: An end-to-end trainable filter pruning method for efficient deep model inference. Pattern Recognition, vol. 107, Article number 107461, 2020. DOI: 10.1016/j.patcog.2020.107461.
    [291]
    X. L. Dai, H. X. Yin, N. K. Jha. NeST: A neural network synthesis tool based on a grow-and-prune paradigm. IEEE Transactions on Computers, vol. 68, no. 10, pp. 1487–1497, 2019. DOI: 10.1109/TC.2019.2914438.
    [292]
    Z. Liu, J. G. Li, Z. Q. Shen, G. Huang, S. M. Yan, C. S. Zhang. Learning efficient convolutional networks through network slimming. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2755–2763, 2017. DOI: 10.1109/ICCV.2017.298.
    [293]
    P. Molchanov, A. Mallya, S. Tyree, I. Frosio, J. Kautz. Importance estimation for neural network pruning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11256–11264, 2019. DOI: 10.1109/CVPR.2019.01152.
    [294]
    A. Renda, J. Frankle, M. Carbin. Comparing rewinding and fine-tuning in neural network pruning. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
    [295]
    G. G. Ding, S. Zhang, Z. Z. Jia, J. Zhong, J. G. Han. Where to prune: Using LSTM to guide data-dependent soft pruning. IEEE Transactions on Image Processing, vol. 30, pp. 293–304, 2021. DOI: 10.1109/TIP.2020.3035028.
    [296]
    M. B. Lin, L. J. Cao, S. J. Li, Q. X. Ye, Y. H. Tian, J. Z. Liu, Q. Tian, R. R. Ji. Filter sketch for network pruning. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: 10.1109/TNNLS.2021.3084206.
    [297]
    M. B. Lin, R. R. Ji, S. J. Li, Y. Wang, Y. J. Wu, F. Y. Huang, Q. X. Ye. Network pruning using adaptive exemplar filters. IEEE Transactions on Neural Networks and Learning Systems, to be published. DOI: 10.1109/TNNLS.2021.3084856.
    [298]
    S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, E. Shelhamer. cuDNN: Efficient primitives for deep learning. [Online], Available: https://arxiv.org/abs/1410.0759, 2014.
    [299]
    X. L. Dai, H. X. Yin, N. K. Jha. Grow and prune compact, fast, and accurate LSTMs. IEEE Transactions on Computers, vol. 69, no. 3, pp. 441–452, 2020. DOI: 10.1109/TC.2019.2954495.
    [300]
    M. H. Zhu, J. Clemons, J. Pool, M. Rhu, S. W. Keckler, Y. Xie. Structurally sparsified backward propagation for faster long short-term memory training. [Online], Available: https://arxiv.org/abs/1806.00512, 2018.
    [301]
    F. Alibart, E. Zamanidoost, D. B. Strukov. Pattern classification by memristive crossbar circuits using ex situ and in situ training. Nature Communications, vol. 4, no. 1, Article number 2072, 2013. DOI: 10.1038/ncomms3072.
    [302]
    Z. Liu, M. J. Sun, T. H. Zhou, G. Huang, T. Darrell. Rethinking the value of network pruning. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [303]
    J. Frankle, M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [304]
    N. Cohen, A. Shashua. Convolutional rectifier networks as generalized tensor decompositions. In Proceedings of the 33rd International Conference on Machine Learning, New York City, USA, pp. 955–963, 2016.
    [305]
    Y. P. Chen, X. J. Jin, B. Y. Kang, J. S. Feng, S. C. Yan. Sharing residual units through collective tensor factorization in deep neural networks. [Online], Available: https://arxiv.org/abs/1703.02180v2, 2017.
    [306]
    S. H. Li, L. Wang. Neural network renormalization group. Physical Review Letters, vol. 121, no. 26, Article number 260601, 2018. DOI: 10.1103/PhysRevLett.121.260601.
    [307]
    G. Evenbly, G. Vidal. Algorithms for entanglement renormalization. Physical Review B, vol. 79, no. 14, Article number 144108, 2009. DOI: 10.1103/PhysRevB.79.144108.
    [308]
    A. S. Morcos, H. N. Yu, M. Paganini, Y. D. Tian. One ticket to win them all: Generalizing lottery ticket initializations across datasets and optimizers. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 444, 2019. DOI: 10.5555/3454287.3454731.
    [309]
    H. N. Yu, S. Edunov, Y. D. Tian, A. S. Morcos. Playing the lottery with rewards and multiple languages: Lottery tickets in RL and NLP. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, pp. 1–12, 2020.
    [310]
    E. Malach, G. Yehudai, S. Shalev-Shwartz, O. Shamir. Proving the lottery ticket hypothesis: Pruning is all you need. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, pp. 6682–6691, 2020.
    [311]
    L. Orseau, M. Hutter, O. Rivasplata. Logarithmic pruning is all you need. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 246, 2020. DOI: 10.5555/3495724.3495970.
    [312]
    S. K. Ye, T. Y. Zhang, K. Q. Zhang, J. Y. Li, K. D. Xu, Y. F. Yang, F. X. Yu, J. Tang, Fardad, S. J. Liu, X. Chen, X. Lin, Y. Z. Wang. Progressive weight pruning of deep neural networks using ADMM. [Online], Available: https://arxiv.org/abs/1810.07378, 2018.
    [313]
    A. Polino, R. Pascanu, D. Alistarh. Model compression via distillation and quantization. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [314]
    P. Jiang, G. Agrawal. A linear speedup analysis of distributed deep learning with sparse and quantized communication. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, Montréal, Canada, pp. 2530–2541, 2018. DOI: 10.5555/3327144.3327178.
    [315]
    G. Tzelepis, A. Asif, S. Baci, S. Cavdar, E. E. Aksoy. Deep neural network compression for image classification and object detection. In Proceedings of the 18th IEEE International Conference on Machine Learning and Applications, IEEE, Boca Raton, USA, pp. 1621–1628, 2019. DOI: 10.1109/ICMLA.2019.00266.
    [316]
    D. Lee, D. H. Wang, Y. K. Yang, L. Deng, G. S. Zhao, G. Q. Li. QTTnet: Quantized tensor train neural networks for 3D object and video recognition. Neural Networks, vol. 144, pp. 420–432, 2021. DOI: 10.1016/j.neunet.2021.05.034.
    [317]
    X. Z. Zhu, J. F. Dai, L. Yuan, Y. C. Wei. Towards high performance video object detection. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7210–7218, 2018. DOI: 10.1109/CVPR.2018.00753.
    [318]
    J. Lin, Y. M. Rao, J. W. Lu, J. Zhou. Runtime neural pruning. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 2178–2188, 2017. DOI: 10.5555/3294771.3294979.
    [319]
    Y. M. Rao, J. W. Lu, J. Lin, J. Zhou. Runtime network routing for efficient image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 10, pp. 2291–2304, 2019. DOI: 10.1109/TPAMI.2018.2878258.
    [320]
    X. T. Gao, Y. R. Zhao, L. Dudziak, R. D. Mullins, C. Z. Xu. Dynamic channel pruning: Feature boosting and suppression. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [321]
    J. H. Yu, L. J. Yang, N. Xu, J. C. Yang, T. S. Huang. Slimmable neural networks. In Proceedings of the 7th International Conference on Learning Representations, New Orleans, USA, 2019.
    [322]
    Z. D. Zhang, C. Jung. Recurrent convolution for compact and cost-adjustable neural networks: An empirical study. [Online], Available: https://arxiv.org/abs/1902.09809, 2019.
    [323]
    S. C. Liu, Y. Y. Lin, Z. M. Zhou, K. M. Nan, H. Liu, J. Z. Du. On-demand deep model compression for mobile devices: A usage-driven model selection framework. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services, ACM, Munich, Germany, pp. 389–400, 2018. DOI: 10.1145/3210240.3210337.
    [324]
    T. Bolukbasi, J. Wang, O. Dekel, V. Saligrama. Adaptive neural networks for efficient inference. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 527–536, 2017.
    [325]
    X. Wang, F. Yu, Z. Y. Dou, T. Darrell, J. E. Gonzalez. SkipNet: Learning dynamic routing in convolutional networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 420–436, 2018. DOI: 10.1007/978-3-030-01261-8_25.
    [326]
    A. Ehteshami Bejnordi, R. Krestel. Dynamic channel and layer gating in convolutional neural networks. In Proceedings of the 43rd German Conference on Artificial Intelligence, Springer, Bamberg, Germany, pp. 33–45, 2020. DOI: 10.1007/978-3-030-58285-2_3.
    [327]
    J. Q. Guan, Y. Liu, Q. Liu, J. Peng. Energy-efficient amortized inference with cascaded deep classifiers. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI.org, Stockholm, Sweden, pp. 2184–2190, 2018. DOI: 10.24963/ijcai.2018/302.
    [328]
    H. X. Li, Z. Lin, X. H. Shen, J. Brandt, G. Hua. A convolutional neural network cascade for face detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 5325–5334, 2015. DOI: 10.1109/CVPR.2015.7299170.
    [329]
    R. A. Jacobs, M. I. Jordan, S. J. Nowlan, G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, vol. 3, no. 1, pp. 79–87, 1991. DOI: 10.1162/neco.1991.3.1.79.
    [330]
    A. Veit, S. Belongie. Convolutional networks with adaptive inference graphs. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–18, 2018. DOI: 10.1007/978-3-030-01246-5_1.
    [331]
    H. Y. Wang, Z. Q. Qin, S. Y. Li, X. Li. CoDiNet: Path distribution modeling with consistency and diversity for dynamic routing. IEEE Transactions on Pattern Analysis and Machine Intelligence, to be published. DOI: 10.1109/TPAMI.2021.3084680.
    [332]
    J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: 10.1109/CVPR.2018.00745.
    [333]
    F. Wang, M. Q. Jiang, C. Qian, S. Yang, C. Li, H. G. Zhang, X. G. Wang, X. O. Tang. Residual attention network for image classification. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6450–6458, 2017. DOI: 10.1109/CVPR.2017.683.
    [334]
    M. Y. Ren, A. Pokrovsky, B. Yang, R. Urtasun. SBNet: Sparse blocks network for fast inference. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8711–8720, 2018. DOI: 10.1109/CVPR.2018.00908.
    [335]
    A. Recasens, P. Kellnhofer, S. Stent, W. Matusik, A. Torralba. Learning to zoom: A saliency-based sampling layer for neural networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 52–67, 2018. DOI: 10.1007/978-3-030-01240-3_4.
    [336]
    Z. R. Yang, Y. H. Xu, W. R. Dai, H. K. Xiong. Dynamic-stride-net: Deep convolutional neural network with dynamic stride. In Proceedings of SPIE 11187, Optoelectronic Imaging and Multimedia Technology VI, SPIE, Hangzhou, China, Article number 1118707, 2019. DOI: 10.1117/12.2537799.
    [337]
    W. H. Wu, D. L. He, X. Tan, S. F. Chen, Y. Yang, S. L. Wen. Dynamic inference: A new approach toward efficient video action recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Seattle, USA, pp. 2890–2898, 2020. DOI: 10.1109/CVPRW50498.2020.00346.
    [338]
    B. L. Zhou, A. Khosla, A. Lapedriza, A. Oliva, A. Torralba. Learning deep features for discriminative localization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2921–2929, 2016. DOI: 10.1109/CVPR.2016.319.
    [339]
    A. H. Phan, A. Cichocki, P. Tichavský, D. P. Mandic, K. Matsuoka. On revealing replicating structures in multiway data: A novel tensor decomposition approach. In Proceedings of the 10th International Conference on Latent Variable Analysis and Signal Separation, Springer, Tel Aviv, Israel, pp. 297–305, 2012. DOI: 10.1007/978-3-642-28551-6_37.
    [340]
    J. Pei, L. Deng, S. Song, M. G. Zhao, Y. H. Zhang, S. Wu, G. R. Wang, Z. Zou, Z. H. Wu, W. He, F. Chen, N. Deng, S. Wu, Y. Wang, Y. J. Wu, Z. Y. Yang, C. Ma, G. Q. Li, W. T. Han, H. L. Li, H. Q. Wu, R. Zhao, Y. Xie, L. P. Shi. Towards artificial general intelligence with hybrid Tianjic chip architecture. Nature, vol. 572, no. 7767, pp. 106–111, 2019. DOI: 10.1038/s41586-019-1424-8.
    [341]
    P. A. Merolla, J. V. Arthur, R. Alvarez-Icaza, A. S. Cassidy, J. Sawada, F. Akopyan, B. L. Jackson, N. Imam, C. Guo, Y. Nakamura, B. Brezzo, I. Vo, S. K. Esser, R. Appuswamy, B. Taba, A. Amir, M. D. Flickner, W. P. Risk, R. Manohar, D. S. Modha. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, vol. 345, no. 6197, pp. 668–673, 2014. DOI: 10.1126/science.1254642.
    [342]
    N. Schuch, I. Cirac, D. Pérez-García. Peps as ground states: Degeneracy and topology. Annals of Physics, vol. 325, no. 10, pp. 2153–2192, 2010. DOI: 10.1016/j.aop.2010.05.008.
    [343]
    A. Hallam, E. Grant, V. Stojevic, S. Severini, A. G. Green. Compact neural networks based on the multiscale entanglement renormalization ansatz. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(22)  / Tables(7)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (57) PDF downloads(6) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return