Deng-Ping Fan, Ziling Huang, Peng Zheng, Hong Liu, Xuebin Qin, Luc Van Gool. Facial-sketch Synthesis: A New Challenge. Machine Intelligence Research, vol. 19, no. 4, pp.257-287, 2022. https://doi.org/10.1007/s11633-022-1349-9
Citation: Deng-Ping Fan, Ziling Huang, Peng Zheng, Hong Liu, Xuebin Qin, Luc Van Gool. Facial-sketch Synthesis: A New Challenge. Machine Intelligence Research, vol. 19, no. 4, pp.257-287, 2022. https://doi.org/10.1007/s11633-022-1349-9

Facial-sketch Synthesis: A New Challenge

doi: 10.1007/s11633-022-1349-9
More Information
  • Author Bio:

    Deng-Ping Fan received the Ph.D. degree from Nankai University, China in 2019. He joined the Inception Institute of Artificial Intelligence (IIAI), UAE in 2019. He is a Postdoctoral Researcher, working with Prof. Luc Van Gool in Computer Vision Laboratory, ETH Zürich, Switzerland. He has published approximately 50 top journal and conference papers such as TPAMI, CVPR, ICCV, ECCV, etc. He won the Best Paper Finalist Award at IEEE CVPR 2019, and the Best Paper Award Nominee at IEEE CVPR 2020. He was recognized as the CVPR 2019 outstanding reviewer with a special mention award, the CVPR 2020 outstanding reviewer, the ECCV 2020 high-quality reviewer, and the CVPR 2021 outstanding reviewer. He served as a program committee board (PCB) member of IJCAI 2022–2024, a senior program committee (SPC) member of IJCAI 2021, a committee member of China Society of Image and Graphics (CSIG), area chair in NeurIPS 2021 Datasets and Benchmarks Track, area chair in MICCAI2020 Wshp (OMIA7), editorial board member of Computer Vision & AI. His research interests include computer vision, deep learning, and visual attention, especially the human vision on co-salient object detection, RGB salient object detection, RGB-D salient object detection, and video salient object detection. E-mail: dengpfan@gmail.comORCID iD: 0000-0002-5245-7518

    Ziling Huang received the B. Sc. degree in electrical engineering from North China Electric Power University, China in 2015, and the M. Sc. degree in electrical engineering from Taiwan Tsing Hua University, Taiwan, China in 2020. She is currently a Ph. D. degree candidate at Department of Information and Communication Engineering, Graduate School of Information Science and Technology, University of Tokyo, Japan. She was an intern student at National Institute of Informatics, Japan in 2019, and at ByteDance, China from 2019 to 2020. Her research interests include computer vision and machine learning.E-mail: huangziling@nii.ac.jpORCID iD: 0000-0003-3241-7911

    Peng Zheng is a master student in visual computing and communication program at Aalto University, Finland and University of Trento, Italy. He was a research intern at Inception Institute of Artificial Intelligence (IIAI),UAE from March 2021 to October 2021. He has been a research assistant in Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), AUE since, January 2022. He serves as the reviewer of IEEE TPAMI. His research interests include computer vision and machine learning, especially on common information mining and person search.E-mail: zhengpeng0108@gmail.comORCID iD: 0000-0002-4087-5237

    Hong Liu received the Ph. D. degree from Xiamen University, China in 2020. He is now a Japan Society for the Promotion of Science Fellowship researcher at the National Institute of Informatics, Japan. He has published about 20+ papers in top journals and conferences such as TPAMI, IJCV, TIP, CVPR, ICCV, ECCV, ICLR. He was awarded the Outstanding Doctoral Dissertation Award of the China Society of Image and Graphics, JSPS International Fellowship, and Top-100 Chinese New Stars in Artificial Intelligence by Baidu Scholar. His research interests include large-scale image retrieval, Riemannian-based machine learning, and adversarial learning. E-mail: hliu@nii.ac.jp (Corresponding author)ORCID iD: 0000-0001-5318-6388

    Xuebin Qin received the Ph. D. degree from University of Alberta, Canada in 2020. Since March 2020, he is a research fellow at Department of Computing Vision, MBZUAI, UAE. He has published about 10 papers in vision and robotics conferences such as CVPR, ECCV, BMVC, ICPR, WACV, IROS. His research interests include highly accurate image segmentation, salient object detection, image labeling, detection and vision tracking. E-mail: xuebin@ualberta.ca (Corresponding author)ORCID iD: 0000-0002-9042-7192

    Luc Van Gool received the Ph. D. degree in electromechanical engineering at Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at Katholieke Universiteit Leuven in Belgium and the ETH in Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science Committee. He is a co-founder of 10 spin-off companies. His interests include 3D reconstruction and modelling, object recognition, tracking, and gesture analysis, and the combination of those. E-mail: vangool@vision.ee.ethz.chORCID iD: 0000-0002-3445-5711

  • Received Date: 2022-03-30
  • Accepted Date: 2022-06-14
  • Publish Date: 2022-08-01
  • This paper aims to conduct a comprehensive study on facial-sketch synthesis (FSS). However, due to the high cost of obtaining hand-drawn sketch datasets, there is a lack of a complete benchmark for assessing the development of FSS algorithms over the last decade. We first introduce a high-quality dataset for FSS, named FS2K, which consists of 2 104 image-sketch pairs spanning three types of sketch styles, image backgrounds, lighting conditions, skin colors, and facial attributes. FS2K differs from previous FSS datasets in difficulty, diversity, and scalability and should thus facilitate the progress of FSS research. Second, we present the largest-scale FSS investigation by reviewing 89 classic methods, including 25 handcrafted feature-based facial-sketch synthesis approaches, 29 general translation methods, and 35 image-to-sketch approaches. In addition, we elaborate comprehensive experiments on the existing 19 cutting-edge models. Third, we present a simple baseline for FSS, named FSGAN. With only two straightforward components, i.e., facial-aware masking and style-vector expansion, our FSGAN surpasses the performance of all previous state-of-the-art models on the proposed FS2K dataset by a large margin. Finally, we conclude with lessons learned over the past years and point out several unsolved challenges. Our code is available at https://github.com/DengPingFan/FSGAN.

     

  • 1 Because they want to learn a different style of sketches.
    2 Establishing an FSS dataset drawn by professional artists is more challenging than other face datasets, e.g., face attribute datasets[31], which is why the largest existing FSS dataset, i.e., CUFSF[22], has only ~1K images in the past 13 years. Although FS2K is only ~2 times larger than CUFSF, we still took one year to create such a high-quality dataset.
    3 https://vectorportal.com/
    4 Note that some related works belong to the general GAN-based model, such as CartoonGAN[102] and pSp[105]. These GAN models can be used for either neural style transfer or image-to-image translation. Since we do not make a specific review of the generalized GAN model, we classified a few GAN models into the neural style transfer task as a quick overview of these methods.
    5 This dataset is for scholarly communication only.
    6 http://www.imdb.com7 http://www.unsplash.com8 http://www.pexels.com/9 http://pngimg.com/10 https://www.scfai.edu.cn/english/ is one of the four most prominent art academies in China. Three senior artists are all from the Design Academy.11 Fig. 4(a) presents the copy table, which has an LCD backlight. It requires a high voltage input of $100-240$ V and 0.6 A working current. Its size is A4 (i.e., $300\times 200\times 3.5$ mm) in Fig. 4(b), and the luminous intensity is 300–350 LM. Therefore, it has become the most popular copy table product, after the aluminum alloy copy table, for animators (see Fig. 4(c)).
    http://www.unsplash.com
    http://www.pexels.com/
    http://pngimg.com/
    10 https://www.scfai.edu.cn/english/ is one of the four most prominent art academies in China. Three senior artists are all from the Design Academy.11 Fig. 4(a) presents the copy table, which has an LCD backlight. It requires a high voltage input of $100\sim240$ V and 0.6 A working current. Its size is A4 (i.e., $300\times 200\times 3.5$ mm) in Fig. 4(b), and the luminous intensity is $300-350$ LM. Therefore, it has become the most popular copy table product, after the aluminum alloy copy table, for animators (see Fig. 4(c)).
    Fig. 5-a presents the copy table, which has an LCD backlight. It requires a high voltage input of $100\sim240$V and 0.6A working current. Its size is A4 (i.e., $300\times 200\times 3.5$mm) in Fig. 5-b, and the luminous intensity is $300\sim350$LM. Therefore, it has become the most popular copy table product, after the aluminum alloy copy table, for animators (see Fig. 5-c).
    12 $X_{ske} = F(X_{img}, X_{style})$, where $X_{style}$ denotes the sketch style of input.13 $ X_{img} = F(X_{ske}) $
    $X_{img} = F(X_{ske}).$
    14 Because the S2I task needs to restore more detailed information of the RGB images, more training epochs are needed.
    These authors contribute equally to this work
  • loading
  • [1]
    X. G. Wang, X. O. Tang. Face photo-sketch synthesis and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 1955–1967, 2009. DOI: 10.1109/TPAMI.2008.222.
    [2]
    R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10735–10744, 2019. DOI: 10.1109/CVPR.2019.01100.
    [3]
    H. Koshimizu, M. Tominaga, T. Fujiwara, K. Murakami. On KANSEI facial image processing for computerized facial caricaturing system PICASSO. In Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, IEEE, Tokyo, Japan, pp. 294–299, 1999. DOI: 10.1109/ICSMC.1999.816567.
    [4]
    N. Kumar, A. C. Berg, P. N. Belhumeur, S. K. Nayar. Attribute and simile classifiers for face verification. In Proceedings of IEEE 12th International Conference on Computer Vision, IEEE, Kyoto, Japan, pp. 365–372, 2009. DOI: 10.1109/ICCV.2009.5459250" target="_blank">href="http://dx.doi.org/10.1109/ICCV.2009.5459250">10.1109/ICCV.2009.5459250.
    [5]
    H. S. Du, Q. P. Hu, D. F. Qiao, I. Pitas. Robust face recognition via low-rank sparse representation-based classification. International Journal of Automation and Computing, vol. 12, no. 6, pp. 579–587, 2015. DOI: 10.1007/s11633-015-0901-2.
    [6]
    Y. Z. Lu. A novel face recognition algorithm for distinguishing faces with various angles. International Journal of Automation and Computing, vol. 5, no. 2, pp. 193–197, 2008. DOI: 10.1007/s11633-008-0193-x.
    [7]
    V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010-009, Department of Computer Science, University of Massachusetts Amherst, USA, 2010.
    [8]
    Z. P. Zhang, P. Luo, C. C. Loy, X. O. Tang. Facial landmark detection by deep multi-task learning. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 94–108, 2014. DOI: 10.1007/978-3-319-10599-4_7.
    [9]
    A. Bulat, G. Tzimiropoulos. How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230, 000 3D facial landmarks). In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1021–1030, 2017. DOI: 10.1109/ICCV.2017.116.
    [10]
    J. X. Sun, Q. Li, W. N. Wang, J. Zhao, Z. N. Sun. Multi-caption text-to-face synthesis: Dataset and algorithm. In Proceedings of the 29th ACM International Conference on Multimedia, ACM, Chengdu, China, pp. 2290–2298, 2021. DOI: 10.1145/3474085.3475391.
    [11]
    R. Yi, M. F. Xia, Y. J. Liu, Y. K. Lai, P. L. Rosin. Line drawings for face portraits from photos using global and local structure based GANs. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3462–3475, 2021. DOI: 10.1109/TPAMI.2020.2987931.
    [12]
    Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on image Processing, vol. 13, no. 4, pp. 600–612, 2004. DOI: 10.1109/TIP.2003.819861.
    [13]
    J. Y. Zhu, T. Park, P. Isola, A. A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2242–2251, 2017. DOI: 10.1109/ICCV.2017.244.
    [14]
    M. Y. Liu, T. Breuel, J. Kautz. Unsupervised image-to-image translation networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 700–708, 2017.
    [15]
    T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional GANs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8798–8807, 2018. DOI: 10.1109/CVPR.2018.00917.
    [16]
    T. Park, M. Y. Liu, T. C. Wang, J. Y. Zhu. Semantic image synthesis with spatially-adaptive normalization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2332–2341, 2019. DOI: 10.1109/CVPR.2019.00244.
    [17]
    H. Y. Chang, Z. X. Wang, Y. Y. Chuang. Domain-specific mappings for generative adversarial style transfer. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 573–589, 2020. DOI: 10.1007/978-3-030-58598-3_34.
    [18]
    R. F. Chen, W. B. Huang, B. H. Huang, F. C. Sun, B. Fang. Reusing discriminators for encoding: Towards unsupervised image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8165–8174, 2020. DOI: 10.1109/CVPR42600.2020.00819.
    [19]
    H. Y. Lee, H. Y. Tseng, Q. Mao, J. B. Huang, Y. D. Lu, M. Singh, M. H. Yang. DRIT++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision, vol. 128, no. 10, pp. 2402–2417, 2020. DOI: 10.1007/s11263-019-01284-z.
    [20]
    D. P. Fan, S. C. Zhang, Y. H. Wu, Y. Liu, M. M. Cheng, B. Ren, P. Rosin, R. R. Ji. Scoot: A perceptual metric for facial sketches. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 5611–5621, 2019. DOI: 10.1109/ICCV.2019.00571.
    [21]
    H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. On matching sketches with digital face images. In Proceedings of the 4th IEEE International Conference on Biometrics: Theory, Applications and Systems, IEEE, Washington DC, USA, 2010. DOI: 10.1109/BTAS.2010.5634507.
    [22]
    W. Zhang, X. G. Wang, X. O. Tang. Coupled information-theoretic encoding for face photo-sketch recognition. In Proceedings of Conference on Computer Vision and Pattern Recognition, IEEE, Colorado Springs, USA, pp. 513–520, 2011. DOI: 10.1109/CVPR.2011.5995324.
    [23]
    X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li. Face sketch-photo synthesis and retrieval using sparse representation. IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 8, pp. 1213–1226, 2012. DOI: 10.1109/TCSVT.2012.2198090.
    [24]
    I. Berger, A. Shamir, M. Mahler, E. Carter, J. Hodgins. Style and abstraction in portrait sketching. ACM Transactions on Graphics, vol. 32, no. 4, Article number 55, 2013. DOI: 10.1145/2461912.2461964.
    [25]
    R. Yi, Y. J. Liu, Y. K. Lai, P. L. Rosin. Unpaired portrait drawing generation via asymmetric cycle mapping. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8214–8222, 2020. DOI: 10.1109/CVPR42600.2020.00824.
    [26]
    C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Face recognition from multiple stylistic sketches: Scenarios, datasets, and evaluation. Pattern Recognition, vol. 84, pp. 262–272, 2018. DOI: 10.1016/j.patcog.2018.07.014.
    [27]
    A. M. Martinez, R. Benavente. The AR Face Database, CVC Technical Report 24, CVC, Spain, 1998.
    [28]
    N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis under multi-dictionary sparse representation framework. In Proceedings of 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 82–87, 2011. DOI: 10.1109/ICIG.2011.112.
    [29]
    S. C. Zhang, R. R. Ji, J. Hu, X. Q. Lu, X. L. Li. Face sketch synthesis by multidomain adversarial learning. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 5, pp. 1419–1428, 2019. DOI: 10.1109/TNNLS.2018.2869574.
    [30]
    M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. Knowledge distillation for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 2, pp. 893–906, 2022. DOI: 10.1109/TNNLS.2020.3030536.
    [31]
    Z. W. Liu, P. Luo, X. G. Wang, X. O. Tang. Deep learning face attributes in the wild. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 3730–3738, 2015. DOI: 10.1109/ICCV.2015.425.
    [32]
    J. Kim, M. Kim, H. Kang, K. Lee. U-GAT-IT: Unsupervised generative attentional networks with adaptive layer-Instance normalization for image-to-image translation. In Proceedings of the 8th International Conference on Learning Representations, Ababa, Ethiopia, 2020.
    [33]
    P. Isola, J. Y. Zhu, T. H. Zhou, A. A. Efros. Image-to-image translation with conditional adversarial networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5967–5976, 2017. DOI: 10.1109/CVPR.2017.632.
    [34]
    K. Messer, J. Matas, J. Kittler, K. Jonsson, J. Luettin, G. Maitre. XM2VTSDB: The extended M2VTS database. In Proceedings of the 2nd International Conference on Audio and Video-based Biometric Person Authentication, Springer, Washington DC, USA, pp. 965–966, 1999.
    [35]
    P. J. Phillips, H. Moon, S. A. Rizvi, P. J. Rauss. The FERET evaluation methodology for face-recognition algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104, 2000. DOI: 10.1109/34.879790.
    [36]
    Á. Serrano, I. M. De Diego, C. Conde, E. Cabello, L. L. Shen, L. Bai. Influence of wavelet frequency and orientation in an SVM-based parallel Gabor PCA face verification system. In Proceedings of the 8th International Conference on Intelligent Data Engineering and Automated Learning, Springer, Birmingham, UK, pp. 219–228, 2007. DOI: 10.1007/978-3-540-77226-2_23.
    [37]
    H. S. Bhatt, S. Bharadwaj, R. Singh, M. Vatsa. Memetically optimized MCWLD for matching sketches with digital face images. IEEE Transactions on Information Forensics and Security, vol. 7, no. 5, pp. 1522–1535, 2012. DOI: 10.1109/TIFS.2012.2204252.
    [38]
    M. Minear, D. C. Park. A lifespan database of adult facial stimuli. Behavior Research Methods,Instruments &Computers, vol. 36, no. 4, pp. 630–633, 2004. DOI: 10.3758/BF03206543.
    [39]
    J. Nishino, T. Kamyama, H. Shira, T. Odaka, H. Ogura. Linguistic knowledge acquisition system on facial caricature drawing system. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1591–1596, 1999. DOI: 10.1109/FUZZY.1999.790142.
    [40]
    S. Iwashita, Y. Takeda, T. Onisawa. Expressive facial caricature drawing. In Proceedings of IEEE International Fuzzy Systems. IEEE, Seoul, Korea, pp. 1597–1602, 1999. DOI: 10.1109/FUZZY.1999.790143.
    [41]
    Y. Z. Li, H. Kobatake. Extraction of facial sketch image based on morphological processing. In Proceedings of International Conference on Image Processing, IEEE, Santa Barbara, USA, pp. 316–319, 1997. DOI: 10.1109/ICIP.1997.632104.
    [42]
    M. Tominaga, S. Fukuoka, K. Murakami, H. Koshimizu. Facial caricaturing with motion caricaturing in PICASSO system. In Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics, IEEE, Tokyo, Japan, pp. 30, 1997. DOI: 10.1109/AIM.1997.652888.
    [43]
    S. E. Brennan. Caricature Generator, Ph. D. dissertation, Massachusetts Institute of Technology, USA, 1982.
    [44]
    N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. A comprehensive survey to face hallucination. International Journal of Computer Vision, vol. 106, no. 1, pp. 9–30, 2014. DOI: 10.1007/s11263-013-0645-9.
    [45]
    H. Chen, Y. Q. Xu, H. Y. Shum, S. C. Zhu, N. N. Zheng. Example-based facial sketch generation with non-parametric sampling. In Proceedings of the 8th IEEE International Conference on Computer Vision, IEEE, Vancouver, Canada, pp. 433–438, 2001. DOI: 10.1109/ICCV.2001.937657.
    [46]
    A. V. Nefian, M. H. Hayes III. Face recognition using an embedded HMM. In Proceedings of IEEE Conference on Audio and Video-based Biometric Person Authentication, IEEE, 1999.
    [47]
    X. B. Gao, J. J. Zhong, J. Li, C. N. Tian. Face sketch synthesis algorithm based on E-HMM and selective ensemble. IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 4, pp. 487–496, 2008. DOI: 10.1109/TCSVT.2008.918770.
    [48]
    M. Eitz, J. Hays, M. Alexa. How do humans sketch objects? ACM Transactions on Graphics, vol. 31, no. 4, Article number 44, 2012. DOI: 10.1145/2185520.2185540.
    [49]
    T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: Common objects in context. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 740–755, 2014. DOI: 10.1007/978-3-319-10602-1_48.
    [50]
    O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, F. F. Li. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015. DOI: 10.1007/11263-015-0816-y.
    [51]
    M. Cimpoi, S. Maji, I. Kokkinos, S. Mohamed, A. Vedaldi. Describing textures in the wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 3606–3613, 2014. DOI: 10.1109/CVPR.2014.461.
    [52]
    S. Y. Duck. Painter by numbers, wikiart. org, [Online], Available: https://www.kaggle.com/c/painter-by-numbers, 2016.
    [53]
    M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele. The cityscapes dataset for semantic urban scene understanding. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3213–3223, 2016. DOI: 10.1109/CVPR.2016.350.
    [54]
    R. Tyleček, R. Šára. Spatial pattern templates for recognition of objects with regular structure. In Proceedings of the 35th German Conference on Pattern Recognition, Springer, Saarbrücken, Germany, pp. 364–374, 2013. DOI: 10.1007/978-3-642-40602-7_39.
    [55]
    J. Y. Zhu, P. Krähenbühl, E. Shechtman, A. A. Efros. Generative visual manipulation on the natural image manifold. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 597–613, 2016. DOI: 10.1007/978-3-319-46454-1_36.
    [56]
    A. Yu, K. Grauman. Fine-grained visual comparisons with local learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 192–199, 2014. DOI: 10.1109/CVPR.2014.32.
    [57]
    P. Y. Laffont, Z. Ren, X. F.Tao, C. Qian, J. Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics, vol. 33, no. 4, Article number 149, 2014. DOI: 10.1145/2601097.2601101.
    [58]
    Y. Lecun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. DOI: 10.1109/5.726791.
    [59]
    C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie. The caltech-ucsd birds-200-2011 dataset, 2011. [Online], Available: https://authors.library.caltech.edu/27452/1/CUB_200_2011.pdf.
    [60]
    T. Karras, T. Aila, S. Laine, J. Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [61]
    N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor segmentation and support inference from RGBD images. In Proceedings of the 12th European Conference on Computer Vision, Springer, Florence, Italy, pp. 746–760, 2012. DOI: 10.1007/978-3-642-33715-4_54.
    [62]
    B. L. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba. Scene parsing through ADE20K dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 5122–5130, 2017. DOI: 10.1109/CVPR.2017.544.
    [63]
    Q. Yu, Y. Z. Song, T. Xiang, T. M. Hospedales. Sketchx!-shoe/chair fine-grained SBIR dataset, 2017. [Online], Available: https://sketchx.eecs.qmul.ac.uk/downloads/.
    [64]
    D. Ha, D. Eck. A neural representation of sketch drawings. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
    [65]
    Y. H. Jin, J. K. Zhang, M. J. Li, Y. T. Tian, H. C. Zhu, Z. H. Fang. Towards the automatic anime characters creation with generative adversarial networks. [Online], Available: https://arxiv.org/pdf/1708.05509, 2017.
    [66]
    H. Z. Xu, Y. Gao, F. Yu, T. Darrell. End-to-end learning of driving models from large-scale video datasets. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 3530–3538, 2017. DOI: 10.1109/CVPR.2017.376.
    [67]
    G. Ros, L. Sellart, J. Materzynska, D. Vazquez, A. M. Lopez. The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 3234–3243, 2016. DOI: 10.1109/CVPR.2016.352.
    [68]
    Z. W. Liu, P. Luo, S. Qiu, X. G. Wang, X. O. Tang. DeepFashion: Powering robust clothes recognition and retrieval with rich annotations. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 1096–1104, 2016. DOI: 10.1109/CVPR.2016.124.
    [69]
    T. Karras, S. Laine, T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4396–4405, 2019. DOI: 10.1109/CVPR.2019.00453.
    [70]
    E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, Honolulu, USA, pp. 1122–1131, 2017. DOI: 10.1109/CVPRW.2017.150.
    [71]
    B. Yao, X. Yang, S. C. Zhu. Introduction to a large-scale general purpose ground truth database: Methodology, annotation tool and benchmarks. In Proceedings of the 6th International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition, Springer, Ezhou, China, pp. 169–183, 2007, DOI: 10.1007/978-3-540-74198-5_14.
    [72]
    J. Krause, M. Stark, J. Deng, F. F. Li. 3D object representations for fine-grained categorization. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Sydney, Australia, pp. 554–561, 2013. DOI: 10.1109/ICCVW.2013.77.
    [73]
    F. Yu, A. Seff, Y. D. Zhang, S. R. Song, T. Funkhouser, J. X. Xiao. LSUN: Construction of a large-scale image dataset using deep learning with humans in the loop. [Online], Available: https://arxiv.org/abs/1506.03365, 2015.
    [74]
    Q. S. Liu, X. O. Tang, H. L. Jin, H. Q. Lu, S. D. Ma. A nonlinear approach for face sketch synthesis and recognition. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, San Diego, USA, pp. 1005–1010, 2005. DOI: 10.1109/CVPR.2005.39.
    [75]
    Z. J. Xu, H. Chen, S. C. Zhu, J. B. Luo. A hierarchical compositional model for face representation and sketching. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 6, pp. 955–969, 2008. DOI: 10.1109/TPAMI.2008.50.
    [76]
    W. Zhang, X. G. Wang, X. O. Tang. Lighting and pose robust face sketch synthesis. In Proceedings of the 11th European Conference on Computer Vision, Springer, Heraklion, Greece, pp. 420–433, 2010. DOI: 10.1007/978-3-642-15567-3_31.
    [77]
    N. Y. Ji, X. J. Chai, S. G. Shan, X. L. Chen. Local regression model for automatic face sketch generation. In Proceedings of the 6th International Conference on Image and Graphics, IEEE, Hefei, China, pp. 412–417, 2011. DOI: 10.1109/ICIG.2011.84.
    [78]
    L. Chang, M. Q. Zhou, X. M. Deng, Z. K. Wu, Y. J. Han. Face sketch synthesis via multivariate output regression. In Proceedings of the 14th International Conference on Human-computer Interaction, Springer, Orlando, USA, pp. 555–561, 2011. DOI: 10.1007/978-3-642-21602-2_60.
    [79]
    J. W. Zhang, N. N. Wang, X. B. Gao, D. C. Tao, X. L. Li. Face sketch-photo synthesis based on support vector regression. In Proceedings of the 18th IEEE International Conference on Image Processing, IEEE, Brussels, Belgium, pp. 1125–1128, 2011. DOI: 10.1109/ICIP.2011.6115625.
    [80]
    S. L. Wang, L. Zhang, Y. Liang, Q. Pan. Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 2216–2223, 2012. DOI: 10.1109/CVPR.2012.6247930.
    [81]
    H. Zhou, Z. H. Kuang, K. Y. K. Wong. Markov weight fields for face sketch synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Providence, USA, pp. 1091–1097, 2012. DOI: 10.1109/CVPR.2012.6247788.
    [82]
    T. H. Wang, J. Collomosse, A. Hunter, D. Greig. Learnable stroke models for example-based portrait painting. In Proceedings of British Machine Vision Conference, Bristol, UK, 2013.
    [83]
    N. N. Wang, D. C. Tao, X. B. Gao, X. L. Li, J. Li. Transductive face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 24, no. 9, pp. 1364–1376, 2013. DOI: 10.1109/TNNLS.2013.2258174.
    [84]
    D. A. Huang, Y. C. F. Wang. Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Sydney, Australia, pp. 2496–2503, 2013. DOI: 10.1109/ICCV.2013.310.
    [85]
    Y. B. Song, L. C. Bao, Q. X. Yang, M. H. Yang. Real-time exemplar-based face sketch synthesis. In Proceedings of the 13th European Conference on Computer Vision, Springer, Zurich, Switzerland, pp. 800–813, 2014. DOI: 10.1007/978-3-319-10599-4_51.
    [86]
    S. C. Zhang, X. B. Gao, N. N. Wang, J. Li. Robust face sketch style synthesis. IEEE Transactions on Image Processing, vol. 25, no. 1, pp. 220–232, 2016. DOI: 10.1109/TIP.2015.2501755.
    [87]
    C. L. Peng, X. B. Gao, N. N. Wang, J. Li. Superpixel-based face sketch-photo synthesis. IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 2, pp. 288–299, 2017. DOI: 10.1109/TCSVT.2015.2502861.
    [88]
    C. L. Peng, X. B. Gao, N. N. Wang, D. C. Tao, X. L. Li, J. Li. Multiple representations-based face sketch-photo synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no. 11, pp. 2201–2215, 2016. DOI: 10.1109/TNNLS.2015.2464681.
    [89]
    Y. Li, Y. Z. Song, T. M. Hospedales, S. G. Gong. Free-hand sketch synthesis with deformable stroke models. International Journal of Computer Vision, vol. 122, no. 1, pp. 169–190, 2017. DOI: 10.1007/s11263-016-0963-9.
    [90]
    J. Li, X. Y. Yu, C. L. Peng, N. N. Wang. Adaptive representation-based face sketch-photo synthesis. Neurocomputing, vol. 269, pp. 152–159, 2017. DOI: 10.1016/j.neucom.2016.10.095.
    [91]
    N. N. Wang, X. B. Gao, J. Li. Random sampling for fast face sketch synthesis. Pattern Recognition, vol. 76, pp. 215–227, 2018. DOI: 10.1016/j.patcog.2017.11.008.
    [92]
    Y. F. Men, Z. H. Lian, Y. M. Tang, J. G. Xiao. A common framework for interactive texture transfer. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6353–6362, 2018. DOI: 10.1109/CVPR.2018.00665.
    [93]
    L. A. Gatys, A. S. Ecker, M. Bethge. A neural algorithm of artistic style. [Online], Available: https://arxiv.org/abs/1508.06576, 2015.
    [94]
    L. A. Gatys, A. S. Ecker, M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 2414–2423, 2016. DOI: 10.1109/CVPR.2016.265.
    [95]
    J. Johnson, A. Alahi, F. F. Li. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the 14th European Conference on Computer Vision, Springer, Amsterdam, The Netherlands, pp. 694–711, 2016. DOI: 10.1007/978-3-319-46475-6_43.
    [96]
    D. Ulyanov, V. Lebedev, A. Vedaldi, V. S. Lempitsky. Texture networks: Feed-forward synthesis of textures and stylized images. In Proceedings of the 33rd International Conference on International Conference on Machine Learning, New York, USA, pp. 1349–1357, 2016.
    [97]
    T. Q. Chen, M. Schmidt. Fast patch-based style transfer of arbitrary style. [Online], Available: https://arxiv.org/pdf/1612.04337, 2016.
    [98]
    V. Dumoulin, J. Shlens, M. Kudlur. A learned representation for artistic style. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [99]
    D. Ulyanov, A. Vedaldi, V. Lempitsky. Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 4105–4113, 2017. DOI: 10.1109/CVPR.2017.437.
    [100]
    X. Huang, S. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1510–1519, 2017. DOI: 10.1109/ICCV.2017.167.
    [101]
    Y. J. Li, C. Fang, J. M. Yang, Z. W. Wang, X. Lu, M. H. Yang. Universal style transfer via feature transforms. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 385–395, 2017.
    [102]
    Y. Chen, Y. K. Lai, Y. J. Liu. CartoonGAN: Generative adversarial networks for photo cartoonization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9465–9474, 2018. DOI: 10.1109/CVPR.2018.00986.
    [103]
    R. Abdal, Y. P. Qin, P. Wonka. Image2StyleGAN: How to embed images into the StyleGAN latent space? In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 4431–4440, 2019. DOI: 10.1109/ICCV.2019.00453.
    [104]
    D. Kotovenko, M. Wright, A. Heimbrecht, B. Ommer. Rethinking style transfer: From pixels to parameterized brushstrokes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 12191–12200, 2021. DOI: 10.1109/CVPR46437.2021.01202.
    [105]
    E. Richardson, Y. Alaluf, O. Patashnik, Y. Nitzan, Y. Azar, S. Shapiro, D. Cohen-Or. Encoding in style: A StyleGAN encoder for image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 2287–2296, 2021. DOI: 10.1109/CVPR46437.2021.00232.
    [106]
    Z. L. Yi, H. Zhang, P. Tan, M. L. Gong. DualGAN: Unsupervised dual learning for image-to-image translation. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2868–2876, 2017. DOI: 10.1109/ICCV.2017.310.
    [107]
    T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim. Learning to discover cross-domain relations with generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 1857–1865, 2017.
    [108]
    J. Y. Zhu, R. Zhang, D. Pathak, T. Darrell, A. A. Efros, O. Wang, E. Shechtman. Toward multimodal image-to-image translation. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 465–476, 2017.
    [109]
    X. Huang, M. Y. Liu, S. Belongie, J. Kautz. Multimodal unsupervised image-to-image translation. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 179–196, 2018. DOI: 10.1007/978-3-030-01219-9_11.
    [110]
    P. Zhang, B. Zhang, D. Chen, L. Yuan, F. Wen. Cross-domain correspondence learning for exemplar-based image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5142–5152, 2020. DOI: 10.1109/CVPR42600.2020.00519.
    [111]
    L. M. Jiang, C. X. Zhang, M. Y. Huang, C. X. Liu, J. P. Shi, C. C. Loy. TSIT: A simple and versatile framework for image-to-image translation. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 206–222, 2020. DOI: 10.1007/978-3-030-58580-8_13.
    [112]
    Y. H. Zhao, R. H. Wu, H. Dong. Unpaired image-to-image translation using adversarial consistency loss. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 800–815, 2020. DOI: 10.1007/978-3-030-58545-7_46.
    [113]
    X. R. Zhou, B. Zhang, T. Zhang, P. Zhang, J. M. Bao, D. Chen, Z. F. Zhang, F. Wen. CoCosNet v2: Full-resolution correspondence learning for image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp. 11460–11470, 2021. DOI: 10.1109/CVPR46437.2021.01130.
    [114]
    A. P. Chen, R. Y. Liu, L. Xie, Z. Chen, H. Su, J. Y. Yu. SofGAN: A portrait image generator with dynamic styling. ACM Transactions on Graphics, vol. 41, no. 1, Article number 1, 2022. DOI: 10.1145/3470848.
    [115]
    L. L. Zhang, L. Lin, X. Wu, S. Y. Ding, L. Zhang. End-to-end photo-sketch generation via fully convolutional representation learning. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 627–634, 2015. DOI: 10.1145/2671188.2749321.
    [116]
    M. R. Zhu, N. N. Wang, X. B. Gao, J. Li. Deep graphical feature learning for face sketch synthesis. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, pp. 3574–3580, 2017.
    [117]
    P. Sangkloy, J. W. Lu, C. Fang, F. Yu, J. Hays. Scribbler: Controlling deep image synthesis with sketch and color. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 6836–6845, 2017. DOI: 10.1109/CVPR.2017.723.
    [118]
    M. J. Zhang, N. N. Wang, Y. S. Li, R. X. Wang, X. B. Gao. Face sketch synthesis from coarse to fine. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence, California, USA, pp. 7558–7565, 2018. DOI: 10.1609/aaai.v32i1.12224" target="_blank">href="http://dx.doi.org/10.1609/aaai.v32i1.12224">10.1609/aaai.v32i1.12224.
    [119]
    W. Q. Xian, P. Sangkloy, V. Agrawal, A. Raj, J. W. Lu, C. Fang, F. Yu, J. Hays. TextureGAN: Controlling deep image synthesis with texture patches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8456–8465, 2018. DOI: 10.1109/CVPR.2018.00882.
    [120]
    J. F. Song, K. Y. Pang, Y. Z. Song, T. Xiang, T. M. Hospedales. Learning to sketch with shortcut cycle consistency. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 801–810, 2018. DOI: 10.1109/CVPR.2018.00090.
    [121]
    Y. Y. Lu, S. Z. Wu, Y. W. Tai, C. K. Tang. Image generation from sketch constraint using contextual GAN. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 213–228, 2018. DOI: 10.1007/978-3-030-01270-0_13.
    [122]
    S. C. Zhang, R. R. Ji, J. Hu, Y. Gao, C. W. Lin. Robust face sketch synthesis via generative adversarial fusion of priors and parametric sigmoid. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 1163–1169, 2018.
    [123]
    M. J. Zhang, N. Wang, Y. Li, X. Gao. Markov random neural fields for face sketch synthesis. In Proceedings of International Joint Conferences on Artificial Intelligence, Stockholm, Sweden, pp. 7558–7565, 2018.
    [124]
    L. D. Wang, V. Sindagi, V. Patel. High-quality facial photo-sketch synthesis using multi-adversarial networks. In Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Xi′an, China, pp. 83–90, 2018. DOI: 10.1109/FG.2018.00022.
    [125]
    M. J. Zhang, R. X. Wang, X. B. Gao, J. Li, D. C. Tao. Dual-transfer face sketch-photo synthesis. IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 642–657, 2019. DOI: 10.1109/TIP.2018.2869688.
    [126]
    H. Kazemi, M. Iranmanesh, A. Dabouei, S. Soleymani, N. M. Nasrabadi. Facial attributes guided deep sketch-to-photo synthesis. In Proceedings of IEEE Winter Applications of Computer Vision Workshops, IEEE, Lake Tahoe, USA, 2018. DOI: 10.1109/WACVW.2018.00006.
    [127]
    H. Kazemi, F. Taherkhani, N. M. Nasrabadi. Unsupervised facial geometry learning for sketch to photo synthesis. In Proceedings of International Conference of the Biometrics Special Interest Group, IEEE, Darmstadt, Germany, 2018.
    [128]
    S. You, N. You, M. X. Pan. PI-REC: Progressive image reconstruction network with edge and color domain. [Online], Available: https://arxiv.org/abs/1903.10146, 2019.
    [129]
    M. J. Zhang, N. N. Wang, Y. S. Li, X. B. Gao. Deep latent low-rank representation for face sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3109–3123, 2019. DOI: 10.1109/TNNLS.2018.2890017.
    [130]
    M. R. Zhu, J. Li, N. N. Wang, X. B. Gao. A deep collaborative framework for face photo-sketch synthesis. IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 10, pp. 3096–3108, 2019. DOI: 10.1109/TNNLS.2018.2890018.
    [131]
    M. J. Zhang, Y. S. Li, N. N. Wang, Y. Chi, X. B. Gao. Cascaded face sketch synthesis under various illuminations. IEEE Transactions on Image Processing, vol. 29, pp. 1507–1521, 2019. DOI: 10.1109/TIP.2019.2942514.
    [132]
    M. R. Zhu, N. N. Wang, X. B. Gao, J. Li, Z. F. Li. Face photo-sketch synthesis via knowledge transfer. In Proceedings of the 28th International Joint Conference on Artficial Intelligence, Macao, China, pp. 1048–1054, 2019.
    [133]
    Y. J. Li, C. Fang, A. Hertzmann, E. Shechtman, M. H. Yang. Im2Pencil: Controllable pencil illustration from photographs. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 1525–1534, 2019. DOI: 10.1109/CVPR.2019.00162.
    [134]
    A. Ghosh, R. Zhang, P. Dokania, O. Wang, A. Efros, P. Torr, E. Shechtman. Interactive sketch & fill: Multiclass sketch-to-image translation. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1171–1180, 2019. DOI: 10.1109/ICCV.2019.00126.
    [135]
    X. R. Wang, J. Z. Yu. Learning to cartoonize using white-box cartoon representations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 8087–8096, 2020. DOI: 10.1109/CVPR42600.2020.00811.
    [136]
    C. Y. Gao, Q. Liu, Q. Xu, L. M. Wang, J. Z. Liu, C. Q. Zou. SketchyCOCO: Image generation from freehand scene sketches. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 5173–5182, 2020. DOI: 10.1109/CVPR42600.2020.00522.
    [137]
    S. Yang, Z. Y. Wang, J. Y. Liu, Z. M. Guo. Deep plastic surgery: Robust and controllable image editing with human-drawn sketches. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 601–617, 2020. DOI: 10.1007/978-3-030-58555-6_36.
    [138]
    S. Y. Chen, W. C. Su, L. Gao, S. H. Xia, H. B. Fu. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics, vol. 39, no. 4, Article number 72, 2020. DOI: 10.1145/3386569.3392386.
    [139]
    J. Yu, X. X. Xu, F. Gao, S. J. Shi, M. Wang, D. C. Tao, Q. M. Huang. Toward realistic face photo-sketch synthesis via composition-aided GANs. IEEE Transactions on Cybernetics, vol. 51, no. 9, pp. 4350–4362, 2021. DOI: 10.1109/TCYB.2020.2972944.
    [140]
    Y. K. Fang, W. H. Deng, J. P. Du, J. N. Hu. Identity-aware CycleGAN for face photo-sketch synthesis and recognition. Pattern Recognition, vol. 102, Article number 107249, 2020. DOI: 10.1016/j.patcog.2020.107249.
    [141]
    Y. Lin, S. G. Ling, K. R. Fu, P. Cheng. An identity-preserved model for face sketch-photo synthesis. IEEE Signal Processing Letters, vol. 27, pp. 1095–1099, 2020. DOI: 10.1109/LSP.2020.3005039.
    [142]
    C. L. Peng, N. N. Wang, J. Li, X. B. Gao. Universal face photo-sketch style transfer via multiview domain translation. IEEE Transactions on Image Processing, vol. 29, pp. 8519–8534, 2020. DOI: 10.1109/TIP.2020.3016502.
    [143]
    K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
    [144]
    S. C. Duan, Z. X. Chen, Q. M. J. Wu, L. Cai, D. Lu. Multi-scale gradients self-attention residual learning for face photo-sketch transformation. IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1218–1230, 2020. DOI: 10.1109/TIFS.2020.3031386.
    [145]
    S. Y. Wang, D. Bau, J. Y. Zhu. Sketch your own GAN. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 14030–14040, 2021. DOI: 10.1109/ICCV48922.2021.01379.
    [146]
    A. K. Bhunia, S. Khan, H. Cholakkal, R. M. Anwer, F. S. Khan, J. Laaksonen, M. Felsberg. DoodleFormer: Creative sketch drawing with transformers. [Online], Available: https://arxiv.org/abs/2112.03258, 2021.
    [147]
    H. Abdi, L. J. Williams. Principal component analysis. WIREs Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010. DOI: 10.1002/wics.101.
    [148]
    X. O. Tang, X. G. Wang. Face photo recognition using sketch. In Proceedings. International Conference on Image Processing, IEEE, Rochester, USA, pp. I-257–I-260, 2002. DOI: 10.1109/ICIP.2002.1038008.
    [149]
    X. O. Tang, X. G. Wang. Face sketch synthesis and recognition. In Proceedings of the 9th IEEE International Conference on Computer Vision, IEEE, Nice, France, pp. 687–694, 2003. DOI: 10.1109/ICCV.2003.1238414.
    [150]
    X. O. Tang, X. G. Wang. Face sketch recognition. IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 50–57, 2004. DOI: 10.1109/TCSVT.2003.818353.
    [151]
    S. T. Roweis, L. K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, vol. 290, no. 5500, pp. 2323–2326, 2000. DOI: 10.1126/science.290.5500.2323.
    [152]
    S. Saxena, M. N. Teli. Comparison and analysis of image-to-image generative adversarial networks: A survey. [Online], Available: https://arxiv.org/abs/2112.12625, 2021.
    [153]
    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2672–2680, 2014.
    [154]
    M. Mirza, S. Osindero. Conditional generative adversarial nets. [Online], Available: https://arxiv.org/abs/1411.1784, 2014.
    [155]
    O. Ronneberger, P. Fischer, T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention, Springer, Munich, Germany, pp. 234–241, 2015. DOI: 10.1007/978-3-319-24574-4_28.
    [156]
    Y. C. Jing, Y. Z. Yang, Z. L. Feng, J. W. Ye, Y. Z. Yu, M. L. Song. Neural style transfer: A review. IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 11, pp. 3365–3385, 2020. DOI: 10.1109/TVCG.2019.2921336.
    [157]
    Y. H. Song, C. Yang, Y. J. Shen, P. Wang, Q. Huang, C. C. J. Kuo. SPG-Net: Segmentation prediction and guidance network for image inpainting. In Proceedings of British Machine Vision Conference, Newcastle, UK, 2018.
    [158]
    D. Yi, Z. Lei, S. C. Liao, S. Z. Li. Learning face representation from scratch. [Online], Available: https://arxiv.org/abs/1411.7923, 2014.
    [159]
    L. Wang, R. F. Li, K. Wang, J. Chen. Feature representation for facial expression recognition based on FACS and LBP. International Journal of Automation and Computing, vol. 11, no. 5, pp. 459–468, 2014. DOI: 10.1007/s11633-014-0835-0.
    [160]
    X. Zheng, Y. Q. Guo, H. B. Huang, Y. Li, R. He. A survey of deep facial attribute analysis. International Journal of Computer Vision, vol. 128, no. 8, pp. 2002–2034, 2020. DOI: 10.1007/s11263-020-01308-z.
    [161]
    G. B. Huang, M. Mattar, T. Berg, E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. In Proceedings of Workshop on Faces In “Real-Life” Images: Detection, Alignment, and Recognition, Marseille, France, Article number inria-321923, 2008.
    [162]
    R. Ranjan, V. M. Patel, R. Chellappa. Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 121–135, 2019. DOI: 10.1109/TPAMI.2017.2781233.
    [163]
    E. M. Hand, R. Chellappa. Attributes for improved attributes: A multi-task network utilizing implicit and explicit relationships for facial attribute classification. In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 4068–4074, 2017.
    [164]
    H. Han, A. K. Jain, F. Wang, S. G. Shan, X. L. Chen. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 11, pp. 2597–2609, 2018. DOI: 10.1109/TPAMI.2017.2738004.
    [165]
    Y. Jang, H. Gunes, I. Patras. SmileNet: Registration-free smiling face detection in the wild. In Proceedings of IEEE International Conference on Computer Vision Workshops, IEEE, Venice, Italy, pp. 1581–1589, 2017. DOI: 10.1109/ICCVW.2017.186.
    [166]
    R. Ranjan, S. Sankaranarayanan, C. D. Castillo, R. Chellappa. An all-in-one convolutional neural network for face analysis. In Proceedings of the 12th IEEE International Conference on Automatic Face & Gesture Recognition, IEEE, Washington DC, USA, pp. 17–24, 2017. DOI: 10.1109/FG.2017.137.
    [167]
    S. Li, W. H. Deng. Deep facial expression recognition: A survey. IEEE Transactions on Affective Computing, 2020, to be published. DOI: 10.1109/TAFFC.2020.2981446.
    [168]
    N. Zhang, M. Paluri, M. Ranzato, T. Darrell, L. Bourdev. PANDA: Pose aligned networks for deep attribute modeling. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1637–1644, 2014. DOI: 10.1109/CVPR.2014.212.
    [169]
    M. N. Kan, S. G. Shan, H. Chang, X. L. Chen. Stacked progressive auto-encoders (SPAE) for face recognition across poses. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Columbus, USA, pp. 1883–1890, 2014. DOI: 10.1109/CVPR.2014.243.
    [170]
    Y. Wu, Z. G. Wang, Q. Ji. Facial feature tracking under varying facial expressions and face poses based on restricted Boltzmann machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Portland, USA, pp. 3452–3459, 2013. DOI: 10.1109/CVPR.2013.443.
    [171]
    L. Tran, X. Yin, X. M. Liu. Disentangled representation learning GAN for pose-invariant face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1283–1292, 2017. DOI: 10.1109/CVPR.2017.141.
    [172]
    U. Toseeb, D. R. T. Keeble, E. J. Bryant. The significance of hair for face recognition. PLoS One, vol. 7, no. 3, Article number e34144, 2012. DOI: 10.1371/journal.pone.0034144.
    [173]
    S. J. Bartel, K. Toews, L. Gronhovd, S. L. Prime. “Do I Know You?” altering hairstyle affects facial recognition. Visual Cognition, vol. 26, no. 3, pp. 149–155, 2018. DOI: 10.1080/13506285.2017.1394412.
    [174]
    N. Kumar, P. Belhumeur, S. Nayar. FaceTracer: A search engine for large collections of images with faces. In Proceedings of the 10th European Conference on Computer Vision, Springer, Marseille, France, pp. 340–353, 2008. DOI: 10.1007/978-3-540-88693-8_25.
    [175]
    H. Y. Li, W. M. Dong, B. G. Hu. Facial image attributes transformation via conditional recycle generative adversarial networks. Journal of Computer Science and Technology, vol. 33, no. 3, pp. 511–521, 2018. DOI: 10.1007/s11390-018-1835-2.
    [176]
    J. S. Pierrard, T. Vetter. Skin detail analysis for face recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Minneapolis, USA, 2007. DOI: 10.1109/CVPR.2007.383264.
    [177]
    S. Z. Li. Encyclopedia of Biometrics: I-Z, New York, USA: Springer, 2009.
    [178]
    K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: 10.1109/LSP.2016.2603342.
    [179]
    K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: 10.1109/CVPR.2016.90.
    [180]
    Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, J. Choo. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8789–8797, 2018. DOI: 10.1109/CVPR.2018.00916.
    [181]
    B. Zhao, B. Chang, Z. Q. Jie, L. Sigal. Modular generative adversarial networks. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 157–173, 2018. DOI: 10.1007/978-3-030-01264-9_10.
    [182]
    A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. M. Lin, A. Desmaison, L. Antiga, A. Lerer. Automatic differentiation In PyTorch. In Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, USA, 2017.
    [183]
    D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2014.
    [184]
    Q. Yu, F. Liu, Y. Z. Song, T. Xiang, T. M. Hospedales, C. C. Loy. Sketch me that shoe. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 799–807, 2016. DOI: 10.1109/CVPR.2016.93.
    [185]
    C. Shorten, T. M. Khoshgoftaar. A survey on image data augmentation for deep learning. Journal of Big Data, vol. 6, no. 1, Article number 60, 2019. DOI: 10.1186/s40537-019-0197-0.
    [186]
    Y. X. Wang, C. C. Wu, L. Herranz, J. Van De Weijer, A. Gonzalez-Garcia, B. Raducanu. Transferring GANs: Generating images from limited data. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 220–236, 2018. DOI: 10.1007/978-3-030-01231-1_14.
    [187]
    Y. X. Wang, L. Yu, J. Van De Weijer. DeepI2I: Enabling deep hierarchical image-to-image translation by transferring from GANs. In Proceedings of the 34th in Neural Information Processing Systems, 2020.
    [188]
    A. Shocher, Y. Gandelsman, I. Mosseri, M. Yarom, M. Irani, W. T. Freeman, T. Dekel. Semantic pyramid for image generation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 7455–7464, 2020. DOI: 10.1109/CVPR42600.2020.00748.
    [189]
    S. Ravi, H. Larochelle. Optimization as a model for few-shot learning. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
    [190]
    O. Chapelle, B. Scholkopf, A. Zien. Semi-supervised learning. IEEE Transactions on Neural Networks, vol. 20, no. 3, Article number 542, 2009. DOI: 10.1109/TNN.2009.2015974.
    [191]
    M. Oquab, L. Bottou, I. Laptev, J. Sivic. Is object localization for free? – Weakly-supervised learning with convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 685–694, 2015. DOI: 10.1109/CVPR.2015.7298668.
    [192]
    X. L. Wang, K. M. He, A. Gupta. Transitive Invariance for self-supervised visual representation learning. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1338–1347, 2017. DOI: 10.1109/ICCV.2017.149.
    [193]
    R. Pinto, T. Mettler, M. Taisch. Managing supplier delivery reliability risk under limited information: Foundations for a human-in-the-loop DSS. Decision Support Systems, vol. 54, no. 2, pp. 1076–1084, 2013. DOI: 10.1016/j.dss.2012.10.033.
    [194]
    Y. LeCun. Generalization and network design strategies. Connectionism in Perspective, vol. 19, no. 143–155, Article number 18, 1989.
    [195]
    I. O. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. H. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit, M. Lucic, A. Dosovitskiy. MLP-mixer: An all-MLP architecture for vision. In Proceedings of the 34th in Neural Information Processing Systems, pp. 24261–24272, 2021.
    [196]
    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
    [197]
    K. Lee, H. W. Chang, L. Jiang, H. Zhang, Z. W. Tu, C. Liu. ViTGAN: Training GANs with vision transformers. [Online], Available: https://arxiv.org/abs/2107.04589, 2022.
    [198]
    L. Zhang, L. Zhang, X. Q. Mou, D. Zhang. FSIM: A feature similarity index for image quality assessment. IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011. DOI: 10.1109/TIP.2011.2109730.
    [199]
    S. Avidan, A. Shamir. Seam carving for content-aware image resizing. ACM Transactions on Graphics, vol. 26, no. 3, pp. 10-1–10-9, 2007. DOI: 10.1145/1276377.1276390.
    [200]
    C. Dong, C. C. Loy, K. M. He, X. O. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 2, pp. 295–307, 2016. DOI: 10.1109/TPAMI.2015.2439281.
    [201]
    Y. Y. Hu, S. Yang, W. H. Yang, L. Y. Duan, J. Y. Liu. Towards coding for human and machine vision: A scalable image coding approach. In Proceedings of IEEE International Conference on Multimedia and Expo, IEEE, London, UK, 2020. DOI: 10.1109/ICME46284.2020.9102750.
    [202]
    E. Wood, T. Baltrušaitis, C. Hewitt, S. Dziadzio, T. J. Cashman, J. Shotton. Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3661–3671, 2021. DOI: 10.1109/ICCV48922.2021.00366.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(14)  / Tables(10)

    用微信扫码二维码

    分享至好友和朋友圈

    Article Metrics

    Article views (1789) PDF downloads(156) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return