Depth-aware neural style transfer

Aus de_evolutionary_art_org
Version vom 9. Dezember 2017, 22:40 Uhr von Gubachelier (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „== Referenz == Xiao-Chang Liu Ming-Ming Cheng Yu-Kun Lai Paul L. Rosin: Depth-aware neural style transfer. == DOI == https://doi.org/10.1145/3092919.309…“)

(Unterschied) ← Nächstältere Version | Aktuelle Version (Unterschied) | Nächstjüngere Version → (Unterschied)
Wechseln zu: Navigation, Suche

Referenz

Xiao-Chang Liu Ming-Ming Cheng Yu-Kun Lai Paul L. Rosin: Depth-aware neural style transfer.

DOI

https://doi.org/10.1145/3092919.3092924

Abstract

Neural style transfer has recently received significant attention and demonstrated amazing results. An efficient solution proposed by Johnson et al. trains feed-forward convolutional neural networks by defining and optimizing perceptual loss functions. Such methods are typically based on high-level features extracted from pre-trained neural networks, where the loss functions contain two components: style loss and content loss. However, such pre-trained networks are originally designed for object recognition, and hence the high-level features often focus on the primary target and neglect other details. As a result, when input images contain multiple objects potentially at different depths, the resulting images are often unsatisfactory because image layout is destroyed and the boundary between the foreground and background as well as different objects becomes obscured. We observe that the depth map effectively reflects the spatial distribution in an image and preserving the depth map of the content image after stylization helps produce an image that preserves its semantic content. In this paper, we introduce a novel approach for neural style transfer that integrates depth preservation as additional loss, preserving overall image layout while performing style transfer.

Extended Abstract

Bibtex

@inproceedings{Liu:2017:DNS:3092919.3092924,
author = {Liu, Xiao-Chang and Cheng, Ming-Ming and Lai, Yu-Kun and Rosin, Paul L.},
title = {Depth-aware Neural Style Transfer},
booktitle = {Proceedings of the Symposium on Non-Photorealistic Animation and Rendering},
series = {NPAR '17},
year = {2017},
isbn = {978-1-4503-5081-5},
location = {Los Angeles, California},
pages = {4:1--4:10},
articleno = {4},
numpages = {10},
url = {http://doi.acm.org/10.1145/3092919.3092924 http://de.evo-art.org/index.php?title=Depth-aware_neural_style_transfer},
doi = {10.1145/3092919.3092924},
acmid = {3092924},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {deep learning, depth, non-photorealistic rendering},
} 

Used References

1 Weifeng Chen, Zhao Fu, Dawei Yang, and Jia Deng. 2016. Single-image depth perception in the wild. In NIPS. 730--738.

2 Ming-Ming Cheng, Ziming Zhang, Wen-Yan Lin, and Philip Torr. 2014. BING: Binarized Normed Gradients for Objectness Estimation at 300fps. In CVPR.

3 Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, and Evan Shelhamer. 2014. cudnn: Efficient primitives for deep learning. arXiv preprint arXiv:1410.0759 (2014).

4 Ronan Collobert, Koray Kavukcuoglu, and Clément Farabet. 2011. Torch7: A Matlab-like environment for machine learning. In BigLearn, NIPS Workshop.

5 Alexei A Efros and William T Freeman. 2001. Image quilting for texture synthesis and transfer. In ACM SIGGRAPH. 341--346.

6 Alexei A Efros and Thomas K Leung. 1999. Texture synthesis by non-parametric sampling. In ICCV, Vol. 2. 1033--1038.

7 David Eigen , Rob Fergus, Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-scale Convolutional Architecture, Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), p.2650-2658, December 07-13, 2015 https://dl.acm.org/citation.cfm?id=2919917&CFID=558819604&CFTOKEN=68186175 http://dx.doi.org/10.1109/ICCV.2015.304

8 Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414--2423.

9 Leon A Gatys, Alexander S Ecker, Matthias Bethge, Aaron Hertzmann, and Eli Shechtman. 2017. Controlling Perceptual Factors in Neural Style Transfer. In CVPR.

10 A Geiger , P Lenz , C Stiller , R Urtasun, Vision meets robotics: The KITTI dataset, International Journal of Robotics Research, v.32 n.11, p.1231-1237, September 2013 https://dl.acm.org/citation.cfm?id=2528333&CFID=558819604&CFTOKEN=68186175 http://dx.doi.org/10.1177/0278364913491297

11 Ross Girshick , Jeff Donahue , Trevor Darrell , Jitendra Malik, Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, p.580-587, June 23-28, 2014 https://dl.acm.org/citation.cfm?id=2679851&CFID=558819604&CFTOKEN=68186175 http://dx.doi.org/10.1109/CVPR.2014.81

12 Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778. http://dx.doi.org/10.1109/CVPR.2016.90

13 Aaron Hertzmann, Non-Photorealistic Rendering and the science of art, Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering, June 07-10, 2010, Annecy, France https://dl.acm.org/citation.cfm?id=1809957&CFID=558819604&CFTOKEN=68186175 http://doi.acm.org/10.1145/1809939.1809957

14 Aaron Hertzmann, Charles E Jacobs, Nuria Oliver, Brian Curless, and David H Salesin. 2001. Image analogies. In ACM SIGGRAPH. 327--340.

15 Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).

16 Tobias Isenberg. 2013. Evaluating and Validating Non-photorealistic and Illustrative Rendering. In Image and Video-Based Artistic Stylisation, Paul L. Rosin and John P. Collomosse (Eds.). Springer, 311--331. http://dx.doi.org/10.1007/978-1-4471-4519-6_15

17 Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In ECCV. Springer, 694--711. http://dx.doi.org/10.1007/978-3-319-46475-6_43

18 Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

19 Congyan Lang, Tam V Nguyen, Harish Katti, Karthik Yadati, Mohan Kankanhalli, and Shuicheng Yan. 2012. Depth matters: Influence of depth cues on visual saliency. In ECCV. 101--115.

20 Bo Li, Chunhua Shen, YuchaoDai, Anton van den Hengel, and Mingyi He. 2015. Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs. In CVPR. 1119--1127.

21 Chuan Li and Michael Wand. 2016. Combining Markov Random Fields and convolutional neural networks for image synthesis. In CVPR. 2479--2486. http://dx.doi.org/10.1109/CVPR.2016.272

22 Yanghao Li, Naiyan Wang, Jiaying Liu, and Xiaodi Hou. 2017. Demystifying Neural Style Transfer. CoRR abs/1701.01036 (2017).

23 Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In ECCV. 740--755.

24 Beyang Liu, Stephen Gould, and Daphne Koller. 2010. Single image depth estimation from predicted semantic labels. In CVPR. 1253--1260. http://dx.doi.org/10.1109/CVPR.2010.5539823

25 Fayao Liu, Chunhua Shen, and Guosheng Lin. 2015. Deep convolutional neural fields for depth estimation from a single image. In CVPR. 5162--5170. http://dx.doi.org/10.1109/CVPR.2015.7299152

26 Yun Liu, Ming-Ming Cheng, Xiaowei Hu, Kai Wang, and Xiang Bai. 2017. Richer Convolutional Features for Edge Detection. In CVPR.

27 Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. 3431--3440. http://dx.doi.org/10.1109/CVPR.2015.7298965

28 Aravindh Mahendran and Andrea Vedaldi. 2015. Understanding deep image representations by inverting them. In CVPR. 5188--5196. http://dx.doi.org/10.1109/CVPR.2015.7299155

29 Graeme McCaig, Steve DiPaola, and Liane Gabora. 2016. Deep Convolutional Networks as Models of Generalization and Blending Within Visual Creativity. CoRR abs/1610.02478(2016).

30 Alec Radford, Luke Metz, and Soumith Chintala. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015).

31 Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In NIPS.

32 Paul L. Rosin and J. Collomosse. 2013. Image and Video-based Artistic Stylisation. Springer. http://dx.doi.org/10.1007/978-1-4471-4519-6

33 Manuel Ruder, Alexey Dosovitskiy, and Thomas Brox. 2016. Artistic style transfer for videos. In GCPR. 26--36. http://dx.doi.org/10.1007/978-3-319-45886-1_3

34 Ashutosh Saxena, Sung H Chung, and Andrew Y Ng. 2005. Learning depth from single monocular images. In NIPS, Vol. 18. 1--8.

35 Ahmed Selim, Mohamed Elgharib, and Linda Doyle. 2016. Painting style transfer for head portraits using convolutional neural networks. ACM TOG 35, 4 (2016), 129.

36 Nathan Silberman , Derek Hoiem , Pushmeet Kohli , Rob Fergus, Indoor segmentation and support inference from RGBD images, Proceedings of the 12th European conference on Computer Vision, October 07-13, 2012, Florence, Italy https://dl.acm.org/citation.cfm?id=2403195&CFID=558819604&CFTOKEN=68186175 http://dx.doi.org/10.1007/978-3-642-33715-4_54

37 Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv:1312.6034 (2013).

38 Thomas Strothotte and Stefan Schlechtweg. 2002. Non-photorealistic computer graphics: modeling, rendering, and animation. Morgan Kaufmann.

39 James T Todd and J Farley Norman. 2003. The visual perception of 3-D shape from multiple cues: Are observers capable of perceiving metric structure? Perception & Psychophysics 65, 1 (2003), 31--47. http://dx.doi.org/10.3758/BF03194781

40 Dmitry Ulyanov, Vadim Lebedev, Andrea Vedaldi, and Victor Lempitsky. 2016. Texture networks: Feed-forward synthesis of textures and stylized images.

41 Jingdong Wang , Huaizu Jiang , Zejian Yuan , Ming-Ming Cheng , Xiaowei Hu , Nanning Zheng, Salient Object Detection: A Discriminative Regional Feature Integration Approach, International Journal of Computer Vision, v.123 n.2, p.251-268, June 2017 https://dl.acm.org/citation.cfm?id=3095356&CFID=558819604&CFTOKEN=68186175 http://dx.doi.org/10.1007/s11263-016-0977-3

42 Peng Wang, Xiaohui Shen, Zhe Lin, Scott Cohen, Brian Price, and Alan L Yuille. 2015. Towards unified depth and semantic prediction from a single image. In CVPR. 2800--2809.

43 Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. 2017. Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach. In CVPR.

44 Yunchao Wei, Xiaodan Liang, Yunpeng Chen, Xiaohui Shen, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. 2016. STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation. (2016). http://dx.doi.org/10.1109/TPAMI.2016.2636150

45 Ziyu Zhang, Alexander G Schwing, Sanja Fidler, and Raquel Urtasun. 2015. Monocular object instance segmentation and depth ordering with CNNs. In ICCV. 2614--2622.

46 Daniel Zoran, Phillip Isola, Dilip Krishnan, and William T Freeman. 2015. Learning ordinal relationships for mid-level vision. In ICCV. 388--396.


Links

Full Text

https://dl.acm.org/authorize?N32487

internal file


Sonstige Links