Rating Image Aesthetics Using Deep Learning: Unterschied zwischen den Versionen

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche
(Used References)
(Used References)
 
Zeile 154: Zeile 154:
 
J. Z. Wang, “On shape and the computability of emotions,” in ACM
 
J. Z. Wang, “On shape and the computability of emotions,” in ACM
 
International Conference on Multimedia (MM), pp. 229–238, 2012 http://dx.doi.org/10.1145/2393347.2393384
 
International Conference on Multimedia (MM), pp. 229–238, 2012 http://dx.doi.org/10.1145/2393347.2393384
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
S. Bhattacharya, R. Sukthankar and M. Shah, "A framework for photo-quality assessment and enhancement based on visual aesthetics", Proc. ACM Int. Conf. Multimedia, pp. 271-280 [CrossRef]
 
 
W. Luo, X. Wang and X. Tang, "Content-based photo quality assessment", Proc. IEEE Int. Conf. Comput. Vis., pp. 2206-2213
 
 
S. Dhar, V. Ordonez and T. Berg, "High level describable attributes for predicting aesthetics and interestingness", Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 1657-1664
 
Abstract | Full Text: PDF (4274KB) | Full Text: HTML
 
 
M. Nishiyama, T. Okabe, I. Sato and Y. Sato, "Aesthetic quality classification of photographs based on color harmony", Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 33-40
 
Abstract | Full Text: PDF (2660KB) | Full Text: HTML
 
 
P. O¿¿¿Donovan, A. Agarwala and A. Hertzmann, "Color compatibility from large datasets", ACM Trans. Graph., vol. 30, no. 4, pp. 63:1-63:12, 2011 [CrossRef]
 
 
L. Marchesotti, F. Perronnin, D. Larlus and G. Csurka, "Assessing the aesthetic quality of photographs using generic image descriptors", Proc. IEEE Int. Conf. Comput. Vis., pp. 1784-1791
 
Abstract | Full Text: PDF (6768KB) | Full Text: HTML
 
 
N. Murray, L. Marchesotti and F. Perronnin, "AVA: A large-scale database for aesthetic visual analysis", Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 2408-2415
 
Abstract | Full Text: PDF (1112KB) | Full Text: HTML
 
 
L. Marchesotti and F. Perronnin, "Learning beautiful (and ugly) attributes", Proc. Brit. Mach. Vis. Conf., pp. 7.1-7.11
 
 
D. Lowe, "Distinctive image features from scale-invariant keypoints", Int. J. Comput. Vis., vol. 60, no. 2, pp. 91-110, 2004 [CrossRef]
 
 
A. Krizhevsky, I. Sutskever and G. E. Hinton, "Imagenet classification with deep convolutional neural networks", Proc. Adv. Neural Inf. Process. Syst., pp. 1106-1114
 
 
H.-H. Su, T.-W. Chen, C.-C. Kao, W. Hsu and S.-Y. Chien, "Scenic photo quality assessment with bag of aesthetics-preserving features", Proc. ACM Int. Conf. Multimedia, pp. 1213-1216 [CrossRef]
 
 
A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope", Int. J. Comput. Vis., vol. 42, no. 3, pp. 145-175, 2001 [CrossRef]
 
 
D. Ciresan, U. Meier and J. Schmidhuber, "Multi-column deep neural networks for image classification", Proc. IEEE Conf. Comput. Vis. Pattern Recog., pp. 3642-3649
 
Abstract | Full Text: PDF (648KB) | Full Text: HTML
 
 
Y. Sun, X. Wang and X. Tang, "Hybrid deep learning for face verification", Proc. IEEE Int. Conf. Comput. Vis., pp. 1489-1496
 
Abstract | Full Text: PDF (824KB) | Full Text: HTML
 
 
P. Sermanet, K. Kavukcuoglu, S. Chintala and Y. LeCun, "Pedestrian detection with unsupervised multi-stage features learning", Proc. IEEE Conf. Computer Vis. Pattern Recog., pp. 3626-3633
 
Abstract | Full Text: PDF (585KB)
 
 
Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition", Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, 1998
 
Abstract | Full Text: PDF (896KB)
 
 
G. E. Hinton, S. Osindero and Y.-W. Teh, "A fast learning algorithm for deep belief nets", Neural Comput., vol. 18, no. 7, pp. 1527-1554, 2006 [CrossRef]
 
 
G. Hinton, "Training products of experts by minimizing contrastive divergence", Neural Comput., vol. 14, no. 8, pp. 1771-1800, 2002 [CrossRef]
 
 
S. Karayev, A. Hertzmann, H. Winnermoller, A. Agarwala and T. Darrel, "Recognizing image style", Proc. Brit. Mach. Vis. Conf. [CrossRef]
 
 
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng and T. Darrell, DeCAF: A deep convolutional activation feature for generic visual recognition, 2013
 
 
F. Agostinelli, M. Anderson and H. Lee, "Adaptive multi-column deep neural networks with application to robust image denoising", Proc. Adv. Neural Inf. Process. Syst., pp. 1493-1501
 
 
A. Khosla, A. Das Sarma and R. Hamid, "What makes an image popular?", Proc. Int. World Wide Web Conf., pp. 867-876 [CrossRef]
 
 
O. Litzel, On Photographic Composition, 1974, Amphoto
 
 
W. Niekamp, "An exploratory investigation into factors affecting visual balance", Educational Commun. Technol. A, J. Theory, Res., Develop., vol. 29, no. 1, pp. 37-48, 1981
 
 
R. Arnheim, Art and Visual Perception: A Psychology of the Creative Eye, 1974, Univ. of California
 
 
D. Joshi, R. Datta, E. Fedorovskaya, Q. T. Luong, J. Z. Wang, J. Li and J. B. Luo, "Aesthetics and emotions in images", IEEE Signal Process. Mag., vol. 28, no. 5, pp. 94-115, 2011
 
Abstract | Full Text: PDF (3585KB) | Full Text: HTML
 
 
J. Pan and Q. Yang, "A survey on transfer learning", IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345-1359, 2010
 
Abstract | Full Text: PDF (2530KB) | Full Text: HTML
 
 
R. Collobert and J. Weston, "A unified architecture for natural language processing: Deep neural networks with multitask learning", Proc. Int. Conf. Mach. Learn., pp. 160-167 [CrossRef]
 
 
X. Lu, P. Suryanarayan, R. B. Adams, J. Li, M. G. Newman and J. Z. Wang, "On shape and the computability of emotions", Proc. ACM Int. Conf. Multimedia, pp. 229-238 [CrossRef]
 
  
 
== Links ==
 
== Links ==

Aktuelle Version vom 20. Juni 2016, 17:16 Uhr

Reference

Lu, X.; Lin, Z.; Jin, H.; Yang, J.; Wang, J.Z.: Rating Image Aesthetics Using Deep Learning. IEEE Transactions on Multimedia, 2015, Volume: 17, Issue: 11, 2021 - 2034.

DOI

http://dx.doi.org/10.1109/TMM.2015.2477040

Abstract

This paper investigates unified feature learning and classifier training approaches for image aesthetics assessment . Existing methods built upon handcrafted or generic image features and developed machine learning and statistical modeling techniques utilizing training examples. We adopt a novel deep neural network approach to allow unified feature learning and classifier training to estimate image aesthetics. In particular, we develop a double-column deep convolutional neural network to support heterogeneous inputs, i.e., global and local views, in order to capture both global and local characteristics of images . In addition, we employ the style and semantic attributes of images to further boost the aesthetics categorization performance . Experimental results show that our approach produces significantly better results than the earlier reported results on the AVA dataset for both the generic image aesthetics and content -based image aesthetics. Moreover, we introduce a 1.5-million image dataset (IAD) for image aesthetics assessment and we further boost the performance on the AVA test set by training the proposed deep neural networks on the IAD dataset.

Extended Abstract

Bibtex

@ARTICLE{7243357,
author={Lu, X. and Lin, Z. and Jin, H. and Yang, J. and Wang, J.Z.},
journal={Multimedia, IEEE Transactions on},
title={Rating Image Aesthetics Using Deep Learning},
year={2015},
volume={17},
number={11},
pages={2021-2034},
keywords={Computer architecture;Image color analysis;Machine learning;Neural networks;Semantics;Training;Visualization;Automatic feature learning;deep neural networks;image aesthetics},
doi={10.1109/TMM.2015.2477040},
url={http://dx.doi.org/10.1109/TMM.2015.2477040, http://de.evo-art.org/index.php?title=Rating_Image_Aesthetics_Using_Deep_Learning },
ISSN={1520-9210},
month={Nov},
}

Used References

[1] R. Datta, D. Joshi, J. Li, and J. Wang, “Studying aesthetics in photo- graphic images using a computational approach,” in European Confer- ence on Computer Vision (ECCV), pp. 288–301, 2006.

[2] Y. Ke, X. Tang, and F. Jing, “The design of high-level features for photo quality assessment,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 419–426, 2006.

[3] Y. Luo and X. Tang, “Photo and video quality evaluation: Focusing on the subject,” in European Conference on Computer Vision (ECCV), pp. 386–399, 2008. http://dx.doi.org/10.1007/978-3-540-88690-7_29

[4] S. Bhattacharya, R. Sukthankar, and M. Shah, “A framework for photo- quality assessment and enhancement based on visual aesthetics,” in ACM International Conference on Multimedia (MM), pp. 271–280, 2010. http://dx.doi.org/10.1145/1873951.1873990

[5] W. Luo, X. Wang, and X. Tang, “Content-based photo quality assess- ment,” in IEEE International Conference on Computer Vision (ICCV), pp. 2206–2213, 2011.

[6] S. Dhar, V. Ordonez, and T. Berg, “High level describable attributes for predicting aesthetics and interestingness,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1657–1664, 2011. http://dx.doi.org/10.1109/CVPR.2011.5995467

[7] M. Nishiyama, T. Okabe, I. Sato, and Y. Sato, “Aesthetic quality clas- sification of photographs based on color harmony,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 33–40, 2011. http://dx.doi.org/10.1109/CVPR.2011.5995539

[8] P. O’Donovan, A. Agarwala, and A. Hertzmann, “Color compatibility from large datasets,” ACM Transactions on Graphics (TOG), vol. 30, no. 4, pp. 63:1–12, 2011. http://dx.doi.org/10.1145/1964921.1964958

[9] L. Marchesotti, F. Perronnin, D. Larlus, and G. Csurka, “Assessing the aesthetic quality of photographs using generic image descriptors,” in IEEE International Conference on Computer Vision (ICCV), pp. 1784– 1791, 2011. http://dx.doi.org/10.1109/ICCV.2011.6126444

[10] N. Murray, L. Marchesotti, and F. Perronnin, “AVA: A large-scale database for aesthetic visual analysis,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2408–2415, 2012. http://dx.doi.org/10.1109/CVPR.2012.6247954

[11] L. Marchesotti and F. Perronnin, “Learning beautiful (and ugly) at- tributes,” in British Machine Vision Conference (BMVC), 2013.

[12] D. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision (IJCV), vol. 60, no. 2, pp. 91–110, 2004. http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94

[13] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Infor- mation Processing Systems (NIPS), pp. 1106–1114, 2012.

[14] H.-H. Su, T.-W. Chen, C.-C. Kao, W. Hsu, and S.-Y. Chien, “Scenic photo quality assessment with bag of aesthetics-preserving features,” in ACM International Conference on Multimedia (MM), pp. 1213–1216, 2011. http://dx.doi.org/10.1145/2072298.2071977

[15] A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” International Journal of Computer Vision (IJCV), vol. 42, no. 3, pp. 145–175, 2001. http://dx.doi.org/10.1023/A:1011139631724

[16] D. Ciresan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649, 2012. http://dx.doi.org/10.1109/CVPR.2012.6248110

[17] Y. Sun, X. Wang, and X. Tang, “Hybrid deep learning for face ver- ification,” in The IEEE International Conference on Computer Vision (ICCV), 2013. http://dx.doi.org/10.1109/ICCV.2013.188

[18] P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. LeCun, “Pedestrian detection with unsupervised multi-stage features learning,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3626–3633, 2013. http://dx.doi.org/10.1109/CVPR.2013.465

[19] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. http://dx.doi.org/10.1109/5.726791

[20] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for deep belief nets,” Neural Computation, vol. 18, no. 7, pp. 1527–1554, 2006. http://dx.doi.org/10.1162/neco.2006.18.7.1527

[21] G. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, vol. 14, no. 8, pp. 1771–1800, 2002. http://dx.doi.org/10.1162/089976602760128018

[22] S. Karayev, A. Hertzmann, H. Winnermoller, A. Agarwala, and T. Dar- rel, “Recognizing image style,” in British Machine Vision Conference (BMVC), 2014. http://dx.doi.org/10.5244/C.28.122

[23] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “DeCAF: A deep convolutional activation feature for generic visual recognition,” in Technical report, 2013. arXiv:1310.1531v1, 2013.

[24] F. Agostinelli, M. Anderson, and H. Lee, “Adaptive multi-column deep neural networks with application to robust image denoising,” in Advances in Neural Information Processing Systems (NIPS), pp. 1493– 1501, 2013.

[25] A. Khosla, A. Das Sarma, and R. Hamid, “What makes an image popular?” in International World Wide Web Conference (WWW), pp. 867–876, 2014. http://dx.doi.org/10.1145/2566486.2567996

[26] O. Litzel, in On Photographic Composition. New York: Amphoto Books, 1974.

[27] W. Niekamp, “An exploratory investigation into factors affecting visual balance,” in Educational Communication and Technology: A Journal of Theory, Research, and Development, vol. 29, no. 1, pp. 37–48, 1981.

[28] R. Arnheim, in Art and visual Perception: A psychology of the creative eye. Los Angeles. CA: University of California Press., 1974.

[29] D. Joshi, R. Datta, E. Fedorovskaya, Q. T. Luong, J. Z. Wang, J. Li, and J. B. Luo, “Aesthetics and emotions in images,” IEEE Signal Processing Magazine, vol. 28, no. 5, pp. 94–115, 2011. http://dx.doi.org/10.1109/MSP.2011.941851

[30] J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 22, no. 10, pp. 1345– 1359, 2010. http://dx.doi.org/10.1109/TKDE.2009.191

[31] R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Interna- tional Conference on Machine Learning (ICML), pp. 160–167, 2008. http://dx.doi.org/10.1145/1390156.1390177

[32] X. Lu, P. Suryanarayan, R. B. Adams Jr, J. Li, M. G. Newman, and J. Z. Wang, “On shape and the computability of emotions,” in ACM International Conference on Multimedia (MM), pp. 229–238, 2012 http://dx.doi.org/10.1145/2393347.2393384

Links

Full Text

http://infolab.stanford.edu/~wangz/project/imsearch/Aesthetics/TMM15/lu.pdf

intern file

Sonstige Links