Photo Quality Assessment with DCNN that Understands Image Well
Inhaltsverzeichnis
Referenz
Dong, Z., Shen, X., Li, H., Tian, X.: Photo Quality Assessment with DCNN that Understands Image Well. In: MultiMedia Modeling, Springer (2015) 524-535
DOI
http://dx.doi.org/10.1007/978-3-319-14442-9_57
Abstract
Photo quality assessment from the view of human aesthetics, which tries to classify images into the categories of good and bad, has drawn a lot of attention in computer vision field. Up to now, experts have proposed many methods to deal with this problem. Most of those methods are based on the design of hand-crafted features. However, due to the complexity and subjectivity of human’s aesthetic activities, it is difficult to describe and model all the factors that affect the photo aesthetic quality. Therefore those methods just obtain limited success. On the other hand, deep convolutional neural network has been proved to be effective in many computer vision problems and it does not need human efforts in the design of features. In this paper, we try to adopt a deep convolutional neural network that “understands” images well to conduct the photo aesthetic quality assessment. Firstly, we implement a deep convolutional neural network which has eight layers and millions of parameters. Then to “teach” this network enough knowledge about images, we train it on the ImageNet which is one of the largest available image database. Next, for each given image, we take the activations of the last layer of the neural network as its aesthetic feature. The experimental results on two large and reliable image aesthetic quality assessment datasets prove the effectiveness of our method.
Extended Abstract
Bibtex
@Inbook{Dong2015, author="Dong, Zhe and Shen, Xu and Li, Houqiang and Tian, Xinmei", editor="He, Xiangjian and Luo, Suhuai and Tao, Dacheng and Xu, Changsheng and Yang, Jie and Hasan, Muhammad Abul", title="Photo Quality Assessment with DCNN that Understands Image Well", bookTitle="MultiMedia Modeling: 21st International Conference, MMM 2015, Sydney, NSW, Australia, January 5-7, 2015, Proceedings, Part II", year="2015", publisher="Springer International Publishing", address="Cham", pages="524--535", isbn="978-3-319-14442-9", doi="10.1007/978-3-319-14442-9_57", url="http://dx.doi.org/10.1007/978-3-319-14442-9_57" }
Used References
1. Geng, B., Yang, L., Xu, C., Hua, X.-S., Li, S.: The role of attractiveness in web image search. In: ACM MM, pp. 63–72 (2011)
2. Li, C., Loui, A.C., Chen, T.: Towards aesthetics: A photo quality assessment and photo selection system. In: ACM MM, pp. 827–830 (2010)
3. Bhattacharya, S., Sukthankar, R., Shah, M.: A Framework for Photo-Quality Assessment and Enhancement based on Visual Aesthetics. In: MM, pp. 271–280 (2010)
4. Zhang, F.-L., Wang, M., Hu, S.-.M.: Aesthetic image enhancement by dependence-aware object recomposition. IEEE Trans. on Multimedia, 1480–1490 (2013)
5. Datta, R., Joshi, D., Li, J., Wang, J.Z.: Studying aesthetics in photographic images using a computational approach. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 288–301. Springer, Heidelberg (2006) http://dx.doi.org/10.1007/11744078_23
6. Luo, Y., Tang, X.: Photo and video quality evaluation: Focusing on the subject. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 386–399. Springer, Heidelberg (2008) http://dx.doi.org/10.1007/978-3-540-88690-7_29
7. Lo, K.Y., Liu, K.H., Chen, C.S.: Assessment of photo aesthetics with efficiency. In: ICPR, pp. 2186–2189 (2012)
8. Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: CVPR, pp. 419–426 (2006)
9. Tong, H., Li, M., Zhang, H.-J., He, J., Zhang, C.: Classification of digital photos taken by photographers or home users. In: Aizawa, K., Nakamura, Y., Satoh, S. (eds.) PCM 2004. LNCS, vol. 3331, pp. 198–205. Springer, Heidelberg (2004) http://dx.doi.org/10.1007/978-3-540-30541-5_25
10. Marchesotti, L., Perronnin, F., Larlus, D., Csurka, G.: Assessing the aesthetic quality of photographs using generic image descriptors. In: ICCV, pp. 1784–1791 (2011)
11. Dance, C., Willamowski, J., Fan, L., Bray, C., Csurka, G.: Visual categorization with bags of keypoints. In: ECCV International Workshop on Statistical Learning in Computer Vision, Prague (2004)
12. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
14. Wong, L.K., Low, K.L.: Saliency-enhanced image aesthetics class prediction. In: ICPR, pp. 997–1000 (2009)
15. Yin, W., Mei, T., Chen, C.W.: Assessing photo quality with geo-context and crowdsourced photos. In: Visual Communications and Image Processing, pp. 1–6 (2012)
16. Chu, W.T., Chen, Y.K., Chen, K.T.: Size does matter: how image size affects aesthetic perception? In: ACM MM, pp. 53–62 (2013)
17. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., et al.: Hand-written digit recognition with a back-propagation network. In: Advances in Neural Information Processing Systems (1990)
18. LeCun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Science and Neural Networks. MIT Press (1995)
19. Hubel, D.H., Wiesel, T.: Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. Journal of Physiology (London) 160, 106–154 (1962) http://dx.doi.org/10.1113/jphysiol.1962.sp006837
20. Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: A convolutional neural network approach. IEEE Trans. Neural Networks 8, 98–113 (1997) http://dx.doi.org/10.1109/72.554195
21. Le Callet, P., Viard-Gaudin, C., Barba, D.: A convolutional neural network approach for objective video quality assessment. IEEE Trans. Neural Netw. 17(5), 1316–1327 (2006) http://dx.doi.org/10.1109/TNN.2006.879766
22. Szarvas, M., Yoshizawa, A., Yamamoto, M., Ogata, J.: Pedestrian detection with convolutional neural networks. In: Proc. IEEE Intell. Veh. Symp., pp. 224–229 (2005)
23. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: Proc. 27th International Conference on Machine Learning (2010)
24. Luo, W., Wang, X., Tang, X.: Content-based photo quality assessment. In: ICCV, pp. 2206–2213 (2011)
25. Murray, N., Marchesotti, L., Perronnin, F.: AVA: A largescale database for aesthetic visual analysis. In: CVPR, pp. 2408–2415 (2012)
26. Deng, J., Berg, A., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: CVPR (2009)
27. Jia, Y.: Caffe: An open source convolutional architecture for fast feature embedding (2013), http://cafe.berkeleyvision.org/
28. Liu, W.F., Tao, D.: Multiview hessian regularization for image annotation. IEEE Trans. on Image Processing, 2676–2687 (2013)
Links
Full Text