What's interesting?
Inhaltsverzeichnis
Reference
Jürgen Schmidhuber (1997) What's interesting? Technical Report TR-35-97, IDSIA, Lugano, Switzerland.
DOI
Abstract
Interestingness depends on the observer's current knowledge and computational abilities. Things are boring if either too much or too little is known about them | if they appear either trivial or random. Interesting are unexpected regularities that seem easy to gure out. I attempt to implement these ideas in a \curious", \creative" explorer with two co- evolving \brains". It executes a lifelong sequence of instructions whose modi able probabilities are conditioned on both brains | both must agree on each instruction. There are special instructions for comparing computational results. The brains can predict outcomes of such comparisons. If their opinions di er, then the winner will get rewarded, the loser punished. Hence each brain wants to lure the other into agreeing upon instruction subsequences involving comparisons that surprise it. The surprised brain adapts. In turn, the other loses a source of reward | an incentive to shift the focus of interest. Both brains deal with the complex credit assignment problem using the recent Incremental Self-Improvement paradigm. Extensive simulations include an example where curiosity helps to speed up external reward.
Extended Abstract
Bibtex
Used References
1] E. B. Baum. Neural nets that learn in polynomial time from examples and queries. IEEE Transactions on Neural Networks, 2(1):5{19, 1991.
2] G.J. Chaitin. On the length of programs for computing nite binary sequences: statistical consider- ations. Journal of the ACM, 16:145{159, 1969.
3] A. C. Clarke. The ghost from the grand banks. Orbit books, 1991.
4] D. A. Cohn. Neural network exploration using optimal experiment design. In J. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, pages 679{686. San Mateo, CA: Morgan Kaufmann, 1994.
5] P. Dayan and T. J. Sejnowski. Exloration bonuses and dual control. Machine Learning, 25:5{22, 1996.
6] V. V. Fedorov. Theory of optimal experiments. Academic Press, 1972.
7] S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias/variance dilemma. Neural Computation, 4:1{58, 1992.
8] D. Hillis. Co-evolving parasites improve simulated evolution as an optimization procedure. In C.G. Langton, C. Taylor, J. D. Farmer, and S. Rasmussen, editors, Arti cial Life II, pages 313{324. Addison Wesley Publishing Company, 1992.
9] J. Hwang, J. Choi, S. Oh, and R. J. Marks II. Query-based learning applied to partially trained multilayer perceptrons. IEEE Transactions on Neural Networks, 2(1):131{136, 1991.
10] A.N. Kolmogorov. Three approaches to the quantitative de nition of information. Problems of Information Transmission, 1:1{11, 1965.
11] M. Li and P. M. B. Vitanyi. An Introduction to Kolmogorov Complexity and its Applications. Springer, 1993.
12] L.J. Lin. Reinforcement Learning for Robots Using Neural Networks. PhD thesis, Carnegie Mellon University, Pittsburgh, January 1993.
13] D. J. C. MacKay. Information-based objective functions for active data selection. Neural Computation, 4(2):550{604, 1992.
14] F. Nake. Asthetik als Informationsverarbeitung. Springer, 1974.
15] M. Plutowski, G. Cottrell, and H. White. Learning Mackey-Glass from 25 examples, plus or minus 2. In J. Cowan, G. Tesauro, and J. Alspector, editors, Advances in Neural Information Processing Systems 6, pages 1135{1142. San Mateo, CA: Morgan Kaufmann, 1994.
16] J. B. Pollack and A. D. Blair. Why did TD-Gammon work? In M. C. Mozer, M. I. Jordan, and S. Petsche, editors, Advances in Neural Information Processing Systems 9, pages 10{16. MIT Press, Cambridge, 1997.
17] A. L. Samuel. Some studies in machine learning using the game of checkers. IBM Journal on Research and Development, 3:210{229, 1959.
18] J. Schmidhuber. Curious model-building control systems. In Proc. International Joint Conference on Neural Networks, Singapore, volume 2, pages 1458{1463. IEEE, 1991.
19] J. Schmidhuber. A possibility for implementing curiosity and boredom in model-building neural controllers. In J. A. Meyer and S. W. Wilson, editors, Proc. of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, pages 222{227. MIT Press/Bradford Books, 1991.
20] J. Schmidhuber. Learning factorial codes by predictability minimization. Neural Computation, 4(6):863{879, 1992.
21] J. Schmidhuber. Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 1997. In press.
22] J. Schmidhuber. Low-complexity art. Leonardo, Journal of the International Society for the Arts, Sciences, and Technology, 30(2):97{103, 1997.
23] J. Schmidhuber, M. Eldracher, and B. Foltin. Semilinear predictability minimization produces well- known feature detectors. Neural Computation, 8(4):773{786, 1996.
24] J. Schmidhuber and D. Prelinger. Discovering predictable classi cations. Neural Computation, 5(4):625{635, 1993.
25] J. Schmidhuber, J. Zhao, and N. Schraudolph. Reinforcement learning with self-modifying policies. In S. Thrun and L. Pratt, editors, Learning to learn. Kluwer, 1997. To appear.
26] J. Schmidhuber, J. Zhao, and M. Wiering. Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28:105{130, 1997. In press.
27] N. Schraudolph and T. J. Sejnowski. Unsupervised discrimination of clustered data via optimization of binary information gain. In Stephen Jose Hanson, Jack D. Cowan, and C. Lee Giles, editors, Advances in Neural Information Processing Systems, volume 5, pages 499{506. Morgan Kaufmann, San Mateo, 1993.
28] C. E. Shannon. A mathematical theory of communication (parts I and II). Bell System Technical Journal, XXVII:379{423, 1948.
29] R.J. Solomono . A formal theory of inductive inference. Part I. Information and Control, 7:1{22, 1964.
30] J. Storck, S. Hochreiter, and J. Schmidhuber. Reinforcement driven information acquisition in non- deterministic environments. In Proceedings of the International Conference on Arti cial Neural Net- works, Paris, volume 2, pages 159{164. EC2 & Cie, Paris, 1995.
31] R. S. Sutton. Integrated architectures for learning, planning and reacting based on dynamic pro- gramming. In Machine Learning: Proceedings of the Seventh International Workshop, 1990.
Links
Full Text
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.6362&rep=rep1&type=pdf
ftp://ftp.idsia.ch/pub/juergen/interest.ps.gz
Sonstige Links
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.6362