The Effects of Randomly Sampled Training Data on Program Evolution

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche


Brian J. Ross: The Effects of Randomly Sampled Training Data on Program Evolution. GECCO 2000, ed. D. Whitley et al., Morgan Kaufmann, 2000, pp. 443-450.



The e ects of randomly sampled training data on genetic programming performance is empirically investigated. Often the most natural, if not only, means of characterizing the target behaviour for a problem is to ran- domly sample training cases inherent to that problem. A natural question to raise about this strategy is, how deleterious is the ran- domly sampling of training data to evolution performance? Will sampling reduce the evo- lutionary search to hill climbing? Can re- sampling during the run be advantageous? We address these questions by undertaking a suite of di erent GP experiments. Pa- rameters include various sampling strategies (single, re-sampling, ideal samples), genera- tional and steady{state evolution, and non{ evolutionary strategies such as hill climbing and random search. The experiments con rm that random sampling e ectively character- izes stochastic domains during genetic pro- gramming, provided that a su ciently rep- resentative sample is used. An unexpected result is that genetic programming may per- form worse than random search when the sampled training sets are exceptionally poor. We conjecture that poor training sets cause evolution to prematurely converge to unde- sirable optima, which irrevocably handicaps the population's diversity and viability.

Extended Abstract


Used References

Dupont, P. (1994). Regular Grammatical Inference from Positive and Negative Samples by Ge- netic Search: the GIG method. In: 2nd Intl. Coll. on Grammatical Inference and Applications. Springer-Verlag. pp. 236{245.

Garg, V.K., R. Kumar and S.I Marcus (1999). Proba- bilistic Language Formalism for Stochastic Dis- crete Event Systems. IEEE Trans. Automatic Control 44, 280{293.

Giguere, P. and D.E. Goldberg (1998). Population Siz- ing for Optimum Sampling with Genetic Algo- rithms: A Case Study of the Onemax Problem. In: Proc. Genetic Programming 1998 (J.R. Koza et al, Ed.). Morgan Kaufmann. pp. 496{503.

Kammeyer, T.E. and R.K. Belew (1997). Stochastic Context-free Grammar Induction with a Genetic Algorithm Using Local Search. In: Foundations of Genetic Algorithms IV (R.K. Belew and M. Vode, Eds.). Morgan-Kaufmann.

Longshaw, T. (1997). Evolutionary learning of large grammars. In: Proc. Genetic Programming 1997 (J.R. Koza et al, Ed.). Morgan Kaufmann. Stan- ford University, CA, USA. pp. 406{409.

Miller, B.L. and D.E. Goldberg (1996). Optimum Sam- pling for Genetic Algorithms. In: Arti cial Neu- ral Networks in Engineering (ANNIE '96). ASME Press. pp. 291{298.

Ross, B.J. (1999). Logic-based Genetic Program- ming with De nite Clause Translation Gram- mars. Technical Report CS-99-02. Brock Univer- sity, Dept. of Computer Science.

Ross, B.J. (2000). Probabilistic Pattern Matching and the Evolution of Stochastic Regular Expressions. Applied Intelligence. Accepted for publication.

Ross, B.J., F. Fueten and D.Y. Yashkir (2000). Edge Detection of Petrographic Images Using Genetic Programming. In: Proc. GECCO 2000 (D. Whit- ley et al., Ed.). Morgan Kaufmann.

Sipser, M. (1996). Introduction to the Theory of Com- putation. PWS Pub. Co.


Full Text

intern file

Sonstige Links