Using genetic algorithm to improve information retrieval systems

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche


Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, and Osman A. Sadek (2006): Using genetic algorithm to improve information retrieval systems. World Academy of Science and Engineering Technology, Vol. 17, No. 2, pp.6–13.



This study investigates the use of genetic algorithms in information retrieval. The method is shown to be applicable to three well-known documents collections, where more relevant documents are presented to users in the genetic modification. In this paper we present a new fitness function for approximate information retrieval which is very fast and very flexible, than cosine similarity fitness function.

Extended Abstract


author = {Ahmed A. A. Radwan, Bahgat A. Abdel Latef, Abdel Mgeid A. Ali, and Osman A. Sadek},
title = {Using genetic algorithm to improve information retrieval systems},
journal = {World Academy of Science and Engineering Technology},
volume = {17},
number = {2},
pages = {6-13},
year = {2006},
keywords={Cosine similarity, Fitness function, Genetic Algorithm, Information Retrieval, Query learning}

Used References

[1] J. H. Holland, “Adaptation in Natural and Artificial Systems”, University of Michigan Press, Ann Arbor, 1975.

[2] K. A. DeJong, An Analysis of the Behavior of a Class of Genetic Adaptive Systems, Ph.D. Thesis, University of Michigan, 1975.

[3] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, Reading, MA., 1989.

[4] H. Chen, “Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms”. Journal of the American Society for Information Science, 46(3), 1995, pp. 194–216.

[5] J. Savoy and D. Vrajitoru, “Evaluation of learning schemes used in information retrieval (CR-I-95-02)”. Universite de Neuchatel, Faculte de droit et des Sciences Economiques, 1996.

[6] M. Gordon, “Probabilistic and genetic algorithms in document retrieval”. Communications of the ACM, 31(10), 1988, pp. 1208–1218.

[7] J. Yang, R. Korfhage and E. Rasmussen. “Query improvement in information retrieval using genetic algorithms––a report on the experiments of the TREC project”. In Proceedings of the 1st text retrieval conference (TREC-1), 1992, pp. 31–58.

[8] J. Morgan and A. Kilgour. “Personalising on-line information retrieval support with a genetic algorithm”. In A. Moscardini, & P. Smith (Eds.), PolyModel 16: Applications of artificial intelligence, 1996, pp. 142–149.

[9] M. Boughanem, C. Chrisment, and L. Tamine. “On using genetic algorithms for multimodal relevance optimization in information retrieval”. Journal of the American Society for Information Science and Technology, 53(11), 2002, pp. 934–942.

[10] J. T. Horng and C. C. Yeh. “Applying genetic algorithms to query optimization in document retrieval”. Information Processing & Management, 36(5), 2000, pp. 737–759.

[11] D. Vrajitoru. “Crossover improvement for the genetic algorithm in information retrieval”. Information Processing& Management, 34(4), 1998, pp. 405–415.

[12] D. Vrajitoru. “Large population or many generations for genetic algorithms? Implications in information retrieval”. In F. Crestani and G. Pasi (Eds.), Soft computing in information retrieval. Techniques and applications, Physica-Verlag, 2000, pp. 199–222.

[13] D. Harman. “Overview of the first TREC conference”. In Proceedings of the 16th ACM SIGIR conference on information retrieval, 1993, pp. 36– 47.

[14] B. T. Bartell, G. W. Cottrell and R. K. Belew. “Automatic combination of multiple ranked retrieval systems”. In Proceedings of the 17th ACM SIGIR conference on information retrieval, 1994, pp. 173–181.

[15] P. Pathak, M. Gordon and W. Fan. “Effective information retrieval using genetic algorithms based matching functions adaption”, in: Proc. 33rd Hawaii International Conference on Science (HICS), Hawaii, USA, 2000.

[16] R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval, Adisson, 1999.

[17] G. Salton and M.H. McGill. Introduction to Modern Information Retrieval, McGraw-Hill, 1983.

[18] C.J. Van Rijsbergen. Information Retrieval, second ed., Butterworth, 1979.

[19] A. Bookstein. “Outline of a general probabilistic retrieval model”, Journal of Documentation 39 (2), 1983, pp. 63–72.

[20] N. Fuhr. “Probabilistic models in information retrieval”, Computer Journal 35 (3), 1992, pp. 243–255.

[21] C. H. Chang and C. C. Hsu. The design of an information system for hypertext retrieval and automatic discovery on WWW. Ph.D. thesis, Department of CSIE, National Taiwan University, 1999.

[22] K. L. Kwok. “Comparing representations in Chinese information retrieval”. ACM SIGIR'97, Philadelphia, PA, USA, 1997, pp. 34 -41. [23] T. Mitchell. Machine Learning, McGraw-Hill, 1997.

[24] H. Chen et al., “A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing”, Journal of the American Society for Information Science 49 (8), 1998, pp. 693–705.

[25] W. Fan, M.D. Gordon and P. Pathak. “Personalization of search engine services for effective retrieval and knowledge management”, in: Proc. 2000 International Conference on Information Systems (ICIS), Brisbane, Australia, 2000.

[26] A.M. Robertson and P. Willet. “Generation of equifrequent groups of words using a genetic algorithm”, Journal of Documentation 50 (3), 1994, pp. 213–232.

[27] M. Gordon. “Probabilistic and genetic algorithms for document retrieval”, Communications of the ACM 31 (10), 1988, pp. 1208–1218.

[28] W. Fan, M.D. Gordon and P. Pathak. “Discovery of context-specific ranking functions for effective information retrieval using genetic programming”, IEEE Transactions on knowledge and Data Engineering, in press.

[29] M.P. Smith, M. Smith. “The use of genetic programming to build Boolean queries for text retrieval through relevance feedback”, Journal of Information Science 23 (6), 1997, pp. 423–431.

[30] J. Koza. “Genetic Programming”. On the Programming of Computers by means of Natural Selection, The MIT Press, 1992.

[31] J. Yang and R. Korfhage. “Query modifications using genetic algorithms in vector space models”, International Journal of Expert Systems 7 (2), 1994, pp.165–191.

[32] H. Kucera and N. Francis. “Computational analysis of present-day American English”. Providence, RD: Brown University Press, 1967. [33] M. F. Porter. “An algorithm for suffix stripping. Program”, 14(3), 1980, pp. 130–137.

[34] G. Salton and C. Buckley. “Improving retrieval performance by relevance feedback”. Journal of the American Society for Information Science, 41(4), 1990, pp. 288–297.

[35] T. Noreault, M. McGill and M. B. Koll. “A performance evaluation of similarity measures, document term weighting schemes and representation in a Boolean environment”. Information retrieval research. London: Butterworths, 1981.


Full Text

internal file

Sonstige Links