A review on the application of evolutionary computation to information retrieval

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche


O. Cordón, E. Herrera-Viedma, C. López-Puljalte, M. Luque, C. Zarco: A review on the application of evolutionary computation to information retrieval. International Journal of Approximate Reasoning 34 (July 2003) 241–264.




In this contribution, different proposals found in the specialized literature for the application of evolutionary computation to the field of information retrieval will be reviewed. To do so, different kinds of IR problems that have been solved by evolutionary algorithms are analyzed. Some of the specific existing approaches will be specifically described for some of these problems and the obtained results will be critically evaluated in order to give a clear view of the topic to the reader.

Extended Abstract


title = "Soft Computing Applications to Intelligent Information Retrieval on the Internet A review on the application of evolutionary computation to information retrieval",
journal = "International Journal of Approximate Reasoning",
volume = "34",
number = "2",
pages = "241 - 264",
year = "2003",
note = "",
issn = "0888-613X",
doi = "http://dx.doi.org/10.1016/j.ijar.2003.07.010",
url = "http://www.sciencedirect.com/science/article/pii/S0888613X0300094X http://de.evo-art.org/index.php?title=A_review_on_the_application_of_evolutionary_computation_to_information_retrieval",
author = "O. Cordón and E. Herrera-Viedma and C. López-Pujalte and M. Luque and C. Zarco",
keywords = "Information retrieval",
keywords = "Evolutionary algorithms",
keywords = "Automatic indexing",
keywords = "Document clustering",
keywords = "Query definition",
keywords = "User profiles",
keywords = "Internet search agents",
keywords = "Image retrieval",
keywords = "Similarity functions",
keywords = "Web pages"

Used References

[1] T. Bäck, D.B. Fogel, Z. Michalewicz: Handbook of Evolutionary Computation. IOP Publishing and Oxford University Press (1997)

[2] R. Baeza-Yates, B. Ribeiro-Neto: Modern Information Retrieval. Adisson (1999)

[3] A. Bergström, P. Jaksetic, P. Nordin: Enhancing information retrieval by automatic acquisition of textual relations using genetic programming. in: Proc. 2000 International Conference on Intelligent User Interfaces, New Orleans, USA, 2000, pp. 29–32 http://dx.doi.org/10.1145/325737.325763 http://www.media.mit.edu/~lieber/IUI/Bergstrom/Bergstrom.pdf

[4] A. Bookstein: Outline of a general probabilistic retrieval model. Journal of Documentation, 39 (2) (1983), pp. 63–72

[5] G. Bordogna, P. Carrara, G. Pasi: Fuzzy approaches to extend Boolean information retrieval. P. Bosc, J. Kacprzyk (Eds.), Fuzziness in Database Management Systems (1995), pp. 231–274

[6] M. Boughanem, C. Chrisment, L. Tamine: Genetic approach to query space exploration. Information Retrieval, 1 (1999), pp. 175–192 http://dx.doi.org/10.1023/A:1009931404333

[7] M. Boughanem, C. Chrisment, L. Tamine: On using genetic algorithms for multimodal relevance optimization in information retrieval. Journal of the American Society for Information Science and Technology, 53 (11) (2002), pp. 934–942 http://dx.doi.org/10.1002/asi.10119 https://hal.archives-ouvertes.fr/hal-00359529/document

[8] M. Boughanem, C. Chrisment, L. Tamine: Multiple query evaluation based on an enhanced genetic algorithm. Information Processing and Management, 39 (2003), pp. 215–231 http://dx.doi.org/10.1016/S0306-4573(02)00048-1

[9] C.H. Chang, C.C. Hsun, The Design of an Information System for Hypertext Retrieval and Automatic Discovery on WWW, PhD thesis, Department of CSIE, National Taiwan University 1999

[10] Y.K. Chang, C. Cirillo: Evaluation of feedback retrieval using modified freezing, residual collection, and test and control groups. G. Salton (Ed.), The Smart Retrieval System––Experiments in Automatic Document Processing, Prentice Hall (1971), pp. 335–370

[11] H. Chen, et al.: A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing. Journal of the American Society for Information Science, 49 (8) (1998), pp. 693–705 http://www.dtic.mil/dtic/tr/fulltext/u2/a573988.pdf

[12] H. Chen, C. Yi-Ming, M. Ramsey, C. Yang: An intelligent personal spider (agent) for dynamic Internet/Intranet searching. Decision Support Systems, 23 (1998), pp. 41–58

[13] Y. Chen, C. Shahabi: Automatically improving the accuracy of user profile with genetic algorithm. in: Proc. International Conference on Artificial Intelligence and Soft Computing, Cancun, Mexico, 2001 http://infolab.usc.edu/DocsDemos/asc01.pdf

[14] S. Cho, J. Lee: A human-oriented image retrieval system using interactive genetic algorithm. IEEE Transactions on System, Man and Cybernetics. Part A: Systems and Humans, 32 (3) (2002), pp. 452–458

[15] O. Cordón, E. Herrera-Viedma, M. Luque: Evolutionary learning of Boolean queries by multiobjective genetic programming. in: Proc. PPSN-VII, Granada, Spain, 2002, pp. 710–719, LNCS 2439 http://dx.doi.org/10.1007/3-540-45712-7_68 http://delta.cs.cinvestav.mx/~ccoello/EMOO/cordon02.pdf.gz

[16] O. Cordón, E. Herrera-Viedma, M. Luque, F. Moya, C. Zarco: Analyzing the performance of a multiobjective GA-P algorithm for learning fuzzy queries in a machine learning environment. in: International Fuzzy Systems Association World Congress, 2003, Istanbul, Turkey, LNAI 2715 http://dx.doi.org/10.1007/3-540-44967-1_73 http://sci2s.ugr.es/sites/default/files/ficherosPublicaciones/0319_ifsa03-cordon.pdf

[17] O. Cordón, F. Moya, C. Zarco: A GA-P algorithm to automatically formulate extended Boolean queries for a fuzzy information retrieval system. Mathware & Soft Computing, 7 (2–3) (2000), pp. 309–322 http://sci2s.ugr.es/sites/default/files/ficherosPublicaciones/0450_MATHWARE_2000_07_02-03_18.pdf

[18] O. Cordón, F. Moya, C. Zarco: A new evolutionary algorithm combining simulated annealing and genetic programming for relevance feedback in fuzzy information retrieval systems. Soft Computing, 6 (5) (2002), pp. 308–319 http://scimago1.ugr.es/publications/softc-02.pdf

[19] O. Cordón, F. Moya, C. Zarco: Automatic learning of multiple extended Boolean queries by multiobjective GA-P algorithms. V. Loia, M. Nikravesh, L.A. Zadeh (Eds.), Fuzzy Logic and the Internet, Springer (2003) http://dx.doi.org/10.1007/978-3-540-39988-9_3 http://sci2s.ugr.es/sites/default/files/ficherosPublicaciones/0327_cordon-libro-loia.pdf

[20] W. Fan, M. Gordon, P. Pathak: Automatic generation of a matching function by genetic programming for effective information retrieval. in: America’s Conference on Information System, Milwaukee, USA, August 1999 http://aisel.aisnet.org/cgi/viewcontent.cgi?article=1374&context=amcis1999

[21] W. Fan, M.D. Gordon, P. Pathak, Personalization of search engine services for effective retrieval and knowledge management, in: Proc. 2000 International Conference on Information Systems (ICIS), Brisbane, Australia, 2000

[22] W. Fan, M.D. Gordon, P. Pathak: Discovery of context-specific ranking functions for effective information retrieval using genetic programming. IEEE Transactions on knowledge and Data Engineering, Volume 16, 4, 523 - 527. http://dx.doi.org/10.1109/TKDE.2004.1269663

[23] J.L. Fernández-Villacañas, M. Shackleton: Investigation of the importance of the genotype–phenotype mapping in information retrieval. Future Generation Computer Systems, 19 (2003), pp. 55–68 http://dx.doi.org/10.1016/S0167-739X(02)00108-5

[24] D.B. Fogel: System Identification trough Simulated Evolution: A Machine Learning Approach. Ginn Press, USA (1991)

[25] C.M. Fonseca, P.J. Fleming. Genetic algorithms for multiobjective optimization: formulation, discussion and generalization, in: Proc. Fifth International Conference on Genetic Algorithms, 1993, pp. 416–423

[26] N. Fuhr: Probabilistic models in information retrieval. Computer Journal, 35 (3) (1992), pp. 243–255

[27] H.-P. Schwefel: Evolution and Optimum Seeking. Sixth Generation Computer Technology Series, John Wiley and Sons (1995)

[28] M. Gordon: Probabilistic and genetic algorithms for document retrieval. Communications of the ACM, 31 (10) (1988), pp. 1208–1218 http://dx.doi.org/10.1145/63039.63044

[29] M. Gordon: User-based document clustering by redescribing subject description with a genetic algorithm. Journal of the American Society for Information Science, 42 (5) (1991), pp. 311–322 http://dx.doi.org/10.1002/(SICI)1097-4571(199106)42:5<311::AID-ASI1>3.0.CO;2-J

[30] J. Horng, C. Yeh: Applying genetic algorithms to query optimization in document retrieval. Information Processing and Management, 36 (2000), pp. 737–759 http://dx.doi.org/10.1016/S0306-4573(00)00008-X

[31] S. Kato and S.-i. Iisaku: An Image Retrieval Method Based on a Genetic Algorithm. Information Networking, 1998. (ICOIN-12) Proceedings., Twelfth International Conference on, (ICOIN'98), 1998, pp. 333–336 http://dx.doi.org/10.1109/ICOIN.1998.648404

[32] J. Koza: Genetic Programming. On the Programming of Computers by means of Natural Selection. The MIT Press (1992)

[33] D.H. Kraft, F.E. Petry, B.P. Buckes, T. Sadasivan: Genetic algorithm for query optimization in information retrieval: relevance feedback. E. Sanchez, T. Shibata, L.A. Zadeh (Eds.), Genetic Algorithms and Fuzzy Logic Systems (1997), pp. 155–173

[34] K.L. Kwok, Comparing representations in Chinese information retrieval, in: ACM SIGIR’97, Philadelphia, USA, 1997, pp. 34–41

[35] H. Larsen, N. Marı́n, M.J. Martı́n-Bautista, M.A. Vila: Using genetic feature selection for optimizing user profile. Mathware & Soft Computing, 7 (2000), pp. 275–286

[36] V. Loia, P. Luengo: An Evolutionary Approach to Automatic Web Page Categorization and Updating. in: First Asia–Pacific Conference, Maebashi City, Japan, 2001, pp. 292–302 http://dx.doi.org/10.1007/3-540-45490-X_35

[37] C. López-Pujalte, V. Guerrero, F. Moya: A test of genetic algorithms in relevance feedback. Information Processing & Management, 38 (2002), pp. 793–805 http://dx.doi.org/10.1016/S0306-4573(01)00061-9

[38] C. López-Pujalte, V. Guerrero, F. Moya: Genetic algorithms in relevance feedback: a second test and new contributions. Information Processing & Management, 39 (5) (2003), pp. 669–807 http://dx.doi.org/10.1016/S0306-4573(02)00044-4

[39] C. López-Pujalte, V. Guerrero, F. Moya: Order-based fitness functions for genetic algorithms applied to relevance feedback. Journal of the American Society for Information Science and Technology, 54 (2) (2003), pp. 152–160 http://dx.doi.org/10.1002/asi.10179

[40] M.J. Martin-Bautista, H. Larsen, M.A. Vila: A fuzzy genetic algorithm approach to an adaptive information retrieval agent. Journal of the American Society for Information Science, 50 (9) (1999), pp. 760–771 http://hera.ugr.es/doi/15000266.pdf

[41] Z. Michalewicz: Genetic Algorithms+Data Structures=Evolution Programs. Springer-Verlag (1996)

[42] T. Mitchell: Machine Learning. McGraw-Hill (1997)

[43] P. Pathak, M. Gordon, W. Fan: Effective information retrieval using genetic algorithms based matching functions adaption. in: Proc. 33rd Hawaii International Conference on Science (HICS), Hawaii, USA, 2000 http://dx.doi.org/10.1109/HICSS.2000.926653 https://www.computer.org/csdl/proceedings/hicss/2000/0493/02/04932011.pdf

[44] A. Robertson, P. Willet: An upperbound to the performance for ranked-output searching: optimal weighting of query terms using a genetic algorithm. Journal of Documentation, 52 (4) (1996), pp. 405–420 http://dx.doi.org/10.1108/eb026973

[45] A.M. Robertson, P. Willet: Generation of equifrequent groups of words using a genetic algorithm. Journal of Documentation, 50 (3) (1994), pp. 213–232

[46] S.E. Robertson, K. Spark Jones: Relevance weighting of search terms. Journal of the American Society for Information Science, 27 (1976), pp. 129–145

[47] G. Salton, M.H. McGill: Introduction to Modern Information Retrieval. McGraw-Hill (1983)

[48] E. Sanchez, H. Miyano, J. Brachet: Optimization of fuzzy queries with genetic algorithms. Applications to a data base of patents in biomedical engineering. in: Proc. VI IFSA Congress, Sao-Paulo, Brazil, 1995, pp. 293–296

[49] L. Sánchez: A niching scheme for steady state GA-P and its application to fuzzy rule based classifiers induction. Mathware & Soft Computing, 7 (2–3) (2000), pp. 337–350

[50] T.K. Shih: Mobile agent evolution computing. Information Sciences, 137 (2001), pp. 53–73 http://www.sciencedirect.com/science/article/pii/S0020025501001098/pdf?md5=e884830e72fec6e4b720efec51673eeb&pid=1-s2.0-S0020025501001098-main.pdf

[51] M.P. Smith, M. Smith: The use of genetic programming to build Boolean queries for text retrieval through relevance feedback. Journal of Information Science, 23 (6) (1997), pp. 423–431

[52] Z. Stejic, Y. Takama, K. Hirota: Genetic algorithm-based relevance feedback for image retrieval using local similarity patterns. Information Processing & Management, 39 (2003), pp. 1–23 http://www.sciencedirect.com/science/article/pii/S0306457302000249/pdf?md5=12918acbc9dbe47063abbf2a65ee9cfd&pid=1-s2.0-S0306457302000249-main.pdf

[53] C.J. Van Rijsbergen: Information Retrieval. (second ed.) Butterworth (1979) http://www.sciencedirect.com/science/article/pii/S0888613X0300094X#BIB53

[54] D. Vrajitoru: Crossover improvement for the genetic algorithm in information retrieval. Information Processing and Management, 34 (4) (1998), pp. 405–415 http://dx.doi.org/10.1016/S0306-4573(98)00015-6 http://lcvmwww.epfl.ch/publications/data/articles/39/IPM98.ps.gz

[55] R.L. Walker, Assessment of the web using genetic programming, in: Proc. Genetic and Evolutionary Computation Conference, San Francisco, 1999, pp. 1750–1755

[56] R.L. Walker: Search engine case study: searching the web using genetic programming and MPI. Parallel Computing, 27 (2001), pp. 71–89 http://dx.doi.org/10.1016/S0167-8191(00)00089-2

[57] Yang, Jing-Jye and Korfhage, Robert R: Query modifications using genetic algorithms in vector space models. International Journal of Expert Systems, 7 (2) (1994), pp. 165–191


Full Text


internal file

Sonstige Links