Order-based fitness functions for genetic algorithms applied to relevance feedback
Inhaltsverzeichnis
Referenz
C. López-Pujalte, V. Guerrero, F. Moya: Order-based fitness functions for genetic algorithms applied to relevance feedback. Journal of the American Society for Information Science and Technology, 54 (2) (2003), pp. 152–160
DOI
http://dx.doi.org/10.1002/asi.10179
Abstract
Recently there have been appearing new applications of genetic algorithms to information retrieval, most of them specifically to relevance feedback. The evolution of the possible solutions are guided by fitness functions that are designed as measures of the goodness of the solutions. These functions are naturally the key to achieving a reasonable improvement, and which function is chosen most distinguishes one experiment from another. In previous work, we found that, among the functions implemented in the literature, the ones that yield the best results are those that take into account not only when documents are retrieved, but also the order in which they are retrieved. Here, we therefore evaluate the efficacy of a genetic algorithm with various order-based fitness functions for relevance feedback (some of them of our own design), and compare the results with the Ide dec-hi method, one of the best traditional methods.
Extended Abstract
Bibtex
@article{Lopez-Pujalte:2003:OFF:767375.767381, author = {L\'{o}pez-Pujalte, Cristina and Guerrero-Bote, Vicente P. and de Moya-Aneg\'{o}n, F{\'e}lix}, title = {Order-based Fitness Functions for Genetic Algorithms Applied to Relevance Feedback}, journal = {J. Am. Soc. Inf. Sci. Technol.}, issue_date = {January 2003}, volume = {54}, number = {2}, month = jan, year = {2003}, issn = {1532-2882}, pages = {152--160}, numpages = {9}, url = {http://dx.doi.org/10.1002/asi.10179}, doi = {10.1002/asi.10179}, acmid = {767381}, publisher = {John Wiley \& Sons, Inc.}, address = {New York, NY, USA}, }
Used References
1 Ricardo A. Baeza-Yates , Berthier Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1999 http://dl.acm.org/citation.cfm?id=553876&CFID=558819604&CFTOKEN=68186175
2 R. K. Belew, Adaptive information retrieval: using a connectionist representation to retrieve and learn about documents, Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval, p.11-20, June 25-28, 1989, Cambridge, Massachusetts, USA http://doi.acm.org/10.1145/75334.75337
3 Chang, C.-H., & Hsu, C.-C. (1999). The design of an information system for hypertext retrieval and automatic discovery on WWW. Ph.D. Thesis,. Department of CSIE. National Taiwan University.
4 Chang, Y.K., Cirillo, C., & Razon, J. (1971). Evaluation of feedback retrieval using modified freezing, residual collection, and test and control groups. In G. Salton (Ed.), The smart retrieval system--Experiments in automatic document processing (pp. 355-370). Englewood Cliffs, NJ: Prentice Hall Inc.
5 Hsinchun Chen, Machine learning for information retrieval: neural networks, symbolic learning, and genetic algorithms, Journal of the American Society for Information Science, v.46 n.3, p.194-216, April 1995 http://dx.doi.org/10.1002/(SICI)1097-4571(199504)46:3%3C194::AID-ASI4%3E3.0.CO;2-S
6 Hsinchun Chen , Ganesan Shankaranarayanan , Linlin She , Anand Iyer, A machine learning approach to inductive query by examples: an experiment using relevance feedback, ID3, genetic algorithms, and simulated annealing, Journal of the American Society for Information Science, v.49 n.8, p.639-705, June 1998 http://dl.acm.org/citation.cfm?id=279413&CFID=558819604&CFTOKEN=68186175
7 Hsinchun Chen , Yi-Ming Chung , Marshall Ramsey , Christopher C. Yang, A smart itsy bitsy spider for the web, Journal of the American Society for Information Science, v.49 n.7, p.604-618, May 15, 1998 http://dx.doi.org/10.1002/(SICI)1097-4571(1998)49:7%3C604::AID-ASI3%3E3.3.CO;2-X
8 Cordón, O., Moya, F., & Zarco, M.C. (1999). Breve estudio sobre la aplicacióón de los algoritmos genéticos a la recuperación de la información. IV Congreso ISKO (Granada) (pp. 179-186).
9 Cordón, O., Moya, F., & Zarco, M.C. (2000). A GA-P algorithm to automatically formulate extended boolean queries for a fuzzy information retrieval system by means of GA-P techniques. Mathware & Soft Computing, 7(2-3), 309-322.
10 Cordón. O., Moya, F., & Zarco, M.C. (2002). A new evolutionary algorithm combining simulated annealing and genetic programming for relevance feedback in fuzzy information retrieval systems. Soft Computing, 6(5), 308-319.
11 Davis, L. (Ed.). (1991). Handbook of genetic algorithms. New York: Van Nostrand Reinhold.
12 Kenneth Alan De Jong, An analysis of the behavior of a class of genetic adaptive systems., University of Michigan, Ann Arbor, MI, 1975 http://dl.acm.org/citation.cfm?id=907087&CFID=558819604&CFTOKEN=68186175
13 W. B. Frakes, Stemming algorithms, Information retrieval: data structures and algorithms, Prentice-Hall, Inc., Upper Saddle River, NJ, 1992 http://dl.acm.org/citation.cfm?id=129695&CFID=558819604&CFTOKEN=68186175
14 Glover, D.E. (1987). Solving a complex keyboard configuration problem through a generalized adaptive search. In: Davis, L. (Ed.) Genetic algorithms and simulated annealing (pp. 12-31). San Mateo, CA: Morgan Kaufmann.
15 David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, 1989 http://dl.acm.org/citation.cfm?id=534133&CFID=558819604&CFTOKEN=68186175
16 M. Gordon, Probabilistic and genetic algorithms in document retrieval, Communications of the ACM, v.31 n.10, p.1208-1218, Oct. 1988 http://doi.acm.org/10.1145/63039.63044
17 Michael D. Gordon, The necessity for adaptation in modified Boolean document retrieval systems, Information Processing and Management: an International Journal, v.24 n.3, p.339-347, May 1988 http://dx.doi.org/10.1016/0306-4573(88)90100-8
18 Gordon, M.D. (1991). User-based document clustering by redescribing subject descriptors with a genetic algorithms. Journal of the American Society for Information Science, 42(5), 311-322.
19 J Grefenstette, Optimization of control parameters for genetic algorithms, IEEE Transactions on Systems, Man and Cybernetics, v.16 n.1, p.122-128, Jan./Feb. 1986 http://dx.doi.org/10.1109/TSMC.1986.289288
20 Grefenstette, J.J. (1987). Incorporating problem specific knowledge into genetic algorithms. In L. Davis (Ed.), Genetic algorithms and simulated annealing (pp. 42-60). Los Altos: Morgan Kaufmann Publishers.
21 Vicente P. Guerrero , Félix de Moya Anegón, Reduction of the dimension of a document space using the fuzzified output of a Kohonen network, Journal of the American Society for Information Science and Technology, v.52 n.14, p.1234-1241, December 2001 http://dx.doi.org/10.1002/asi.1189
22 Vicente P. Guerrero Bote , Félix de Moya Anegón , Victor Herrero Solana, Document organization using Kohonen's algorithm, Information Processing and Management: an International Journal, v.38 n.1, p.79-89, January 2002 http://dx.doi.org/10.1016/S0306-4573(00)00066-2
23 Donna Harman, Relevance feedback and other query modification techniques, Information retrieval: data structures and algorithms, Prentice-Hall, Inc., Upper Saddle River, NJ, 1992 http://dl.acm.org/citation.cfm?id=129698&CFID=558819604&CFTOKEN=68186175
24 M. R. Hilliard , G. E. Liepins , Mark Palmer , Michael Morrow , Jon Richardson, A classifier based system for discovering scheduling heuristics, Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their application, p.231-235, October 1987, Cambridge, Massachusetts, USA http://dl.acm.org/citation.cfm?id=42543&CFID=558819604&CFTOKEN=68186175
25 John H. Holland, Adaptation in natural and artificial systems, MIT Press, Cambridge, MA, 1992 http://dl.acm.org/citation.cfm?id=129194&CFID=558819604&CFTOKEN=68186175
26 Jorng-Tzong Horng , Ching-Chang Yeh, Applying genetic algorithms to query optimization in document retrieval, Information Processing and Management: an International Journal, v.36 n.5, p.737-759, Sept. 2000 http://dx.doi.org/10.1016/S0306-4573(00)00008-X
27 Ide, E. (1971). New experiments in relevance feedback. In G. Salton (Ed.), The SMART retrieval system (pp. 337-354). Englewood Cliffs, NJ: Prentice-Hall.
28 Kraft, D.H., Petry, F.E., Buckles, B.P., & Sadasivan, T. (1994). The use of genetic programming to build queries for information retrieval. Proceedings of IEEE Symposium on Evolutionary Computation. Orlando, FL.
29 Kraft, D.H., Petry, F.E., Buckles, B.P., & Sadasivan, T. (1995). Applying genetic algorithms to information retrieval systems via relevance feedback. In P. Bosc & J. Kacprzyk (Eds.), Fuzzy sets and possibility theory in database management systems, studies in fuzziness series (pp. 330-346). Heidelberg, Germany: Physica-Verlag.
30 Kraft, D.H., Petry, F.E., Buckles, B.P., & Sadasivan, T. (1997). Genetic algorithms for query optimization in information retrieval: Relevance feedback. In E. Sanchez, T. Shibata, & L.A. Zadeh (Eds.), Genetic algorithms and fuzzy logic systems. Soft computing perspectives (pp. 155-173). Singapore: World Scientific.
31 Kucera, H., & Francis, N. (1967). Computational analysis of present-day American English. Providence, RD: Brown University Press.
32 K. L. Kwok, Comparing representations in Chinese information retrieval, ACM SIGIR Forum, v.31 n.SI, p.34-41 http://doi.acm.org/10.1145/278459.258531
33 López-Pujalte, C. (2000). Algoritmos genéticos aplicados a la retroalimentación por relevancia. PhD Thesis, Library and Information Science Faculty, University of Granada.
34 Cristina Lopez-Pujalte , Vicente P. Guerrero Bote , Félix de Moya Anegón, A test of genetic algorithms in relevance feedback, Information Processing and Management: an International Journal, v.38 n.6, p.793-805, November 2002 http://dx.doi.org/10.1016/S0306-4573(01)00061-9
35 Martín-Bautista, M.J. (2000). Modelos de computación flexible para la recuperación de información. PhD Thesis, Computer Science Faculty, University of Granada.
36 Maria J. Martin-Bautista , Maria-Amparo Vila , Henrik Legind Larsen, A fuzzy genetic algorithm approach to an adaptive information retrieval agent, Journal of the American Society for Information Science, v.50 n.9, p.760-771, July 1999 http://dx.doi.org/10.1002/(SICI)1097-4571(1999)50:9%3C760::AID-ASI4%3E3.3.CO;2-F
37 Zbigniew Michalewicz, Genetic algorithms + data structures = evolution programs (3rd ed.), Springer-Verlag, London, UK, 1996 http://dl.acm.org/citation.cfm?id=229930&CFID=558819604&CFTOKEN=68186175
38 Terry Noreault , Michael McGill , Matthew B. Koll, A performance evaluation of similarity measures, document term weighting schemes and representations in a Boolean environment, Proceedings of the 3rd annual ACM conference on Research and development in information retrieval, p.57-76, June 23-27, 1980, Cambridge, England http://dl.acm.org/citation.cfm?id=636674&CFID=558819604&CFTOKEN=68186175
39 Porter M.F. (1980). An algorithm for suffix stripping. Program 14(3), 130-137.
40 Vijay Raghavan , Brijesh Agarwal, Optimal determination of user-oriented clusters: an application for the reproductive plan, Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their application, p.241-246, October 1987, Cambridge, Massachusetts, USA http://dl.acm.org/citation.cfm?id=42545&CFID=558819604&CFTOKEN=68186175
41 Vijay V. Raghavan , Kim Birchard, A clustering strategy based on a formalism of the reproductive process in natural systems, Proceedings of the 2nd annual international ACM SIGIR conference on Information storage and retrieval: information implications into the eighties, p.10-22, September 27-28, 1979, Dallas, Texas http://doi.acm.org/10.1145/511706.511709
42 Robertson, A.M., & Willett, P. (1994). Generation of equifrequent groups of words using a genetic algorithms. Journal of Documentation, 50(3), 213-232.
43 Robertson, A.M., & Willett, P. (1995). Use of genetic algorithms in information retrieval. British Library. Research and Development Department. BLRD Report 6201.
44 Robertson, A.M., & Willett, P. (1996). An upperbound to the performance of ranked-output searching: Optimal weighting of query terms using a genetic algorithm. Journal of Documentation, 52(4), 405-420.45
45 Robertson, G. (1987). Parallel implementation of genetic algorithms in classification systems. In L. Davis (Ed.), Genetic algorithms and simulated annealing (pp. 129-149). San Mateo, CA: Morgan Kaufmann.
46 Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), 288-297.
47 Gerard Salton , Michael J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., New York, NY, 1986 http://dl.acm.org/citation.cfm?id=576628&CFID=558819604&CFTOKEN=68186175
48 Sanchez, E. (1994). Fuzzy logic and genetic algorithms in information retrieval. Proceedings of the third international conference on fuzzy logic, neural networks and soft computing (pp. 29-35).
49 Sanchez, E., & Pierre, P. (1994). Fuzzy logic and genetic algorithms in information retrieval. Third international conference on fuzzy logic, neural networks and soft computing, Izuaka, Japan (pp. 29-35).
50 Sanchez, E., Miyano, H., & Brachet, J. (1995). Optimization of fuzzy queries with genetic algorithms. Application to a data base of patents in Biomedical Engineering. VI IFSA Congress, San Paulo, Brazil (vol. II, pp. 293-296).
51 Smith, M.P., & Smith, M. (1997). The use of genetic programming to build boolean queries for text retrieval through relevance feedback. Journal of Information Science, 23(6), 423-431.
52 Vrajitoru, D. (1997). Apprentissage en recherche d'informations. Doctoral thesis University of Neuch,tel, Faculty of Science.
53 Dana Vrajitoru, Crossover improvement for the genetic algorithm in information retrieval, Information Processing and Management: an International Journal, v.34 n.4, p.405-415, July 1, 1998 http://dx.doi.org/10.1016/S0306-4573(98)00015-6
54 Yang, J.J., & Korfhage, R.R. (1992). Adaptive information retrieval systems in vector model. Proceedings of the symposium on document analysis and information retrieval (pp. 134-150), Las Vegas, NV.
55 Jing-Jye Yang , Robert Korfhage, Query Optimization in Information Retrieval Using Genetic Algorithms, Proceedings of the 5th International Conference on Genetic Algorithms, p.603-613, June 01, 1993 http://dl.acm.org/citation.cfm?id=657582&CFID=558819604&CFTOKEN=68186175
56 Jing-Jye Yang , Robert R. Korfhage, Query modification using genetic algorithms in vector space models, International Journal of Expert Systems, v.7 n.2, p.165-191, 1994 http://dl.acm.org/citation.cfm?id=185123&CFID=558819604&CFTOKEN=68186175
Links
Full Text