Effective information retrieval using genetic algorithms based matching functions adaption

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche


Referenz

Pathak, M. Gordon, W. Fan: Effective information retrieval using genetic algorithms based matching functions adaption. in: Proc. 33rd Hawaii International Conference on Science (HICS), Hawaii, USA, 2000

DOI

http://dx.doi.org/10.1109/HICSS.2000.926653

Abstract

Knowledge intensive organizations have vast array of information contained in large document repositories. With the advent of E-commerce and corporate intranets/extranets, these repositories are expected to grow at a fast pace. This explosive growth has led to huge, fragmented, and unstructured document collections. Although it has become easier to collect and store information in document collections, it has become increasingly difficult to retrieve relevant information from these large document collections. This paper addresses the issue of improving retrieval performance (in terms of precision and recall) for retrieval from document collections. There are three important paradigms of research in the area of information retrieval (1R): Probabilistic IR, Knowledge-based IR, and, Artificial Intelligence based techniques like neural networks and symbolic learning. Very few researcher have tried to use evolutionary algorithms like genetic algorithms (GAs). Previous attempts at using GAs have concentrated on modifying document representations or modifying query representations. This work looks at the possibility of applying GAs to adapt various matching functions. It is hoped that such an adaptation of the matching functions in lead to a better retrieval performance than that obtained by using a single matching function. An overall matching function is treated as an weighted combination of scores produced by individual matching functions. This overall score is asked to rank and retrieve documents. Weights associated with individual functions are searched using Genetic Algorithms. The idea is tested on a real document collection called the Cranfield collection. The results look very encouraging.

Extended Abstract

Bibtex

@INPROCEEDINGS{926653,
author={P. Pathak and M. Gordon and Weiguo Fan},
booktitle={System Sciences, 2000. Proceedings of the 33rd Annual Hawaii International Conference on},
title={Effective information retrieval using genetic algorithms based matching functions adaptation},
year={2000},
pages={8 pp. vol.1-},
keywords={genetic algorithms;information retrieval;document repositories;genetic algorithms;information retrieval;large document collections;matching functions adaptation;neural networks;retrieval performance;symbolic learning;Adaptive arrays;Artificial intelligence;Artificial neural networks;Costs;Electronic switching systems;Extranets;Genetic algorithms;Information retrieval;Read only memory;Testing},
doi={10.1109/HICSS.2000.926653},
url={http://dx.doi.org/10.1109/HICSS.2000.926653 http://de.evo-art.org/index.php?title=Effective_information_retrieval_using_genetic_algorithms_based_matching_functions_adaption },
month={Jan},
}

Used References

B. T. Bartell , G. W. Cottrell and R. K. Belew, "Optimizing similarity using multi-query relevance feedback", Journal of the American Society for Information Science, vol. 49, no. 8, pp. 742-761, 1998 http://dx.doi.org/10.1002/(SICI)1097-4571(199806)49:8<742::AID-ASI8>3.0.CO;2-H

G Bordogna and G. Pasi, "A fuzzy linguistic approach generalizing Boolean information retrieval: a model and its evaluation", Journal of the American Society for Information Science, vol. 44, no. 2, pp. 70-82, 1993 http://dx.doi.org/10.1002/(SICI)1097-4571(199303)44:2<70::AID-ASI2>3.0.CO;2-I

R. Belew, "Adaptive information retrieval", Proceedings of the Twelfth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, pp. 11-20, 1989

D. C. Blair and M. E. Maron, "An evaluation of retrieval effectiveness for a full text document-retrieval system", Communications of the ACM, vol. 28, no. 3, pp. 289-299, 1985 Full Text: Access at ACM

A. Bookstein, "Probability and fuzzy-set applications to information retrieval", Annual Review of Information Science and Technology, vol. 20, pp. 117-151, 1985

H. Chen and V. Dhar, "Cognitive process as a basis for intelligent retrieval systems design", Information Processing and Management, vol. 27, pp. 405-432, 1991 http://dx.doi.org/10.1016/0306-4573(91)90060-Y

H. Chen, "Machine Learning for Information Retrieval: Neural Networks, Symbolic Learning, and Genetic Algorithms", Journal of the American Society for Information Science, vol. 46, no. 3, pp. 194-216, 1995 http://dx.doi.org/10.1002/(SICI)1097-4571(199504)46:3<194::AID-ASI4>3.0.CO;2-S

T. Doszkocs , J. Reggia and X. Lin, "Connectionist models and information retrieval", Annual Review of Information Science and Technology, vol. 25, pp. 209-260, 1990

D. Evans, "TREC experiments of the CLARIT project", The First Text Retrieval Conference (TREC1), 1993

N. Fuhr and C Buckley, "A Probabilistic Learning Approach for Document Indexing", ACM Transactions on Information Systems, vol. 9, pp. 223-248, 1991 http://dx.doi.org/10.1145/125187.125189

D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, 1989, Addison-Wesley

M. D. Gordon, "Probabilistic and genetic algorithms for document retrieval", Communications of the ACM, vol. 31, no. 10, pp. 1208-1218, 1988 http://dx.doi.org/10.1145/63039.63044

M. D. Gordon, "User-based document clustering by redescribing subject descriptions with a genetic algorithm", Journal of the American Society for Information Science, vol. 42, pp. 311-322, 1991 http://dx.doi.org/10.1002/(SICI)1097-4571(199106)42:5<311::AID-ASI1>3.0.CO;2-J

D. Harman, "An experimental study of factors important in document ranking", Proceedings of the ACM SIGIR, pp. 186-193, 1986 http://dx.doi.org/10.1145/253168.253206

J. H. Holland, Adaptation in Natural and Artificial Systems, 1975, The University of Michigan Press

W. P. Jones and G. W. Furnas, "Pictures of relevance: A geometric analysis of similarity measures", Journal of the American Society for Information Science, vol. 38, pp. 420-442 http://dx.doi.org/10.1002/(SICI)1097-4571(198711)38:6<420::AID-ASI3>3.0.CO;2-S

M. Maron and J. Kuhns, "On relevance, probabilistic indexing and information retrieval", Journal of the ACM, vol. 7, pp. 216-243, 1960 http://dx.doi.org/10.1145/321033.321035

S. H. Myaeng, "Using conceptual graphs for information retrieval: a framework for adequate representation and flexible inferencing", Proceedings of the Symposium on Document Analysis and Information Retrieval, pp. 102-116, 1992

P. Pathak, "A simulation model of document information retrieval system with relevance feedback", Proceedings of the America Conference of the Association for Information Systems, pp. 194-196, 1998

P. Pathak, "Relevance Feedback in Information Retrieval Using Genetic Algorithms: A Test on Simulated Documents", Proceedings of the Eighth Annual Workshop on Information Technologies and Systems, WITS\'98, pp. 65-74, 1998

J. Quinlan, "Induction of decision trees", Machine Learning, vol. 1, pp. 81-106, 1986, 1993 http://dx.doi.org/10.1007/BF00116251

V. Raghavan and B. Agarwal, "Optimal determination of user-oriented clusters: An application for the reproductive plan", Proceedings of the Second International Conference on Genetic Algorithms and their Applications, pp. 241-246, 1987

S. Robertson and K. Sparck Jones, "Relevance weighting of search terms", Journal of the American Society for Information Sciences, vol. 27, pp. 129-146, 1976 http://dx.doi.org/10.1002/asi.4630270302

S. E. Robertson, "The probabilistic character of relevance", Information Processing & Management, vol. 13, pp. 247-251, 1977 http://dx.doi.org/10.1016/0306-4573(77)90005-X

G. Salton, The SMART Retrieval System: experiments in automatic document processing, 1971, Prentice Hall

G. Salton and M. McGill, Introduction to Modern Information Retrieval, 1983, McGraw-Hill

G. Salton and C. Buckley, "Improving retrieval performance by relevance feedback", Journal of the American Society for Information Science, vol. 41, no. 4, pp. 288-297, 1990 http://dx.doi.org/10.1002/(SICI)1097-4571(199006)41:4<288::AID-ASI8>3.0.CO;2-H

H. Turtle and W. Croft, "Inference networks for document retrieval", Proceedings of the 13th Annual International ACM/SIGIR Conference in Research and Development in Information Retrieval, pp. 1-24, 1990 http://dx.doi.org/10.1145/96749.98006

C. J. van Rijsbergen, Information Retrieval, 1979, Butterworth

J. Yang and R. R. Korfhage, "Query Optimization in Information Retrieval Using Genetic Algorithms", Proceedings of the fifth International Conference on Genetic Algorithms, pp. 603-613, 1993

Links

Full Text

https://www.computer.org/csdl/proceedings/hicss/2000/0493/02/04932011.pdf

internal file


Sonstige Links