An upperbound to the performance for ranked-output searching: optimal weighting of query terms using a genetic algorithm: Unterschied zwischen den Versionen

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche
(Die Seite wurde neu angelegt: „ == Referenz == A. Robertson, P. Willet: An upperbound to the performance for ranked-output searching: optimal weighting of query terms using a genetic algor…“)
 
 
Zeile 63: Zeile 63:
 
7. CROFT, W.B. and HARPER, D.J. Using probabilistic models of document
 
7. CROFT, W.B. and HARPER, D.J. Using probabilistic models of document
 
retrieval without relevance information. Journal of Documentation, 35,
 
retrieval without relevance information. Journal of Documentation, 35,
1979, 285-295.
+
1979, 285-295. http://www.sciencedirect.com/science/article/pii/0020027173900430
  
 
8. CROFT, W.B. Experiments with representation in a document retrieval
 
8. CROFT, W.B. Experiments with representation in a document retrieval
Zeile 72: Zeile 72:
  
 
10. SALTON, G. and BUCKLEY, C. Term-weighting approaches in automatic text
 
10. SALTON, G. and BUCKLEY, C. Term-weighting approaches in automatic text
retrieval. Information Processing and Management, 24, 1988, 513-523.
+
retrieval. Information Processing and Management, 24, 1988, 513-523. http://www.sciencedirect.com/science/article/pii/0306457388900210
  
 
11. ROBERTSON, S.E. and WALKER, S. Some simple approximations to the 2-
 
11. ROBERTSON, S.E. and WALKER, S. Some simple approximations to the 2-
Zeile 81: Zeile 81:
  
 
12. ROBERTSON, S.E. AND SPARCK JONES, K. Relevance weighting of search terms.
 
12. ROBERTSON, S.E. AND SPARCK JONES, K. Relevance weighting of search terms.
Journal of the American Society for Information Science, 27, 1976, 129-145.
+
Journal of the American Society for Information Science, 27, 1976, 129-145. http://onlinelibrary.wiley.com/doi/10.1002/asi.4630270302/abstract
  
 
13. PORTER, M. AND GALPIN, v. Relevance feedback in a public access catalogue
 
13. PORTER, M. AND GALPIN, v. Relevance feedback in a public access catalogue
 
for a research library: Muscat at the Scott Polar Research Institute.
 
for a research library: Muscat at the Scott Polar Research Institute.
Program, 22, 1988, 1-20.
+
Program, 22, 1988, 1-20.  
  
 
14. SALTON, G. and BUCKLEY, C. Improving retrieval performance by relevance
 
14. SALTON, G. and BUCKLEY, C. Improving retrieval performance by relevance
 
feedback. Journal of the American Society for Information Science, 41, 1990,
 
feedback. Journal of the American Society for Information Science, 41, 1990,
288-297.
+
288-297. http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-4571(199006)41:4%3C288::AID-ASI8%3E3.0.CO;2-H/abstract
  
 
15. KEEN, E.M. The use of term position devices in ranked output experiments.
 
15. KEEN, E.M. The use of term position devices in ranked output experiments.
Zeile 100: Zeile 100:
  
 
17. JACKSON, D.M. Classification, relevance and information retrieval. Advances
 
17. JACKSON, D.M. Classification, relevance and information retrieval. Advances
in Computers, 11, 1971, 60-125.
+
in Computers, 11, 1971, 60-125. http://www.sciencedirect.com/science/article/pii/S0065245808606300
  
 
18. SPARCK JONES, K. A performance yardstick for test collections. Journal of
 
18. SPARCK JONES, K. A performance yardstick for test collections. Journal of
Zeile 115: Zeile 115:
 
21. HEINE, M.H. AND TAGUE, J.M. An investigation of the optimization of search
 
21. HEINE, M.H. AND TAGUE, J.M. An investigation of the optimization of search
 
logic for the Medline database. Journal of the American Society for
 
logic for the Medline database. Journal of the American Society for
Information Science, 42, 1991, 267-278.
+
Information Science, 42, 1991, 267-278. http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-4571(199105)42:4%3C267::AID-ASI3%3E3.0.CO;2-Y/abstract
  
 
22. KEEN, E.M. Presenting results of experimental retrieval comparisons.
 
22. KEEN, E.M. Presenting results of experimental retrieval comparisons.
Information Processing and Management, 28, 1992, 491-502.
+
Information Processing and Management, 28, 1992, 491-502. http://www.sciencedirect.com/science/article/pii/030645739290006L
  
 
23. LOSEE, R.M. Upper bounds for retrieval performance and their use for
 
23. LOSEE, R.M. Upper bounds for retrieval performance and their use for
 
generating optimal Boolean queries: can it get any better than this?
 
generating optimal Boolean queries: can it get any better than this?
Information Processing and Management, 30, 1994, 193-203.
+
Information Processing and Management, 30, 1994, 193-203. http://www.sciencedirect.com/science/article/pii/0306457394900647
  
 
24. SHAW, W.M. Term-relevance computations and perfect retrieval performance.
 
24. SHAW, W.M. Term-relevance computations and perfect retrieval performance.
Information Processing and Management, 31, 1995, 491-498.
+
Information Processing and Management, 31, 1995, 491-498. http://www.sciencedirect.com/science/article/pii/0306457395000115
  
 
25. GOLDBERG, D.E. Genetic algorithms in search, optimization and machine
 
25. GOLDBERG, D.E. Genetic algorithms in search, optimization and machine
Zeile 146: Zeile 146:
  
 
30. GORDON, M. Probabilistic and genetic algorithms for document retrieval.
 
30. GORDON, M. Probabilistic and genetic algorithms for document retrieval.
Communications of the ACM, 31, 1988, 1,208-1,218.
+
Communications of the ACM, 31, 1988, 1,208-1,218. http://dl.acm.org/citation.cfm?doid=63039.63044
  
 
31. FRIEDER, O. and SIEGELMANN, H.T. On the allocation of documents in
 
31. FRIEDER, O. and SIEGELMANN, H.T. On the allocation of documents in
Zeile 153: Zeile 153:
 
proceedings of the fourteenth annual international ACM/SIGIR conference on
 
proceedings of the fourteenth annual international ACM/SIGIR conference on
 
research and development in information retrieval. New York: ACM Press,
 
research and development in information retrieval. New York: ACM Press,
1991, 230-239.
+
1991, 230-239. http://dx.doi.org/10.1145/122860.122884
  
 
32. PETRY, F.E., BUCKLES, B.P., PRABBU, D. and KRAFT, D.H. Fuzzy information
 
32. PETRY, F.E., BUCKLES, B.P., PRABBU, D. and KRAFT, D.H. Fuzzy information

Aktuelle Version vom 27. Juni 2016, 13:57 Uhr


Referenz

A. Robertson, P. Willet: An upperbound to the performance for ranked-output searching: optimal weighting of query terms using a genetic algorithm. Journal of Documentation, 52 (4) (1996), pp. 405–420

DOI

http://dx.doi.org/10.1108/eb026973

Abstract

This paper describes the development of a genetic algorithm (GA) for the assignment of weights to query terms in a ranked-output document retrieval system. The GA involves a fitness function that is based on full relevance information, and the rankings resulting from the use of these weights are compared with the Robertson-Sparck Jones F4 retrospective relevance weight. Extended experiments with seven document test collections show that the GA can often find weights that are slightly superior to those produced by the deterministic weighting scheme. That said, there are many cases where the two approaches give the same results, and a few cases where the F4 weights are superior to the GA weights. Since the GA has been designed to identify weights yielding the best possible level of retrospective performance, these results indicate that the F4 weights provide an excellent and practicable alternative. Evidence is presented to suggest that negative weights may play an important role in retrospective relevance weighting.

Extended Abstract

Bibtex

@article{
author = {A. Robertson, P. Willet},
title = {An upperbound to the performance for ranked-output searching: optimal weighting of query terms using a genetic algorithm},
journal = {Journal of Documentation},
volume = {52},
number = {4},
pages = {405–420},
year = {1996},
keywords={}
doi={},
url={http://dx.doi.org/10.1108/eb026973 http://de.evo-art.org/index.php?title=An_upperbound_to_the_performance_for_ranked-output_searching:_optimal_weighting_of_query_terms_using_a_genetic_algorithm},
}

Used References

1. SALTON, G., ed. The SMART retrieval system: experiments in automatic document processing. Englewood Cliffs, NJ: Prentice-Hall, 1971.

2. FRAKES, W.B. and BAEZA-YATES, R. Information retrieval: data structures and algorithms. Englewood Cliffs, NJ: Prentice Hall, 1992.

3. PRITCHARD-SCHOCH, T. Natural language comes of age. Online, 77(3), 1993, 33-43.

4. TENOPIR, C and CAHN, p. TARGET and Freestyle. DIALOG and Mead join the relevance ranks. Online, 18(3), 1994, 31-47.

5. SPARCK JONES, K. A statistical interpretation of term specificity and its application in information retrieval. Journal of Documentation, 28, 1972, 11-21.

6. SPARCK JONES, K. Index term weighting. Information Storage and Retrieval, 9, 1973, 619-633.

7. CROFT, W.B. and HARPER, D.J. Using probabilistic models of document retrieval without relevance information. Journal of Documentation, 35, 1979, 285-295. http://www.sciencedirect.com/science/article/pii/0020027173900430

8. CROFT, W.B. Experiments with representation in a document retrieval system. Information Technology: Research and Development, 2, 1983, 1-21.

9. ROBERTSON, S.E. On relevance weight estimation and query expansion. Journal of Documentation, 42, 1986, 182-188.

10. SALTON, G. and BUCKLEY, C. Term-weighting approaches in automatic text retrieval. Information Processing and Management, 24, 1988, 513-523. http://www.sciencedirect.com/science/article/pii/0306457388900210

11. ROBERTSON, S.E. and WALKER, S. Some simple approximations to the 2- Poisson model for probabilistic weighted retrieval. In: CROFT, W.B. and VAN RIJSBERGEN, C.J., eds. SIGIR '94: proceedings of the seventeenth international conference on research and development in information retrieval. London: Springer-Verlag, 1994, 232-241.

12. ROBERTSON, S.E. AND SPARCK JONES, K. Relevance weighting of search terms. Journal of the American Society for Information Science, 27, 1976, 129-145. http://onlinelibrary.wiley.com/doi/10.1002/asi.4630270302/abstract

13. PORTER, M. AND GALPIN, v. Relevance feedback in a public access catalogue for a research library: Muscat at the Scott Polar Research Institute. Program, 22, 1988, 1-20.

14. SALTON, G. and BUCKLEY, C. Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41, 1990, 288-297. http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-4571(199006)41:4%3C288::AID-ASI8%3E3.0.CO;2-H/abstract

15. KEEN, E.M. The use of term position devices in ranked output experiments. Journal of Documentation, 47, 1991, 1-22.

16. WILKINSON, R. Effective retrieval of structured documents. In: CROFT, W.B. and VAN RIJSBERGEN, C.J., eds. SIGIR '94: proceedings of the seventeenth international conference on research and development in information retrieval. London: Springer-Verlag, 1994, 311-317.

17. JACKSON, D.M. Classification, relevance and information retrieval. Advances in Computers, 11, 1971, 60-125. http://www.sciencedirect.com/science/article/pii/S0065245808606300

18. SPARCK JONES, K. A performance yardstick for test collections. Journal of Documentation, 31, 1975, 266-272.

19. STIRLING, K.H. The effect of document ranking on retrieval system performance: a search for an optimal ranking rule. PhD thesis, University of California, 1977.

20. SCHAUBLE, P. On the compatibility of retrieval functions, preference relations, and document descriptions. Zurich: Eidgenossische Technische Hochschule, Institut für Informationssysteme, 1989.

21. HEINE, M.H. AND TAGUE, J.M. An investigation of the optimization of search logic for the Medline database. Journal of the American Society for Information Science, 42, 1991, 267-278. http://onlinelibrary.wiley.com/doi/10.1002/(SICI)1097-4571(199105)42:4%3C267::AID-ASI3%3E3.0.CO;2-Y/abstract

22. KEEN, E.M. Presenting results of experimental retrieval comparisons. Information Processing and Management, 28, 1992, 491-502. http://www.sciencedirect.com/science/article/pii/030645739290006L

23. LOSEE, R.M. Upper bounds for retrieval performance and their use for generating optimal Boolean queries: can it get any better than this? Information Processing and Management, 30, 1994, 193-203. http://www.sciencedirect.com/science/article/pii/0306457394900647

24. SHAW, W.M. Term-relevance computations and perfect retrieval performance. Information Processing and Management, 31, 1995, 491-498. http://www.sciencedirect.com/science/article/pii/0306457395000115

25. GOLDBERG, D.E. Genetic algorithms in search, optimization and machine learning. Reading, MA: Addison-Wesley, 1989.

26. DAVIS, L., ed. Handbook of genetic algorithms. New York: Van Nostrand Reinhold, 1991.

27. MICHALEWICZ, Z. Genetic algorithms + data structures = evolution programs. Berlin: Springer-Verlag, 1992.

28. MITCHELL, M. All introduction to genetic algorithms. Cambridge, MA: MIT Press, 1996.

29. RAGHAVAN, V.V. and AGARWAL, B. Optimal determination of user-orientated clusters: an application for the reproductive plan. In: GREFENSTETTE, J.J., ed. Genetic algorithms and their applications: proceedings of the second international conference on genetic algorithms and their applications. Hillsdale, NJ: Erlbaum, 1987, 241-246.

30. GORDON, M. Probabilistic and genetic algorithms for document retrieval. Communications of the ACM, 31, 1988, 1,208-1,218. http://dl.acm.org/citation.cfm?doid=63039.63044

31. FRIEDER, O. and SIEGELMANN, H.T. On the allocation of documents in multiprocessor information retrieval systems. In: BOOKSTEIN, A., CHIARAMELLA, Y., SALTON, G. and RAGHAVAN, V.V., eds. SIGIR '91: proceedings of the fourteenth annual international ACM/SIGIR conference on research and development in information retrieval. New York: ACM Press, 1991, 230-239. http://dx.doi.org/10.1145/122860.122884

32. PETRY, F.E., BUCKLES, B.P., PRABBU, D. and KRAFT, D.H. Fuzzy information retrieval using genetic algorithms and relevance feedback. In: BONZI, S., ed. ASIS '93: proceedings of the 56th ASIS annual meeting. Medford, NJ: American Society for Information Science, 1993, 122-125.

33. YANG, J-J., KORFHAGE, R.R. and RASMUSSEN, E.M. Query improvement in information retrieval using genetic algorithms - a report on the experiments of the TREC project. In: HARMAN, D.K., ed. The first text retrieval conference (TREC-1). Washington: National Institute of Standards and Technology, 1993, 31-58. (NIST Special Publication 500-207)

34. ROBERTSON, A.M. and WILLETT, P. Generation of equifrequent groups of words using a genetic algorithm. Journal of Documentation, 50, 1994, 213-232.

35. SMITH, M., SMITH, M.P. and WADE, S.J. Applying genetic programming to the problem of term weight algorithms. New Review of Document and Text Management, 1, 1995, 101-110.

36. WADE, S.J., SMITH, M. and WOLSTENHOLME, M. Application of a genetic algorithm to the production of text signatures. New Review of Document and Text Management, 1, 1995, 147-166.

37. ROBERTSON, A.M. and WILLETT, P. The use of genetic algorithms in information retrieval. London: British Library Research and Development Department, 1995. (British Library R&D report 6201).

38. SPARCK JONES, K. Search term relevance weighting given little relevance information. Journal of Documentation, 35, 1979, 30-48.

Links

Full Text

internal file


Sonstige Links