An Evolutionary Approach to Automatic Web Page Categorization and Updating

Aus de_evolutionary_art_org
Wechseln zu: Navigation, Suche


Referenz

V. Loia, P. Luengo: An Evolutionary Approach to Automatic Web Page Categorization and Updating. in: First Asia–Pacific Conference, Maebashi City, Japan, 2001, pp. 292–302

DOI

http://dx.doi.org/10.1007/3-540-45490-X_35

Abstract

Catalogues play an important role in most of the current Web search engines. The catalogues, which organize documents into hi- erarchical collections, are maintained manually increasin gdifficulty and costs due to the incessant growing of the WWW. This problem has stim- ulated many researches to work on automatic categorization of Web doc- uments. In reality, most of these approaches work well either on special types of documents or on restricted set of documents. This paper presents an evolutionary approach useful to construct automatically the catalogue as well as to perform the classification of a Web document. This function- ality relies on a genetic-based fuzzy clustering methodology that applies the clusterin gon the context of the document, as opposite to content- based clusterin gthat works on the complete document information


Extended Abstract

Bibtex

@Inbook{Loia2001,
author="Loia, Vincenzo and Luongo, Paolo",
editor="Zhong, Ning and Yao, Yiju and Liu, Jiming and Ohsuga, Setsuo",
title="An Evolutionary Approach to Automatic Web Page Categorization and Updating",
bookTitle="Web Intelligence: Research and Development: First Asia-Pacific Conference, WI 2001 Maebashi City, Japan, October 23--26, 2001 Proceedings",
year="2001",
publisher="Springer Berlin Heidelberg",
address="Berlin, Heidelberg",
pages="292--302",
isbn="978-3-540-45490-8",
doi="10.1007/3-540-45490-X_35",
url="http://dx.doi.org/10.1007/3-540-45490-X_35 http://de.evo-art.org/index.php?title=An_Evolutionary_Approach_to_Automatic_Web_Page_Categorization_and_Updating"
}

Used References

1. Attardi, G., Di Marco S., and Salvi, D. (1998). Categorisation by Context. Journal of Universal Compouter Science, 4:719–736.

2. Boley, D., Gini., M., Gross, R., Hang, E-H., Hasting, K., Karypis, G., Kumar, V., Mobasher, B., and Moore, J. (1999). Partioning-based clustering for Web document categorization Decision Support System, 27 (1999) 329–341. http://dx.doi.org/10.1016/S0167-9236(99)00055-X

3. Chakrabarti, S., Dom, B., Gibson, D., Kleinberg, J., Rahavan, P., and Rajagopalan, S. (1998). Automatic resource list compilation by analyzingh yperlink structure and associated text. Seventh International World Wide Web Conference, 1998.

4. Chang, C-H., and Hsu, C-C. (1997). Customizable Multi-Engine Search tool with Clustering. Sixth International World Wide Web Conference, April 7-11, 1997 Santa Clara, California, USA.

5. Cohen, W. (1998). A web-based information system that reasons with structured collections of text. Agents’98, 1998.

6. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., and Slattery, S. (1998). Learningto extract symbolic knowledge from the World Wide Web. AAAI-98, 1998.

7. Hayes, J., and Weinstein, S. P. (1990). CONSTRUE-TIS: A system for contentbased indexingof a database of news stories. Second Annual Conference on Innovative Applications of Artificial Intelligence, 1–5.

8. Iwayama, M. (1995). Cluster-based text categorization: a comparison of category search strategies. SIGIR-95, pp. 273–280.

9. Open Directory Project. URL: http://www.dmoz.org/​about.​html

10. Lawrence, S. and Giles, C. L. (1999). Nature, 400:107–109.Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99).

11. Mase, H., Tsuji, H., Kinukawa, H., Hosoya, Y., Koutani, K., and Kiyota, K. (1996). Experimental simulation for automatic patent categorization. Advances in Production Management Systems, 377–382.

12. McCallum, A., Nigam, K., Rennie, J., and Seymore, K. (1999). A Machine Learning Approach to BuildingDomain-Sp ecific Search Engine. Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99).

13. Sahami, M., Yusufali, S., and Baldoando, M. Q., W. (1998) SONIA: A service for organizing networked information autonomously. Third ACM Conference on Digital Libraries.

14. Selberg, E. (1999) Towards Comprehensive Web Search. PhD thesis, University of Washington.

15. Selberg, E and Etzioni, O. (2000). On the Instability of Web Search Engine. RIAO 2000.

16. JDK Java 2 Sun. http://www.java.sun.com

17. Zamir, O., and Etzioni, O. (1988).Web Document Clustering: A Feasibility Demonstration. SIGIR’98, Melbourne, Australia, ACM Press.

18. A Lexical Database for English. URL: http://www.cogsci.princeton.edu/​wn/​

Links

Full Text

internal file


Sonstige Links

https://www.researchgate.net/publication/221158394_An_Evolutionary_Approach_to_Automatic_Web_Page_Categorization_and_Updating