Hierarchical Classification of Web Content
Type de ressource
Auteurs/contributeurs
- Dumais, Susan (Auteur)
- Chen, Hao (Auteur)
Titre
Hierarchical Classification of Web Content
Résumé
This paper explores the use of hierarchical structure for classifying a large, heterogeneous collection of web content. The hierarchical structure is initially used to train different second-level classifiers. In the hierarchical case, a model is learned to distinguish a second-level category from other categories within the same top level. In the flat non-hierarchical case, a model distinguishes a second-level category from all other second-level categories. Scoring rules can further take advantage of the hierarchy by considering only second-level categories that exceed a threshold at the top level. We use support vector machine (SVM) classifiers, which have been shown to be efficient and effective for classification, but not previously explored in the context of hierarchical classification. We found small advantages in accuracy for hierarchical models over flat models. For the hierarchical approach, we found the same accuracy using a sequential Boolean decision rule and a multiplicative decision rule. Since the sequential approach is much more efficient, requiring only 14%-16% of the comparisons used in the other approaches, we find it to be a good choice for classifying text into large hierarchical structures.
Date
2000
Titre des actes
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Intitulé du colloque
23rd annual international ACM SIGIR conference on Research and development in information retrieval
Maison d’édition
ACM
Pages
256-263
Langue
anglais
ISBN
1-58113-226-3
Titre abrégé
Hierarchical Classification of Web Content
Archive
ACM Digital Library
Catalogue de bibl.
Association for Computing Machinery
Référence
Dumais, S. et Chen, H. (2000). Hierarchical Classification of Web Content. 23rd annual international ACM SIGIR conference on Research and development in information retrieval (p. 256‑263). https://doi.org/10.1145/345508.345593
Revue de littérature
Lien vers cette notice