Baza de date „Diacronia” (BDD)
Titlu:

Représentation de Textes à l’Aide d’Étiquettes Sémantiques dans le Cadre de la Classification Automatique

Autori:
Publicația: Revue roumaine de linguistique, LI (3-4)
p-ISSN:0035-3957
Editura:Editura Academiei
Locul:București
Anul:
Rezumat:This paper describes an algorithm for document representation in a reduced vectorial space by a process of feature extraction. The algorithm is evaluated in the context of the supervised classification of news articles. We are generating a document representation (profile) represented by semantic tags from a machine-readable dictionary. We are dealing with synonymy handled by thematic conflation, and polysemy for which we have developed a statistical method for word-sense disambiguation. We propose four variants for the profile generation depending on whether a recursive system is used or not, and whether a corrective factor for polysemous words is taken into account or not. We have evaluated 32 variants, depending on the algorithm type and on three other parameters: grammatical category selection, 15% reduction of the profile, and a stop-list of semantic tags. Some parameters (like profile reduction) have low influence on the classifier performance and others (corrective factor for the ambiguous words, stop-list) improve the performance noticeably.
Limba: franceză
Linkuri:  

Citări la această publicație: 0

Referințe în această publicație: 0

Lista citărilor/referințelor nu cuprinde decît texte prezente în baza de date, nefiind deci exhaustivă.
Pentru trimiterea de texte, semnalarea oricăror greșeli, și eventualul refuz ca „Diacronia” să facă publice textele, vă rugăm să folosiți adresa de email [Please enable javascript to view.].

Prima pagină: