Représentation de Textes à l’Aide d’Étiquettes Sémantiques dans le Cadre de la Classification Automatique

Camelia Ignat; François Rousselot

“Diacronia” bibliometric database (BDD)

Title:	Représentation de Textes à l’Aide d’Étiquettes Sémantiques dans le Cadre de la Classification Automatique
Authors:	Camelia Ignat, François Rousselot
Publication:	Revue roumaine de linguistique, LI (3-4)
p-ISSN:	0035-3957
Publisher:	Editura Academiei
Place:	București
Year:	2006
Abstract:	This paper describes an algorithm for document representation in a reduced vectorial space by a process of feature extraction. The algorithm is evaluated in the context of the supervised classification of news articles. We are generating a document representation (profile) represented by semantic tags from a machine-readable dictionary. We are dealing with synonymy handled by thematic conflation, and polysemy for which we have developed a statistical method for word-sense disambiguation. We propose four variants for the profile generation depending on whether a recursive system is used or not, and whether a corrective factor for polysemous words is taken into account or not. We have evaluated 32 variants, depending on the algorithm type and on three other parameters: grammatical category selection, 15% reduction of the profile, and a stop-list of semantic tags. Some parameters (like profile reduction) have low influence on the classifier performance and others (corrective factor for the ambiguous words, stop-list) improve the performance noticeably.
Language:	French
Links:	pdf html

Citations to this publication: 1

Florin Sterian

Bibliografia românească de lingvistică (BRL, 50, 2007). Lucrări de lingvistică apărute în țara noastră în cursul anului 2007

LR, LVII (3), 241-416

2008

pdf

References in this publication: 0

The citations/references list is based on indexed publications only, and may therefore be incomplete.
For any and all inquiries related to the database, please contact us at [Please enable javascript to view.].

Représentation de Textes à l’Aide d’Étiquettes Sémantiques dans le Cadre de la Classification Automatique

Citations to this publication: 1

References in this publication: 0

Preview: