Neural networks for multi-lingual multi-label document classification

Jiří Martínek and Ladislav Lenc and Pavel Král
27th International Conference on Artificial Neural Networks (ICANN 2018) (2018)
BibTex  | PDF

Research topics

Document classification | Neural networks

Abstract

This paper proposes a novel approach for multi-lingual multi-label document classification based on neural networks. We use popular convolutional neural networks for this task with three different configurations. The first one uses static word2vec embeddings that are let as is, while the second one initializes it with word2vec and fine-tunes the embeddings while learning on the available data. The last method initializes embeddings randomly and then they are optimized to the classification task. The proposed method is evaluated on four languages, namely English, German, Spanish and Italian from the Reuters corpus. Experimental results show that the proposed approach is efficient and the best obtained F-measure reaches 84%.

Authors of the publication

Back to Top