Neural networks for multi-lingual multi-label document classification


Jiří Martínek and Ladislav Lenc and Pavel Král
27th International Conference on Artificial Neural Networks (ICANN 2018) (2018)

PDF

Abstract

This paper proposes a novel approach for multi-lingual multi-label document classification based on neural networks. We use popular convolutional neural networks for this task with three different configurations. The first one uses static word2vec embeddings that are let as is, while the second one initializes it with word2vec and fine-tunes the embeddings while learning on the available data. The last method initializes embeddings randomly and then they are optimized to the classification task. The proposed method is evaluated on four languages, namely English, German, Spanish and Italian from the Reuters corpus. Experimental results show that the proposed approach is efficient and the best obtained F-measure reaches 84%.

Authors

BibTex

@InProceedings{icann2018, author = {Mart{\'i}nek, Ji{\v{r}}{\'i} and Lenc, Ladislav and Kr{\'a}l, Pavel}, title = {Neural networks for multi-lingual multi-label document classification}, booktitle = {27th International Conference on Artificial Neural Networks (ICANN 2018)}, month = {October 4-7}, year = {2018}, address = {Rhodes, Greece}, volume = {11139 LNCS}, pages = {73-83}, isbn = {978-3-030-01417-9}, doi = {10.1007/978-3-030-01418-6_8}, publisher = {Springer International Publishing} }
Back to Top