NLP group

Multi-label Document Classification in Czech

Michal Hrala and Pavel Král
16th International conference on Text, Speech and Dialogue (TSD 2013) (2013)

Research topics:

Document Classification

Abstract

This paper deals with multi-label automatic document classification in the context of a real application for the Czech news agency. The main goal of this work is to compare and evaluate three most promising multi-label document classification approaches on a Czech language. We show that the simple method based on a meta-classifier proposes by Zhu at al. outperforms significantly the other approaches. The classification error rate improvement is about 13%. The Czech document corpus is available for research purposes for free which is another contribution of this work.

NLP group

Research & development

Multi-label Document Classification in Czech

Research topics:

Abstract

Authors

prof. Ing. Pavel Král, Ph.D.

Team leader

BibTex

Contact Us

NLP group

We offer