Multi-label Document Classification in Czech
16th International conference on Text, Speech and Dialogue (TSD 2013) (2013)
This paper deals with multi-label automatic document classification in the context of a real application for the Czech news agency.
The main goal of this work is to compare and evaluate three most promising multi-label document classification approaches on a Czech language.
We show that the simple method based on a meta-classifier proposes by Zhu at al. outperforms significantly the other approaches.
The classification error rate improvement is about 13%.
The Czech document corpus is available for research purposes for free
which is another contribution of this work.