NLP group

Segment Representations in Named Entity Recognition

Michal Konkol and Miloslav Konopík
Text, Speech, and Dialogue (2015)

Research topics:

Named Entity Recognition

Abstract

In this paper we study the effects of various segment representations in the named entity recognition (NER) task. The segment representation is responsible for mapping multi-word entities into classes used in the chosen machine learning approach. Usually, the choice of a segment representation in the NER system is arbitrary without proper tests. Some authors presented comparisons of different segment representations such as BIO, BIEO, BILOU and usually compared only two segment representations. Our goal is to show, that the segment representation problem is more complex and that the proper selection of the best approach is not straightforward. We provide experiments with a wide set of segment representations. All the representations are tested using two popular machine learning algorithms: Conditional Random Fields and Maximum Entropy. Furthermore, the tests are done on four languages, namely English, Spanish, Dutch and Czech.

NLP group

Research & development

Segment Representations in Named Entity Recognition

Research topics:

Abstract

Authors

Ing. Miloslav Konopík, Ph.D.

Researcher

BibTex

Contact Us

NLP group

We offer