NLP group

Enhancing Masked Language Modeling in BERT Models Using Pretrained Static Embeddings

Adam Mištera and Pavel Král
International Conference on Text, Speech, and Dialogue (2025)

Research topics:

Semantic Analysis | Neural Networks | Artificial Intelligence

Abstract

This paper explores the integration of pretrained static fastText word vectors into a simplified Transformer-based model to improve its efficiency and accuracy. Despite the fact that these embeddings have been outperformed by large models based on the Transformer architecture, they can still contribute useful linguistic information, when combined with contextual models, especially in low resource or computationally constrained environments. We demonstrate this by incorporating static embeddings directly into our own BERT\textsubscript{TINY}-based models prior to pretraining using masked language modeling. In this paper, we train the models on seven different languages covering three distinct language families. The results show that the use of static fastText embeddings in these models not only improves convergence for all tested languages, but also significantly improves their evaluation accuracy.

NLP group

Research & development

Enhancing Masked Language Modeling in BERT Models Using Pretrained Static Embeddings

Research topics:

Abstract

Authors

Ing. Adam Mištera

PhD student

Doc.Ing. Pavel Král, Ph.D.

Team leader

BibTex

Contact Us

NLP group

We offer