Summarization

Automatic summarization is the process of reducing a set of text documents in order to create a summary that retains the most important points of the original documents.

In our research, we focus on language-independent summarization. Our approach based on latent semantic analysis is state-of-the-art in multilingual summarization, proved by excellent results in both multilingual summarization shared tasks organized so far (NIST 2011 and ACL 2013). We are also working on the Czech part of the summarization corpus, an effort to create a multilingual summarization evaluation resource organized by the Multiling community. Currently studied topics are: LSA-based summarization of news, social media and scientific papers; summarisation evaluation in multiple languages; opinion and comparative summarisation; using coreference for summarisation; and summary generation (sentence compression and paraphrasing).

Publications

Josef Steinberger
Aspects of Multilingual News Summarisation
Alessandro Fiori (ed.): Innovative Document Summarization Techniques: Revolutionizing Knowledge Understanding, Advances in Data Mining and Database Management series (2014)
BibTex
Summarization
Josef Steinberger
Multilingual Statistical News Summarization
Multilingual Information Extraction and Summarization (2013)
BibTex
Summarization
Back to Top