Dialogue Act Recognition Using Visual Information
Lecture Notes in Computer Science (2021)
Automatic dialogue management including dialogue act (DA) recognition is usually focused on dialogues in the audio signal. However, some dialogues are also available in a written form and their automatic analysis is also very important. The main goal of this paper thus consists in the dialogue act recognition from printed documents. For visual DA recognition, we propose a novel deep model that combines two recurrent neural networks. The approach is evaluated on a newly created dataset containing printed dialogues from the English VERBMOBIL corpus. We have shown that visual information does not have any positive impact on DA recognition using good quality images where the OCR result is excellent. We have also demonstrated that visual information can significantly improve the DA recognition score on low-quality images with erroneous OCR. To the best of our knowledge, this is the first attempt focused on DA recognition from visual data.