The paper presents an overview of the fourth edition of the Shared Task on Multilingual Coreference Resolution, organized as part of the CODI-CRAC 2025 workshop. As in the previous editions, participants were challenged to develop systems that identify mentions and cluster them according to identity coreference. A key innovation of this year’s task was the introduction of a dedicated Large Language Model (LLM) track, featuring a simplified plaintext format designed to be more suitable for LLMs than the original CoNLL-U representation. The task also expanded its coverage with three new datasets in two additional languages, using version 1.3 of CorefUD – a harmonized multilingual collection of 22 datasets in 17 languages. In total, nine systems participated, including four LLM-based approaches (two fine-tuned and two using few-shot adaptation). While traditional systems still kept the lead, LLMs showed clear potential, suggesting they may soon challenge established approaches in future editions.
@inproceedings{novak-etal-2025-findings, title = "Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can {LLM}s Dethrone Traditional Approaches?", author = "Nov{\'a}k, Michal and Konopik, Miloslav and Nedoluzhko, Anna and Popel, Martin and Prazak, Ondrej and Sido, Jakub and Straka, Milan and {\v{Z}}abokrtsk{\'y}, Zden{\v{e}}k and Zeman, Daniel", editor = "Ogrodniczuk, Maciej and Novak, Michal and Poesio, Massimo and Pradhan, Sameer and Ng, Vincent", booktitle = "Proceedings of the Eighth Workshop on Computational Models of Reference, Anaphora and Coreference", month = nov, year = "2025", address = "Suzhou, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2025.crac-1.9/", pages = "95--118", abstract = "The paper presents an overview of the fourth edition of the Shared Task on Multilingual Coreference Resolution, organized as part of the CODI-CRAC 2025 workshop. As in the previous editions, participants were challenged to develop systems that identify mentions and cluster them according to identity coreference. A key innovation of this year{'}s task was the introduction of a dedicated Large Language Model (LLM) track, featuring a simplified plaintext format designed to be more suitable for LLMs than the original CoNLL-U representation. The task also expanded its coverage with three new datasets in two additional languages, using version 1.3 of CorefUD {--} a harmonized multilingual collection of 22 datasets in 17 languages. In total, nine systems participated, including four LLM-based approaches (two fine-tuned and two using few-shot adaptation). While traditional systems still kept the lead, LLMs showed clear potential, suggesting they may soon challenge established approaches in future editions." }