Marko Robnik Šikonja: Large language models for cross-lingual transfer
Affiliation: University of Ljubljana
Moderator: Slavko Žitnik
Tuesday, 01/08 19:00-20:30, classroom PA
Abstract:
Currently, the most successful approach to natural language processing is based on large pretrained language models using the transformer architecture of neural networks. These are typically pretrained on huge text corpora on the tasks of predicting next tokens or masked tokens. While most existing models are predominantly monolingual, multilingual variants also exist and can help in cross-lingual transfer of knowledge and models. We will present a few types of large language models, focusing on cross-lingual transfer. We will show their strengths and weaknesses in text classification, summarization, and question answering.
Speaker short bio:
Marko Robnik-Šikonja is Professor of Computer Science and Informatics at the University of Ljubljana, Faculty of Computer and Information Science. His research interests span machine learning, data mining, natural language processing, explainable artificial intelligence, and application of data science techniques. His most notable scientific results concern deep learning, natural language analysis, feature evaluation, ensemble learning, predictive model explanation, information network analysis, and data generation. He is (co)author of more than 200 scientific publications that were cited more than 7,500 times. He is the author of several data mining software packages and language resources.
Malvina Nissim: Language Technology <preposition> Society
Affiliation: University of Groningen
Moderator: Valerio Basile
Thursday, 03/08 19:00-21:00, classroom PA
Abstract:
The recognition of society’s role in language technology has become essential and cannot be overlooked. Still, plenty of research in Natural Language Processing does not explicitly account for such interplay. This evening lecture will zoom in on precisely this aspect. “Precisely” is an ambitious term, since the very definition of the relationship between language technology and society is subject to multiple interpretations, both in the context of scientific research as well as in connection with the general public, who currently is very much exposed to, interested in, and involved with language-based artificial intelligence tools. Through recent work I’ve carried out with my group, and through personal reflections, I will unpack this exciting relationship from different angles.
Speaker short bio:
Malvina Nissim holds a Chair in Computational Linguistics and Society at the University of Groningen, The Netherlands. Her research focuses both on language modelling aspects as well as on the impact that language technology has on society. She's thus also regularly involved in outreach activities, and is a member of the Ethics Committee of the international Association for Computational Linguistics (ACL). She graduated in Linguistics from the University of Pisa, and obtained her PhD in Linguistics from the University of Pavia. Before joining the University of Groningen, she was a tenured researcher at the University of Bologna (2006-2014), and a postdoc at the Institute for Cognitive Science and Technology in Rome (2006) and at the University of Edinburgh (2001-2005). She is the 2016 University of Groningen Lecturer of the Year.
Beniamino Accattoli: The Cost of the lambda Calculus and the Semantics of Sharing
Affiliation: Inria
Moderator: Valentin Goranko
Tuesday, 08/08 19:00-20:30, classroom PA
Abstract:
The lambda calculus is an expressive mathematical formalism that elegantly captures the core of functional programming languages, while providing at the same time compact representations of intuitionistic logic proofs. The first part of the talk shall survey the recent advances in the study of reasonable cost models for the lambda calculus, that is, of time and space cost measures that are equivalent to those of Turing machines. In particular, it shall overview how understanding the role of sharing in the evaluation process is crucial for both time and space, but for opposite reasons. The second part of the talk shall show that extending the lambda calculus with first-class sharing is not a minor extension, as crucial semantic properties and results break, and new tools and richer theories need to be developed.
Speaker short bio:
Beniamino Accattoli obtained his PhD at 'La Sapienza' University in Rome in 2011. Since 2015, he has been a researcher at Inria & Ecole Polytechnique.
His work is mainly about the lambda calculus, using a combination of tools from functional programming, logic, rewriting theory, and complexity theory. His most relevant contribution, in collaboration with Ugo Dal Lago, is that the lambda calculus can be used as a reasonable model for computational complexity.
Darja Fišer: The Good, the Bad and the Ugly of Language Technology Infrastructure
(Dick Oehrle Memorial Lecture)
Affiliation: Clarin Eric
Moderator: John McCrae
Thursday, 10/08 19:00-21:00, classroom PA
Abstract:
Advances in digitization and datafication have been transformative for linguistics and other disciplines that work with language materials. This has increased the need for research infrastructures that supports the development, documentation, archiving, dissemination, reuse and citation of language resources and tools which is prerequisite for verifiable, reproducible and ethical research. Still, the potential of research infrastructures in language technology remains undervalued and underutilised in the real world of language-based research and education. Based on the work conducted within my research group as well as through personal observations, I will address the good, the bad and the ugly aspects of adopting the research infrastructure principles that is built around the Open Science and FAIR data paradigm.
Speaker short bio:
Darja Fišer is Executive Director of CLARIN. She has a background in corpus linguistics and language resource creation. She has been Associate Professor at the Faculty of Arts, University of Ljubljana, since 2019, Senior Research Fellow at the Institute of Contemporary History since 2021, and is leading the new national research programme for Digital Humanities in Slovenia. She is also serving as a member of the Scientific Advisory Board of the Austrian Centre for Digital Humanities at the Austrian Academy of Sciences, the National Interdisciplinary Research E-Infrastructure for Bulgarian Language and Cultural Heritage Resources and Technologies, and the Czech National Corpus research infrastructure of the Institute of the Czech National Corpus at Charles University.