Selected Publications.

Please see my Google Scholar for an up-to-date list of all publications.

SHADES: Towards a Multilingual Assessment of Stereotypes in Large Language Models

January 2025

Margaret Mitchell et al.

While research has attempted to identify and mitigate such biases, most efforts have been concentrated around English, lagging the rapid advancement of LLMs in multilingual settings. In this paper, we introduce a new multilingual dataset SHADES to help address this issue, designed for examining culturally-specific stereotypes that may be learned by LLMs. The dataset includes stereotypes from 20 geopolitical regions and languages, spanning multiple identity cate016 gories subject to discrimination worldwide.

January 2025

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

January 2025

Gregor Geigle, Florian Schneider, Carolin Holtermann, Chris Biemann, Radu Timofte, Anne Lauscher and Goran Glavaš

We present a comprehensive investigation into the training strategies for massively multilingual LVLMs. First, we conduct a series of multi-stage experiments spanning 13 downstream vision-language tasks and 43 languages, systematically examining: (1) the number of training languages that can be included without degrading English performance and (2) optimal language distributions of pre-training as well as (3) instruction-tuning data. Further, we (4) investigate how to improve multilingual text-in-image understanding, and introduce a new benchmark for the task.

January 2025

Why do LLaVA Vision-Language Models Reply to Images in English?

July 2024

Musashi Hinck, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal

We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query.

July 2024

Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ

March 2024

Carolin Holtermann, Paul Röttger, Timm Dill and Anne Lauscher

We investigate the basic multilingual capabilities of state-of-the-art open LLMs beyond their intended use. Specifically, we introduce a new silver standard benchmark which we use to assess the models' multilingual language fidelity and question answering accuracy.

March 2024

What the Weight?! A Unified Framework for Zero-Shot Knowledge Composition

January 2024

Carolin Holtermann, Markus Frohmann, Navid Rekabsaz and Anne Lauscher

We propose a novel framework for zero-shot module composition, which encompasses existing and some novel variations for selecting, weighting, and combining parameter modules under a single unified notion. Focusing on the scenario of domain knowledge and adapter layers, our framework provides a systematic unification of concepts, allowing us to conduct the first comprehensive benchmarking study of various zero-shot knowledge composition strategies.

January 2024

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

October 2023

Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher and Navid Rekabsaz

Multi-task learning (MTL) has shown considerable practical benefits, particularly when using pre-trained language models (PLMs). On the flip side, current two-stage MTL methods come with the cost of introducing a substantial number of additional parameters. In this work, we address this issue by leveraging the usefulness of linearly scaling the output representations of source adapters for transfer learning.

October 2023

Fair and Argumentative Language Modeling for Computational Argumentation

May 2022

Holtermann, Carolin and Lauscher, Anne and Ponzetto, Simone

Although much work in NLP has focused on measuring and mitigating stereotypical bias in semantic spaces, research addressing bias in computational argumentation is still in its infancy. In this paper, we address this research gap and conduct a thorough investigation of bias in argumentative language models.

May 2022