We present Superlim, a multi-task NLP benchmark and analysis platform for evaluating Swedish language models, a counterpart to the English-language (Super)GLUE suite. We describe the dataset, the tasks, the leaderboard and report the baseline results yielded by a reference implementation. The tested models do not approach ceiling performance on any of the tasks, which suggests that Superlim is truly difficult, a desirable quality for a benchmark. We address methodological challenges, such as mitigating the Anglocentric bias when creating datasets for a less-resourced language; choosing the most appropriate measures; documenting the datasets and making the leaderboard convenient and transparent. We also highlight other potential usages of the dataset, such as, for instance, the evaluation of cross-lingual transfer learning.

Superlim: A Swedish Language Understanding Evaluation Benchmark

This article reports on an ongoing project aiming at automatization of pseudonymization of learner essays. The process includes three steps: identification of personal information in an unstructured text, labeling for a category, and pseudonymization. We experiment with rule-based methods for detection of 15 categories out of the suggested 19 (Megyesi et al., 2018) that we deem important and/or doable with automatic approaches. For the detection and labeling steps, we use resources covering personal names, geographic names, company and university names and others. For the pseudonymization step, we replace the item using another item of the same type from the above-mentioned resources. Evaluation of the detection and labeling steps are made on a set of manually anonymized essays. The results are promising and show that 89% of the personal information can be successfully identified in learner data, and annotated correctly with an inter-annotator agreement of 86% measured as Fleiss kappa and Krippendorff’s alpha.

Towards Privacy by Design in Learner Corpora Research: 
A Case of On-the-fly Pseudonymization of Swedish Learner Essays

**Are you attending this poster session virtually?** 
In-person printed posters are available for in-person attendees only.
 
**Are you attending this poster session in person?** 
Hybrid posters are displayed in the East Foyer.

PS3 (In-person) Posters, Demo, Industry, Findings

poster

**Welcome to EMNLP 2023!**

On behalf of the EMNLP 2023 Organizing Committee, I extend a warm and heartfelt welcome to all of you to the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). It is with immense pleasure and excitement that we gather here in Singapore, a vibrant hub of innovation and technological advancement.

The conference program is packed with insightful presentations, thought-provoking workshops, and engaging networking opportunities. In addition to the technical sessions, we have also planned several social events that will provide you with the opportunity to connect with your colleagues.

The conference is held in person at the Resorts World Convention Centre in Singapore, and available on-line with the help of Underline.

For those in Singapore, I hope that you will find time to explore the exciting Sentosa Island, Gardens by the Bay, Singapore Botanic Gardens and the many other attractions unique to Singapore.

Of course, we are grateful to our sponsors and partners for their generous support of EMNLP 2023, whose contributions make it possible for us to host this world-class event.

Yuji Matsumoto (RIKEN AIP) 
EMNLP 2023 General Chair

To access the **EMNLP 2023** event page on Underline, you need to register for the Conference. 
Please follow **[this link](https://2023.emnlp.org/registration/)** for more details.

EMNLP 2023

EMNLP 2023 took place in Singapore from Dec 6th to Dec 10th, 2023.

POSTER2: Applications

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

First established in 1965, the biennial COLING conference is held in diverse parts of the globe and attracts participants from both top-ranked research centers and emerging countries. Today, the most important developments in our field are taking place not only in universities and academic research institutes, but also in industrial research departments and in technological startups. COLING conferences provide opportunities for all these communities to showcase their exciting developments.

COLING 2020

COLING, the International Conference on Computational Linguistics, is one of the premier conferences for natural language processing and computational linguistics. Often grouped within the field of artificial intelligence, but actually pre-dating the development of artificial intelligence, advances in computational linguistics and natural language processing are now some of the major drivers behind the use of artificial intelligence for commercial and social applications – for example, on-line search, machine translation and with voice-assisted conversational devices.

Elena Volodina

4

SHORT BIO

Presentations

Superlim: A Swedish Language Understanding Evaluation Benchmark

Towards Privacy by Design in Learner Corpora Research: A Case of On-the-fly Pseudonymization of Swedish Learner Essays

Stay up to date with the latest Underline news!

PRESENTATIONS

CONFERENCES

COMPANY

RESOURCES