February 15, 2023
March 15, 2023
March 29, 2023
April 26, 2023
May 10, 2023
June 7, 2023
Previous presentations
Find recordings of all previous presentations here.
Wednesday September 14, 2022, 16.00-17.30:
Title: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?
Speaker: Prof. Dr Emily M. Bender, University of Washington
Abstract: In this paper, Bender and her co-authors take stock of the recent trend towards ever larger language models (especially for English), which the field of natural language processing has been using to extend the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks. The authors take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks?
Wednesday September 28, 2022, 15.00-16.30:
Titel: "Natural" Natural Language Processing
Speaker: Slav Petrov, Distinguished Scientist / Senior Research Director, Google
Abstract: Natural language processing has often relied on intermediate (linguistically motivated) representations. Advances in representation learning are making it increasingly possible to train end-to-end models, even for complex tasks and with little supervision. We are increasingly also seeing multi-task models that perform well on a variety of tasks. In this talk I will review work in this area carried out at Google Research over the last couple of years and draw attention to some exciting results demonstrating that models can benefit from task descriptions and side information expressed in natural language, opening up opportunities for more "natural" interactions with natural language processing systems.
Wednesday October 12, 2022, 15.00-16.30:
Titel: Reading between the lines: a better benchmark for text understanding
Speaker: Prof. Dr. Yoav Goldberg, Bar Ilan University and Research Director at AI2 Israel
Abstract: What does it mean to "understand" a text? Many recent NLP works take a question-answering (QA) perspective to this question, under which understanding a text means the ability to accurately answer questions about it. While this approach is useful, it can also be misleading: some questions are easier than others, some questions can be answered without access to the text at all, and, furthermore, questions often leak information about the answer. Thus, determining understanding based on question-answering ability alone is not sufficient. Indeed, many QA datasets are now considered "solved" by deep learning systems, while it should be clear that these systems do not really understand the text. In this talk I highlight deficiencies of QA for measuring text understanding, and also introduce a new NLP task we devised---together with a large corresponding dataset---which I argue serves as a much stronger indicator of text understanding than QA is. Beyond being a benchmark for text understanding, the task is also linguistically grounded, and is a very useful building block for downstream text processing applications.
Wednesday October 26, 2022, 15.00-16.30:
Titel: Detect - Verify - Communicate: Combating Misinformation with More Realistic NLP
Speaker: Prof. Dr. Iryna Gurevych, Technical University Darmstadt, Germany
Abstract: Dealing with misinformation is a grand challenge of the information society directed at equipping the computer users with effective tools for identifying and debunking misinformation. Current Natural Language Processing (NLP) including its fact-checking research fails to meet the expectations of real-life scenarios. In this talk, we show why the past work on fact-checking has not yet led to truly useful tools for managing misinformation, and discuss our ongoing work on more realistic solutions. NLP systems are expensive in terms of financial cost, computation, and manpower needed to create data for the learning process. With that in mind, we are pursuing research on detection of emerging misinformation topics to focus human attention on the most harmful, novel examples. Automatic methods for claim verification rely on large, high-quality datasets. To this end, we have constructed two corpora for fact checking, considering larger evidence documents and pushing the state of the art closer to the reality of combating misinformation. We further compare the capabilities of automatic, NLP-based approaches to what human fact checkers actually do, uncovering critical research directions for the future. To edify false beliefs, we are collaborating with cognitive scientists and psychologists to automatically detect and respond to attitudes of vaccine hesitancy, encouraging anti-vaxxers to change their minds with effective communication strategies.
Wednesday November 9, 2022, 15.00-16.30:
Titel: Revisiting Neural Scaling Laws in Language and Vision
Speaker: Ibrahim Alabdulmohsin, Research Scientist at Google Brain, and Xiaohua Zhai, Staff Researcher at Google Brain
Abstract: The remarkable progress in deep learning in recent years is largely driven by improvements in scale, where bigger models are trained on larger datasets for longer schedules. To predict the benefit of scale empirically, we argue for a more rigorous methodology based on the extrapolation loss, instead of reporting the best-fitting (interpolating) parameters. We then present a recipe for estimating scaling law parameters reliably from learning curves. We demonstrate that it extrapolates more accurately than previous methods in a wide range of architecture families across several domains, including image classification, neural machine translation (NMT) and language modeling, in addition to tasks from the BIG-Bench evaluation benchmark. To accelerate research in this domain, we also release a benchmark dataset comprising of 92 evaluation tasks. Finally, we dive in depth into our efforts to study the scaling properties of Vision Transformers by characterizing the relationships between their error rate, data, and compute, which has resulted in state-of-the-art top-1 accuracy on ImageNet.
Wednesday December 7, 2022, 15.00-16.30:
Titel: Text Summarization and Evaluation in the Era of GPT-3
Speaker: Tanya Goyal, PhD student at University of Texas at Austin
Abstract: The recent success of zero- and few-shot prompting with models like GPT-3 has led to a paradigm shift in NLP research. We study its impact on text summarization, focusing on the classic benchmark domain of news summarization. First, we investigate how zero-shot GPT-3 compares against fine-tuned models trained on large summarization datasets. We show that not only do humans overwhelmingly prefer GPT-3 summaries, but these also do not suffer from common dataset-specific issues such as poor factuality. Next, we study what this means for evaluation, particularly the role of gold standard test sets. Our experiments show that both reference-based and reference-free automatic metrics, e.g. recently proposed QA- or entailment-based factuality approaches, cannot reliably evaluate zero-shot summaries. Finally, we discuss future research challenges beyond generic summarization, specifically, keyword- and aspect-based summarization, showing how dominant fine-tuning approaches compare to zero-shot prompting.
Wednesday February 15, 15.00-16.30:
Title: NLP and language models in the legal domain in Sweden
Abstract: In this seminar, we take a closer look at the use of NLP and language models in the legal domain in Sweden. Our three speakers represent three very different actors and applications, ranging from the Swedish National Courts Administration (Domstolsverket), a traditional legal publishing house (Norstedts Juridik) to a legal tech startup (Maigon).
Speakers: John Lagström (Domstolsverket), Selcuk Ünlü (Norstedts Juridik) and Magnus Sundqvist (Maigon)
Wednesday March 15, 15.00-16.30:
Title: GPT-SW3 (AI Sweden)
Speakers: Amaru Cuba Gyllensten, PhD Senior Research Scientist (AI Sweden) & Ariel Ekgren, Research Scientist (AI Sweden)
Abstract: AI Sweden, together with RISE and WASP WARA Media & Language, is developing large-scale generative language models for the Nordic languages, primarily Swedish: GPT-SW3. The vision is to provide GPT-SW3 as a foundational resource for Swedish (and Nordic) NLP that is useful across various domains and use cases, ranging from academic research to public and private sector applications. Since January 2023 the first models (up to 40B parameters) are available for testing and validation in a controlled pre-release. In this seminar, AI Sweden will present GPT-SW3 and ongoing work around the models. As usual, there will be plenty of time for dialogue and questions.
March 29 (online & on site both in Stockholm and Gothenburg):
Title: SuperLim: an evaluation framework for Swedish Language Models
Speakers: Språkbanken Text, KBLab and AI Sweden
Abstract: In this seminar, Språkbanken Text, KBlab and AI Sweden present SuperLim, the national evaluation framework for Swedish language models. SuperLim consists of a standardized collection of benchmarking tests for Swedish language models, training data, baseline results, and a leaderboard. The SuperLim project has been funded by Vinnova (2020-2023) and supports the national responsibility of facilitating the development of trustworthy and robust NLP applications.