Evaluating Retrieval-Augmented Generation Systems: Challenges and Practices

Room

Time

Theme

Difficulty

main

room

To be released

10:00

To be released

LLM Development

To be released

Retrieval-augmented generation (RAG) systems integrate external information sources with advanced large language models (LLMs), promising more factual and context-aware outputs. However, assessing their performance remains a significant challenge—it involves assessing not only generated text quality but also the relevance and correctness of retrieved information. Recent developments have introduced new frameworks and metrics, but the field has not converged on widely accepted standards for robust evaluation.

In this talk, we explore current approaches to RAG evaluation and discuss key open questions from ongoing research in this area. Our goal is to encourage more systematic approaches for measuring and comparing RAG performance in a continually advancing field.

Speakers

Yue Liu

Machine Learning Engineer

Modulai

Bio

Yue is a Machine Learning (ML) Engineer at Modulai, an ML consultancy company in Sweden. There she has worked on various projects in the healthcare, legal and finance sectors. Before joining Modulai, Yue’s PhD research at KTH focused on applying AI models to breast cancer risk assessment and detection in mammograms. Earlier, she pursued her Master’s in Computer Science at KTH, Sweden and TU Delft, the Netherlands.

Evaluating Retrieval-Augmented Generation Systems: Challenges and Practices

Speakers

Yue Liu

Bio

Recording

Cookie Settings