Evaluating Retrieval-Augmented Generation Systems: Challenges and Practices

Room
Time
Theme
Difficulty
main
room
To be released
10:00
To be released
LLM Development
To be released
D2

Retrieval-augmented generation (RAG) systems integrate external information sources with advanced large language models (LLMs), promising more factual and context-aware outputs. However, assessing their performance remains a significant challenge—it involves assessing not only generated text quality but also the relevance and correctness of retrieved information. Recent developments have introduced new frameworks and metrics, but the field has not converged on widely accepted standards for robust evaluation.

In this talk, we explore current approaches to RAG evaluation and discuss key open questions from ongoing research in this area. Our goal is to encourage more systematic approaches for measuring and comparing RAG performance in a continually advancing field.

Speakers

Yue Liu

Machine Learning Engineer
Modulai
Yue Liu

Bio

Yue is a Machine Learning (ML) Engineer at Modulai, an ML consultancy company in Sweden. There she has worked on various projects in the healthcare, legal and finance sectors. Before joining Modulai, Yue’s PhD research at KTH focused on applying AI models to breast cancer risk assessment and detection in mammograms. Earlier, she pursued her Master’s in Computer Science at KTH, Sweden and TU Delft, the Netherlands.

Recording