Robert Cedergren

Founding Head of AI
Bencha International
Room
Time
Theme
Difficulty
Congress Hall
Room H1+H2
Room G3
To be released
13:55
To be released
Efficiency
 
D2
Robert Cedergren

Small Models Are All We Need

As the second-hand fashion industry grows rapidly, companies face increasing challenges in managing large product inventories—from identifying fashion items accurately to setting competitive prices across multiple resale channels. Bencha addresses these challenges with an automated pricing engine, real-time product identification, and a recommerce hub that guides clients on where, how, and at what price to sell their items. Under the hood, our systems process millions of unstructured product documents every day to power these capabilities.

For large-scale agentic and AI workflows, cost and service-level objectives (SLOs) such as latency and throughput quickly become bottlenecks when relying on large general-purpose models or proprietary APIs. At the same time, there is often headroom to improve accuracy and output quality for narrow, well-defined tasks. As model capabilities advance, one powerful pattern is to leverage large models for supervision while training much smaller specialized models that satisfy strict SLOs—with far lower variance, latency, and resource footprint—and still preserve quality or even outperform larger systems in domain-specific scenarios.

This talk focuses on how resource footprint, latency, throughput, and reliability constraints shape architectural and modeling choices in the context of open-weight AI. Demonstrated through a real-world production case study using Vision Language Models, it will detail Bencha’s systematic methodology for scoping an MVP, and iterating on data curation, fine-tuning, and evaluation strategies to reflect production behavior rather than benchmark scores. Attendees will gain practical strategies for extracting maximum value at scale—especially when defaulting to costly proprietary APIs is not an option.

Bio

Robert Cedergren is Founding Head of AI at Bencha, where he architects the company’s AI platform. He is a hands-on ML/AI engineer with a strong foundation in Computer Vision, NLP, and Generative AI, and has built and scaled production machine learning systems at multiple startups—spanning complex training workflows to engineering cloud-native runtimes that deliver both low latency and high throughput. Robert holds an MSc in Machine Learning from KTH and thrives in high-ownership roles shipping production-grade, end-to-end AI systems.

Recording