
This talk presents Zenseact’s transition from DL1 to DL2—two fundamentally different paradigms for perception and planning of automated vehicles. DL1 consists of task-specific CNNs operating on single images from individual cameras. In contrast, DL2 is a unified, end-to-end, multi-task Transformer architecture that jointly performs spatial and temporal sensor fusion within a single model. Its multi-task design enables joint reasoning across functions such as road estimation, object detection, and trajectory planning.
The move from DL1 to DL2 has reshaped our entire development lifecycle, spanning data collection, curation, and annotation to large-scale training and deployment. It has required the creation of a new toolchain and driven a significant organizational transformation toward tightly integrated ML/SW engineering at scale. This presentation outlines the key steps in this journey and showcases representative examples of the advanced algorithms emerging from this unified, end-to-end approach.
Erik is currently Senior Director of AI and Perception at Zenseact. Prior to this role, he held several key technical leadership positions at Zenseact (and formerly Zenuity), including Chief AI Officer, Chief Architect, Product Area Owner for Computer Vision, and Technical Expert in Deep Learning.
From 2014 to 2017, he served as Director of Automated Driving and Preventive Safety at Autoliv’s global research division. Erik holds a PhD in superstring theory from Chalmers University of Technology (2006) and has eight years of experience in statistical accident research at Autoliv.