0
LatentVLA: Latent Reasoning Models for Autonomous Driving
https://towardsdatascience.com/latentvla-latent-reasoning-models-for-autonomous-driving/(towardsdatascience.com)LatentVLA is an autonomous driving model that performs reasoning in the latent space, contrasting with approaches that rely on natural language. It uses a self-supervised framework to learn discrete 'ego-actions' from unlabeled driving data by separating driver actions from environmental dynamics. A large Vision-Language Model (VLM) is trained to predict these latent action sequences, using a very small action codebook to preserve pre-trained knowledge. To achieve real-time performance, knowledge distillation is employed to transfer the VLM's capabilities to a much smaller decision transformer. While LatentVLA achieves state-of-the-art results on simulation benchmarks, the evaluation also discusses the limitations of open-loop planning for assessing true driving capabilities.
0 points•by chrisf•5 hours ago