reSee.it Podcast Summary
Dmitri Dolgov describes a twenty-year arc from the early Google self-driving car project to Waymo’s current scale, emphasizing the shift from pure research to global deployment. He explains how the Waymo driver relies on three sensing modalities—cameras, lidars, and radars—providing 360-degree coverage and feeding data into onboard AI that encodes perception and decodes driving actions. The conversation highlights a layered approach: foundation models on the off-board side, specialized off-board teachers such as the Waymo driver, the simulator, and the critic, and downstream distillation into smaller, faster models that operate in the car. The team uses a mix of end-to-end learning and intermediate representations to balance data-driven insight with structured world knowledge like objects, roads, signs, and traffic rules. They stress that purely pixel-to-trajectory systems are inefficient for scaling to the full three-pronged driver–simulator–critic ecosystem, which benefits from augmented representations and safety checks. The discussion also explores the role of the simulator and the critic in training through reinforcement learning with human feedback, enabling a safe closed-loop loop for exploring rare scenarios and refining rewards. Dolgov recounts the generational progression from early deployments in Chandler, Arizona, to broader U.S. expansion and international pilots, noting a strategic decision in Gen 5 to lean heavily on AI as the backbone for a more generalizable driver. He underscores the importance of data collection, specialization, and validation to adapt to different cities and weather, while maintaining a core technology that generalizes well across platforms and sensor stacks.
Beyond the core technical narrative, the interview delves into practical realities of operating at scale: how existing depots are increasingly automated, what the six-generation vehicle redesign brings in terms of space, cost, and passenger experience, and the balancing act between driver-assist systems and full autonomy. Dolgov contemplates urban implications—parking, traffic efficiency, and city design—alongside questions about market access, density, and the economics of deploying autonomous fleets across diverse locales. He closes with reflections on Google’s culture, patience for long-term AI breakthroughs, and the ongoing, iterative nature of solving a problem that remains technically solvable but economically and operationally complex at global scale.