Agents are Robots Too
What Self-Driving Taught Me About Building Agents
Jesse Hu draws parallels between self-driving cars and AI agents, introducing "Agentics"—applying robotics principles to agent development. Learn why the model is only 1% of the work, how to design closed-loop systems, and what robotics can teach us about building better agents.
When you get into real world applications, the model is only doing 1% of the work and 99% of the work goes into other things.Jesse Hu, Abundant (01:29)
Model work
Everything else
New discipline
Framework
Why Robotics Matters for Agent Developers
Jesse Hu, a lifelong ML engineer who worked on YouTube and Google's two-tower embedding model, BERT, and mixture of experts, now applies his robotics background to building agentic coding models at Abundant. His central thesis: AI agents and robots face the same fundamental challenges. By studying decades of robotics research and self-driving development, agent engineers can accelerate progress and avoid repeating costly mistakes.
"I've been a lifelong ML engineer and I've worked at places like YouTube and Google where I worked on the two tower embedding model as well as some early work on BERT and mixture of experts."
Speaker background: ML engineer at YouTube/Google, now at Abundant building datasets for agentic coding models
Watch (00:17)"I've given different variants of this talk in person for different events, but this is the first one that I've done for coding agents."
This talk adapts robotics/self-driving lessons specifically for AI agent developers
Watch (00:04)The Core Insight
The term "Agentics" refers to applying robotics principles, abstractions, and core concepts to agent development. This moves agent development from "something we hack on" to a dedicated scientific practice with established methodologies. Robotics has spent decades solving problems that agent developers are encountering for the first time—the closed-loop control, action space design, simulation, and failure mode analysis that robotics teams have mastered can directly inform agent development.
The 1% vs 99% Problem
In both robotics and agents, the machine learning model is only the tip of the iceberg. The real engineering challenge lies in the 99% of the system that surrounds the model.
Model Work vs Everything Else
1% → 99%
Robotics 99%
Hardware, sensors, actuators, integration, offline stack (simulation, training)
Agents 99%
APIs, MCPs, terminal, browser, VM, persistent file systems, offline stack (data, training)
"The model is only doing 1% of the work and 99% of the work goes into other things."
In both robotics and agents, the ML model is tiny compared to the full system stack
Watch (01:29)"And I think this is something that is very analogous to self-driving cars. When you get into real world applications, it's like hardware, sensors, actuators, integration, and the offline stack like simulation and training."
Robotics breakdown of the 99%: hardware, sensors, integration, simulation
Watch (01:39)"For agents, it's like APIs, MCPs, terminal, browser, VM, persistent file systems, all this kind of stuff. And then the offline stack which is like data and model training."
Agent equivalent of the 99%: APIs, tools, interfaces, and the offline stack
Watch (02:05)Winning Teams Have the Best Offline Stack
In robotics competitions and self-driving development, the teams that win aren't necessarily those with the best models—they're the teams with the best offline stacks: simulation environments, data pipelines, evaluation frameworks, and training infrastructure. The same applies to agents. Invest in your simulation, evaluation, and data infrastructure—it's where real competitive advantage lies.
Embodiment: Robots vs Agents
Both robots and agents have a "body" that interacts with the world. Understanding these parallels helps transfer insights from robotics to agent development.
Hardware & Sensors
🤖 Robotics: Cameras, LiDAR, radar, GPS - perception systems
🤖 Agents: APIs, web scrapers, database queries - information gathering
💡 Both need rich perception of the environment
Actuators & Tools
🤖 Robotics: Motors, servos, steering - physical action
🤖 Agents: MCPs, terminal commands, function calls - digital action
💡 Action primitives define what the system can do
Fleet Management
🤖 Robotics: Coordinating multiple vehicles
🤖 Agents: Multi-agent orchestration
💡 Systems-level coordination across many autonomous units
Offline Stack
🤖 Robotics: Simulation environments for testing
🤖 Agents: Evaluation frameworks, sandboxed environments
💡 Testing without real-world consequences
Closed-Loop vs Open-Loop Systems
Robotics relies on closed-loop feedback: act, measure, recalibrate. Most agents today use open-loop, turn-based interaction. This implicit design decision has significant implications for agent reliability.
🔄 Robotics: Closed-Loop
- • Turn wheel → measure actual turn → recalibrate
- • Continuous sampling from environment
- • Real-time feedback on every action
- • Can immediately respond to changes
💬 Agents: Turn-Based (Open-Loop)
- • Execute tool → wait for response
- • Discrete turns in conversation
- • No real-time feedback during execution
- • Can't respond to pop-ups or long-running processes immediately
"In robotics, you turn the wheel, you actually measure, you actually look at like, did the wheel actually turn five degrees? And then you recalibrate if it's off."
Robotics relies on real-time feedback loops between action and measurement
Watch (03:30)"We don't do this thing that's natural robotics where we keep sampling from the world and we keep interacting in real time."
Agents typically use turn-based interaction instead of continuous sampling
Watch (05:13)"We've kind of done this implicitly. So in agents we often have a conversation. So we wait to take our turn."
Turn-based conversation is an implicit design choice with implications
Watch (04:22)The Time Discretization Trade-off
Agent developers have implicitly chosen turn-based interaction (wait for our turn). This is easier to reason about but limits real-time responsiveness. Robotics keeps sampling from the world continuously. For agents, consider whether your use case requires continuous interaction—browser agents dealing with pop-ups, terminal agents watching process output, or agents monitoring real-time data streams may need closed-loop designs.
Action Spaces: Beyond Tool Calls
How your agent interacts with the world is a design choice with trade-offs. Tool calls and MCPs are just the beginning.
Discrete Tools
MCPs, function calls
Easy to reason about, limited flexibility
Character-Level I/O
Terminus agent, TX streams
Finer control, more complex state
Continuous
Velocities, acceleration (Dreamer)
Robotics-style, 20 FPS interaction
"The question is what trade-offs are we making and what implicit or explicit design decisions have we made."
Every action space design involves trade-offs
Watch (05:49)"There's an agent called Terminus which is uh from TerminalBench. And instead of using tools, they're doing like TX streams."
Character-level I/O as an alternative to discrete tool calls
Watch (06:41)"So they're basically interacting at the character level rather than at the tool level."
Character-level interaction enables finer-grained control
Watch (06:51)Stateful vs Stateless Agents
Agents are evolving from stateless "spawn from nothing" systems to stateful systems with persistent memory and environment context.
"Similarly, we're going from these stateless agents to more stateful agents."
Agents are evolving from stateless to stateful systems
Watch (07:31)"Before, you kind of spawn from nothing like in a video game. Now you have VMs with persistent file stores."
VMs with persistent storage give agents 'memory' and 'state'
Watch (07:41)"You have to think about what's the entire space? What Slack messages are active? What's the state of the world?"
Stateful agents require reasoning about the entire environment state
Watch (08:03)The State Explosion Challenge
Stateful agents must reason about the entire environment state: active Slack messages, file system contents, running processes, browser tabs, and more. This state explosion makes planning and decision-making exponentially harder. Robotics faces the same challenge—the "state" of the real world includes every object, vehicle, pedestrian, and environmental condition. Successful systems use hierarchical abstraction and attention mechanisms to focus on relevant state.
Distribution Shift & Cascading Consequences
Unlike pure prediction or classification, actions have consequences. When agents act, they change the environment, creating new states that may be outside their training distribution.
⚠️ Distribution Shift
Browser agents get confused by pop-ups never seen in training
Imitation learning fails at the edges of the training distribution
🔗 Cascading Issues
Great plans fail when implemented - execution gap
Real-world is messy: actions have unintended consequences
"And you can start to see this in agents such as browser agents. When you see a pop-up that never happened in training because humans actually interact with pop-ups quite naturally, it gets confused and it gets really confused."
Distribution shift: agents fail on scenarios not seen in training
Watch (08:45)"Actions have consequences in a very messy real world."
Unlike pure prediction, actions change the environment in unpredictable ways
Watch (10:01)"We're dealing with a whole new paradigm in which you predict, you act, and then you deal with the consequences of that action and then re-evaluate everything you've done before."
The action loop: predict → act → handle consequences → re-evaluate
Watch (09:36)"In self-driving from 2017 to 2020, we thought just make boxes and drive around them. That assumption wasn't true."
Self-driving focused on perception, but action models were equally important
Watch (11:25)"Same thing with agents. You can have a great plan, but when you actually implement it, you realize there's all these cascading issues."
Great plans fail when implemented - the execution gap
Watch (11:56)Simulation & Counterfactuals
Robotics learned that simulation is essential. It enables exploring multiple possible paths ("counterfactuals") without real-world consequences. The same applies to agents.
🌐 Simulation Benefits
- • Explore all possible paths, not just one
- • Test failure modes safely
- • Represent real-world complexity in starting state
- • Play out counterfactuals: "what if" scenarios
🎯 MDP Framework
- • State: Environment representation
- • Reward: Objective function
- • Action Primitives: What can the agent do?
- • Useful communication primitives between teams
"You can play through the real world not just in a single path but all the paths that you could possibly take as your agent changes."
Simulation enables exploring counterfactuals - all possible paths
Watch (10:03)"You need to be able to represent the complexities and messiness of the real world in your starting state."
Simulation must capture real-world complexity, not simplified scenarios
Watch (10:13)"State, reward, action primitives. These are very useful primitives for communication between people."
Markov Decision Process (MDP) framework provides common language
Watch (11:04)"Moving from chat models to agent models."
Agents require different thinking than chat models
Watch (11:16)Development Process: Hill Climbing & Logs
Agent development follows a hill climbing process: make changes, test against nebulous metrics, hope you improve. Unlike traditional software where features ship and work reliably, ML systems require iteration without guaranteed forward progress.
📜 Traditional Software
Feature → Production
Guaranteed forward progress
🎲 ML / Agents
Nebulous metric → Guess and check → Hope ↑
Iterative hill climbing
"In both cases in self-driving when it comes to robotics and in code when it comes to digital agents we're actually very lucky in both."
Both domains benefit from predefined human-machine interfaces
Watch (12:22)"When you're exploring new domains, you should ask like, is there a predefined human interface?"
Domains with existing interfaces are easier for agent deployment
Watch (13:07)"It's basically this iterative process of building or iterating on a complex system such as an LM or an agent when you don't always make forward progress."
Hill climbing: making progress on complex systems without guaranteed forward movement
Watch (13:44)"The old way is like you ship a feature and you know it's going to work and you can move forward."
Traditional software: feature → production (guaranteed progress)
Watch (14:13)"The new way is you pick a nebulous metric and you guess and check and you hope you go up."
ML/agents: nebulous metric → guess and check → hope for improvement
Watch (14:21)Logs Become Critical
In agent development, detailed logs are more important than benchmark scores. A 70% benchmark score tells you very little. But breaking down failures by category, environment, failure mode, and then triaging individual failures gives you actionable insights on how to improve. Build logging infrastructure that captures full execution traces—not just metrics.
"The logs actually become a much more important part of the process than they are today. You can get a lot more insights than just your numbers."
Detailed logs enable deeper insights than simple benchmark scores
Watch (14:45)"70% on a benchmark tells you very little. But if you break it down by categories, by cities, by different failure modes, and then you go triage individual failures, you get a lot more insights on how to improve."
Break down failures systematically to understand improvement paths
Watch (15:02)The Current State of Agentics
Where are we today? Great demos and predictive models, but end-to-end work completion remains elusive. The reasons are the same challenges robotics faced: actions have consequences, real world complexity.
Great Demos
Impressive one-off demonstrations
Great Models
Strong predictive capabilities
Not There Yet
End-to-end work completion
"Great demos, great predictive models, but not nearly there on end-to-end work completion."
Current state of agents: impressive demos, limited real-world completion
Watch (15:27)"Actions have consequences, real world complexity. These are things that we learned in robotics and self-driving."
Robotics learned these lessons the hard way - agents can benefit from that experience
Watch (15:53)What is Agentics?
"Agentics: Applying robotics principles, abstractions, and core concepts to agent development to move from 'something we hack on' to 'dedicated real science and really becomes a practice.'"
Agentics as a discipline: systematic approach to agent development
Watch (16:20)Key Takeaways for Agent Engineers
1. The Model is Only 1%
Infrastructure
- •99% of the work is APIs, MCPs, tools, interfaces, and the offline stack
- •Winning teams have the best simulation and evaluation infrastructure
- •Invest in your offline stack—data, training, and simulation environments
2. Design Closed-Loop Systems
Feedback
- •Robotics uses continuous sampling and real-time feedback
- •Agents typically use turn-based interaction—an implicit design choice
- •Consider whether your use case requires closed-loop, real-time interaction
3. Action Space Design Matters
Primitives
- •Tool calls are just one option—consider character-level I/O or continuous actions
- •Every action space design involves trade-offs
- •Choose action primitives that match your domain
4. Prepare for Distribution Shift
Robustness
- •Agents fail on scenarios outside their training distribution
- •Actions have consequences in a messy real world
- •Browser agents get confused by unseen pop-ups and edge cases
5. Invest in Simulation
Counterfactuals
- •Simulation enables exploring all possible paths, not just one
- •Play out counterfactuals—what if scenarios
- •Represent real-world complexity in your starting state
6. Make Logs Central
Observability
- •Detailed logs matter more than benchmark scores
- •Break down failures by category and triage individual cases
- •Build logging infrastructure that captures full execution traces
7. Use the MDP Framework
Communication
- •State, reward, action primitives are useful communication tools
- •Moving from chat models to agent models requires new thinking
- •MDP abstractions help teams reason about agent behavior
8. Embrace Hill Climbing
Process
- •Agent development is iterative—nebulous metrics, guess and check, hope for improvement
- •Unlike traditional software, there's no guaranteed forward progress
- •Learning → Simulation → Deploy with confidence → Real-world logs feed back
Recommended Reading
Jesse recommends studying these robotics and reinforcement learning concepts to deepen your understanding of agent development:
Control Theory
Open-loop and closed-loop control systems
MDPs
Markov Decision Processes and planning
Observability
Fully vs partially observable environments
Distribution Shift
DAgger (Dataset Aggregation) and imitation learning
Offline RL
Reinforcement learning from offline datasets
Robotics Literature
Recent papers on manipulation and navigation
Source Video
Agents are Robots Too: What Self-Driving Taught Me About Building Agents
Jesse Hu • Abundant • AI Engineer Summit
Research Note: All quotes in this report are timestamped and link to exact moments in the video for validation. This analysis covers Jesse Hu's insights on applying robotics principles ("Agentics") to AI agent development, including the 1% vs 99% problem, closed-loop systems, action space design, distribution shift, simulation, and hill climbing development processes.
Key Concepts: Agentics, embodiment, closed-loop control, action spaces, stateful agents, distribution shift, cascading consequences, simulation, counterfactuals, MDP framework, hill climbing, offline stack, self-driving parallels