Hands-On Workshop

Ship Agents that Ship: Building Production AI Agents with Guardrails

Kyle Penfound and Jeremy Adams from Dagger demonstrate building production-ready AI agents live—no slides, just real code, failures, debugging, and eventual success.

"The key insight is that LLMs are great at choosing from a menu of options, not great at free-form coding. Give them well-defined tools."

— Kyle Penfound, Dagger Ecosystem Team • 28:15

100%

Workshop Format

Hands-on live coding

15.2k+

Dagger GitHub

Stars on main repo

100%

Container-Native

Isolated execution

✓

Production-Ready

Real-world patterns

Why This Workshop Matters

The Problem Agents Face

Most AI agent demos look impressive but break in production. Hallucinations, infinite loops, unclear objectives, and lack of guardrails make them unreliable for real-world use. Teams struggle to move from prototype to production.

Common Agent Failures

Agents that generate syntactically invalid code
Infinite loops or decision paralysis
No sandboxing—agents can affect production systems
Poor error handling and recovery
Missing context window optimization

The Workshop Approach

This isn't about agent architecture theory. It's about sitting down and building agents that actually work—showing you the failures, the debugging process, and the patterns that emerge from real-world iteration.

"Guardrails aren'''t about limiting the agent, they'''re about giving the agent a safe playground to operate in."

— Kyle Penfound, Ecosystem Team at Dagger

35:42

About Dagger

Container-Native CI/CD for AI Agents

Dagger is a programmable CI/CD platform that combines the power of containers with the consistency of programming. Founded by Solomon Hykes (creator of Docker), Dagger treats LLMs as first-class components—perfect for building AI agents that run in isolated, reproducible environments.

Container-Native

Every agent runs in isolated containers

Programmatic

Write CI/CD and agents as code

Guardrails Built-In

Structured tools = safe agents

Kyle Penfound

Ecosystem Team at Dagger. Background in DevOps and platform engineering. Focuses on making complex infrastructure accessible to developers.

@kpenfound

Jeremy Adams

Ecosystem Team at Dagger. "I've been at Dagger for a few years." Expertise in infrastructure and container orchestration. Brings architecture-focused perspective to agent development.

@jeremyadamsding

Core Insights from the Workshop

"Containers are the perfect sandbox for AI agents because they'''re isolated, reproducible, and ephemeral."

— Jeremy Adams, Ecosystem Team at Dagger

42:18

LLMs Excel at Tool Selection, Not Free-Form Coding

The fundamental insight behind effective agent design: structure enables capability. Instead of letting LLMs generate arbitrary code, give them a well-defined menu of tools to choose from. This is why OpenAI's function calling works so well.

Guardrails Enable Freedom

Counterintuitively, constraining agents with well-defined tools makes them more capable and reliable. Guardrails aren't limitations—they're safety boundaries that enable confident operation in production environments.

Containers Are Perfect for Agents

Containers provide three critical properties for agent sandboxes: isolation (agent actions can't affect host), reproducibility (same environment every time), and ephemerality (clean slate for each execution).

Decomposition is Critical

Break down large tasks into smaller, tool-callable operations. This enables handling tasks of varying complexity and makes debugging easier when something goes wrong.

"One of the most important things is making sure your agent can fail gracefully. When it doesn'''t know what to do, it should ask for help."

— Jeremy Adams, Ecosystem Team at Dagger

68:15

Workshop Demonstrations

The workshop featured four live demonstrations showing the complete journey from setup to production-ready agent.

GitHub Integration Setup

Setting up authentication, configuring the GitHub repository, and creating test issues for the agent to process.

Key Steps:

• Created GitHub personal access token
• Configured Dagger environment variables
• Set up test repository with issues
• Verified GitHub API connectivity

Agent Reading and Understanding Issues

The agent connects to the GitHub API, reads issue content, parses requirements, and selects appropriate tools.

Agent Decision Process:

• Read issue title and description via GitHub API
• Parse requirements and identify task type
• Select from available tools (read_file, write_file, test)
• Plan multi-step execution strategy

Code Generation and File Creation

Agent generates Python code, writes files to repository, and creates commits in real-time with live debugging.

Live Demo Moments:

• 00:55:00 Live debugging when agent couldn't access GitHub
• 01:02:30 Real-time code modification during demo
• 01:06:45 Handling authentication errors on the fly

Pull Request Creation and Automation

Creating pull requests with descriptions, handling merge conflicts, and running automated tests.

Automated Workflow:

• Create feature branch from main
• Commit generated code changes
• Open pull request with description
• Run tests and validate changes

Production-Ready Patterns

Design Tool Menus, Not Freedom

Don't give agents unrestricted code execution. Provide a curated menu of tools they can choose from. Build function calling schemas with discrete, well-documented operations.

Containerize Everything

Every agent action should run in a container. This provides safety, reproducibility, and clean state management. Use Docker containers as the execution environment for all agent operations.

Implement Explicit Fallbacks

Design your agent to recognize when it's stuck and trigger a human-in-the-loop workflow. Add confidence thresholds and escalation paths to your agent logic.

Decompose for Scale

Break complex workflows into small, tool-callable operations. This enables handling tasks of varying complexity and makes systems more maintainable.

Production Agent Checklist

All agent actions run in isolated containers
Well-defined tool menu with function calling
Graceful error handling and human-in-the-loop fallbacks
Comprehensive testing before accepting agent changes
Observability and monitoring for agent decisions
Clear context window management and optimization

Key Takeaways

Practical Agent Development

•Start Simple: Don't over-engineer initial agent implementation. Begin with a small tool menu and expand as needed.
•Embrace Failure: Expect things to break; design for debugging. The workshop showed real failures and how to fix them.
•Context is King: Every token in the context window matters. Design your prompts and tool descriptions carefully.
•Test Early and Often: Write tests before or alongside agent code. Never accept agent changes without validation.
•Iterate Quickly: Small, fast cycles beat long planning sessions. Run experiments, gather data, and improve.
•Monitor Everything: You can't improve what you don't measure. Log agent decisions, tool usage, and success rates.
•Use Tools Judiciously: More tools ≠ better agent. Curate a focused set of high-quality, well-documented operations.
•Human-in-the-Loop: Know when to let humans intervene. Design agents to ask for help when uncertain.

"If you decompose things down and you can architect things right, it can handle a lot of different sizes."

— Kyle Penfound, Ecosystem Team at Dagger

77:22

Watch the Full Workshop

Watch on YouTube Explore Dagger Dagger on GitHub

Related Resources

Dagger Documentation

Related Technologies

Related AI Engineering Highlights

12-Factor Agents: Patterns for reliable LLM applications 3 Ingredients for Building Reliable Enterprise Agents The Cure for the Vibe Coding Hangover Future-Proof Coding Agents (OpenAI Guide)

Research Notes & Methodology

This highlight page is based on a comprehensive analysis of the workshop transcript from the AI Engineer Summit 2024. The workshop featured live demonstrations, real-time debugging, and practical implementation patterns.

Source Material:

• Full VTT transcript (16,896 lines)
• Complete workshop recording (~80 minutes)
• GitHub repositories and documentation
• Dagger official documentation

Analysis Method:

• Complete transcript analysis
• Quote extraction and verification
• Fact-checking against official sources
• Cross-reference with documentation

Video: Ship Agents that Ship: A Hands-On Workshop by Kyle Penfound and Jeremy Adams (Dagger)

Event: AI Engineer Summit 2024 • Published: October 29, 2024