AI Engineer Insights

AI Engineering Highlights

Comprehensive analysis and insights from the AI Engineer Summit. Deep dives into the latest trends, technologies, and thought leadership in AI engineering.

In-Depth Analyses

95

Comprehensive AI engineering analysis

Expert Speakers

100

Industry leaders and practitioners

Videos Analyzed

461+

Conference talks and presentations

Fact-Checked

100%

Verified and validated insights

Featured Topics

Vibe Coding
Multi-Agent Systems
AI Infrastructure
Developer Experience
Enterprise AI
LLM Applications

All Highlights

95 total

Devin 2.0 and Moore's Law for AI Agents: 70-Day Doubling Cycle

Scott Wu from Cognition presents Moore's Law for AI Agents - capabilities double every 70 days (16-64x annually). From tab completion to autonomous engineers in 18 months. Learn the 5-tier evolution framework: migrations → bug fixes → complex debugging → project autonomy, plus technical infrastructure including Playbooks, Deep Wiki, With Search, and integrations with Linear, Jira, and Slack.

Scott Wu
AI Engineer Summit
Jun 15, 2025
Devin-2
Scott-Wu
Cognition
Moores-Law-AI
+19 more

Small Bets, Big Impact: Building GenBI at Northwestern Mutual

Asaf Bord shares how a 160-year-old Fortune 100 insurance company built GenBI using 6-week sprints, continuous plug-pulling rights, and incremental value delivery. Learn the crawl-walk-run adoption strategy, why 80% of BI work is report routing (not SQL generation), and the honest assessment that executive-ready AI may never arrive.

Asaf Bord
AI Engineer Summit
Oct 29, 2024
GenBI
Northwestern-Mutual
enterprise-AI
incremental-delivery
+16 more

7 Habits of Highly Effective GenAI Evaluations: AWS Framework for Production AI

Justin Muller, Principal Applied AI Architect at AWS, reveals the battle-tested 7 Habits framework that transformed document processing from 22% to 92% accuracy in 6 months. Learn why evals are the missing piece to scaling GenAI, the 30-second rule for rapid iteration, and how to build evaluation systems that enable production deployment with real-world case studies and practical implementation guidance.

Justin Muller
AI Engineer Summit
Oct 29, 2024
GenAI-evaluations
LLM-evaluation
AWS
prompt-decomposition
+16 more

Form Factors for Your New AI Coworkers: A Design Framework

Craig Wattrus from Flatfile presents a four-form-factor framework for AI coworkers: Invisible, Ambient, Inline, and Conversational. Learn why traditional design processes fail with LLMs and how playful experimentation leads to better AI products through "feeling the material" and character coaching over control.

Craig Wattrus
AI Engineer Summit
Oct 29, 2024
AI-coworkers
form-factors
Flatfile
UX-design
+12 more

How Bolt.new Scaled $0-20M ARR in 60 Days with 15 People

Eric Simons shares how Bolt.new went from near-shutdown to $20M+ ARR in 60 days with just 15 people. Learn the Spartan mentality, community strategies, AI-powered support, and team culture that made it possible.

Eric Simons
AI Engineer Summit
Oct 29, 2024
Bolt.new
StackBlitz
$20M-ARR
Spartan-mentality
+10 more

Llama 3 at 1,000 tokens/s on SambaNova AI Platform

Full workshop on achieving unprecedented Llama 3 inference speeds of 1,000 tokens/second using SambaNova's Composition of Experts architecture and custom RDU hardware. Includes hands-on RAG implementation with LlamaIndex, ChromaDB, and performance benchmarks (16 chips vs 576, full precision).

Relle, Pedro
AI Engineer Summit 2024
Oct 29, 2024
Llama-3
SambaNova
RDU
inference-optimization
+14 more

Why Cisco Ditched RAG for Fine-Tuning in Production AI Agents

Ola Mabadeje from Cisco's Outshift group reveals how they built a 5-agent system for network change management, why fine-tuning beat RAG for knowledge graph queries (drastic token reduction), and how knowledge graphs serve as digital twins for safe testing before production. Complete architecture with real quotes.

Ola Mabadeje
AI Engineer Conference 2025
Dec 30, 2025
multi-agent-AI
Cisco
fine-tuning-vs-RAG
knowledge-graphs
+15 more

Ship Agents that Ship: Building Production AI Agents with Guardrails

Kyle Penfound and Jeremy Adams from Dagger demonstrate building production-ready AI agents in a hands-on workshop. Learn guardrails, container-native development, GitHub integration, function calling patterns, and practical production-ready agent architecture.

Kyle Penfound, Jeremy Adams
AI Engineer Summit 2024
Oct 29, 2024
AI-agents
Dagger
container-native
agent-guardrails
+12 more

From Arc to Dia: How The Browser Company Built AI-Native Tools

Samir Mody from The Browser Company shares lessons from building Arc and Dia. Learn how they achieved 10x iteration speed, built Jeba for automated prompt optimization, embraced non-engineers as AI developers, and navigated security challenges in AI browsers.

Samir Mody
AI Engineer Conference 2024
Oct 29, 2024
The-Browser-Company
Arc-browser
Dia-browser
AI-browsers
+10 more

Why Bolt.new Won and Most DevTools AI Pivots Failed

From Death's Door to $100M ARR: The Three-Step Framework, Anti-Patterns to Avoid, and How to Create Categories Instead of Features

Victoria Melnikova
AI Engineer Conference 2025
Dec 30, 2025
Bolt.new
StackBlitz
AI-pivot
startup-turnaround
+10 more

Rust is the Language of AGI: Why AI Prefers Rust Over Python

Michael Yuan explains why Rust is the perfect language for AI-generated code. Learn how the Rust compiler serves as a reward function for AI, the MCP tools that automate code generation, and the path to AGI through verifiable code with 1,000+ developers already using Rust Coder in production.

Michael Yuan
AI Engineer Conference
Oct 29, 2024
Rust
AGI
AI-code-generation
MCP
+15 more

Code World Model: Building World Models for Computation

Jacob Kahn from Meta FAIR presents Code World Model (CWM) - a 32B parameter model that explicitly models program execution dynamics rather than just syntax. Learn how execution tracing, asynchronous RL with mid-trajectory updates, and "neural debugging" enable AI to simulate code execution without running it, effectively approximating solutions to the halting problem.

Jacob Kahn
AI Engineer Conference
Oct 29, 2024
world-models
execution-tracing
Meta-FAIR
neural-debugging
+18 more

tldraw.computer: The Visual AI Language That Executes Like Code

Steve Ruiz demos tldraw.computer - a visual programming language where LLMs execute through graph-based nodes. See multimodal computing, self-scripting nodes, and AI as collaborator returning structured shapes, not pixels.

Steve Ruiz
AI Engineer Summit 2024
Oct 29, 2024
tldraw
visual-programming
AI-execution
multimodal
+11 more

Agent Reinforcement Fine Tuning: OpenAI's Breakthrough in Training AI Agents

Will Hang and Cathy Zhou from OpenAI introduce Agent RFT - the first time models can interact with the outside world during training. Learn how Cognition, Cosine, and Qodo achieved dramatic improvements with as few as 10 examples, the four success principles, and why parallel tool calling reduces latency from 8-10 steps to 4.

Will Hang, Cathy Zhou
AI Engineer Summit
Oct 29, 2024
Agent-RFT
OpenAI
Will-Hang
Cathy-Zhou
+21 more

AI Copilots for Tech Architecture: The Highest-ROI Use Case

Why Architecture Copilots Deliver Higher ROI Than Coding Copilots: Preventing Costly Mistakes, Justifying Nine-Figure Infrastructure Spends, and Enabling Safe Delegation to Developers

Boris Bogatin
AI Engineer Conference
Dec 30, 2025
AI-architecture
tech-architecture
architecture-copilot
enterprise-AI
+9 more

Claude plays Minecraft! Emergent AI Behavior & Agent Engineering

AWS engineer demonstrates building Rocky, a Minecraft bot powered by Claude Haiku and Amazon Bedrock Agents. Live demo with emergent behavior, real-time gameplay, and practical lessons on architecture evolution from LangChain to Bedrock.

AWS Engineer
AI Engineer Summit 2024
Oct 29, 2024
AI-agents
Claude-Haiku
Amazon-Bedrock
Minecraft
+12 more

AI Kernel Generation: What's Working, What's Not, What's Next

Natalie Serrino from Gimlet Labs on AI-Driven GPU Optimization: 25-70% Speedups, Agentic Synthesis Swarm, Hardware-in-the-Loop Verification, and the Path Forward for PTX Generation and Formal Verification

Natalie Serrino
AI Engineer Conference
Dec 30, 2025
AI-kernel-generation
GPU-optimization
Gimlet-Labs
heterogeneous-compute
+10 more

Giving a Voice to AI Agents: Voice AI 2.0, Contextual AI, and <500ms Latency

Scott Stephenson, CEO at Deepgram, explains the evolution from Siri-era Voice AI 1.0 to LLM-powered Voice AI 2.0, the Intelligence Revolution timeline (25-30 years), accuracy improvements (75% to 90%+), latency breakthroughs (2-5s to 100-200ms), and how contextual AI传递es conversation context to enable human-like voice interactions with <500ms roundtrip latency

Scott Stephenson
AI Engineer Conference
Oct 29, 2024
Voice-AI-2.0
Deepgram
Scott-Stephenson
contextual-AI
+13 more

Efficient Reinforcement Learning: Asynchronous Pipeline RL & GPU Optimization

Rhythm Garg and Linden Li from Applied Compute present efficient RL systems for enterprise applications. Learn about asynchronous vs synchronous RL, GPU utilization optimization, staleness trade-offs, system-level modeling, and first-principles optimization for end-to-end performance.

Rhythm Garg, Linden Li
AI Engineer Summit
Oct 29, 2024
reinforcement-learning
RL
asynchronous-RL
pipeline-RL
+11 more

A Year of Gemini Progress + What's Next: 50x Growth, Universal Assistant, and Agentic AI

Logan Kilpatrick from Google DeepMind recaps a transformative year — 10 years of progress in 12 months, 50x inference growth, Gemini 2.5 Pro final update, organizational evolution, and what's next for universal assistant vision, omnimodal models, agentic AI, and developer platform expansions (Embeddings API, Deep Research API, Veo 3, Imagen 4, AI Studio repositioning).

Logan Kilpatrick
AI Education Summit
Oct 29, 2024
Gemini-2.5-Pro
Google-DeepMind
Logan-Kilpatrick
universal-assistant
+17 more

Top Ten Challenges to Reach AGI

Stephen Chin and Andreas Kollegger explore the fundamental obstacles to AGI through science fiction memes—from Memento's memory problem to The Matrix's simulation control. A concise 4-minute lightning talk covering memory limitations, alignment problems, transparency issues, and the ultimate question: do we know what to ask AGI?

Stephen Chin, Andreas Kollegger (ABK)
AI Engineer World's Fair
Oct 29, 2024
AGI
science-fiction
AI-safety
alignment-problem
+10 more

Taxonomy for Next-Gen Reasoning: Why AI Gains Aren't Free

Nathan Lambert's Four-Pillar Framework: Skills, Calibration, Strategy, Abstraction—10-100x Token Waste Problem, Post-Training RL Compute Revolution (1% → 10%+)

Nathan Lambert
AI Engineer Conference
Dec 30, 2025
Nathan-Lambert
AI-reasoning
post-training-RL
calibration
+12 more

Why Agent Engineering

swyx Landmark Keynote: Why 2025 is the Year of Agents - 6 Enabling Factors, Agent Definitions, PMF Use Cases, ChatGPT Growth Analysis, and the Evolution of AI Engineering as a Discipline

swyx (Shawn Wang)
AI Engineer Summit 2025
Dec 30, 2025
agent-engineering
swyx
AI-Engineer-Summit
agents-2025
+10 more

Latent Space Paper Club: DeepSeek R1/V3 and Test Time Compute

8B = 235B Distillation Breakthrough, Doubled Reasoning Tokens, and the New Scaling Paradigm from Chinchilla to Inference

Vibhu Sapra
AI Engineer World's Fair
Oct 29, 2024
DeepSeek-R1
DeepSeek-V3
test-time-compute
model-distillation
+13 more

Agentic GraphRAG: AI's Logical Edge

Neo4j MCP Tools, GraphRAG Architecture, and Enterprise Case Study with 85% Adoption

Stephen Chin
AI Engineer Conference
Dec 30, 2024
GraphRAG
Neo4j
knowledge-graphs
MCP
+11 more

Anchoring Enterprise GenAI with Knowledge Graphs: 75% Faster Onboarding

Pfizer & Neo4j Case Study: How GraphRAG Achieved 3 Months → 3 Weeks with Knowledge Graphs. Real Enterprise Lessons on Technology Transfer, Workforce Knowledge Retention Crisis (20 Years → 3 Years Tenure), and Navigating Organizational Politics.

Jonathan Lowe, Stephen Chin
AI Engineer Summit
Oct 29, 2024
GraphRAG
Neo4j
Pfizer
enterprise-ai
+18 more

AI Engineering at Jane Street - Building AI Tools in OCaml

John Crepezzi shares how Jane Street builds custom AI infrastructure when off-the-shelf tools won't work. Learn workspace snapshotting, Code Evaluation Service (CES) running 50-100x faster than builds, Aid sidecar architecture for multi-editor support, and why they have more OCaml code than exists publicly worldwide.

John Crepezzi
AI Engineer Summit 2024
Oct 29, 2024
Jane-Street
OCaml
AI-engineering
workspace-snapshotting
+13 more

AI Agents, Meet Test Driven Development

Why TDD is Critical for Reliable AI Agents: L0-L4 Agentic Workflow Framework, Evaluation Loops, and SEO Agent Demo with 60% Performance Improvement

Anita
AI Engineers Conference
Dec 30, 2024
test-driven-development
TDD
AI-agents
agentic-workflows
+12 more

2025 is the Year of Evals!

Why AI Evaluation Finally Breaks Through: Three Converging Forces, C-Suite Alignment & Market Validation

John Dickerson
AI Engineer Conference
Dec 30, 2024
AI-evaluation
ML-monitoring
agentic-systems
enterprise-AI
+11 more

2026: The Year The IDE Died

Why Vibe Coding Will Transform Software Development

Steve Yegge, Gene Kim
AI Engineer Summit 2024
Oct 29, 2024
vibe-coding
ai
ide
future-of-work
+1 more

The Price of Intelligence: AI Agent Pricing in 2025

Comprehensive analysis of AI agent pricing models, cost structures, and the economics of intelligence — outcome-based pricing, prepaid credits, cost optimization strategies, and 2025 predictions from 13+ real company examples

Chz
AI Engineer Conference 2024
Dec 30, 2024
ai-agent-pricing
cost-of-intelligence
token-economics
model-costs
+13 more

BlackRock: 8 Months → 2 Days

How to Build Custom Knowledge Apps at Scale - Human-in-the-Loop Design, LLM Strategies, and Why Autonomous Agents Don't Work in Finance

Infant Vasanth, Vaibhav Page
AI Engineer Summit 2024
Oct 29, 2024
enterprise
financial-services
document-processing
human-in-the-loop
+5 more

How to Build World-Class AI Products

Sarah Sachs (AI Lead, Notion) and Carlos Esteban (Braintrust) share their evaluation-first approach to building AI products. Learn why Notion spends 90% of time on evaluation and 10% on prompting, with practical guidance on data management, trace-based debugging, user feedback analysis, multi-turn conversation evaluation, and production monitoring with online scoring.

Sarah Sachs, Carlos Esteban
AI Engineer Conference
Oct 29, 2024
AI-product-development
Notion
Braintrust
evaluation-framework
+10 more

Five Hard Earned Lessons About Evals: Why Braintrust Ships 2 Weeks After New Model Releases

Ankur Goyal (CEO, Braintrust) shares hard-earned lessons: YAML vs JSON (15% token savings), GPT-4o 10% → Claude 4 Sonnet viable (6x better), Notion's 24-hour model integration, continuous reconciliation, and why great evals must be engineered like any other software system.

Ankur Goyal
AI Engineer Summit
Oct 29, 2024
AI-evaluation
Braintrust
Ankur-Goyal
YAML-vs-JSON
+11 more

12-Factor Agents: Building Reliable LLM Applications

Production AI Methodology from HumanLayer - Transform Unreliable Demos into Dependable Systems

Dex Horthy
AI Engineer Summit 2024
Nov 15, 2024
12-factor-agents
production-AI
agent-frameworks
software-engineering
+3 more

3 Ingredients for Building Reliable Enterprise Agents

The Mathematical Formula for Agent Success: P(success) × Value - Cost(failure) > Cost(running)

Harrison Chase
AI Engineer Summit 2024
Nov 15, 2024
agents
enterprise
reliability
human-in-the-loop
+6 more

Agents are Robots Too

What Self-Driving Taught Me About Building Agents: Agentics, 1% vs 99% Problem, and Closed-Loop Systems

Jesse Hu
AI Engineer Summit
Jan 1, 2024
robotics
self-driving
agents
Agentics
+8 more

AI Music Generation: From Prompt to Production

Hands-on workshop exploring AI music generation tools (Udio, Suno, Stable Audio), voice cloning (RVC), stem separation (Wave, UVR5), and the RIAA legal battle. Learn practical workflows for generating professional-quality music from text prompts.

Phlo Young
AI Engineer World's Fair
Dec 30, 2024
AI-music
Udio
Suno
Stable-Audio
+17 more

AI + Security & Safety: Why Your Agent Can't Go to Production

The Single-Process Security Flaw, Real-World Production Blocker, and Three-Layer Defense Framework from Apache Ranger's Creator - Don Bosco Durai, Priv

Don Bosco Durai
AI Engineer Summit
Dec 30, 2024
AI-security
agent-security
zero-trust
enterprise-AI
+11 more

Vibes Won't Cut It

Production Reality vs. AI Hype in Software Engineering - Why Professional Engineers Are Skeptical and What Actually Works

Chris Kelly
AI Engineer Conference
Oct 29, 2024
vibe-coding
production-engineering
AI-coding-skepticism
software-engineering
+7 more

MongoDB Atlas Vector Search: RAG Without the Complexity

Unified Platform: HNSW Algorithm, Search Nodes for Independent Scaling, Framework Integrations (LangChain, LlamaIndex), and Production-Ready RAG with 4,096 Dimensions Support

Ben Flast
AI Engineer Conference
Dec 30, 2024
MongoDB
RAG
vector-search
HNSW
+12 more

Building in the Gemini Era

Google DeepMind's Vision for AI-Assisted Development

Kat Kampf, Ammaar Reshi
AI Engineer Summit
Jan 1, 2024
gemini-3-pro
ai-studio
vibe-coding
image-generation
+3 more

#define AI Engineer: Technical Humility & Research-Engineering Symbiosis

Greg Brockman (OpenAI President) & Jensen Huang (NVIDIA CEO) on the evolution from AlexNet to 100K GPU clusters, why technical humility matters, and the future of domain-specific AI agents. "If you don't have the idea, you're dead in the water. But if you don't have the engineering, that idea is not going to live and see the light of day." Learn about the 3-phase evolution of AI engineering at OpenAI, the cultural divide between engineers and researchers, and predictions for AGI-era development workflows.

Greg Brockman, Jensen Huang
AI Engineer Summit
Nov 15, 2024
Greg-Brockman
Jensen-Huang
OpenAI
NVIDIA
+12 more

The Next Unicorns: 7 Top AI Startups from HF0 Residency

Real Revenue, Validated Models: 25M Users, $100M ARR Across Portfolio. Meet Krea, OpenHome, Koframe, Federous AI, Upside, OpenAudio, Glow, Favored, and OpenRouter.

Diego Rodriguez, Sua, Josh, Eugene, Jonas, David Vorick, David, Alex Atala
HF0 Residency Demo Day
Dec 30, 2024
HF0
AI-startups
unicorns
venture-capital
+13 more

The AI Developer Experience Doesn't Have to Suck

Why and How Modal Rebuilt Cloud Infrastructure from Scratch

Eric Bernhardson
AI Engineer Summit
Jan 1, 2024
serverless-gpu
container-cold-start
memory-snapshotting
modal
+2 more

AI Native Company

Jan 1, 2024

What Data from 20m Pull Requests Reveal About AI Transformation

Jellyfish Research: 2x Throughput, 24% Faster Cycles, and the Architecture Correlation That Determines AI Success (4x vs 0x Gains)

Nicholas Arcolano
AI Engineer Conference (2024)
Dec 30, 2024
AI-transformation
Jellyfish
pull-request-analytics
GitHub-Copilot
+10 more

Shipping AI That Works: An Evaluation Framework for PMs

LLM-as-Judge Methodology with 4 Components: Role, Task, Context, Goal. Why Even OpenAI and Anthropic CPOs Say Models Hallucinate. Transition from Vibe Coding to Thrive Coding in Production.

Aman Khan
AI Engineer Conference
Oct 29, 2024
AI-evaluation
LLM-as-judge
vibe-coding
thrive-coding
+14 more

Architecting Agent Memory: Principles, Patterns, and Best Practices

MongoDB's Guide to Building Stateful AI Agents: Memory Management Lifecycle, Four Memory Types, and Voyage AI Integration

Richmond Alake
AI Engineer Conference
Dec 30, 2024
agent-memory
MongoDB
vector-search
RAG
+10 more

Autonomy Is All You Need

How Replit Broke the One-Hour Autonomy Barrier for Non-Technical Users

Michele Catasta
AI Engineer Summit
Oct 29, 2024
autonomy
multi-hour-agents
Replit
B3
+5 more

Claude plays Minecraft!: When AI Spontaneously Emerges Unexpected Behavior

AWS Solutions Architect's live demo of Rocky, a Minecraft bot powered by Claude Haiku and AWS Bedrock, showcasing emergent AI behaviors including autonomous escape from a hole, 3D spatial reasoning, and the critical Return of Control pattern for production agents

AWS Solutions Architect
AI Engineer Summit
Oct 29, 2024
emergent-behavior
Claude-Haiku
AWS-Bedrock
Minecraft
+11 more

Continual System Prompt Learning for Code Agents

5-15% Improvement with Only 150 Examples - A Practical Alternative to Reinforcement Learning

Aparna Dhinakaran
AI Engineer Summit 2024
Dec 29, 2024
system-prompt-learning
code-agents
LLM-optimization
SWE-bench
+4 more

Building Cursor Composer: Fast, Smart, and Parallel

Lee Robinson reveals how Cursor built their first agent model with 4x token efficiency, parallel tool calling breakthrough, and RL infrastructure secrets. Learn about the 3.5x Blackwell speedup, semantic search impact, and vertical integration advantages.

Lee Robinson
AI Engineer Summit 2024
Oct 29, 2024
Cursor
Cursor-Composer
Lee-Robinson
parallel-tool-calling
+15 more

Developing Taste in Coding Agents

Meta Neuro-Symbolic RL: 10x PR Increase with Acquired Taste

Ahmad Awais
AI Engineer Summit
Dec 29, 2024
meta-neuro-symbolic-rl
taste-models
acquired-taste
reinforcement-learning
+3 more

Devin 2.0 and Moore's Law for AI Agents

Scott Wu's Framework: 70-Day Doubling Cycle - From Tab Completion to Autonomous Engineers in 18 Months. Deep Wiki, Automated Testing, Backlog Processing, and the Future of Software Engineering

Scott Wu
AI Engineer Summit
Oct 29, 2024
Devin-2
Scott-Wu
Cognition-AI
Moore's-Law-AI
+13 more

The DevOps Engineer Who Never Sleeps: AI Agents at Datadog

Diamond Bishop from Datadog shares what they learned building AI agents that automate on-call duties, handle incident response, and transform DevOps. Covers evaluation strategies, team composition, LLM observability, and predictions about agents surpassing humans as SaaS users.

Diamond Bishop
AI Engineer Summit 2024
Oct 29, 2024
AI-agents
Datadog
on-call-automation
LLM-observability
+13 more

Future-Proof Coding Agents

OpenAI Guide to Building AI That Writes Code and Survives Model Evolution

Bill Chen, Brian Fioca
AI Engineer Summit
Jan 1, 2024
coding-agents
openai-codeex
ai-harness
model-evolution
+2 more

How Claude Code Works

Architecture Deep Dive: Flexible Loops, Skills System & Comparison with Cursor, AMP, and OpenAI Codex

Jared Zoneraich
AI Engineer Summit
Nov 15, 2024
claude-code
agent-architecture
skills-system
flexible-loops
+4 more

How to Look at Your Data

A Practical Guide to Evaluating RAG Systems: Fast Evals, Cluster Analysis, and Data-Driven Decision Making

Jeff Huber, Jason Liu
AI Engineer Summit 2025
Dec 30, 2025
RAG-evaluation
fast-evals
cluster-analysis
data-driven-AI
+7 more

Your Personal Open-Source Humanoid Robot for $8,999

How K-Scale Labs Built a $9k Open-Source Humanoid in 5 Months: Sim-to-Real RL, Python SDK, and Democratizing Robotics

JX Mo
AI Engineer Summit
Jan 1, 2024
humanoid-robots
open-source
robotics
RL
+4 more

The Infinite Software Crisis: When AI Generates Faster Than We Understand

Jake Nations from Netflix argues that while AI has dramatically accelerated code generation, it has created a dangerous gap between what we can produce and what we can understand. He presents "context compression" as a three-phase solution (Research, Planning, Implementation) to maintain control over complex systems.

Jake Nations
AI Engineer Summit 2024
Oct 29, 2024
software-complexity
context-compression
Netflix
AI-code-generation
+11 more

How Search Conquered Compiler Complexity

Luminal AI Automatically Rediscovered Flash Attention Using Search-Based Compilation—12 Primitives, E-Graphs, and 3M → 5K Lines of Code

Joe Fioti
AI Engineer Summit 2024
Oct 29, 2024
search-based-compilation
deep-learning-compilers
Flash-Attention
e-graphs
+8 more

Making Codebases Agent Ready

Organizational Readiness for Autonomous AI Development

Eno Reyes
AI Engineer Summit 2024
Jan 1, 2024
ai-agents
codebase-readiness
verification
autonomous-development
+1 more

Production Software Keeps Breaking and It Will Only Get Worse

Why AI Writes Code Faster But Debugging Gets Harder - Three-Part Framework: Causal ML + LLMs + Swarms of Agents. DigitalOcean Case Study: 40% MTTR Reduction

Anish Agarwal, Matt
AI Engineer Conference
Dec 30, 2024
production-reliability
AI-debugging
causal-ML
swarm-intelligence
+11 more

Poolside's Path to AGI

Reinforcement Learning, Defense & Vertical Integration

Jason Warner, Eiso Kant
AI Engineer Summit
Jan 1, 2024
agi
reinforcement-learning
defense-ai
vertical-integration
+1 more

Serving Voice AI at $1/hr: Open-source, LoRAs, Latency, Load Balancing

Neil Dwyer reveals how to achieve $1/hr voice AI costs using Orpheus model, LoRA fine-tuning, vLLM with FP8, and consistent hash load balancing

Neil Dwyer
AI Engineer Summit
Oct 29, 2024
voice-AI
Orpheus
LoRA
vLLM
+8 more

Skills vs Agents

Jan 1, 2024

The Unbearable Lightness of Agent Optimization

Why Your Agent Optimization is Failing (And How to Fix It)

Alberto Romero
AI Engineer Summit
Nov 15, 2024
agent-optimization
meta-ac
weak-reflector
production-ai

Unlocking AI Powered DevOps Within Your Organization

Practical Patterns from GitHub: Realistic Metrics (30% Average), IDE Integration Over Chat Tools, Human-in-the-Loop Autonomous Agents

Jon Peck
AI Engineer Summit
Dec 30, 2024
DevOps
AI-in-DevOps
GitHub-Copilot
IDE-integration
+5 more

Coding Evals: From Code Snippets to Codebases

How AI Code Evaluation Evolved from Single Functions to Hour-Long Challenges—and Why 30% is Reward Hacking

Naman Jain
AI Engineer Summit
Dec 30, 2024
coding-evals
reward-hacking
data-contamination
CodeBench
+5 more

ChatGPT is poorly designed. So I fixed it

Multimodal Voice + Text Integration Using GPT-4o Realtime API - "Shipping the Org Chart" Anti-Pattern and the FaceTime + iMessage Solution

AI Engineer World's Fair
Dec 30, 2024
UX-design
ChatGPT
GPT-4o-Realtime
multimodal-AI
+6 more

Building Conversational AI Agents

Multilingual Architecture with ElevenLabs: 31 Languages Now, 99 Coming in V3, $5M Voice Marketplace, Production-Grade Low-Latency Pipeline

Thor Schaeff
API Day Singapore
Dec 30, 2024
conversational-AI
multilingual-AI
voice-technology
ElevenLabs
+12 more

Code World Model: Building World Models for Computation

Jacob Kahn from FAIR Meta presents a revolutionary framework for understanding code through execution modeling. Learn how 32B parameter models trained on execution traces enable semantic understanding, neural debugging, and approximation of undecidable problems like the halting problem—all through bash-first async RL with mid-trajectory model updates.

Jacob Kahn
AI Engineer Summit
Oct 29, 2024
world-models
code-execution
FAIR-Meta
Jacob-Kahn
+12 more

The Cure for the Vibe Coding Hangover

Systematic Framework for Reliable AI-Augmented Development: 5-Step Planning Phase, Multi-Sensory Feedback Loop, Binary Dependencies, and Circular Resolution Strategies That Transform AI Agents from Erratic Novices into Predictable Implementation Partners

Corey J. Gallon
AI Engineer Conference
Dec 30, 2024
vibe-coding
AI-agents
software-architecture
dependency-management
+13 more

War on Slop

Jan 1, 2024

Enterprise Ready MCP: The Complete Guide

From Localhost to Production: Taking Model Context Protocol Servers from Demo to Enterprise Deployments - Security Challenges, Compliance Requirements, and Implementation Realities

Tobin South
AI Engineer Summit
Oct 29, 2024
MCP
Model-Context-Protocol
enterprise-AI
AI-security
+9 more

LinkedIn 360Brew: One Model to Replace All Recommendation Systems

How LinkedIn replaced dozens of specialized models with a single LLM—achieved 7x latency reduction and 30x throughput improvement through promptification and model distillation

Hamed, Maziar
AI Engineer Summit
Dec 30, 2024
LinkedIn
360Brew
LLM
recommendation-systems
+12 more

Best Practices for Evaluating LLM Applications with llmeval

Niklas Nielsen from Log10 introduces llmeval - a command-line tool for reliable LLM evaluation built on Meta's Hydra. Learn about flexible test criteria for fuzzy model outputs, Python-based metrics, model-based evaluation with its pitfalls (self-preference bias, score inflation), and the innovative "Auto-John" concept for scaling human feedback through AI personas.

Niklas Nielsen
AI Engineer Conference
Oct 29, 2024
llmeval
Log10
Niklas-Nielsen
LLM-evaluation
+14 more

Netflix Foundation Model: One Model to Rule All Recommendations

How Netflix proved scaling laws apply to recommendation systems—applying LLM techniques like multi-token prediction, long-context training, semantic embeddings, and rich multi-task objectives to achieve infrastructure consolidation and quality improvements

Yesu Feng
AI Engineer Summit
Dec 30, 2025
Netflix
recommendation-systems
foundation-model
LLM-techniques
+12 more

Leadership in AI Assisted Engineering

Justin Reock shares aggregated data from 140,000 engineers revealing extreme variability in AI impact (+20% to -20%) and provides a leadership framework for successful AI adoption. Learn why writing code has never been the bottleneck, how Theory of Constraints applies to AI, and the 7 leadership principles that separate +20% outcomes from -20% outcomes.

Justin Reock
AI Engineer Summit
Oct 29, 2024
AI-leadership
DX
Atlassian
software-engineering-productivity
+11 more

Government Agents: AI Agents Meet Tough Regulations

Mark Myshatyn from Los Alamos National Lab reveals how one of the most secure government organizations is deploying AI agents that design fusion capsules and execute code on HPC systems. Learn about 1000+ security controls, FedRAMP compliance, OpenAI models on classified networks, Venado supercomputer (2500+ GraceHopper nodes), and four architecture principles for government-ready AI. "We are not a t-shirt company...People can die if we do this wrong."

Mark Myshatyn
AI Engineer Conference
Oct 29, 2024
government-AI
Los-Alamos-National-Lab
AI-regulations
FedRAMP
+17 more

Explore More Research

Dive deeper into AI engineering with our comprehensive collection of research topics, case studies, and company analysis.