Moore's Law for AI Agents: 70-Day Doubling in Code
Scott Wu (Cognition CEO) presents a framework for understanding exponential AI agent growth, with capabilities doubling every 70 days. Learn about the evolution from tab completion to full AI engineer and why we'll see another 16-64x improvement in the next year.
The amount of work that an AI agent can do in code goes something between 16 and 64x in a year every year at least for the last couple years that we've seen. The doubling time is about every 70 days.
— Scott Wu, CEO of Cognition (00:01:24)
AI capability doubling cycle in code
Annual improvement per year
From tab completion to AI engineer
The 4-Tier Evolution of AI Coding Capabilities
In just 18 months, AI coding capabilities evolved from simple autocomplete to autonomous software engineers. Here's the timeline Scott Wu presented:
Tab Completion
The only product experience with PMF in code was single-line completion. Tools like GitHub Copilot dominated, predicting the next line based on context.
"18 months ago, I would say the only really the only product experience that had PMF in code was just tab completion... It was just like here's what I have so far. Predict the next line for me."
Watch (00:02:15)Repetitive Migrations
AI agents excelled at large-scale, repetitive tasks—JavaScript to TypeScript conversions, version upgrades, migrating 10,000+ files. These were "the easiest thing for AI, which was cool actually because it was also the most annoying thing for humans to do."
"10,000 file migrations were the easiest thing for AI, which was cool actually because it was also the most annoying thing for humans to do."
Watch (00:04:10)Isolated Bugs & Features
AI reached "intern-level" capability for isolated tasks—fixing bugs, implementing features that spanned multiple files but were well-defined. This is where Devin 2.0's "Playbooks" system became critical for reliable instruction following.
Key capability: Understanding multi-file changes, knowing when to ask for help, and maintaining confidence estimation.
Watch (00:13:20)Full AI Engineer
The turning point: "when you could just tag Devin in Slack and say 'we've got this bug, please take a look' or 'could you build this thing'—and it would just do it." Autonomous execution of complex multi-hour tasks with minimal supervision.
"Often you want to be able to have points where you closely monitor Devin for 10% of the task, 20% of the task and then have it do work on its own for the other 80-90%."
Watch (00:12:42)Understanding "Moore's Law for AI Agents"
The Framework
Scott Wu introduces a simple but powerful metric: measure AI capability by "how much uninterrupted work can the agent do before human intervention is needed?"
The Math:
- • Doubling every 70 days = ~5 doublings/year
- • 2⁵ = 32x average (within 16-64x range)
- • Consistent for "at least the last couple years"
Why Code is Faster
"The doubling time is about every seven months which already is pretty crazy actually. But in code it's actually even faster. It's every 70 days."
Comparison:
- • General AI: ~7 month doubling cycle
- • Code-specific AI: ~70 day cycle (3x faster)
- • Structural advantages in code (tests, clear objectives)
The Bold Prediction: Next 12 Months
Scott doesn't see this slowing down. In fact, he predicts another 16-64x improvement over the next year.
"We're going to see another 16 to 64x over the next 12 months as well."
Implications: If 70-day doubling continues, by December 2025 we could see AI agents capable of handling tasks that would take 32x longer than what's possible today. Projects requiring weeks of autonomous work might become hours.
Watch (00:15:55)The Shifting Bottleneck Problem
One of Scott's most important insights: each capability jump creates entirely new bottlenecks. What matters changes every 2-3 months.
Every time you get to the next tier, the bottleneck that you're running into or the most important capability or the right way you should be interfacing with it... actually change at each point.Watch Scott explain (00:02:53)
Tier 1 Bottleneck: Text Prediction
Challenge: Accurately predicting next tokens given limited context
Tier 2 Bottleneck: Scale & Repetition
Challenge: Maintaining consistency across thousands of files
Tier 3 Bottleneck: Instruction Following
Challenge: Reliable execution of complex multi-step tasks (solved with Playbooks)
Tier 4 Bottleneck: Knowledge & Memory
Challenge: Learning from feedback across tasks, organizational context
Tier 5 Bottleneck: Testing & Validation
Challenge: Self-testing, interpreting results, knowing what to test
Tier 6 Bottleneck: Project-Level Orchestration
Challenge: Coordinating entire projects, "what goes after that?"
Technical Architecture: How Devin 2.0 Works
The Playbooks System
Reliable instruction following for complex multi-step tasks. Playbooks encode procedural knowledge that ensures consistent execution.
Critical for Tier 3 capabilities—moving from single-file changes to multi-file coordination.
Knowledge & Memory
Learning from human feedback across tasks. Devin improves over time by remembering corrections and adapting to organizational patterns.
Addresses the "30th day is dramatically better than day one" problem.
Confidence Estimation
Agents must know when they understand a task well enough to execute autonomously vs when to ask for help.
"Rather than just going off and doing things immediately, you have to be able to say, okay, I'm quite sure that this is the task and I'm going to go execute it now versus I don't understand what's going on. Human, please give me help."
Self-Testing & Iteration
The era where testing "gets really really important." Agents need iterative loops to validate their own work before delivering PRs.
"It's just a much higher context problem to solve... is this testing itself?"
Human-AI Collaboration: The 10-20% Supervision Rule
Often you want to be able to have points where you closely monitor Devin for 10% of the task, 20% of the task and then have it do work on its own for the other 80-90%.
This is the optimal collaboration pattern for AI agents in 2024/2025: close supervision at key decision points, autonomy for execution. Not "hands-off" but "selective oversight."
Watch (00:12:42)Key Decision Points
Monitor at 10-20%: task understanding, architecture decisions, approach selection
Autonomous Execution
Let agent work independently for 80-90%: implementation, testing, refinement
Final Review
Human validates final output, provides feedback for learning
What's Next: Beyond Single Tasks
The Next Frontier: Project-Level Orchestration
Scott teases what comes after Tier 4 (full AI engineer): moving from single autonomous tasks to entire project execution.
"Now what we're thinking about is hey maybe if instead of doing it just one task it's you know how how do we think about tackling an entire project right and after we do a project you know what what goes after that"
Implication: The shift from "build this feature" to "own this project." AI agents that can coordinate multiple features, manage dependencies, and understand project-level objectives.
Short-Term (Next 6 Months)
- • Improved self-testing and validation
- • Better knowledge/memory systems
- • Enhanced confidence estimation
- • More sophisticated debugging
Medium-Term (6-12 Months)
- • Project-level task orchestration
- • Multi-repo understanding
- • Architectural decision-making
- • "16-64x" capabilities realized
Key Takeaways for Engineering Leaders
1. Exponential Growth is Real
The 70-day doubling cycle has been consistent for "at least the last couple years." Plan for 16-64x capability improvements annually.
2. Bottlenecks Shift Every 2-3 Months
What limits AI capabilities changes rapidly. Don't over-optimize for today's bottleneck—focus on adaptable infrastructure.
3. Validation is the Limiter
"The limiter is not the capability of the coding agent. The limit is your organization's validation criteria." Invest in tests, linting, types.
4. 10-20% Supervision Model
Optimal human-AI collaboration: close oversight at decision points, autonomy for execution. Not hands-off, but selective.
5. From Tab Completion to Teammate
18 months from autocomplete to autonomous engineer. The shift from "tool" to "teammate" happened faster than anyone predicted.
6. Testing is the New Bottleneck
"The era where testing and this asynchronous testing gets really really important." Self-testing capabilities are now critical.
Source Video
Devin 2.0 and the Future of SWE
Scott Wu, CEO of Cognition • AI Engineer Summit
Research Note: All quotes in this report are timestamped and link to exact moments in the video for validation. This analysis was conducted using multi-agent transcript analysis with fact-checking against external sources.
Disclaimer: "Moore's Law for AI Agents" is Scott Wu's framework, not an industry-standard metric. Quantitative claims (70-day doubling, 16-64x growth) represent Cognition's internal observations and should be attributed to the speaker rather than presented as objective facts.