A Year of Gemini Progress + What's Next
Logan Kilpatrick from Google DeepMind takes us through a transformative year — 10 years of progress in 12 months, 50x inference growth, Gemini 2.5 Pro final update, organizational evolution, and an exclusive look at what's next for universal assistant vision, omnimodal models, agentic AI, and developer platform expansions.
"The formula is simple: bring the best people together, find infra advantages, and ship."Logan Kilpatrick, Google DeepMind (00:14:20)
Years in 12 months
Inference growth
2.5 Pro benchmarks
Platform announcements
The Year in Review
10 years of progress packed into 12 months
The pace of innovation has been extraordinary. From organizational restructuring to massive inference scale growth, Google has compressed what feels like a decade of progress into a single year.
"It feels like 10 years of Gemini stuff packed into the last 12 months, which has been awesome."
Opening reflection on the pace of innovation
Watch (00:07:15)"A 50x increase in the amount of AI inference that's being processed through Google servers from one year ago to last month"
Sundar Pichai's I/O slide showing adoption metrics
Watch (00:09:30)"All of these different research bets across DeepMind coming together to like build this incredible mainline Gemini model... all that actually ends up upstreaming into the mainline models."
How AlphaProof, AlphaGeometry, and other research improve mainline Gemini
Watch (00:08:45)Organizational Evolution
How Google restructured for AI velocity
A critical but often overlooked aspect of AI progress is organizational structure. Google's two-phase consolidation of research and product teams into DeepMind created unprecedented collaboration and velocity.
Phase 1: Late 2023/Early 2024
Brought different AI research teams together at Google. Charted new direction for DeepMind: not just theoretical research, but building models and delivering to Google and the world.
Phase 2: Early 2024
Brought product teams INTO DeepMind. Now DeepMind creates models, does research, builds products, and delivers to world. Two product lines: Gemini App (consumer) and Gemini API (developer).
"Google brought a bunch of those teams together and charted this new direction for the DeepMind team to not only just like do theoretical foundational research but also to like build models and deliver them to the rest of Google and also the external world."
Phase 1: Late 2023/early 2024 - bringing AI teams together
Watch (00:11:20)"We took the second step of that journey earlier this year which was actually bringing the product teams into DeepMind. So now DeepMind creates the models, does the research but then also builds products and delivers those to the world."
Phase 2: Early 2024 - integrating product teams into DeepMind
Watch (00:12:00)"It's been like personally for me super fun to get to collaborate with our research team and like help actually be on the frontier with them and bring new models and capabilities to the world."
Collaboration between research and product teams
Watch (00:12:45)Key Announcements
New models, APIs, and platform updates
Logan made several significant announcements during the talk, including a new Gemini 2.5 Pro final update and major expansions to the developer platform coming soon.
Gemini 2.5 Pro Final Update
SOTA on ADER and HLE benchmarks. Closes gaps from previous versions. Available at ai.dev, Gemini app, and API.
50x Inference Growth
Year-over-year increase in AI inference through Google servers. Massive adoption of Gemini models.
VO Model 'Burning TPUs Down'
SOTA across multiple benchmarks. Extreme demand consuming all available TPU capacity.
Deep Research API
Consumer Deep Research adapted into developer API for autonomous research tasks.
Veo 3 & Imagen 4
Next-gen video and image generation models coming to API 'very very very soon'.
Embeddings API
State-of-the-art Gemini embeddings model rolling out for RAG applications.
What's Next: Gemini App Strategy
Universal assistant and the thread that unifies Google
The Gemini app is evolving into something far more ambitious than a chatbot — it's becoming the connective tissue that unifies all of Google products and services.
Universal Assistant
Unlike the passive Google Account of the past, Gemini becomes the stateful thread connecting all Google products. The future of Google looks like Gemini as the unifying layer.
Proactive AI
Most AI products today require user initiation. The next frontier is proactive AI that anticipates needs and acts autonomously. This is the next major UX shift.
"The Gemini app is trying to be this universal assistant... I think now we're seeing with Gemini that it's actually this thread that unifies all of Google. And I think the future for Google is going to look a lot like Gemini is this sort of thread that brings all of our stuff together."
Gemini as the connective tissue across all Google products
Watch (00:18:30)"I think the one that I'm most excited about is proactivity. I think most AI products today are still very like you have to go and do all the work as the user. And I think this proactive next step of AI systems and models coming into play is going to be awesome to see."
Proactive AI as the next major UX shift
Watch (00:20:50)What's Next: Model Development
Omnimodal, agentic by default, and reasoning
Gemini was originally built as a single multimodal model for audio, image, and video. Significant progress has been made, and the future involves becoming more systematic and agentic by default.
Omnimodal Model
Native audio capabilities (TTS, audio) powering Astro and Gemini Live. VO model (SOTA) will integrate into mainline. Diffusion experiments for crazy tokens/sec.
Agentic by Default
Models becoming more systematic themselves. Reasoning step absorbs what used to be external scaffolding. Models do more, scaffolding becomes part of reasoning.
Infinite Context
Current attention paradigm doesn't scale to infinite context. New innovations needed beyond transformers to continue scaling context windows.
"When Gemini was originally created, it was built to be a single multimodal model to do audio, image, video, etc... We've made a lot of progress on that. At IO this year, we announced native audio capabilities in Gemini... It's powering the Astro experience. It's powering Gemini Live."
Omnimodal progress and native audio capabilities
Watch (00:24:00)"It's becoming very clear to me that the models are becoming more systematic themselves, like they're doing more and more. And I think the reasoning step is this like really interesting place in which a lot of that's going to happen."
Models becoming systematic - agentic by default direction
Watch (00:26:15)"I think the current model paradigm doesn't work for infinite context. I think it's just like impossible to scale up. Attention doesn't work that way. So I think there'll be some new innovations to hopefully help let people continue to scale up the amount of context that they're bringing in."
Challenges with infinite context and attention mechanisms
Watch (00:27:45)What's Next: Developer Platform
APIs, tools, and AI Studio evolution
Google is making significant investments in the developer platform with new APIs and a strategic repositioning of AI Studio as a pure developer platform.
Embeddings API
Gemini embeddings model (state-of-the-art) rolling out broadly. Powers most RAG applications. "Feels like early AI stuff but still super important."
Deep Research API
Consumer Deep Research adapted into bespoke developer API. Many interesting products built around autonomous research tasks.
Veo 3 & Imagen 4
Next-gen video (Veo 3) and image (Imagen 4) generation models coming to API "very very very soon."
AI Studio Repositioning
Moving from consumer-y feel to pure developer platform. Agents built in, Jules (coding agent) native integration.
"We have a Gemini embeddings model which is state-of-the-art. So excited to be rolling that out to developers more broadly in the next couple of weeks."
SOTA embeddings API for RAG applications
Watch (00:29:00)"We're finding ways to bring a bunch of that together into a like bespoke deep research API which will be awesome."
Deep Research API coming for autonomous research tasks
Watch (00:29:45)"And then V3 and Imagine 4 in the API as well. So hopefully we'll see that very very very soon."
Veo 3 and Imagen 4 coming to API
Watch (00:30:15)"AI studio just to be very clear is being built as a developer platform... we'll sort of move away from this like kind of consumerry feel and move much more towards being a developer platform which I'm personally very excited about because I think that's what developers want from us."
AI Studio repositioning as pure developer platform
Watch (00:31:30)Key Takeaways
Practical insights from Gemini's evolution
1. Organizational Strategy Matters
Research + Product Collaboration
- •Two-phase consolidation: teams together, then product into DeepMind
- •Research bets (AlphaProof, AlphaGeometry) upstream into mainline models
- •Collaboration between research and product creates competitive advantage
- •Formula: bring best people together, find infra advantages, and ship
2. Incredible Scale Achievement
50x Inference Growth
- •10 years of progress compressed into 12 months
- •50x increase in AI inference through Google servers (year-over-year)
- •VO model 'burning all the TPUs down' with demand
- •Massive adoption across Google ecosystem and external developers
3. Universal Assistant Vision
Gemini as Google's Unifying Thread
- •Gemini becomes connective tissue across all Google products
- •Replaces passive Google Account with stateful AI thread
- •Proactive AI as next frontier beyond reactive prompt-response
- •Future Google looks like Gemini unifying everything
4. Developer Platform Expansions
Major API Announcements
- •SOTA embeddings API rolling out broadly for RAG
- •Deep Research API for autonomous research tasks
- •Veo 3 and Imagen 4 coming to API very soon
- •AI Studio repositioning as pure developer platform
Source Video
A year of Gemini progress + what comes next
Logan Kilpatrick • Developer stuff at Google DeepMind
Research Note: All quotes in this report are timestamped and link to exact moments in the video for validation. This analysis covers Google's Gemini progress over the past year, organizational changes, new model announcements (Gemini 2.5 Pro, VO), and what's next for Gemini app (universal assistant, proactive AI), models (omnimodal, agentic by default, reasoning), and developer platform (embeddings API, Deep Research API, Veo 3, Imagen 4, AI Studio repositioning).
Key Concepts: Gemini 2.5 Pro, 50x inference growth, universal assistant, proactive AI, omnimodal models, agentic by default, reasoning, infinite context, embeddings API, Deep Research API, Veo 3, Imagen 4, AI Studio, organizational consolidation, research upstreaming, AlphaProof, AlphaGeometry, TPUs