Google DeepMind Insights

A Year of Gemini Progress + What's Next

Logan Kilpatrick from Google DeepMind takes us through a transformative year — 10 years of progress in 12 months, 50x inference growth, Gemini 2.5 Pro final update, organizational evolution, and an exclusive look at what's next for universal assistant vision, omnimodal models, agentic AI, and developer platform expansions.

"The formula is simple: bring the best people together, find infra advantages, and ship."

Logan Kilpatrick, Google DeepMind (00:14:20)

10→1

Years in 12 months

50x

Inference growth

SOTA

2.5 Pro benchmarks

Platform announcements

The Year in Review

10 years of progress packed into 12 months

The pace of innovation has been extraordinary. From organizational restructuring to massive inference scale growth, Google has compressed what feels like a decade of progress into a single year.

"It feels like 10 years of Gemini stuff packed into the last 12 months, which has been awesome."

Opening reflection on the pace of innovation

Watch (00:07:15)

"A 50x increase in the amount of AI inference that's being processed through Google servers from one year ago to last month"

Sundar Pichai's I/O slide showing adoption metrics

Watch (00:09:30)

"All of these different research bets across DeepMind coming together to like build this incredible mainline Gemini model... all that actually ends up upstreaming into the mainline models."

How AlphaProof, AlphaGeometry, and other research improve mainline Gemini

Watch (00:08:45)

Organizational Evolution

How Google restructured for AI velocity

A critical but often overlooked aspect of AI progress is organizational structure. Google's two-phase consolidation of research and product teams into DeepMind created unprecedented collaboration and velocity.

Phase 1: Late 2023/Early 2024

Brought different AI research teams together at Google. Charted new direction for DeepMind: not just theoretical research, but building models and delivering to Google and the world.

Research Consolidation

Phase 2: Early 2024

Brought product teams INTO DeepMind. Now DeepMind creates models, does research, builds products, and delivers to world. Two product lines: Gemini App (consumer) and Gemini API (developer).

Product Integration

"Google brought a bunch of those teams together and charted this new direction for the DeepMind team to not only just like do theoretical foundational research but also to like build models and deliver them to the rest of Google and also the external world."

Phase 1: Late 2023/early 2024 - bringing AI teams together

Watch (00:11:20)

"We took the second step of that journey earlier this year which was actually bringing the product teams into DeepMind. So now DeepMind creates the models, does the research but then also builds products and delivers those to the world."

Phase 2: Early 2024 - integrating product teams into DeepMind

Watch (00:12:00)

"It's been like personally for me super fun to get to collaborate with our research team and like help actually be on the frontier with them and bring new models and capabilities to the world."

Collaboration between research and product teams

Watch (00:12:45)

Key Announcements

New models, APIs, and platform updates

Logan made several significant announcements during the talk, including a new Gemini 2.5 Pro final update and major expansions to the developer platform coming soon.

Gemini 2.5 Pro Final Update

SOTA on ADER and HLE benchmarks. Closes gaps from previous versions. Available at ai.dev, Gemini app, and API.

50x Inference Growth

Year-over-year increase in AI inference through Google servers. Massive adoption of Gemini models.

VO Model 'Burning TPUs Down'

SOTA across multiple benchmarks. Extreme demand consuming all available TPU capacity.

Deep Research API

Consumer Deep Research adapted into developer API for autonomous research tasks.

Veo 3 & Imagen 4

Next-gen video and image generation models coming to API 'very very very soon'.

Embeddings API

State-of-the-art Gemini embeddings model rolling out for RAG applications.

What's Next: Gemini App Strategy

Universal assistant and the thread that unifies Google

The Gemini app is evolving into something far more ambitious than a chatbot — it's becoming the connective tissue that unifies all of Google products and services.

Universal Assistant

Unlike the passive Google Account of the past, Gemini becomes the stateful thread connecting all Google products. The future of Google looks like Gemini as the unifying layer.

Product Vision

Proactive AI

Most AI products today require user initiation. The next frontier is proactive AI that anticipates needs and acts autonomously. This is the next major UX shift.

Next Frontier

"The Gemini app is trying to be this universal assistant... I think now we're seeing with Gemini that it's actually this thread that unifies all of Google. And I think the future for Google is going to look a lot like Gemini is this sort of thread that brings all of our stuff together."

Gemini as the connective tissue across all Google products

Watch (00:18:30)

"I think the one that I'm most excited about is proactivity. I think most AI products today are still very like you have to go and do all the work as the user. And I think this proactive next step of AI systems and models coming into play is going to be awesome to see."

Proactive AI as the next major UX shift

Watch (00:20:50)

What's Next: Model Development

Omnimodal, agentic by default, and reasoning

Gemini was originally built as a single multimodal model for audio, image, and video. Significant progress has been made, and the future involves becoming more systematic and agentic by default.

Omnimodal Model

Native audio capabilities (TTS, audio) powering Astro and Gemini Live. VO model (SOTA) will integrate into mainline. Diffusion experiments for crazy tokens/sec.

Agentic by Default

Models becoming more systematic themselves. Reasoning step absorbs what used to be external scaffolding. Models do more, scaffolding becomes part of reasoning.

Infinite Context

Current attention paradigm doesn't scale to infinite context. New innovations needed beyond transformers to continue scaling context windows.

"When Gemini was originally created, it was built to be a single multimodal model to do audio, image, video, etc... We've made a lot of progress on that. At IO this year, we announced native audio capabilities in Gemini... It's powering the Astro experience. It's powering Gemini Live."

Omnimodal progress and native audio capabilities

Watch (00:24:00)

"It's becoming very clear to me that the models are becoming more systematic themselves, like they're doing more and more. And I think the reasoning step is this like really interesting place in which a lot of that's going to happen."

Models becoming systematic - agentic by default direction

Watch (00:26:15)

"I think the current model paradigm doesn't work for infinite context. I think it's just like impossible to scale up. Attention doesn't work that way. So I think there'll be some new innovations to hopefully help let people continue to scale up the amount of context that they're bringing in."

Challenges with infinite context and attention mechanisms

Watch (00:27:45)

What's Next: Developer Platform

APIs, tools, and AI Studio evolution

Google is making significant investments in the developer platform with new APIs and a strategic repositioning of AI Studio as a pure developer platform.

Embeddings API

Gemini embeddings model (state-of-the-art) rolling out broadly. Powers most RAG applications. "Feels like early AI stuff but still super important."

Rolling out in weeks

Deep Research API

Consumer Deep Research adapted into bespoke developer API. Many interesting products built around autonomous research tasks.

Coming soon

Veo 3 & Imagen 4

Next-gen video (Veo 3) and image (Imagen 4) generation models coming to API "very very very soon."

Very soon

AI Studio Repositioning

Moving from consumer-y feel to pure developer platform. Agents built in, Jules (coding agent) native integration.

Platform evolution

"We have a Gemini embeddings model which is state-of-the-art. So excited to be rolling that out to developers more broadly in the next couple of weeks."

SOTA embeddings API for RAG applications

Watch (00:29:00)

"We're finding ways to bring a bunch of that together into a like bespoke deep research API which will be awesome."

Deep Research API coming for autonomous research tasks

Watch (00:29:45)

"And then V3 and Imagine 4 in the API as well. So hopefully we'll see that very very very soon."

Veo 3 and Imagen 4 coming to API

Watch (00:30:15)

"AI studio just to be very clear is being built as a developer platform... we'll sort of move away from this like kind of consumerry feel and move much more towards being a developer platform which I'm personally very excited about because I think that's what developers want from us."

AI Studio repositioning as pure developer platform

Watch (00:31:30)

Key Takeaways

Practical insights from Gemini's evolution

1. Organizational Strategy Matters

Research + Product Collaboration

•Two-phase consolidation: teams together, then product into DeepMind
•Research bets (AlphaProof, AlphaGeometry) upstream into mainline models
•Collaboration between research and product creates competitive advantage
•Formula: bring best people together, find infra advantages, and ship

2. Incredible Scale Achievement

50x Inference Growth

•10 years of progress compressed into 12 months
•50x increase in AI inference through Google servers (year-over-year)
•VO model 'burning all the TPUs down' with demand
•Massive adoption across Google ecosystem and external developers

3. Universal Assistant Vision

Gemini as Google's Unifying Thread

•Gemini becomes connective tissue across all Google products
•Replaces passive Google Account with stateful AI thread
•Proactive AI as next frontier beyond reactive prompt-response
•Future Google looks like Gemini unifying everything

4. Developer Platform Expansions

Major API Announcements

•SOTA embeddings API rolling out broadly for RAG
•Deep Research API for autonomous research tasks
•Veo 3 and Imagen 4 coming to API very soon
•AI Studio repositioning as pure developer platform

Source Video

A year of Gemini progress + what comes next

Logan Kilpatrick • Developer stuff at Google DeepMind

Video ID: wE1ZCmCLP5g•Duration: ~15-20 minutes•AI Education Summit

Watch on YouTube

Research Note: All quotes in this report are timestamped and link to exact moments in the video for validation. This analysis covers Google's Gemini progress over the past year, organizational changes, new model announcements (Gemini 2.5 Pro, VO), and what's next for Gemini app (universal assistant, proactive AI), models (omnimodal, agentic by default, reasoning), and developer platform (embeddings API, Deep Research API, Veo 3, Imagen 4, AI Studio repositioning).

Key Concepts: Gemini 2.5 Pro, 50x inference growth, universal assistant, proactive AI, omnimodal models, agentic by default, reasoning, infinite context, embeddings API, Deep Research API, Veo 3, Imagen 4, AI Studio, organizational consolidation, research upstreaming, AlphaProof, AlphaGeometry, TPUs