Google ships Deep Think for builders, not chatters

Google's updated Gemini 3 Deep Think landed this week in the Gemini app for Ultra subscribers, and it's the clearest signal yet that Google is done pretending one model fits all. Deep Think is not here to help you write emails or plan your weekend. It's here to spot logical flaws in math proofs and optimize crystal-growth fabrication methods. If that sentence excites you, this model was built for you.

What shipped

The updated Deep Think went live around March 26 for Google AI Ultra subscribers in the Gemini app. Alongside the consumer rollout, Google opened early API access to researchers, engineers, and enterprises through a dedicated interest form. The positioning is deliberate: Google's own examples focus on scientific reasoning and engineering workflows, not content generation or casual conversation.

This is a model that takes minutes per response, not seconds. It processes problems step by step, blending scientific knowledge with practical analysis. According to Google DeepMind's research blog, the system has already scored up to 90% on IMO-ProofBench Advanced (a PhD-level mathematics benchmark) and autonomously solved four open problems from the Erdos Conjectures database. Those aren't toy benchmarks. Those are problems that professional mathematicians have been working on.

Google also built a math research agent called Aletheia on top of Deep Think. It generates candidate solutions, verifies them with a natural language checker, and iterates. One research paper was generated entirely without human intervention, calculating structure constants in arithmetic geometry. That's a narrow result, but a real one, and Google was careful to classify it as "publishable quality" rather than claiming a landmark breakthrough.

On the enterprise side, Google framed Deep Think's ability to process messy, incomplete data as a selling point. If your organization has production reports from varied sources and formats, Deep Think is designed to pull actionable insights from that noise, according to Google's announcement.

The bigger picture: a family, not a flagship

The same week Deep Think went live, Google shipped Lyria 3 and Lyria 3 Pro, specialized music generation models. Lyria 3 Pro creates tracks up to 3 minutes long with structural awareness of intros, verses, choruses, and bridges. It's available on Vertex AI, Google AI Studio, the Gemini API, and the Gemini app for paid subscribers.

This pairing is the real story. Google is not iterating on a single flagship model anymore. It's shipping specialized models in parallel: Deep Think for hard reasoning, Lyria for music, Flash Lite for cost-sensitive high-volume tasks, and 3.1 Pro for general frontier work. Think of it like a toolbox rather than a Swiss Army knife. Each tool does one thing well instead of one tool doing everything adequately.

For comparison, the broader Gemini 3.1 Pro model (released February 20) runs at $2 per million input tokens and $12 per million output tokens, with a 1-million-token context window. Deep Think pricing for API access hasn't been publicly detailed yet, though the consumer version requires the AI Ultra subscription. Google hasn't disclosed Ultra's exact pricing publicly in the March updates, but 9to5Google's feature breakdown confirms it sits above the AI Pro tier and includes Deep Think, Project Mariner, Project Genie, 30 TB of storage, and YouTube Premium.

Why builders should pay attention

Deep Think occupies a lane that most AI products don't target. OpenAI's o3 reasoning models and Anthropic's Claude Opus 4.6 are strong general reasoners, but Google is making a specific bet that scientific and engineering professionals need a model tuned for their domain. The DeepMind team's work on STOC '26 conference paper reviews, progress on classic computer science problems like Max-Cut and Steiner Tree, and contributions to published research across physics and combinatorics all point in the same direction: this model is being tested against real professional work, not leaderboard games.

The question is whether API access will come with reasonable rate limits and pricing for smaller teams. Right now, early access requires expressing interest through a form, which suggests limited capacity. If you're building tools for scientific workflows, drug discovery pipelines, or engineering simulation analysis, getting on that waitlist is worth your time. If you're building a chatbot or a content tool, Deep Think is overkill and probably too slow for your use case anyway.

The verdict

Deep Think is Google's most opinionated model release in a while. It's not trying to be everything to everyone, and that's what makes it interesting. The specialization strategy, shipping Deep Think and Lyria 3 in the same release cycle, suggests Google has learned that winning the AI race doesn't mean having one model that tops every benchmark. It means having the right model for each job.

For researchers and engineers: get on the API waitlist now. The math and science results are real, and early access means you can evaluate whether this fits your workflows before broader availability.

For product builders in other domains: skip Deep Think specifically, but watch Google's specialization playbook closely. The pattern of purpose-built models is likely where the whole industry ends up.

Marcus Webb covers AI products and startups for The Daily Vibe.

Google ships Deep Think for builders, not chatters

What shipped

The bigger picture: a family, not a flagship

Why builders should pay attention

The verdict

Related Articles

RSAC 2026 turned "agentic security" into a product category. The hard problems are still unsolved.

OpenAI signs Smartly to build conversational ads inside ChatGPT

Microsoft ships its first homegrown AI models. The OpenAI safety net is getting thinner.