Google Releases Gemini 3.1 Pro With Stronger Reasoning

Google has released Gemini 3.1 Pro, an upgraded AI model now rolling out to developers, enterprises, and consumers. The model builds on the Gemini 3 series and marks a significant improvement in core reasoning performance.
On the ARC-AGI-2 benchmark, which tests a model’s ability to solve entirely new logic patterns, Gemini 3.1 Pro scored 77.1%. That is more than double the score of its predecessor, Gemini 3 Pro. On Humanity’s Last Exam, a benchmark designed to measure AI performance against human ability, the model scored 44.4%, up from Gemini 3’s previous high of 38.3% across all available models at the time.
The release follows a major update to Gemini 3 Deep Think last week. That update focused on tough research challenges in chemistry, physics, math, and coding. Google describes Gemini 3.1 Pro as the core intelligence behind those improvements. The Deep Think mode scored higher on both benchmarks — 84.6% on ARC-AGI-2 and 48.4% on HLE — but it operates as a reasoning mode with longer inference times, not a standalone model.
Gemini 3.1 Pro is built for tasks that require advanced reasoning. These include synthesizing data from multiple sources, generating animated SVGs from text prompts, building live data visualizations, and translating literary themes into functional code. Google demonstrated the model building a live aerospace dashboard that visualized the International Space Station’s orbit using a public telemetry stream.
Developers can access Gemini 3.1 Pro in preview through the Gemini API in Google AI Studio, Gemini CLI, Google Antigravity, and Android Studio. Enterprise customers can access it via Vertex AI and Gemini Enterprise. Consumer access is available through the Gemini app and NotebookLM, currently limited to Google AI Pro and Ultra plan subscribers.
Despite strong benchmark results, Gemini 3.1 Pro does not lead all industry evaluations. Anthropic’s Claude Opus 4.6 holds the top position on the Center for AI Safety text capability leaderboard, which aggregates scores across reasoning and text-based tasks. Anthropic’s Opus 4.5, Sonnet 4.5, and Opus 4.6 also outrank Gemini 3 on the same organization’s risk assessment leaderboard.
Google is making Gemini 3.1 Pro available in preview before a general release. The company stated it plans to use this period to validate updates and advance agentic workflow capabilities further.



