Google has made another move in the race for advanced artificial intelligence with the launch of Gemini 3.1 ProThis seemingly minor revision, based on the version number, actually represents a significant leap forward compared to the previous Gemini 3 Pro. While the industry is accustomed to a rapid-fire series of announcements, this move by the company is drawing attention due to the magnitude of the improvement in reasoning and how it integrates seamlessly into its entire ecosystem of services.
This new iteration arrives as the benchmark model within the Gemini family and is already being deployed globally, for both individual users and developers and businesses. More than just a name change, it represents an evolution geared towards solving tasks where a quick and superficial answer falls short: scientific problems, complex analyses, advanced programming, or creative projects that require several sequential steps.
A leap in reasoning that doesn't fit with a simple ".1"
What's striking about Gemini 3.1 Pro is that, despite being labeled an intermediate update, the data shared by Google shows an improvement previously reserved for full generation changes. In the demanding benchmark ARC-AGI-2Designed to test whether a model is capable of solving entirely new logical patterns, without having seen them during training, the new version achieves a 77,1% of hits.
The improvement over the previous model is drastic: the Gemini 3 Pro was around 31% in ARC-AGI-2This means that abstract reasoning performance has practically doubled. This result places Gemini 3.1 Pro above benchmarks like Claude Sonnet 4.6 and Opus 4.6, and ahead of OpenAI's best in these types of tests, marking a turning point in how AI tackles problems it cannot solve using memory alone.
Google explains that this leap is largely due to the fact that it has transferred the advances of its specialized model. Gemini 3 Deep Think —focused on particularly complex scientific and research tasks— to a more general-purpose engine like 3.1 Pro. Deep Think still performs even better on ARC-AGI-2, hovering around 85%, but its computational cost is higher. With 3.1 Pro, the company is trying to square the circle: offering a more reasonable balance between power and efficiency for everyday use.
Other key tests also show signs of improvement. According to the published results, the overall average performance compared to the Gemini 3 Pro has increased by approximately [percentage missing]. 21%, and the advantage over OpenAI's flagship model (GPT‑5.2) would be located around the 16% across the set of comparable benchmarksThe focus is clearly on what most often fails when the problem ceases to be trivial: structured reasoning, multi-step planning, autonomous agents, and competitive code.
However, the model doesn't dominate in absolutely every area. In MMLU, the classic "encyclopedia"-style knowledge benchmark, the improvement is minimal, and in specific tests like MMMU, it even falls a tenth of a point behind 3 Pro. There are also areas, such as certain tasks in real-world work environments (GDPval) or programming with intensive terminal interaction, where rivals like Claude or OpenAI maintain an advantage. Even so, in the overall picture, the balance clearly favors Google's new offering.
Benchmarks where Gemini 3.1 Pro takes the lead
Beyond ARC-AGI-2, Google and external evaluators have been breaking down how it behaves Gemini 3.1 Pro in other test scenarios. In Humanity's Last Exam Without the use of external tools, the model ranks first with a 44,4%, and in variants of the same test with different methodology it reaches almost 51,4%, above GPT-5.2 and the latest versions of Claude.
If we move to the scientific realm, the new Gemini also ranks at the top of the table. GPQA DiamondA very strict benchmark focused on advanced science questions, obtains a 94,3%This indicates a comfortable handling of complex technical explanations. For those working in research, engineering, or highly regulated sectors, this ability to sustain more rigorous scientific reasoning is one of the points that truly makes a difference.
The programming section is another area for improvement. LiveCodeBench ProIn a test focused on competitive coding, Gemini 3.1 Pro achieves an Elo rating of 2.887, surpassing both the previous Gemini 3 Pro and GPT-5.2. In SWE-Bench Verified, which simulates real-world fixes on GitHub repositories, the new model revolves around 80,6%, practically tied with Opus 4.6. That is, it is no longer limited to writing simple functions: it holds its own in complex software maintenance tasks.
Agent-based tests, where the model must autonomously execute chains of actions, also show a significant leap forward. APEX-AgentsFocused on long-term tasks, it goes from 18,4% to around 33,5%, which represents a relative increase of more than 80%. When we talk about MCP Atlasfocused on multi-step workflows, and in BrowseCompwhere you have to browse the web, search for information and run Python code, the results skyrocket to a 69,2% or with a 85,9% respectively, well above the previous generation.
In the multimodal sphere, the model also presents notable improvements. MMMLU —multilingual questions and answers— is around the 92,6%This figure confirms that AI understands and reasons in multiple languages quite effectively, something especially relevant for European markets where linguistic diversity is the norm. However, in other, more refined multimodal tests, such as MMMU, the progress is more modest, and in some specific cases, the successor lags slightly behind its predecessor.
It is worth remembering, in any case, that Benchmarks are only a partial pictureThey are useful for comparing models under equal conditions, but they don't fully reflect how they behave in real-world use cases, with flawed data, ambiguous contexts, or users who mix multiple objectives in the same conversation. Google, like other companies, tends to highlight the metrics that are most favorable to them, so it's always a good idea to test the model with your own tasks before drawing definitive conclusions.
Beyond chat: live panels, animated SVGs, and working code
One of the clearest changes in focus of Gemini 3.1 Pro This is the type of output that Google wants to prioritize. The company insists that the goal is no longer just to have an AI that "speaks" well in a chat, but an engine capable of generating functional results. production-ready code, automated workflows or complex data visualizations.
Among the examples the firm has shown, one is particularly representative: the creation of a real-time aerospace dashboard which shows the orbit of the International Space Station using public telemetry. In this type of demonstration, the model not only explains what needs to be done, but also configures the data ingestion, generates the dashboard logic, and produces the code necessary to visualize it.
Much emphasis has also been placed on the model's ability to Generate SVG animations from textInstead of videos or bitmap images, 3.1 Pro returns vector code that can be embedded directly into a website or application, maintaining sharpness at any scale and consuming far fewer resources. This opens the door to interactive graphics, custom visual effects, and dynamic interfaces without relying so heavily on traditional design tools.
In the creative field, Google has shown cases where the model translates abstract descriptions into functional code, and in tools like its Imagen editor We explore workflows where the result is directly usable by designers and developers. For example, taking the "atmosphere" of a classic novel and transforming it into a web design consistent with that atmosphere, or generating complex 3D simulations—like a flock of starlings—that the user can manipulate with their hands using tracking systems. The key is not just in writing code, but in understand the intention or the “vibe” from the user and reflect it in the result.
For European developers, this focus on actionable outputs can be especially useful in projects where time is limited and there's a need to quickly move from a sketch to a prototype that compiles, deploys, and integrates with other services. Some companies that have tested the preview version report fewer roadblocks on lengthy tasks and less need to rewrite instructions repeatedly to achieve their desired results.
Integration with the Google ecosystem: the great competitive advantage
Beyond the numbers, Google's structural advantage lies not only in the fact that Gemini 3.1 Pro is very powerful, but also in that It already lives within the products that millions of people use dailyUnlike other companies that rely on the user opening a specific app—ChatGPT, Claude, and others—Google benefits from having the main entry points to the Internet: Search, Gmail, YouTube, Android, Docs, Drive, Google Photos, and Maps, among many others.
The company is using this position to integrate its new model into familiar services like Chrome without requiring users to change their habits. In the application of Gemini for mobilesAvailable in Spain and other European countries, 3.1 Pro becomes the default engine for those with a subscription to the Google AI Plus, Pro or Ultra plans, while the free plan allows you to try it with certain usage limitations.
It is also being deployed in NotebookLMGoogle's tool for summarizing and working with long documents, where the new engine promises better synthesis and fewer errors when handling large volumes of text. In the business sector, version 3.1 Pro is offered through Vertex A.I and Gemini Enterprise, so that organizations can connect it to their own data within the usual security and compliance perimeters of Google Cloud.
This integration with the ecosystem creates a "defensive moat" that's difficult for pure AI startups to replicate. Even if a rival model is slightly better in a specific benchmark, the reality is that Google... It doesn't need to convince the user to install anything new.AI is appearing in products already on your mobile phone, browser, or email. From a strategic point of view, it's a factor that carries as much weight as the percentages in performance charts.
The medium-term question is how to sustainably monetize this integrated intelligence without compromising the search, office, or video experience. For now, the company seems to be focusing on subscription packages that combine preferential access to AI with storage and service benefits—a formula that, at least in terms of price, is difficult for players without such a broad ecosystem to match.
Where and how can Gemini 3.1 Pro be used
On a practical level, Gemini 3.1 Pro now available in preview version across various channels. End users can access it through the Gemini and NotebookLM apps, with more generous usage limits for subscribers to paid plans. In Spain, the app is integrated into Android as the primary assistant on compatible phones and is also accessible via the web.
Los developers They have the model available through the Gemini API in Google AI Studiothe official CLI and development environments such as Android StudioFrom there, you can build assistants, specialized agents, technical support tools, or custom integrations with web and mobile applications. The idea is that, with the same endpoint as always, you can now obtain much more robust reasoning.
The companies And European organizations already working on Google Cloud can consume Gemini 3.1 Pro through Vertex A.I and Gemini Enterprise. This allows you to connect the model with your own data to summarize corporate documentation, automate internal processes, create advanced chatbots for customer service, or analyze large databases with natural language questions, while maintaining security, audit, and privacy controls adapted to the business environment.
In all cases, Google emphasizes that the model is still in the "preview" phase, meaning that some features are still being tested and may be adjusted over time. However, the rollout is broad enough that both home and professional users in Europe can begin trying it out without having to wait for a "final" launch.
In the educational and academic field, access via application and NotebookLM opens up interesting possibilities: students and teachers can use 3.1 Pro to summarize long texts, prepare materials, generate practical examples or review code, always with the usual precaution of checking the most sensitive data before accepting it as valid.
API Pricing and Value Strategy
One point that has generated considerable debate among developers is the pricing model of Gemini 3.1 Pro. Google has decided to essentially maintain the same pricing structure that Gemini 3 Pro already had for the API, which means that the performance upgrade comes no direct additional cost for those who were already working with the previous version.
The Google Cloud pricing table indicates that, for prompts up to 200.000 context tokens, the entry cost remains around $2 per million tokens, and the output increases to $12 per millionAbove that contextual threshold, rates increase, to around $4 per million entry tokens y $18 per million output, figures in line with what we already saw in 3 Pro.
Additionally, Google offers context cachingA feature that allows the reuse of long contexts at a reduced price (around $0,20 to $0,40 per million cached tokens, plus an hourly storage fee), which can significantly lower the cost of projects with very long, repetitive prompts. A free monthly quota of queries with integrated search is also included.Search Grounding), from which requests are billed in blocks of one thousand.
For many European startups and SMEs already scrutinizing their computing costs down to the last penny, the fact that the new model offers virtually double the reasoning power for the same price represents a direct improvement in profit margins. In other words, “reasoning per euro invested” is cheaper, something key if AI is at the heart of the product.
For end users, the approach is more focused on subscription packages that bundle premium AI access with additional storage and benefits on Google services. While the details of these packages aren't identical across all European countries, the general idea is that, for a moderate monthly fee, users can access Gemini 3.1 Pro without immediately encountering strict usage limits.
With all of the above on the table, Gemini 3.1 Pro It is shaping up to be a particularly relevant step in the evolution of Google's AI: it offers an unusual leap in logical reasoning for a ".1" update, excels in several key benchmarks, maintains a competitive price for developers, and is supported by an already massive ecosystem of services in Spain and the rest of Europe; it doesn't make it a perfect tool or solve all the challenges of artificial intelligence, but it does reinforce the feeling that the next important battle will not be fought so much on who has more parameters, but on who gets their models to think better and integrate usefully into daily life and work.