Skip to main content

Google is implementing a significant evolution of its search architecture, transitioning from a traditional information retrieval engine to a multimodal reasoning and action system. Recent integrations for users of premium U.S. plans, based on the Gemini 2.5 Pro model, introduce a deep query engine and an agentic framework capable of executing real-world tasks, thereby redefining the search engine’s operational paradigm.

Integration of Gemini 2.5 Pro for Complex Query Decomposition

The core of the update is the integration of the Large Language Model (LLM) Gemini 2.5 Pro, accessible through an experimental interface called “AI Mode.” Unlike previous models, Gemini 2.5 Pro is designed to handle queries that require multi-step reasoning and complex problem decomposition. The model is capable of:

  • Performing computational tasks: Solving mathematical and logical problems that require intermediate calculations.
  • Interpreting and generating code: Addressing programming-related prompts by providing functional code snippets.
  • Generating synthesized responses with citations: The outputs are not simple text extractions but synthesized responses accompanied by a graph of references—an approach similar to Retrieval-Augmented Generation (RAG) that mitigates the risk of hallucinations and ensures the traceability of sources.

This transforms the search process from keyword matching and ranking into a conversational interface capable of articulated problem-solving.

Deep Search: A Federated Query and Synthesis Engine

In parallel, Google is introducing Deep Search, an advanced search architecture designed for open-ended or ambiguous prompts that cannot be satisfied by a single query. Its operation is based on:

  • Parallelized query execution: The system launches hundreds of exploratory queries simultaneously to map a broader knowledge domain.
  • Integration of heterogeneous sources: Deep Search can aggregate and correlate data from unstructured (web text), semi-structured (tables, lists), and structured (Knowledge Graph) sources.
  • Synthesis and reporting: The output is not a list of links (SERP) but a structured analytical report that presents a summary of the gathered information, highlighting connections and providing verified sources.

This tool is optimized for complex analysis tasks, such as market research, financial planning, or academic reviews, reducing the cognitive load and manual data aggregation required from the user.

Agentic Framework for Real-World Task Execution

The most disruptive innovation is the introduction of agentic capabilities, wherein the AI autonomously executes actions on behalf of the user. The automatic calling function is a direct evolution of Google Duplex technology and relies on a complex technology stack:

  • Natural Language Understanding (NLU): For parsing user input and extracting key entities (e.g., service type, preferences, temporal parameters).
  • Dialog Management System: A system that governs the conversational flow with the human interlocutor, capable of handling deviations and requests for clarification.
  • Natural Language Generation (NLG) and Text-to-Speech (TTS): To generate natural and contextually appropriate speech, identifying itself as an AI assistant.

The workflow involves acquiring parameters from the user, autonomously executing the call, extracting relevant information during the dialogue, and finally, structuring this data into a synthetic output delivered to the user via messaging APIs (SMS or email).

Furthermore, the platform provides a control interface for businesses via Google Business Profile, allowing them to implement an opt-out mechanism from the AI calling service—a crucial aspect for the system’s governance and adoption. This release strategy, limited to specific categories and geographies, follows an A/B testing approach to validate the system’s robustness and acceptance before a potential large-scale deployment.