Google Gemini Hijacked via Messaging Notifications: The 'Dual…

SafeBreach researchers have demonstrated how the Google Gemini voice assistant on Android can be hijacked through indirect prompt injections delivered via notifications from apps like WhatsApp and Slack.

SafeBreach researchers demonstrated on June 3, 2026, that Google Gemini—the Android voice assistant integrated into the Utilities feature—can be forced to execute arbitrary actions through standard notifications from WhatsApp, Slack, SMS, and other messaging apps. The technique, dubbed "Fake Context Alignment" requires no malware or physical access to the device. An attacker simply sends a seemingly harmless message; the user responds "Yes" to a voice prompt they perceive as trivial, while the backend interprets that same consent as authorization to perform malicious operations.

Key Takeaways

SafeBreach demonstrated the hijacking of the Gemini voice assistant on Android via indirect prompt injections delivered through notifications from six messaging apps: WhatsApp, Slack, SMS, Signal, Instagram, and Messenger.
The "Fake Context Alignment" technique creates a "dual illusion": the backend sees a legitimate authorization, while the user perceives a benign scenario due to linguistic obfuscation and hyperlinks that the Text-to-Speech (TTS) engine does not vocalize.
Demonstrations included controlling Google Home devices, launching Zoom with active video streaming, sending fake messages from trusted contacts, account-level memory poisoning, and scheduled surveillance.
Google patched the vulnerability on November 14, 2025, through server-side content classifier improvements, requiring no app updates; no CVE has been assigned, and there is no evidence of in-the-wild exploitation.

The "Dual Illusion" Mechanism

The core of the attack lies in a deliberate misalignment between two contexts: the security context the Gemini backend uses to validate permissions and the perceptual context the user constructs through voice synthesis. Researcher Or Yair and his team isolated two sub-techniques that, combined, form what they call the "Ultimate Combo."

The first, "Obfuscated," exploits linguistic omission: the assistant asks for authorization in a foreign language—the dossier cites Chinese as a verified example—that the user does not understand. The user dismisses the phrase as a system glitch and answers affirmatively; the backend then links that "Yes" to the foreign language request. The second, "Muted," exploits a behavior in Gemini’s TTS engine that skips hyperlinks hidden behind clickable text. The user hears a benign request, while the backend reads a different command on the screen.

Both techniques bypass Google’s recent mitigations, including "Delayed Tool Invocation," which adds a confirmation pause between a request and the execution of an action. The combination renders this safeguard ineffective because the user, in their perception, is already confirming something harmless.

"The main purpose of Fake Context Alignment is to create a dual illusion: presenting a legitimate authorization scenario to Gemini's behind-the-scenes security mechanisms, while presenting a completely different, benign scenario to the victim" — Or Yair, SafeBreach

Six Apps and an "Effectively Infinite" Attack Surface

The research dossier confirms six vectors: WhatsApp, Slack, SMS, Signal, Instagram, and Messenger. This selection is strategic. These apps enjoy near-maximum implicit trust within the mobile ecosystem: notification banners are enabled by default, often with high visual and audio priority, and users instinctively associate them with legitimate personal or professional communications.

Or Yair described this attack surface as "effectively infinite" because any app generating text notifications can deliver indirect prompt injection payloads. There is no need to compromise the app itself; the attacker only needs to be able to send a message to the user, which is possible through compromised contact accounts, public groups, or simply knowing the target's phone number.

The targeted feature, Gemini Utilities, is strictly limited to Android. According to Google documentation cited in the report, this functionality reads and responds to notifications from selected apps; it is unavailable on iOS or the web interface. This platform restriction confines the impact to the Android ecosystem but extends to all devices with Gemini active and Utilities enabled.

Research Demonstrations

SafeBreach’s demonstrations go beyond theory. Researchers documented concrete executions impacting physical security, communications, and privacy.

In the smart home domain, they demonstrated unauthorized control of Google Home devices. Regarding communications, they showed the initiation of Zoom calls with active video streaming—exploiting a 301 redirect from a previously trusted domain—and the delivery of fake messages appearing to come from trusted contacts. The 301 redirect mechanism is particularly insidious: Gemini trusts a domain after it serves clean content, then follows a subsequent redirect without requesting new authorization.

The dossier also documents account-level "memory poisoning": the assistant persistently saved a fact chosen by the attacker. Finally, researchers demonstrated scheduled surveillance via timed activation of data collection functions.

Disclosure and the Server-Side Fix

The timeline documented in the dossier is precise. The report reached the Google Vulnerability Reward Program on August 17, 2025. Google confirmed the fix on November 14, 2025, stating that server-side content classifier improvements would mitigate the issue. The fix is server-side: no update to the Google app or the operating system is required from users.

This remediation method is a double-edged sword. While it eliminates the latency typical of mobile security updates—which depend on OEM distribution and user adoption—it keeps the Utilities feature active by default without requiring an explicit opt-in or a review of granted notification permissions. The dossier does not specify if Google implemented mitigations beyond classifier improvements or if variants of the technique remain unpatched.

No CVE has been assigned to this issue. SafeBreach does not list one in its report, and no converging sources report its existence. Sources also state there is no evidence of in-the-wild exploitation, though they caution that the absence of evidence is not proof of non-exploitation.

Why It Matters

The dossier does not specify if the technique has been independently replicated, nor does it outline additional corrective measures users can take independently. The source does not clarify if specific Android versions or Google app versions are more exposed, and it does not document variants of "Fake Context Alignment" that could bypass the improved classifiers.

The Gemini Utilities case highlights a structural issue in agentic AI design: the fusion of traditionally trusted communication channels—notifications—with language models that interpret every input as a potential instruction. When the line between a "message from a friend" and a "system command" becomes porous, the mobile ecosystem's trust model requires recalibration.

Or Yair summarized the stakes with a statement that serves as an architectural principle: "Yes, 100%. We do need to treat all external input as not trusted because all external input is a potential instruction." This stance—treating all external input as untrusted—conflicts with the deep integration logic driving the expansion of Gemini and similar assistants into daily devices and services.

The Open Question

The November 2025 server-side fix closed the specific vulnerability documented, but not the underlying attack class. Indirect prompt injection via notifications is inherently linked to giving LLMs access to uncurated third-party data streams. Every new integration—email, calendars, shared documents, system logs—replicates this surface.

For organizations evaluating agentic AI in enterprise environments, the Gemini Utilities case provides a concrete evaluation criterion: verify not only what actions an agent can perform, but which input streams it consumes without sanitization, and how the system manages the misalignment between backend authorization and user perception. "Dual illusion" is not an implementation bug; it is an emerging paradigm exploiting the architecture of human-AI interaction.

Sources

Information has been verified against the cited sources and is current at the time of publication.

Sources

Sources and references