Google Identifies First AI-Authored Zero-Day Exploiting 2FA in the Wild

Google Threat Intelligence Group (GTIG) has confirmed the discovery of a Python-based zero-day exploit developed with AI assistance. The vulnerability bypasses…

On May 11, the Google Threat Intelligence Group (GTIG) revealed it had identified a zero-day exploit in the wild—written in Python—capable of bypassing Two-Factor Authentication (2FA) on a web-based open-source tool. Analysts have concluded with high confidence that an AI/LLM model was used in its creation, marking the first publicly confirmed case by a major vendor of a weaponized vulnerability bearing artificial fingerprints. This discovery significantly narrows the window between logical discovery and functional exploitation, forcing defenders and vendors to re-evaluate threat models concerning the offensive use of LLMs.

Key Takeaways

The exploit targets a semantic logic flaw stemming from a hard-coded trust assumption within an open-source tool, requiring valid user credentials for activation.
Google analysts identified specific AI-driven stylistic indicators in the Python code, including didactic docstrings, a hallucinated CVSS score, and a standardized Pythonic structure typical of LLM training data.
Google collaborated with the affected vendor for responsible disclosure and the subsequent patch release; however, the identity of the software and the threat actor remain undisclosed.
The operation is part of a coordinated mass vulnerability exploitation campaign by cybercriminals, representing a turning point in the automated weaponization of zero-day vulnerabilities.

Python-Based Zero-Day Bypasses 2FA Following Credential Theft

According to the report released on May 11, the exploit discovered by Google leverages a semantic logic flaw found in an open-source web administration tool. The vulnerability originates from a hard-coded trust assumption that allows an authenticated user with valid credentials to circumvent the 2FA verification flow. This is not a brute-force attack against tokens or a universal bypass, but rather an internal logical path that neutralizes the second factor once the initial security layer is breached.

The "in-the-wild" nature of the toolkit confirms the code was not an academic proof-of-concept, but a functional asset actively used in cybercriminal operations. Google categorized the campaign as a mass vulnerability exploitation operation, highlighting the attackers' ability to coordinate large-scale exploitation. While Google withheld details regarding the vendor and the threat group to protect users during the patching cycle, the discovery leaves open questions regarding the extent of the attack surface already compromised.

Didactic Docstrings and Hallucinated CVSS: AI Fingerprints in the Code

Google investigators based their assessment on recurring stylistic artifacts within the script rather than intuition. The code contains excessively detailed didactic docstrings, a hallucinated CVSS score embedded directly in the comments, and rigid Pythonic formatting—including structured help menus and a cleanly written ANSI color class. While these elements might appear to be meticulous programming in isolation, together they form a stylometric profile consistent with the output of a language model trained on technical corpora.

"Although we do not believe Gemini was used, based on the structure and content of these exploits, we have high confidence that the actor likely leveraged an AI model to support the discovery and weaponization of this vulnerability." - Google (GTIG), via SecurityWeek

The inclusion of a fabricated CVSS score within the code is particularly telling: the model generated a severity value that appeared plausible at first glance but was not anchored in a real-world assessment of the vulnerability. This type of hallucination, typical of generative LLMs, serves as a forensic fingerprint for Google, distinguishing the work of a human expert from an automated weaponization pipeline.

Google Excludes Gemini Involvement While Confirming LLM Usage

In its May 11 publication, the Google Threat Intelligence Group explicitly ruled out the use of Gemini in the zero-day’s creation, despite maintaining high confidence that an AI model was used by the malicious actors. This distinction is critical: it indicates that third-party or open-source linguistic tools are already integrated into the cybercriminal attack chain, rather than an internal failure of Google’s own ecosystem. Google has not specified which LLM was involved, leaving the specific tool unclassified or unattributed.

The statement from Google, as reported by SecurityWeek, establishes a new technical and regulatory benchmark: for the first time, a mainstream threat intelligence vendor has attributed an in-the-wild zero-day to an AI-assisted authoring process. The implication is that the learning curve required to identify and exploit complex logic flaws has flattened, enabling cybercriminal groups to scale weaponization without traditional reverse engineering expertise.

Strategic Defense Priorities

The confirmation of an active AI-generated zero-day necessitates an immediate review of organizational authentication and monitoring strategies. Four operational priorities follow:

Audit authentication flows within internal web-based open-source tools to ensure no hard-coded trust assumptions can be exploited following credential compromise.
Strengthen authentication with additional factors independent of the primary application flow, prioritizing adaptive authentication based on behavior and context over rigidly coupled sequential 2FA.
Integrate checks for LLM-specific stylistic patterns into code reviews and SAST processes—such as overly didactic docstrings or inline CVSS scores—to identify potentially AI-generated code within repositories.
Map dependencies on open-source web admin tools with privileged access, ensuring that credential theft does not allow for a 2FA bypass without triggering real-time alerts.

The line between theoretical proof-of-concept and in-the-wild weapon has measurably thinned. This is no longer a dystopian scenario; the evidence published by Google proves that cybercriminals have already integrated LLMs into the weaponization chain, reducing the human effort required to find and exploit logical flaws.

For network and application defenders, this means the speed of offensive adaptation has surpassed traditional patching cycles. The next generation of authentication must account for a threat model where artificial intelligence is an operational tool for attack.

Does this zero-day allow access without any credentials?
No. According to Google, the exploit requires valid user credentials to trigger the 2FA bypass. It is not an indiscriminate access vulnerability, but a phase-skip following an initial compromise.

Was Gemini used to generate the exploit?
No. The Google Threat Intelligence Group explicitly excluded Gemini, though they maintain high confidence that another AI/LLM model was involved in the discovery and weaponization phases.

Why hasn't the name of the vulnerable tool been released?
Google coordinated responsible disclosure with the affected vendor and chose not to name the software or the actor publicly to facilitate proactive disruption and protect users who have not yet updated.

Information verified against cited sources and current as of the time of publication.

Python-Based Zero-Day Bypasses 2FA Following Credential Theft

Didactic Docstrings and Hallucinated CVSS: AI Fingerprints in the Code

Google Excludes Gemini Involvement While Confirming LLM Usage

Strategic Defense Priorities

Sources