Mistral AI Hit by Supply Chain Attack; 450 Repositories Put Up for Sale

Mistral AI has confirmed a supply chain compromise involving contaminated SDKs and abused SLSA provenance. The threat actor TeamPCP is demanding $25,000 for ap…

Mistral AI Hit by Supply Chain Attack; 450 Repositories Put Up for Sale

The threat group TeamPCP has listed approximately 450 internal Mistral AI repositories for sale for $25,000, threatening to leak the data for free within a week if no buyer is found. The French AI firm confirmed on May 14, 2026, that a supply chain attack on May 12 briefly compromised its codebase management system and contaminated several SDK packages. However, the company has ruled out any breach of user data, hosted services, or research environments. The incident highlights a sophisticated technical failure: the malicious packages were signed with valid SLSA provenance, rendering the compromise invisible to standard automated verification tools.

Key Takeaways
  • TeamPCP is seeking $25,000 for roughly 450 repositories and has threatened a public leak after seven days; however, the authenticity of this data has not been independently verified.
  • Mistral AI confirmed that on May 12, 2026, its codebase management system was breached via a supply chain attack, resulting in the brief contamination of certain SDK packages.
  • According to the company, the impact was limited to non-core repositories; hosted services, user data, and research environments remain unaffected.
  • The "Mini Shai-Hulud" campaign utilized OIDC token extraction, cache poisoning, and pull_request_target misconfigurations to publish packages with valid SLSA provenance, using the Session network for exfiltration.

When SLSA Signatures Become a Blind Spot

The Mini Shai-Hulud campaign successfully compromised packages on npm and PyPI, including Mistral AI’s SDKs. Technical analysis from vendors including Wiz, Snyk, Socket, and StepSecurity indicates the attackers exploited a chain of three known vulnerability classes. "The attacker chained three known vulnerability classes — a pull_request_target 'Pwn Request' misconfiguration, GitHub Actions cache poisoning across the fork↔base trust boundary, and runtime memory extraction of the OIDC token from the Actions runner process," a TanStack security advisory reported via SecurityWeek.

For Mistral AI, the breach impacted the core SDK, the Azure integration, and the GCP integration, with three malicious versions published for each. The malicious artifacts bypassed provenance checks because they were produced by hijacking the legitimate pipeline: the stolen OIDC token allowed the attackers to obtain a valid Sigstore certificate.

"SLSA provenance is a cryptographic certificate, generated by Sigstore, that is meant to verify a package was built from a trusted source. The worm was able to produce these certificates because it hijacked the legitimate build pipeline itself" — Snyk via SecurityWeek

SLSA provenance has become an industry standard for certifying that a package originates from a trusted pipeline. Developers and platforms rely on it to automatically block suspicious artifacts. However, when an attacker gains control of the pipeline itself, the certificate becomes a technically "correct" attestation of a compromised process, bypassing automated security gates without triggering alarms.

Consequently, any package manager or security policy relying solely on SLSA signatures would have treated the malicious Mistral AI versions as legitimate. This shifts the security challenge from cryptographic verification to behavioral monitoring of the runner—a level of control many organizations have yet to implement.

The malware exfiltrated data through multiple channels, including a custom domain, themed GitHub repositories, and the Session network. "The Session network channel is new. Decentralized and takedown-resistant, it is significantly harder to disrupt than dedicated domains or GitHub-based exfiltration," Wiz observed via SecurityWeek. Socket and StepSecurity noted that the campaign's speed worsened its impact: the worm-like propagation exploited mutual trust between repositories to scale rapidly, making manual intervention by maintainers difficult.

The 450-Repository Black Market: Claims vs. Verification

On a dark web forum, TeamPCP offered approximately 450 internal repositories totaling an estimated 5 GB of data for an exclusive purchase price of $25,000. The group threatened to release the data publicly if no buyer emerged within a week. At present, there is no independent evidence to confirm the authenticity of this archive.

It also remains unclear whether the repositories contain sensitive source code, internal project names, or non-core materials. While Mistral AI explicitly stated that the compromised repositories were not part of its core code, the company has not confirmed the exact number of exfiltrated assets or whether the theft occurred entirely on May 12. Furthermore, it remains unconfirmed whether stolen credentials were used to gain access beyond the compromised SDK packages.

The sale of internal source code, even if non-core, poses a significant reputational risk for a sovereign AI vendor competing on proprietary models and enterprise partnerships. The exposure of internal project names, dependencies, or code comments can provide competitors and attackers with a roadmap of the technical architecture, as well as entry points for future spear-phishing campaigns against employees.

Damage Control: What Mistral AI Has Confirmed

In a statement to BleepingComputer, Mistral AI clarified that attackers compromised the codebase management system and that "They [the hackers] contaminated some of our SDK packages for a brief period." According to the company's reconstruction, the affected repositories were not part of the core codebase.

The company also ruled out any breach of sensitive infrastructure. "Neither our hosted services, managed user data, nor any of our research and testing environments were compromised," a spokesperson told BleepingComputer. Mistral AI’s statement draws a sharp line between the codebase management breach and the security of its hosted services, though independent technical evidence to fully corroborate this remains unavailable.

The distinction between core and non-core repositories is technically significant but does not resolve concerns regarding persistence. If access was obtained via OIDC tokens or hijacked credentials, it is not yet confirmed if all keys have been rotated or if backdoors exist within the build systems. While precise, Mistral AI’s statement leaves questions regarding the absolute completeness of the remediation.

Recommended Mitigation Strategies

Countermeasures must be applied across multiple pipeline stages, as no single control would have stopped this multi-stage campaign.

  • Audit pull_request_target: Ensure GitHub Actions workflows do not execute code from external forks with write privileges to the base repository, effectively disabling the "Pwn Request" misconfiguration.
  • Isolate Runners and Rotate OIDC Tokens: Limit memory access for CI/CD runners, implement sandboxing, and immediately revoke tokens used for package signing, replacing them with fresh credentials.
  • Invalidate Suspicious Cache: Review GitHub Actions cache policies to prevent cross-boundary poisoning between forks and base repositories, deleting any artifacts generated by unverified pull requests.
  • Re-evaluate the SLSA Chain of Trust: A valid Sigstore certificate is not a substitute for runtime behavior analysis; integrate behavioral scanning before production installation.

Frequently Asked Questions

How can a malicious package have a valid SLSA signature?
The attacker hijacked the legitimate CI/CD pipeline. By using a stolen OIDC token, they obtained a Sigstore certificate in the same manner as an authentic build. The signature only guarantees the pipeline's origin, not the integrity of the injected source code. The certificate identifies the builder but does not analyze the code itself.

Are the repositories listed by TeamPCP authentic?
Currently, there is no public, independent proof confirming the authenticity or content of the approximately 450 repositories. While Mistral AI confirmed a system compromise, it has not verified the exact volume or nature of the exfiltrated data. The lack of a sample or file hash makes third-party validation impossible.

What is the risk for those who installed the compromised Mistral AI SDKs?
The malicious versions, which included the core SDK and Azure/GCP integrations, were available for a brief period. Users who installed them during this window may have introduced code capable of exfiltrating data to the Session network. An immediate audit of dependencies and installation logs from May 12, 2026, is required.

This incident demonstrates that the open-source supply chain is vulnerable not only to a lack of oversight but also to an over-reliance on automated signing. When a provenance certificate becomes the perfect cover for a malicious payload, verification must move upstream: monitoring runners, isolating memory, and invalidating suspicious caches. For Mistral AI, the true test will be convincing clients that the perimeter is secure while the industry awaits confirmation on the 450 auctioned repositories.

Information has been verified against the cited sources and is current at the time of publication.

Sources