iOS AI Apps: 282 Exposed, Only 28% Fixed

Wake Forest study finds 282 of 444 analyzed iOS LLM apps leak API credentials. After 90 days of responsible disclosure, just 28% remediated; 23% remain fully exploitable.

An academic study by Wake Forest University researchers found that 282 iOS applications with large language model capabilities—64% of a sample of 444 analyzed apps—exposed API credentials or interceptable backend access mechanisms. On June 22, 2026, initial editorial coverage published the results of a 90-day retest following responsible disclosure: only 78 apps, 28%, had fixed the vulnerabilities. The remaining 23% remained fully exploitable, due either to developer inaction or fundamentally flawed authentication implementations.

Key Takeaways

282 of 444 analyzed iOS LLM apps exposed credentials or interceptable backend access, with 146 classified as fully exploitable
After responsible disclosure to 282 developers and a 90-day retest, only 28% remediated; 23% remain exploitable
Three dominant leakage patterns: plaintext API keys, JWTs with excessive validity, unauthenticated backend proxies acting as open relays
The most popular vulnerable app exceeds 2.3 million ratings, and 15% of at-risk apps have more than 1,000 ratings

LLMKeyLens: How the Analysis Framework Works

The researchers developed LLMKeyLens, a framework to intercept iOS app network traffic, detect provider-specific API keys, authentication tokens, and exposed backend endpoints, and validate the actual exploitability of leaked credentials. The sample was built from over 38,000 App Store listings, narrowed to more than 5,600 AI-related apps, then filtered to 444 with confirmed LLM integration through dynamic analysis.

The methodology isolated three convergent leakage patterns. Fifty-four apps exposed plaintext API keys in the iOS client, interceptable via network traffic analysis. One hundred thirty-six apps revealed authentication tokens, in one case a JWT with validity exceeding 100 years. Ninety-two apps used backend proxies that, while correctly hiding API keys, required no authentication from the client, effectively turning them into open relays.

Twenty-eight of the apps with plaintext API keys also exposed proprietary system prompts, expanding the attack surface beyond mere credential usability. One hundred fifty-five vulnerable apps used custom developer backends, 67 relied on cloud platforms such as Firebase, Google Cloud Run, or AWS, and 60 communicated directly with AI providers without architectural intermediaries.

Sample Data: Distribution and Popularity

The vulnerability is not confined to niche apps. Fifteen percent of vulnerable apps have more than 1,000 user ratings, and the most popular app in the sample exceeds 2.3 million ratings. Productivity apps represent the largest category of at-risk apps, while Health & Fitness records the highest leakage rate relative to apps analyzed in the category.

LLM-powered apps reached 17 billion downloads in 2025, accounting for 13% of all mobile app downloads. This mass adoption makes credential leakage not a marginal incident but a systemic problem in the iOS ecosystem, with impacts propagating from unknown developers to mainstream users.

"LLM API key leakage is a widespread and systemic issue in the iOS ecosystem, affecting 26% of analyzed Apps across diverse categories and developer types. The vulnerability's impact extends from niche Apps to popular apps with hundreds of thousands of users"

The Retest Failure: What Happens After Disclosure

The researchers responsibly notified the 282 vulnerable apps and conducted a 90-day retest. The result is a cross-section of the ineffectiveness of traditional disclosure in a fragmented mobile context. Seventy-eight apps, 28%, had remediated via credential revocation or access control enforcement.

Thirty-six apps took no remediation action whatsoever. Thirty implemented technically insufficient countermeasures, with fundamentally flawed authentication that did not prevent continued abuse of credentials or backend access. Overall, 23% of the vulnerable sample remained exploitable at the end of the observation period.

The researchers highlighted a significant architectural finding: 55% of apps with leakage route LLM traffic through custom developer backends, rendering provider-side mitigations alone insufficient. Cloud platforms and direct API services account for comparable shares of leakage (23% and 21%, respectively), confirming that adopting a proxy architecture does not inherently prevent credential exposure.

"Over half of leaked Apps (55%) route LLM traffic through custom developer backends, making provider-side mitigations alone insufficient. Cloud platforms and direct API services account for comparable shares of leakage (23% and 21%, respectively), confirming that adopting a proxy architecture does not prevent credential exposure"

Why It Matters

The study raises structural questions about security governance in the App Store. Responsible disclosure, understood as direct notification to developers, produced remediation in fewer than one-third of cases. The 90-day observation window is not arbitrary: it represents a standard window in the academic disclosure calendar, yet the results indicate it was insufficient to generate effective fixes for nearly three-quarters of vulnerable apps.

The dossier does not specify whether Apple responded to the research or plans to integrate leaked credential detection into the App Store review process. No infrastructure overlaps emerge linking vulnerable apps to specific development frameworks or documented coding practices. The brief does not document specific corrective measures taken by LLM providers to revoke exposed credentials at the API level.

The nature of exposed data varies by leakage pattern. Plaintext API keys enable direct API costs charged to the developer and potential data poisoning of model responses. Unauthenticated backend proxies expose infrastructure to arbitrary use, with risks of violating the confidentiality of end users transiting through the app. JWTs with excessive validity enable prolonged replay attacks.

The source does not specify the full nature of user data transiting through vulnerable backends, nor quantify the potential economic damage from exploitation of leaked credentials. The original academic paper is not directly accessible through available editorial sources; the technical data reported is reconstructed from convergent coverage by Help Net Security and CyberInsider, but not verifiable against the primary scientific source.

Reading: Beyond the Single Advisory

The Wake Forest case is not a software vulnerability patchable with an update. It is a sample of flawed LLM integration practices distributed across hundreds of independent apps, with decentralized security governance and no coercive enforcement mechanism. The low remediation rate suggests that responsible disclosure, while ethically correct, operates in an institutional vacuum when the recipient is an individual developer without structural incentives for compliance.

The enterprise reader's perspective is twofold. For those building LLM-powered apps: leakage is an architecture problem, not a single configuration issue. The presence of credentials in the iOS client, the lack of token expiration, the absence of authentication on backend proxies are avoidable patterns at the design stage. For those consuming LLM services via third parties: the exposed credentials belong to the app developer, but abuse translates into content manipulation and potential exfiltration of user-submitted data.

The most relevant architectural finding is that no integration pattern is inherently protected. Custom backends, cloud platforms, direct provider communication: all three modes present vulnerable apps in the sample. Security depends on the authentication and authorization controls implemented, not on the infrastructure choice.

Frequently Asked Questions

Have the vulnerable apps been actively exploited by threat actors?: Sources document that the apps were exploitable via benign requests; they do not confirm active in-the-wild exploitation. The retest validated the persistence of vulnerabilities, not actual abuse.
Can I check if a specific app is in the sample?: Not available. Researchers did not disclose the list of 282 apps to preserve responsible disclosure.
Can LLM providers block the exposed credentials?: For the 55% of apps with custom backends, credentials belong to the developer and do not transit directly on the provider side. In these cases, revocation is only possible upstream of the developer proxy.

Information verified against cited sources and current as of publication.

Sources

Sources and references