Assume the Breach: Why Zero-Knowledge Architecture Must Become the Default

On April 7, 2026, Anthropic announced Claude Mythos Preview — a model they decided was too dangerous to release publicly.
The reason: with only minimal direction, it can find zero-day vulnerabilities and construct working exploits across major operating systems and browsers, at a scale and speed no prior model could approach. In the headline benchmark — generating working Firefox JavaScript exploits — Mythos succeeded 181 times. Its predecessor, Claude Opus 4.6, succeeded twice in several hundred attempts.
Anthropic did not explicitly train the model to do this. The capability emerged as a downstream effect of improvements in reasoning, coding, and autonomy.
That last point should give you pause.
The equilibrium just broke
For decades, the security industry operated under a stable assumption: exploitation is expensive. Discovery requires expertise; deployment requires time. That gap created the window of defence.
That equilibrium is breaking. AI is both the entry point and the accelerant:
- The entry point. It expands the attack surface through a web of third-party integrations and autonomous agents.
- The accelerant. It compresses the time needed to exploit whatever access is gained. Discovery that used to take months now takes minutes.
The April 2026 Vercel security incident illustrates this perfectly. A third-party AI tool used by an employee — Context.ai — was compromised. Its Google Workspace OAuth access was leveraged to access internal systems and customer environment variables. No cryptography was broken. No passkey was bypassed. The attacker walked in through a trusted integration.
The attack surface is expanding at the same moment the cost of exploiting it is collapsing.
This isn't a forecast. Look at the past two weeks.
Mythos was the warning. What followed in the last 14 days is the confirmation.
- The first AI zero-day. On 11 May 2026, Google's Threat Intelligence Group confirmed the first in-the-wild zero-day developed with AI assistance — a 2FA bypass in an open-source system administration tool, discovered by a financially motivated actor who used an AI model as an "expert-level force multiplier" for vulnerability research.
- CopyFail (CVE-2026-31431). A Linux kernel privilege escalation in the AF_ALG crypto interface, present since 2017 and missed by nearly a decade of human review. Found in roughly an hour using AI-assisted analysis.
- Mini Shai-Hulud. A self-spreading supply-chain worm that compromised 170+ npm and PyPI packages, including releases from TanStack and Mistral AI, by poisoning GitHub Actions caches. The malicious builds shipped with valid SLSA Build L3 attestations. The supply-chain integrity guarantee itself became the delivery vehicle.
- YellowKey. A full BitLocker bypass on Windows 11 and Server 2025 in TPM-only mode. The hardware-level disk-encryption guarantee many enterprises rely on, defeated end-to-end on default configurations.
- PAN-OS CVE-2026-0300. Pre-authentication root RCE in Palo Alto firewalls. CVSS 9.3. The perimeter device, owned before it could begin to defend.
- Apple shipped fixes for ~79 CVEs in macOS 26.5 and ~50 in iOS 26.5, in a single release. That is the patched count for two weeks of disclosures, on the two most-shipped consumer operating systems on the planet.
- Vercel issued 13 Next.js security advisories in two days — middleware bypasses, SSRF, cache poisoning — across the most widely deployed React framework on the public internet. Versions 13.x and 14.x will not receive patches.
That is two weeks. Not a year. Not a forecast.
Every layer of the stack is in scope — the Linux kernel, the Windows boot chain, the firewall, the package registry, the framework, the operating systems. And the cost of finding the next one just dropped.
Now ask the question this post exists to ask: when these systems fall — and they are falling — what does the attacker find?
The right response: design for the breach
The instinctive response is to harden perimeters: stronger authentication, stricter OAuth controls, faster patching. All of this is necessary. None of it is sufficient.
In a world where systems uncover decades-old flaws overnight, no perimeter is a guarantee. The question is no longer whether a breach will happen. It is what the blast radius looks like after it does.
Zero-knowledge architecture is a system design in which user data is encrypted on the user's device before it is sent to the server, and the keys required to decrypt that data are never accessible to the service provider. The server stores ciphertext; it cannot read, preview, or hand over the underlying content. A breach of the server yields encrypted bytes, not readable data.
This is not a policy guarantee. It is a mathematical one.
The Vercel incident makes this concrete. Sensitive environment variables survived the breach; default ones did not. The difference was Vercel's "sensitive" environment variable type — a separate handling path that does not decrypt those values into plaintext the way default variables do. The attacker reached the storage layer; the protected variables did not become readable as a result.
The encryption layer held. The access control layer didn't.
If your security depends entirely on access control, a breach exposes everything. If your architecture assumes breach, the blast radius is constrained by design.
The AI tool problem
There is an irony in the current moment.
The tools expanding the attack surface — AI assistants, copilots, integrations — are also handling some of the most sensitive data in organisations: code, internal documents, customer data, strategic communication. Much of it is processed and stored in plaintext on third-party servers with broad access scopes.
The Vercel attacker entered through an AI tool. The most valuable data is increasingly flowing through those same systems.
Google's threat intelligence team described the AI model used by the May 2026 zero-day actor as an "expert-level force multiplier" for vulnerability research. That phrase deserves to be read literally. It does not mean a researcher with AI help. It means expert-level capability, available on demand, to whoever is paying.
This is not a reason to avoid AI tools. It is a reason to rethink their architecture. An AI system built on zero-knowledge principles — where plaintext is minimised, exposure is constrained, and providers cannot access user data persistently — is fundamentally different from one that stores everything server-side.
That distinction matters more now than ever.
What zero-knowledge architecture actually requires
The term is often used loosely. A meaningful implementation rests on four pillars:
- Client-side primacy. Data is encrypted on the user's device before it leaves it. The server stores only ciphertext, with no decryption path.
- User-held key material. Keys are derived from user-held secrets — a recovery phrase, a hardware-backed passkey — never stored centrally.
- Confidential computation. When the server must process data — for AI reasoning, search, or indexing — it should happen inside Trusted Execution Environments (TEEs) or secure enclaves, so plaintext exists only in a volatile, shielded memory region, opaque even to the host OS or a compromised root user.
- Verifiable claims. Privacy and security claims should be backed by public crypto specifications, independent audits, and open implementation where possible. This is the standard the category should be held to — including us.
Tactical checklist: how to "assume breach" today
If you are managing an organisation's security posture in 2026, stop focusing exclusively on the gates. Start focusing on the vaults.
- Audit AI scopes. Treat every AI integration as a potential root user. Audit OAuth scopes and revoke anything with "read all" permissions that doesn't strictly need them.
- "Encrypt at rest" is not enough. Standard encryption at rest, where the provider holds the keys, is just an access-control check dressed up as cryptography. Move toward zero-knowledge encryption for your most sensitive internal documents — designs where the provider cannot read your data even if their own infrastructure is compromised.
- Hardware-back your identity. Non-FIDO 2FA — SMS, TOTP, push approval — is no longer enough against AI-assisted real-time phishing. Move to passkeys (FIDO2) so a successful phishing attempt cannot move laterally.
- Segment your data tiers. Follow the Vercel model. Identify your "sensitive" variables and ensure they live on a different architectural path from your public data.
Brianni: building the black box
Brianni was built so that a breach is a non-event for your data. The reasoning behind that decision — and why we ruled out the easier server-side path — is in Why I Built Brianni.
-
Authentication without passwords. Sign-in is built around passkeys, with OTP and OAuth as passwordless alternatives. There are no passwords, no password databases, and the phishing surface that comes with them is removed. The full reasoning on why synced passkeys are not the weakness they are often portrayed as is in Synced Passkeys Are Not the Security Risk You've Been Told.
-
The vault. Content is encrypted on your device before it reaches our servers. We are architecturally incapable of reading it. A breach of our infrastructure yields ciphertext, not content.
-
Step-up proofs. Reaching the vault — or any other critical resource — requires a second proof at the point of use: a passkey check (biometric on the user's device) or the recovery phrase. This layer sits on top of normal sign-in, regardless of how the user got into their account. An OAuth-compromised session, a hijacked OTP, a stolen device with an unlocked browser tab — none of them are enough on their own.
This is the part that changes what a compromised sign-in actually gets you. In most systems, getting past authentication is the win condition — once you are in, you are in. In Brianni, you are at the front door. The keys to decrypt anything that matters are produced separately, on demand, by the user's own device.
-
Brianni-AI (in active development). The upcoming AI integration is designed around isolated, ephemeral execution environments — so plaintext is never persistently held in a place where a future breach could read it. We don't just promise not to train on your data; we are building the pipes so the data is never persistently available to be harvested in the first place.
This is not about promising security. It is about removing the ability to fail in certain ways.
Final thought
The era of the safe perimeter ended the moment AI started reasoning at scale.
In the environment we are entering, a breach is no longer a failure of the security team — it is a statistical certainty. The true failure is building a system where a single compromised integration grants the keys to the kingdom.
Build systems where the data protects itself.
Because the perimeter won't.