AI/tech Open

The Last Conspiracy

March 12, 2026 By George Beck
The Working Hypothesis
AI has collapsed the conspiracy surface area required for institutional-scale attacks to one Open
Executive Summary

David Grimes' formula showed that conspiracy surface area was the detection mechanism for institutional-scale fraud. AI is removing it.

In 2016, an Oxford physicist named David Robert Grimes did something unusual: he put a number on how long a conspiracy can survive. Using Poisson statistics and three calibration cases — the NSA’s PRISM program, the Tuskegee syphilis experiment, and the FBI forensics scandal — he derived a simple, uncomfortable formula. The more people who are in on something, the faster it falls apart. A conspiracy involving a thousand people, under even the most generous assumptions about human secrecy, becomes untenable within a few years. The math isn’t complicated. Every additional person is another chance for a crisis of conscience, an accidental leak, a disgruntled ex-employee, a congressional staffer going through boxes.

His model wasn’t just an academic exercise. It was a description of how accountability actually works.

Not courts. Not regulators. Not investigative journalism, at least not primarily. The detection mechanism for large-scale institutional fraud has always been, at its core, the surface area created by the number of people who had to be involved. Enron needed thousands of hands on fabricated transactions — special purpose entities, off-books partnerships, mark-to-market manipulations that left a paper trail 3,000 boxes deep. The fraud required human scale, and human scale eventually produced Sherron Watkins, a VP who wrote an anonymous letter to the CEO warning the company would “implode in a wave of accounting scandals.” Theranos needed lab technicians who could physically see that the Edison machine didn’t work — it was “essentially a pipette on a robotic arm.” You can intimidate people, but you can’t indefinitely suppress what they observe with their own eyes. The LIBOR scandal — the manipulation of the benchmark interest rate that underpinned trillions of dollars in loans and derivatives worldwide — needed traders at rival banks to coordinate rate submissions in writing, across firms, over years. The emails became evidence. And even before the emails surfaced, the coordination left a statistical fingerprint: LIBOR was essentially constant for over a year while comparable rates varied freely, an anomaly detectable through pure econometric analysis.

Every one of those frauds was exposed because it required too many people. The conspiracy surface area was the detection mechanism.

AI changes the equation by collapsing the number of participants toward one.

The consensus on AI deepfakes and synthetic media is, roughly, this: costs have collapsed, the technology has democratized, and we’re in a race between bad actors and inadequate guardrails. Coffeezilla’s recent investigation laid it out methodically — trust-hijacking scams, synthetic brand narratives, AI-assisted emotional manipulation, propaganda designed not to persuade but to exhaust. Hany Farid, a computer science professor at Berkeley who studies manipulated media, puts the goal of modern AI propaganda starkly: not to trick intelligent people with flawless cinematic lies, but to talk “until you’re too tired to care” about what is real. The numbers are real. Before facing regulatory pressure, X’s Grok AI was generating 6,700 sexualized deepfake images per hour — 84 times the volume of the top five dedicated deepfake forums combined. Between Christmas and New Year’s 2025, it produced 20,000 images, some depicting what appeared to be children.

The consensus is right about all of this. The cost collapse is real. The platform complicity is real. The legislative response is, charitably, behind the curve.

Here’s what the consensus is missing.

It’s looking at the wrong variable. Volume isn’t the threat. Attribution surface area is.

The Grok scandal — thousands of anonymous users, each doing one dumb public thing, posting the results to X in real time — was detectable precisely because it was loud. The images were public. The requests were public. The harm had the same structure as every pre-AI fraud: too many people, too much surface area, too easy to find. What’s coming is the inverse. One careful person. One complex operation. No co-conspirators. No disgruntled lab technician. No email thread across rival firms. No congressional staffer finding the memo in a box.

In January 2026, Check Point Research published a report on a malware framework they called VoidLink. It was, by their assessment, “the first evidently documented case” of an advanced framework authored almost entirely by AI, likely under the direction of a single individual. The malware itself was sophisticated — 88,000 lines of code across three programming languages, rootkits that burrow deep into an operating system’s kernel, over 30 plugins, cloud reconnaissance modules targeting Amazon, Microsoft, and Google’s cloud platforms as well as several Chinese equivalents, and tools for compromising containerized software environments. The kind of thing that, a year earlier, would have suggested a well-resourced threat group with multiple specialized developers.

The developer used TRAE SOLO, an AI coding assistant. The method was what Check Point called “Spec Driven Development”: first direct the AI to generate a comprehensive development plan — sprint schedules, team structures, architecture specifications — then feed that plan back to the AI as the execution blueprint. A 20-to-30-week engineering project, compressed to under a week. Development appears to have begun in late November 2025. The framework was functional by approximately December 4th.

We know any of this only because the developer made mistakes. An open directory on their server leaked source code, sprint documents, and TRAE-generated planning files. A compiled version was uploaded to VirusTotal, triggering Check Point’s investigation. The Chinese-language documentation exposed the methodology.

Check Point’s report ends with a question that should be read carefully: “We only uncovered its true development story because we had a rare glimpse into the developer’s environment, a visibility we almost never get. Which begs the question: how many other sophisticated malware frameworks out there were built using AI, but left no artifacts to tell?”

VoidLink is a malware story. But the structural logic — one person leveraging AI to do the work of a coordinated team, faster and without co-conspirators — applies across any operation that previously required human scale. The difference is that in malware, we can see the artifact. In fraud and influence operations, the absence of the artifact is the point.

To be precise about what the research does and doesn’t show: no single documented case yet proves a solo operator has executed institutional-scale financial fraud using AI with no co-conspirators. The closest we have is the Arup deepfake — a $25.6 million wire fraud executed through AI-generated video of a company’s CFO, completed in a single day across 15 transactions. As of early 2025, no perpetrator has been identified, no funds recovered. Whether it was one person or a small cell remains unknown. That ambiguity is itself evidence for the thesis: if the operation were larger, there would be more surface area to find.

What history tells us is that solo-operator institutional damage existed before AI — it just required exceptional circumstances. Jérôme Kerviel lost Société Générale €4.9 billion through unauthorized trades concealed using years of back-office knowledge. Nick Leeson collapsed Barings Bank — a 233-year-old institution — from a hidden account in Singapore. Edward Snowden exfiltrated 1.5 million classified documents using credentials borrowed from colleagues at the NSA. Each case required something most people don’t have: extraordinary institutional access, or years of specialized knowledge, or both.

AI lowers the access and expertise thresholds simultaneously. The constraint is shifting from “what you know and who you know” to “what you can direct.” That’s a different world.

The 2026 International AI Safety Report — compiled by over 100 independent experts from 30 countries — documented a threat actor who had automated 80 to 90 percent of the effort involved in a network intrusion, with human involvement limited to critical decision points. Not fully autonomous. But not a team, either. A strategic director with an AI workforce.

My working hypothesis: we are at or near the threshold where a single capable operator can execute attacks of institutional scale — financial fraud, influence operations, data exfiltration, infrastructure compromise — with no conspiracy surface area to expose. The first serious solo-operator incident of this kind is probably already in progress. If the operator is smart, we won’t know until someone finds the new attack surface. That may take years.

Here’s where it gets complicated — and, cautiously, more interesting.

The old detection mechanism was conspiracy surface area: human involvement, human error, human conscience. AI removes it. But conspiracy surface area was never the only detection mechanism. It was just the most reliable one.

LIBOR wasn’t caught only through whistleblowers and emails. Researchers subsequently showed that the manipulation was visible through pure econometric analysis — rate submissions “moved simultaneously to the same number from one day to the next” over a sustained period, a statistical impossibility under normal market conditions. The coordination left a fingerprint in the data even before anyone talked.

This matters because AI-generated operations may do the same thing. Drexel University’s MISLnet algorithm achieves 98.3% accuracy at identifying AI-generated video through digital fingerprints of the generating model — and can adapt to detect new generators after studying only a small number of examples. Research on “causal fingerprints” of generative models shows that AI outputs carry implicit traces attributable to their source. Behavioral signatures in synthetic content — timing patterns, stylistic consistency, interaction dynamics — are detectably different from human-generated behavior in ways that don’t require a whistleblower to surface.

This is the honest version of the optimistic half: the replacement for conspiracy surface area isn’t one thing, it’s an ecosystem. Algorithmic detection. Behavioral fingerprinting. Statistical anomaly analysis. And, as the underlying infrastructure, cryptographic provenance baked into content at the point of creation.

That infrastructure is called C2PA — the Coalition for Content Provenance and Authenticity. Think of it as a tamper-evident label embedded in digital media at the moment of capture or generation, cryptographically signed, verifiable by any conforming reader. The Google Pixel 10, released in September 2025, signs every photo by default using hardware-backed keys. Sony, Canon, Leica, and Panasonic have shipping C2PA-capable cameras. LinkedIn displays provenance credentials on uploaded content. The EU AI Act — Europe’s comprehensive regulation of artificial intelligence — has a provision (Article 50) enforceable August 2, 2026 that creates a hard regulatory deadline for machine-readable AI content labeling — C2PA is the most technically mature pathway to compliance.

The standard is real. It has moved decisively past the white-paper stage.

The gap is just as real. Fewer than 1% of published content globally carries C2PA metadata, even among the most motivated publishers. Every major social media platform still strips provenance metadata on upload — breaking the chain at exactly the point where synthetic content does its damage. Apple, which controls the majority of smartphone photography in developed markets, is not in the C2PA coalition and has shown no public movement toward joining. And C2PA doesn’t detect AI-generated content — it only records what was declared at the moment of signing. A sufficiently motivated operator doesn’t sign their work.

Microsoft’s February 2026 Media Integrity and Authentication report put it directly: “no single method — C2PA provenance, watermarking, or fingerprinting — can prevent digital deception on its own.”

The infrastructure is being built. Whether it matures fast enough is the open question. Whether platforms comply with the EU deadline meaningfully or cosmetically will determine how much of the provenance chain actually holds. Whether Apple joins determines whether the largest category of photographic evidence — smartphone captures of protests, government actions, events that matter — carries verifiable provenance or doesn’t.

The piece needs one more honest accounting. There are things AI cannot compress to a solo operator, and the thesis should say so. Physical presence still requires a body. Long-term trust relationships still require time. Complex multi-stage autonomous attack sequences — per the same International AI Safety Report — still fail without human intervention at key decision points. The current AI-assisted solo operator isn’t omnipotent; they’re amplified. A capable developer amplified by AI. Not a novice elevated to genius.

The dangerous category is specifically digital institutional-scale harm — financial fraud, influence operations, data exfiltration, cyber infrastructure compromise — where the physical world doesn’t constrain the operation. That’s where the conspiracy surface area argument applies most cleanly, and where the detection infrastructure gap matters most acutely.


What would change my mind

One: A documented solo-operator AI attack of institutional scale is attributed and prosecuted within 18 months. That would suggest the detection mechanisms we already have — behavioral fingerprinting, statistical anomaly detection, platform-level AI monitoring — are more robust than the thesis assumes. The Arup deepfake remains unattributed. Law enforcement is demonstrably struggling. This falsification condition is currently unmet.

Two: Apple joins C2PA and major platforms commit to preserving provenance metadata on upload before the EU AI Act deadline of August 2, 2026. Partial movement is plausible. Full movement — the kind that would actually close the distribution gap — looks unlikely from where we are today.

Three: Evidence emerges that the VoidLink developer was a small team, not a solo operator. Check Point’s analysis is consistent with a single individual, but no definitive proof exists either way. If a team built it, the proof-of-concept weakens but the directional argument remains: whatever previously required a well-resourced threat group is now achievable at a dramatically smaller scale.

The old equilibrium was, in retrospect, fragile in an underappreciated way. We trusted that large-scale institutional harm required large-scale coordination, and that large-scale coordination eventually surfaces. The trust was largely justified, for decades. Grimes modeled it. History confirmed it.

What AI removes is not the possibility of detection. It removes the guarantee of it.

The detection mechanisms that replace conspiracy surface area — algorithmic fingerprinting, statistical anomaly analysis, cryptographic provenance — are real and improving. But they require deliberate construction. They don’t emerge automatically from the architecture of the operation the way human surface area did. They have to be built, deployed at scale, and maintained against adversarial pressure.

That’s a different kind of guarantee. Whether it’s enough is what we’re about to find out.


If you found this useful, the best thing you can do is forward it to one person who would push back on it. I’d rather be wrong in public than right in private.

Founding Readers

Founding readers get permanent free access.

The first 777 subscribers read everything, forever, at no cost.

No spam. One-click unsubscribe.