What we'd build if Anthropic gave us Claude Mythos

On April 7th, Anthropic disclosed Claude Mythos Preview. It scores 93.9% on SWE-bench Verified, 97.6% on USAMO, and over the past few weeks Anthropic has used it to find thousands of previously unknown zero-day vulnerabilities in every major operating system and every major browser. It is, to put it bluntly, the most capable code-reasoning system that has ever existed. And then they put it in a box.

Mythos isn't going on the API. It's going to about fifty partners under a programme called Project Glasswing: Amazon, Apple, Cisco, CrowdStrike, the Linux Foundation, Microsoft, Palo Alto Networks. Defensive security work only. Anthropic decided the public release was too dangerous because the same model that finds bugs to fix also finds bugs to exploit.

That's the right call. It is also slightly funny, because while a frontier model is quietly fixing the kernel of every operating system on Earth, the average Shopify store is still being held together by seventeen apps that each inject their own JavaScript and nobody has audited any of it.

The gap between the headline and the reality

Anthropic's announcement is about the frontier of AI capability. The frontier of Shopify security is somewhere far behind. We see this every week. Exposed staging stores indexed by Google. Custom Liquid that interpolates user input into HTML without escaping. Third-party scripts loaded over HTTP from domains that no longer exist. API keys committed to public theme repos. Fulfillment apps with permissions they haven't needed since 2023.

None of this requires a frontier model to find. Most of it requires somebody to look. The hard part isn't capability, it's coverage. There are millions of Shopify stores and almost nobody is paid to look at any of them.

Which is exactly the problem an AI system is shaped to solve.

What we'd build with it

If Mythos showed up in our inbox tomorrow, this is what we'd point it at first.

1. A continuous Shopify store auditor

Run it once a week against a store. Read every snippet, every section, every custom Liquid file, every installed app, every webhook. Output a ranked list of: things that will break, things that will leak data, things that will slow the page, things that the founder is paying for and not using. Less of a security tool than a structural MRI for an entire commerce stack.

2. The "is this app actually safe?" checker

Every Shopify app the merchant installs gets API access to customer data. Most merchants do not read the permissions. Most do not understand the permissions. A model that can read the public-facing JS of an app, watch what it phones home, and produce a one-paragraph "this is what this app does and here is what it can see" report would change how merchants make app decisions overnight. Right now the only equivalent is reading reviews and hoping.

3. A theme code reviewer that actually understands Shopify

Every other code reviewer treats Liquid like an exotic language. A frontier model with this kind of code reasoning could enforce conventions, flag accessibility regressions, find anti-patterns, and rewrite slow loops without anyone needing to learn the rules first. Imagine a junior developer with the tooling of a principal engineer, on every PR, every time.

4. The pitch surface scanner

For us specifically: when we go to pitch a brand, we already do a manual audit. A model with this capability could read their entire storefront, find the technical debt, the broken redirects, the cannibalised meta titles, the apps doing nothing, and produce the audit before the first call. It's the audit we already write, but in two minutes instead of two days, and probably more thorough.

The honest part

None of this requires Mythos. Opus 4.6 can do most of it today, slower and less reliably. The reason we're writing about Mythos is that the gap between what frontier models can do and what most ecommerce stacks actually use is getting absurd. We're building a part of the Internet where the tool that runs your business has more capability hidden in its API than you will ever ship.

That gap is the opportunity. It is also the thing that should make every Shopify founder slightly nervous. The same systems that can find and fix your problems can also find them on your behalf and not tell anyone, because nobody is looking.

Look.

Sources