/api/preview route with the token in the URL instead of a signed cookie. The Anthropic leak is your CMS, with a bigger blast radius.
## The four CMS misconfigurations that cause 80% of "AI lab" leaks
We've reviewed CMS setups for 40+ Indian firms in the last year. These four show up in nearly every audit, and they're the exact pattern that bit Anthropic.
yourcompany.com with yours.
### Check 1: directory listing on your media folder
# If this returns an HTML listing of files, you're exposed.
curl -s https://yourcompany.com/wp-content/uploads/2026/03/ | grep -i "index of"
curl -s https://yourcompany.com/uploads/ | grep -i "<title>Index of"
curl -s https://yourcompany.com/assets/ | grep -i "<a href"
### Check 2: your S3 bucket is listable
# Replace with your actual bucket. If you get XML listing, you're exposed.
curl -s https://your-bucket-name.s3.amazonaws.com/
curl -s https://your-bucket-name.s3.ap-south-1.amazonaws.com/
# Also try the path style — some clients use this:
curl -s https://s3.ap-south-1.amazonaws.com/your-bucket-name/
### Check 3: draft routes are accessible without auth
# Common draft URL patterns — test each:
curl -s -o /dev/null -w "%{http_code}\n" https://yourcompany.com/api/preview
curl -s -o /dev/null -w "%{http_code}\n" https://yourcompany.com/admin
curl -s -o /dev/null -w "%{http_code}\n" https://yourcompany.com/wp-admin/admin-ajax.php
curl -s -o /dev/null -w "%{http_code}\n" https://yourcompany.com/_next/data/
A 200 or 401 is fine. A 200 with content, or a 403 with a "Disallow" hint in robots.txt that points to a real path — that's where you'll find the leak.
### Check 4: Google has indexed something you didn't want
Search Google for: site:yourcompany.com inurl:draft, site:yourcompany.com filetype:pdf, site:yourcompany.com "internal". If you find pages tagged "draft" or "do not publish," they're indexed. Anthropic's drafts were almost certainly findable this way.
## The fix — a 6-step checklist
- Set your S3/GCS/Azure Blob storage bucket policy to "block all public access". Move public assets to a separate bucket explicitly meant to be public, with a CDN in front.
- For Sanity, Contentful, Strapi: rotate preview tokens monthly. Require auth on the preview route — don't trust the token alone. Use signed, short-lived URLs.
- Disable directory listing in Nginx (
autoindex off;) and Apache (Options -Indexes). Verify with curl after the change. - Move every "draft" or "preview" route behind a real authenticated session — same auth as your admin panel. No path obfuscation, no token-in-URL.
- Add a
robots.txtthat doesn't reveal sensitive paths (don't list/admin-secret/there — it tells attackers exactly where to look). - Set up Google Search Console for your domain. Subscribe to "Crawled — currently not indexed" alerts. If anything sensitive appears, file a removal request the same day.
/preview/[hash] and assumes nobody will guess the hash — you're trusting an attacker not to enumerate. Anthropic's exposed URLs were ~3,000 — well within the range a researcher's automated crawler will hit. URL secrecy is not access control. Use real auth./api/preview route accessible with the token and no IP allow-list; their AWS S3 bucket for course PDFs set to public-read. Total exposure: every draft course module for the next 6 weeks, plus 280 PDFs they'd marked "internal use." Fix: 4 working hours, ₹16,000 invoice, plus a 30-minute training for the team on Sanity preview-mode auth. The Sanity docs they'd read 18 months ago had been updated — they hadn't checked.
For more on how seemingly-boring CMS bugs become embarrassing leaks, our founder writes about [security for fast-moving startup teams](https://viveksinra.com/blog) — same pattern, more case studies.
## FAQ
### Was the Anthropic Mythos leak actually a hack?
No. It was a CMS misconfiguration where draft assets were public by default. Anthropic confirmed this to Fortune, calling it "human error" with an external CMS tool. No vulnerability was exploited; no system was bypassed. Researchers Roy Paz (LayerX Security) and Alexandre Pauwels (Cambridge) found the assets via standard reconnaissance.
### What does "public by default" mean in a CMS?
When you upload an asset (image, PDF, video) to a CMS, the system has to choose: is this asset public on the internet, or only accessible to logged-in editors? Many CMSes default to public — because most content eventually becomes public — which means a draft uploaded "for review" is already live the moment it hits the storage layer.
### How do I check if my WordPress install has this bug?
Try yourdomain.com/wp-content/uploads/ in a browser. If you see a directory listing, disable it. Try yourdomain.com/?p=99999 for various high IDs — if your drafts have low IDs, they may render even unpublished. Install Wordfence or similar and enable "Hide WordPress version" plus directory protection.
### Is Sanity safe if I'm careful with preview tokens?
Safer than WordPress, riskier than a fully auth-gated CMS. Sanity's preview pattern relies on a shared secret in the URL. Treat the token like a password: rotate quarterly, never commit to git, never paste in Slack, and require auth on top of the token (Vercel team SSO, Cloudflare Access, etc.).
### How does Google end up indexing my drafts?
Three ways: a developer or editor accidentally shares the draft URL externally (Slack, email — referer headers leak it); a sitemap.xml file includes draft slugs; or a third-party preview service like Sanity's hosted preview UI itself is crawlable. Always set drafts to noindex via headers — even behind auth — as defense in depth.
### Are these CMS misconfigs reportable under DPDP?
If the leaked content contains personal data — yes. India's Digital Personal Data Protection Act requires breach notification for material exposure of personal data. A misconfigured S3 bucket exposing customer KYC PDFs is a notifiable event, even if no "attacker" downloaded them. Treat misconfig findings as breaches until proven otherwise.
### What's the cheapest tool to scan for these issues continuously?
We run a combination of subfinder, httpx, and nuclei (all open-source, free) weekly against client domains. Total cost: a t3.micro EC2 instance running cron. Findings get emailed to a security inbox. The setup takes a half-day; we wrote about it in our internal runbook. Reach out if you want the bash scripts.
Want a CMS / Storage Exposure Scan?
We run a one-day external scan of your CMS, S3/GCS buckets, preview routes, and Google index for accidentally-public content. Deliverable: a one-page findings report, severity-ranked, with copy-paste fix instructions. Fixed scope ₹25,000 for under 20 employees, ₹45,000 for 20-200. Suitable if you run WordPress, Sanity, Contentful, Strapi, Next.js + headless CMS, or a custom admin.
Book a 20-min Call
