Anubis
I’ve always relied on simple, practical defenses for web services — basic rate limiting, HTTP headers, and cautious CORS rules. But now we’re in the era of AI, and small websites are facing new challenges they’ve never had to deal with before.
Scraping used to be obvious. It usually came from a single IP or a small cluster of addresses within the same network range, often hammering your endpoints without delay. Blocking could be simple: identify the subnet, drop it with a firewall rule, and move on.
But now, large-scale language models and dataset builders operate through distributed scraping pipelines — thousands of nodes spread across residential proxies, cloud functions, and edge servers. Each one behaves “politely,” sending low-frequency, well-formed requests that look identical to legitimate human clients. They are not well-behaved: they ignore your robots.txt, ignore your User-Agent blocks, ignore your X-Robots-Tag headers, and they will scrape your site until it falls over — and then scrape it again after it comes back online[1][2].
Even search engine bots at least give you visitors. AI scrapers, on the other hand, make money from your content and give you nothing in return.
Recently, I noticed several open-source projects starting to adopt a new tool called Anubis to fight back against AI scraping. It’s already being used by FFmpeg, GNOME, Arch Linux, Codeberg, and others[3].
Anubis is like a CAPTCHA, but flipped. Instead of checking whether a visitor is human, it makes web crawling computationally expensive for companies trying to feed their hungry LLMs. It forces each client to perform a small proof-of-work challenge before any request can reach your app. For real users, the process is transparent and fast; for automated crawlers, it introduces meaningful cost.
Setting up Anubis is straight forward, it sits behind your reverse proxy and in front of your app. But in this post I’ll walk through my Anubis experiment to secure my API.
Since I only want legitimate browser users to access my API, I had to consider how the frontend communicates with the backend. APIs don’t interact directly with humans; they’re called by frontend JavaScript.
To solve this, I designed a setup where the frontend pre-solves the Anubis challenge through a hidden iframe:
-
The iframe loads a protected endpoint (e.g. localhost:8081), solving the Anubis challenge silently in the background.
-
Once solved, Anubis returns credentials as a cookie, which the frontend JavaScript cannot directly access.
-
Later, when the frontend makes API calls, it uses credentials: ‘include’ in the fetch() request so the cookie is automatically attached.
Here’s a simplified diagram:
frontend http://frontend.127.0.0.1.sslip.io:5000
└─► embed iframe src="http://localhost:8081"
(solves Anubis challenge and stores cookies on localhost)
↓
subsequent API request (cookie header included)
↓
proxy (nginx:8081 -> localhost:8081)
↓
Anubis:8080
↓
app nginx:80 (static site + /api)
Anything that isn’t a real browser will typically fail to access the API, since it won’t have the credentials proving it’s a verified client.
Configuration
docker-compose.yaml
version: '2'
services:
proxy:
image: nginx
container_name: proxy
ports:
- "8081:8081"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- anubis
networks:
- net
anubis:
image: ghcr.io/techarohq/anubis:latest
environment:
BIND: ":8080"
DIFFICULTY: "4"
TARGET: "http://app"
POLICY_FNAME: "/data/cfg/botPolicy.yaml"
OG_PASSTHROUGH: "true"
OG_EXPIRY_TIME: "24h"
ports:
- "8080:8080"
volumes:
- "./botPolicy.yaml:/data/cfg/botPolicy.yaml:ro"
networks:
- net
app:
image: nginx
container_name: app
volumes:
- ./www:/usr/share/nginx/html:ro
- ./app.nginx.conf:/etc/nginx/nginx.conf:ro
networks:
- net
frontend:
image: nginx
container_name: frontend
ports:
- "5000:5000"
volumes:
- ./frontend/nginx.conf:/etc/nginx/nginx.conf:ro
- ./frontend/www:/usr/share/nginx/html:ro
networks:
- net
networks:
net:
driver: bridge
nginx.conf (proxy → Anubis)
upstream anubis_upstream {
server anubis:8080;
}
server {
listen 8081;
server_name localhost;
location / {
# Forward all requests to anubis
proxy_pass http://anubis_upstream;
proxy_http_version 1.1;
# Preserve important request metadata
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
botPolicy.yaml
bots:
- name: challenge-everything
description: "Temporary rule to confirm Anubis enforcement"
action: CHALLENGE
challenge:
difficulty: 5 # 16 # impossible
report_as: 5 # lie to the operator
algorithm: fast
# path_regex: ^/api
expression:
all:
- "true"
app.nginx.conf (simple backend app)
server {
listen 80;
server_name localhost;
root /usr/share/nginx/html;
index index.html;
# CORS policy for frontend
location / {
add_header 'Access-Control-Allow-Origin' 'http://frontend.127.0.0.1.sslip.io:5000' always;
add_header 'Access-Control-Allow-Credentials' 'true' always;
add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS' always;
add_header 'Access-Control-Allow-Headers' 'Content-Type, Authorization' always;
if ($request_method = OPTIONS) { return 204; }
try_files $uri /index.html;
}
# Simple test API
location /api/hello {
default_type application/json;
return 200 '{"message":"Hello from backend app"}';
}
}
frontend/nginx.conf
events {}
http {
server {
listen 5000;
server_name frontend.127.0.0.1.sslip.io;
root /usr/share/nginx/html;
index index.html;
}
}
frontend/www/index.html
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
<title>Anubis CORS/XHR Test</title>
</head>
<body>
<h1>Frontend Page (Cross-Origin API Test)</h1>
<iframe src="http://localhost:8081/" style="display:none;"></iframe>
<button id="callApi">Call Backend /api/hello</button>
<pre id="result">Result will appear here...</pre>
<script>
document.getElementById('callApi').onclick = async () => {
const res = await fetch('http://localhost:8081/api/hello', {
method: 'GET',
credentials: 'include'
});
const text = await res.text();
document.getElementById('result').textContent =
`${res.status} ${res.statusText}\n\n${text}`;
};
</script>
</body>
</html>
Anubis vs Cloudflare Turnstile
Compared to Cloudflare Turnstile, Anubis doesn’t require any external verification requests. Turnstile’s design expects a user to solve the challenge in the browser, generate a token and then have your backend verify it through Cloudflare’s /siteverify API endpoint[4].
That’s fine for interactive forms, but it doesn’t scale to API-first architectures — where the backend has no user-facing interface and where constant verification round-trips to Cloudflare add latency and cost.
Anubis, on the other hand, validates everything locally. The proof-of-work challenge and verification happen entirely on your infrastructure — no external API calls, no third-party dependency, no telemetry. It’s open source, self-contained, and faster by design.
Community Reactions and Criticism
There’s still criticism around Anubis. Some users say it annoys visitors, while others argue that powerful AI companies can easily run JavaScript engines to solve its challenges — and, yes, that’s already happened[5][6][7].
The author of Anubis doesn’t work on it full-time and has openly discussed the financial and mental strain of maintaining open-source projects. But this is an arms race — just like youtube-dl vs YouTube. There will always be volunteers keeping the project alive, though the lag between breakage and patches will continue to frustrate users.
[1] https://xeiaso.net/blog/2025/anubis/
[2] https://www.theregister.com/2025/07/09/anubis_fighting_the_llm_hordes/
[3] https://Anubis.techaro.lol/docs/user/known-instances
[4] https://developers.cloudflare.com/turnstile/get-started/server-side-validation/
[5] https://news.ycombinator.com/item?id=44914773
[6] https://news.ycombinator.com/item?id=43866626
[7] https://news.ycombinator.com/item?id=45787775