I take my shitposts very seriously.
- 0 Posts
- 7 Comments
Correct. Anubis’ goal is to decrease the web traffic that hits the server, not to prevent scraping altogether. I should also clarify that this works because it costs the scrapers time with each request, not because it bogs down the CPU.
It assumes that we sites are under constant ddos
It is literally happening. https://www.youtube.com/watch?v=cQk2mPcAAWo https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/
It assumes that anubis is effective against ddos attacks
It’s being used by some little-known entities like the LKML, FreeBSD, SourceHut, UNESCO, and the fucking UN, so I’m assuming it probably works well enough. https://policytoolbox.iiep.unesco.org/ https://xeiaso.net/notes/2025/anubis-works/
anti-AI wave
Oh, you’re one of those people. Enough said. (edit) By the way, Anubis’ author seems to be a big fan of machine learning and AI.
(edit 2 just because I’m extra cross that you don’t seem to understand this part)
Do you know what a web crawler does when a process finishes grabbing the response from the web server? Do you think it takes a little break to conserve energy and let all the other remaining processes do their thing? No, it spawns another bloody process to scrape the next hyperlink.
- A web server that can’t discriminate between a request made by a human and one made by a machine has to handle all requests. It may not be an issue for large companies like Amazon or Microsoft, but small websites will suffer timeouts and outages.
- Without a locally hosted solution like Anubis, small websites would have to move behind a large centralized service like Cloudflare.
- Otherwise they might not be able to continue operating and only large corporate-backed services like Twitter and Reddit would survive.
The alternative is having to choose between Reddit and Cloudflare. Does that look “free” and “open” to you?
Anubis is a simple anti-scraper defense that weighs a web client’s soul by giving it a tiny proof-of-work workload (some calculation that doesn’t have an efficient solution, like cryptography) before letting it pass through to the actual website. The workload is insignificant for human users, but very taxing for high-volume scrapers. The calculations are done on the client’s side using Javascript code.
(edit) For clarification: this works because the computation workload takes a relatively long time, not because it bogs down the CPU. Halting each request at the gate for only a few seconds adds up very quickly.
Recently, the FSF published an article that likened Anubis to malware because it’s basically arbitrary code that the user has no choice but to execute:
[…] The problem is that Anubis makes the website send out a free JavaScript program that acts like malware. A website using Anubis will respond to a request for a webpage with a free JavaScript program and not the page that was requested. If you run the JavaScript program sent through Anubis, it will do some useless computations on random numbers and keep one CPU entirely busy. It could take less than a second or over a minute. When it is done, it sends the computation results back to the website. The website will verify that the useless computation was done by looking at the results and only then give access to the originally requested page.
Here’s the article, and here’s aussie linux man talking about it.
Yes Man would never refuse to help you commit tax fraud.