But it would also mean that the Internet Archive is illegal, even tho they don’t profit, but if scraping the internet is a copyright violation, then they are as guilty as Anthropic.
They could move to a voluntary model in the worst case, they don’t profit from it. Institute a “robots.txt” style protocol for signalling opt-in intent to volunteer for scraping by the archive.
I would imagine someone would still need to actually sue the Internet Archive for this to be a problem for them. The vast majority probably won’t care, and they’ll likely just have to deal with whatever the equivalent of a DMCA takedown notice is for them.
But it would also mean that the Internet Archive is illegal, even tho they don’t profit, but if scraping the internet is a copyright violation, then they are as guilty as Anthropic.
IA doesn’t make any money off the content. Not that LLM companies do, but that’s what they’d want.
Do you think that would rescue the IA from the type of people who made the IA already pull 300k books?
No. But going after LLMs wont make the situation for IA any worse, not directly anyway.
if the courts decide that scraping is illegal, IA can close up shop.
They could move to a voluntary model in the worst case, they don’t profit from it. Institute a “robots.txt” style protocol for signalling opt-in intent to volunteer for scraping by the archive.
yeah that might work, but what will happen to all the data they store currently?
I would imagine someone would still need to actually sue the Internet Archive for this to be a problem for them. The vast majority probably won’t care, and they’ll likely just have to deal with whatever the equivalent of a DMCA takedown notice is for them.