Lemmy may be heading down the path of LLMs

ell1e@leminal.space · edit-2 15 hours ago

Lemmy may be heading down the path of LLMs

ell1e@leminal.space · edit-2 4 hours ago

Then the PR can be evaluated, rejected if it’s nonfree or just poor quality

I don’t get the difficulty of rejecting “if it’s nonfree or just poor quality or known LLM code”. I don’t think it’s a vague criterion.

And for many projects, if you admit it’s from a StackOverflow post, unless you can show it’s not a direct copy they will reject it as well. This isn’t commonly taken as incentivizing people to lie.

Now whether you think LLMs are worth the trouble to use is a different discussion, but the enforcement point doesn’t convince me.

There is also a responsibility and liability question here. If something turns out to be a copyright issue and the contributor skirted a known rule, the moral judgement may look different than if you knew and included it anyway. (I can’t comment on the legal outcomes since I’m not a lawyer.)

Rentlar@lemmy.ca · 5 hours ago

To be specific, the jump you are making is likening LLM output to non-free code, while on the surface level it makes sense, it’s much closer to making stuff based on copied code. In the US at least, there’s clear legal precedent that LLM fabrications are not copyrightable.

Blanket AI bans are enforceable, I’m not arguing against that, it’s just that I don’t think it’s worth instituting, that it’s not a good fit for this project. My argument is that a Lemmy development policy of “please mark which parts of your code are AI-generated and how you used LLMs, and we will evaluate accordingly” is better than “if you indicate anywhere that your code is AI/LLM-generated, we will automatically reject it”.

ell1e@leminal.space · edit-2 4 hours ago

My opinion is that the data disagrees with you: 1. https://www.psu.edu/news/research/story/beyond-memorization-text-generators-may-plagiarize-beyond-copy-and-paste 2. https://dl.acm.org/doi/10.1145/3543507.3583199 3. https://www.sciencedirect.com/science/article/pii/S2949719123000213#b7 4. https://www.theatlantic.com/technology/2026/01/ai-memorization-research/685552/ 5. Related high profile incident that is very telling: https://www.pcgamer.com/software/ai/microsoft-uses-plagiarized-ai-slop-flowchart-to-explain-how-github-works-removes-it-after-original-creator-calls-it-out-careless-blatantly-amateuristic-and-lacking-any-ambition-to-put-it-gently/

In the US at least, there’s clear legal precedent that LLM fabrications are not copyrightable.

I see many people doubt this says anything about training data copyright, beyond AI user copyright.

This isn’t legal advice, I’m not a lawyer.

Rentlar@lemmy.ca · 3 hours ago

I don’t mean in any way to imply that your opinion isn’t sound, but simply that I don’t agree with it here in the context of whether the Lemmy devs should accept or not PRs with any reported LLM usage.