Sadly, it seems like Lemmy is going to integrate LLM code going forward: https://github.com/LemmyNet/lemmy/issues/6385 If you comment on the issue, please try to make sure it’s a productive and thoughtful comment and not pure hate brigading.
Edit: perhaps I should also mention this one here as a similar discussion: https://github.com/sashiko-dev/sashiko/issues/31 This one concerns the Linux kernel. I hope you’ll forgive me this slight tangent, but more eyes could benefit this one too.


In my opinion, this argument is exactly the same as saying “we can’t enforce people not stealing GPL-licensed code and copy&pasting it into our project, so we might as well allow it and ask them to disclose it.”
You can try to argue AI may actually be useful, which seems like what they did, and that would more fairly inform a policy in my opinion. I think your argument doesn’t.
Yeah, and of top of that all the reasons why we hate AI,
My argument is that a total ban on AI use is more comparable to saying “Code from any other coding project is not allowed”. It will start unproductive arguments over boilerplate, struct definitions and other commonly used code.
The broadness and vaagueness of “no AI whatsoever” or “no code from any other projects whatsoever” will be more confusing than saying, “if you do copy any code from another project, let us know where from”. Then the PR can be evaluated, rejected if it’s nonfree or just poor quality, rather than incentivizing people to pretend other people’s code is their own, risking bigger consequences for the whole project. People can be honest if they got inspiration from stackoverflow, a reference book, or another project, if they are allowed to be.
I’m not saying AI should be blanket allowed, the submitter needs to understand the code, enough to be able to revise it for errors themselves if the devs point out something. They can’t just say “I asked AI and it’s confident that the code does this and is bug free”.
I don’t get the difficulty of rejecting “if it’s nonfree or just poor quality or known LLM code”. I don’t think it’s a vague criterion.
And for many projects, if you admit it’s from a StackOverflow post, unless you can show it’s not a direct copy they will reject it as well. This isn’t commonly taken as incentivizing people to lie.
Now whether you think LLMs are worth the trouble to use is a different discussion, but the enforcement point doesn’t convince me.
There is also a responsibility and liability question here. If something turns out to be a copyright issue and the contributor skirted a known rule, the moral judgement may look different than if you knew and included it anyway. (I can’t comment on the legal outcomes since I’m not a lawyer.)
To be specific, the jump you are making is likening LLM output to non-free code, while on the surface level it makes sense, it’s much closer to making stuff based on copied code. In the US at least, there’s clear legal precedent that LLM fabrications are not copyrightable.
Blanket AI bans are enforceable, I’m not arguing against that, it’s just that I don’t think it’s worth instituting, that it’s not a good fit for this project. My argument is that a Lemmy development policy of “please mark which parts of your code are AI-generated and how you used LLMs, and we will evaluate accordingly” is better than “if you indicate anywhere that your code is AI/LLM-generated, we will automatically reject it”.
My opinion is that the data disagrees with you: 1. https://www.psu.edu/news/research/story/beyond-memorization-text-generators-may-plagiarize-beyond-copy-and-paste 2. https://dl.acm.org/doi/10.1145/3543507.3583199 3. https://www.sciencedirect.com/science/article/pii/S2949719123000213#b7 4. https://www.theatlantic.com/technology/2026/01/ai-memorization-research/685552/ 5. Related high profile incident that is very telling: https://www.pcgamer.com/software/ai/microsoft-uses-plagiarized-ai-slop-flowchart-to-explain-how-github-works-removes-it-after-original-creator-calls-it-out-careless-blatantly-amateuristic-and-lacking-any-ambition-to-put-it-gently/
I see many people doubt this says anything about training data copyright, beyond AI user copyright.
This isn’t legal advice, I’m not a lawyer.
I don’t mean in any way to imply that your opinion isn’t sound, but simply that I don’t agree with it here in the context of whether the Lemmy devs should accept or not PRs with any reported LLM usage.