The vibe-coding craze rather uniquely simultaneously combines a whole bundle of moral, ethical, quality and legal issues in one, though. That feels like a good enough reason to make an exception for it.
Maintainers or contributors of packages check, like they check the license.
Since this work is done by volunteers, all that is best effort. So it can get left empty if unsure. But for packages where we know, we can add that information.
I hope nixos users who rant on social media about LLM use in software also make PRs to add the information.
Thatâs why i created this discussion instead of letting it happen in the chardet update PR. We should discuss it with the whole community and create a clear policy (if needed).
I think there is a fundamental difference between code written by humans and code written by LLMs. LLMs make mistakes no human would and for that reason these are also hard to notice in reviews.
@ledettwy do you have any evidence for that? I feel like most veteran free software developers that maintain those projects donât use LLMs.
@crertel so we should remove the unfree and insecure flag?
NixOS is about having control over your system. A flag for LLM use would enable the user to be intentional about what software they use. Like the mandatory AI disclosure on Steam.
I have not suggested to forbid LLM created software by default by the way. Users who care can set it to false. Everyone else is not even affected by the change.
This is not about moral.
@truh if you donât act on the information, why bother at all? if you want to be intentional about something, i think some friction helps
If i understand software correctly, a single character can create a security issue. And since LLMs choose characters (or tokens) by probability (weights) and randomness (temperature), it has a higher chance to be the one which makes your whole system vulnerable compared to when a human did choose a character intentionally.
so: any
sure, we have to. policies apply to all packages
This is a very relevant question here. If 99,9% of software has LLM generated code, we canât have a working system without them.
Yes, we could not have one binary flag, but categories, like
- No LLM use
- Responsible LLM use (following best practices)
- Unresponsible LLM use
- Fully vibe-coded (no human has looked at the code)
Technically, i think we can implement that easily in nixpkgs like the different licenses that you can also selectively allow.
But we would have to come up with useful categories that are clearly defined. I think the optimal solution will become clearer with this discussion going on, especially in the broader free software and linux user community.
Thatâs a great idea.
I think the next steps to make progress here can be:
- collect evidence about LLM use in FOSS projects (especially those we have packaged)
- try to sort them in multiple clearly separated categories
- test implementation in nixpkgs
Here is a pad that everyone can edit: LLM use in FOSS projects - HedgeDoc
I think we also need more clarity what exactly is the issue we try do address to find the best solution. Especially when LLM use becomes the norm and avoiding it not an option.
Itâs just speculation on the basis that any open source project that doesnât actively enforce no-LLM generated code policies are going to have AI-generated contributions if itâs popular enough. I think itâs fair speculation. We know for sure that some people are using it for kernel development.
Might be worth collecting links to other similar attempts as well:
- ai-alternatives/llm-afflicted-software: Free/Open Source Software tainted by LLM developers/developed by genAI boosters, along with alternatives. - Codeberg.org
- small-hack/open-slopware: Free/Open Source Software tainted by LLM developers/developed by genAI boosters, along with alternatives. Fork of the repo by @gen-ai-transparency after its deletion. - Codeberg.org
- brib/slopfree-software-index: A list of open-source projects that reject AI-generated code - Codeberg.org
- Starlight Network: The No-AI list
I think maintaining such a list is a good idea and we could also write a linter, which could be used by those that do not want to run such software. This could be external to nixpkgs.
Comparing government backdoors and racism to LLM slop is an insane take. Take a moment to reflect on your trolling, then stop. No one is interested.
Itâs a shame we donât have moderators anymore to stop such bigoted nonsense.
All of those are groups of people, that no matter the purpose/motive of the software they created (not talking about countries of course), have an actual incentive and (oftentimes) responsibility to keep the software working good and produce good quality code, at the very least they actually know and fully understand the code they wrote, which canât be said the same about AI. The morals of the code and the quality of the code are two completely different discussions. The problem with LLMs is that, even putting all morals aside, the code it produces is just frankly not good. It will absolutely produce mistakes, and if there isnât anyone around to catch those mistakes, things will break. Having completely vibe coded software that no human has reviewed is just a security nightmare or outage waiting to happen. Just look at the recent AWS outage due to vibe coded slop making it to production.
I think that the more metadata Nixpkgs can have about different packages the better. We already have sourceProvenance, licenses to name a few, why not add e.g. authorshipProvenance? It could be a list of the sources of the code, e.g. human, LLM, or both. The filtering stuff would be nice to have as well, but purely from a metadata perspective, to gain visibility into how AI is affecting open source is a huge benefit in itself, in my opinion.
I think you misunderstand my point. The fact that you dismiss it as bigoted nonsense is what Iâm getting at.
If the problem is low-quality code, we should tag for that. If the problem is license issues, we should tag for that. Those problems are not limited to LLMs (or indeed, even indicative of LLMs).
If the problem isâŚit came from a source we donât like or that weâd like to dismiss as low-quality or we consider gauche or are economically threatened by, that starts to look a bit less defensible. And if the problem is that it represents a supply-chain risk, thatâs a risk that already exists.
I picked somewhat inflammatory examples because theyâre obviously something that weâd go âWait, thatâs bigoted nonsense if we saw it going onâ.
We already have people writing code making mistakes.
We already have security nightmares and outages waiting to happen.
These are not new (or even rare) problems.
Yes but those are mistakes people make. Difference between LLMs and people is that people learn from their mistakes. I like to compare LLMs to gambling but with code. You are essentially pulling a slot machine (slop machine) and see what you win. Whereas in reality the odds are completely against you, and the only party that actually wins is the casino (LLM provider) since you pay them money. Anyway tangent aside, you are correct that people create and introduce security vulnerabilities. But with LLMs the security vulnerabilities and bugs are much more likely because the LLM has no idea what itâs doing outside of itâs context window. It has no idea about the implications of the code, and as soon as the context is gone, it doesnât even remember it wrote it. You can ask a person about why they wrote that code, why something is done a certain way, or the flaws of the code. You canât get an honest answer from an LLM to those questions because it doesnât think. It can try to guess, based on patterns in the code and what it was trained on, but it has zero knowledge and, more importantly, zero responsibility of that code.
In other words, who owns the code the LLM generated? Not from a copyright perspective (that is a whole other can of worms) but from an engineering perspective. If a problem is found in that code, who do you talk to? The LLM? Good luck getting a response unless you somehow manage to dig up the conversation that was happening when it was produced. The person that committed it? Well here is the problem, did they review the code? If not then that code might as well be written by someone that has just vanished from existence, and that is just a liability in and of itself.
Tell me youâre an american without telling me youâre an american xDDD In $CURRENT_YEAR literally any of those apart from Indians are well deserved.
It might surprise you, but often people who have strong stance against use of LLMs on moral grounds, would also probably say that supporting genocidal nations and their violence apparatuses is bad, actually. So you know, either someone already agrees with your laissez-faire attitude or will think youâre disingenuously trolling by putting an ethnic group alongside obviously evil entities.
I donât really care either way about it â I only really want to reduce the blast radius â but it feels like youâre starting your 2026 election baiting campaign kind of early this year. And not going to be sorry for that one, because using those as examples is such an obvious bait. You couldâve used any other nationalities for the examples instead and made it stronger for that. And yet you chose those that are actively killing people for the lulz (with the exception of India, I think?), curious.
We already have people writing code making mistakes.
People canât write 10 fully-functional projects in under a month. Well-driven Opus 4.6 can. And this is when I cared about the output being good enough. Imagine how many more you could slop out if you didnât. This is not a problem who can or can not make mistakes. Itâs a problem of scale â both in terms of the volume of code in need of auditing being produced and how easy is it now to automat writing supply chain exploits at scale with LLMs. Until there is no good way to provide assurance for software written chiefly by LLMs, then we should to care not to widen the attack surface.
After some checking, it becomes clear that you already canât have a Linux system without some LLM generated code since Linux itself and also systemd has it. Also Nix has some LLM code.
I shared that conclusion also on the Fediverse: davidak: "After some research today it becomes clear that yâŚ" - chaos.social
From a very quick check, FreeBSD, NetBSD and OpenBSD has no git commits co-authored by Claude. So you donât have to stop using computers if you canât accept AI.
So, which LLM generated code is a real problem and how do we handle that specifically? Maybe donât package problematic software. And accept everything else and hope the software maintainers are responsible.
It can be checked if maintainers follow best practices when using LLMs or accepting contributions, like security best practices can be checked, but that is not something NixOS has to be involved in.
I donât really know if I agree with crertel or not, but I do not think that it is accurate to say that crertel is trolling.
Thatâs not accurate. I am interested in what crertel has to say.
I think that part of what you wrote here is also inaccurate. Specifically, I do not think that what crertel wrote is bigoted nonsense.
More info is good. But AI is a tricky thing to filter by. Before even deciding how to categorize AI use in projects we first need to categorize AI use in general.
- Autocomplete (i.e. supermaven)
- Conversational (i.e. copilot, Amazon Q)
- Vibes (Cursor, Kiro)
You could call vibes agentic but thatâs not strictly true. Really what we mean is no-code AI workflows where the author does not and never has understood the code.
After that I think the most accurate thing would be to have
- confirmed used
- confirmed not used
- unknown
For each category for a particular project. For the purposes of putting it on a linear scale, we could assign the following numbers:
- Completely unknown
- No AI at all
- Possible autocomplete
- Confirmed autocomplete
- Possible conversational
- Confirmed conversational
- Suspected vibe coded
- Confirmed vibe coded
This conveniently fits in 3 bits which means nothing but pleases my C sense of aesthetics. In addition if a project has an AI policy we could attach that.
Basically everything would be a 4 or 5 including the Linux Kernel. Most everything which has AI bans is going to be a 2 or 3. 1 would pretty much just be solo projects by luddites like myself.
Itâs also worth noting that metadata isnât just for blocking software but also for documenting it in package search tools.
Nix is a programmable system. If I have structured data I can do with it whatever I want.
I think blocking evaluation for packages that were developed with LLMs would be a huge pain. Itâs a lot worse than for unfree packages since unfree packages are usually leaf packages in a dependency tree, itâs pretty uncommon that FOSS packages depend on unfree packages.
Also, donât forget that package can be vibecoded and the whole process of packaging can be vibecoded too and we need to separate it for someone who finds this as a stuff needed to be known
I think given the limited capacity of nix volunteers to implement metadata infrastructure (including some already-approved and uncontroversial cases, like categories), I think we should have a strong bias towards inaction and scope minimalism. If we donât need to do something about this, we shouldnât.
To the extent that this is driven by concern about code quality, therefore, we should only act once we have clear examples of this being a problem for us: for me, this means multiple examples of packages being broken in a way that seems related to them being LLM-written or LLM-assisted, and which our existing QA process is having a hard time dealing with.
To the extent that this is a morally-driven concern, itâs a bit more difficult to propose a criterion for action, because you donât often get external signals that youâre making a mistake. In that arena I think discussing it here is reasonable, but I donât want a discussion that pretends to be about the pragmatics when itâs really about the principle.
allowSlop = mkEnableOption "Allow LLM-contaminated software"
The same way you handle spam in email. By marking it as such.