Policy on vibe coded nix tools?

Somewhat related, but focused om nixpkgs: How should we handle software created with LLMs? - #50 by harryprayiv

I don’t want to call anyone in particular out. But it has happened a few times now where someone announces a new tool that looks exciting, and then it turns out to be vibe coded. I come into these projects super excited, read through their code, realize it’s hastily generated AI slop, spot a few security red flags, don’t even bother pointing them out because the project is BSL or ARR licensed (I actually don’t hate BSL but combining it with wholy unoriginal AI code rubs me the wrong way). In fact it feels like vibe coded projects are quickly becoming nearly as abundant as normal projects.

On one hand, I’m glad AI is enabling folks to tinker with stuff and become invested in NixOS. On the other hand, I don’t want to waste my time learning about a tool nobody even bothered to write. AI assisted code is one thing. But vibe coding (currently being whitewashed as “spec driven development”) is another thing entirely. Especially because I have yet to find an online-connected vibe coded app of any appreciable size that doesn’t have major security code smells. Although I’ll admit I don’t spend my days scouring the codebases of vibe coded projects (because why would I bother trying to understand the author has not bothered to understand?).

As far as this relates to NixOS forums, I think the proliferation of AI vibe coded projects harms the comnunity in the following ways:

  1. Wastes time of more senior members who were interested in contributing, only to discover it’s AI slop
  2. Confuses beginners by reducing the signal to noise ratio of “actually helpful tools” vs “flaky and unreliable crap that might open you up to RCE”
  3. Goes against the spirit of human collaboration
  4. Lowers the overall quality of code being shared on the forums

We could name and shame such low effort projects, but I would genuinely hate to go around flaming other Nix enthusiasts just because they were overconfident about AI. It would be really nice if we had a “no posts for vibecoded/sdd projects please” rule that we could politely point to instead.

11 Likes

Hmm, I dunno.

As you can probably tell from the other thread, I’m at minimum for gating such things with a flag not having Hydra not build them. That said, I’m not sure if I’m for a blanket ban of this kind.

I do have a few “slopengineered” Nix–adjacent projects that could be potentially useful myself, I’m just not sharing them because I just would feel at ease myself if I shared something I don’t quite understand thoroughly yet. I don’t necessarily think people should be precluded from sharing such things if they want to, as long as they clearly state this as such (e.g. post is flaired/tagged/labelled or whatever Discourse allows for), because there might be some value in discussing PoCs on the terms of the behaviour they implement, rather than code quality — for example potential tool/ecosystem improvements that it doesn’t make sense to invest time into developing properly if nobody else cares (yet or at all).

All this to say, I would be for a rule to require disclaiming this in such posts (including how the tools are used), but not a blanket ban on discussion. A separate category could work as well.

Also, I don’t think that spec-driven-development/slopengineering/whatever-you-want-to-call-it is the same as vibecoding. With proper guidance models (or at least Opus 4.6) can get much further, doing much less dumb things along the way, than without. Forcing the model to plan, reviewing what it spits out and having a strict test harness (NixOS VM test for things I would have otherwise have to click through each time, like 2-machine Firefox Sync <3) really does a lot of heavy lifting. Still not at stage where you could treat software written this way like any other, but it does improve the situation noticeably. Not going to be fighting over terms, but I think it should be treated distinctly than unleashing an LLM agent without a plan nor guardrails.

PS. em-dashes et. al do not mean a model had written this to you, it’s just that the default Polish layout has convenient access to them and I like my typography xD

4 Likes

I really feel that disappointment you describe when someone posts something interesting looking, and I find out that it’s all vibe coded. Back when I tried out vibe coding earlier this year with Opus 4.5 I had a miserable experience getting it to produce something even barely working, let alone good, when I asked it to re-implement part of my custom build hook. I didn’t know it was possible to be bad at vibe coding as a software engineer with a decade of experience using a frontier model, but since it is that must be some kind of testament to the (lacking) quality of these other projects.

4 Likes

I definitely understand the appeal. And it’s something I’m trying to keep an open mind about, at least with local models. The promise of being able to empty your brain of all those hobby projects is really appealing! At the same time, I’m not learning anything when I use AI. I also have experieneced how easy it is to miss things when you use it. Finally, I’ve seen the code so called AI experts have written, and I don’t think it’s maintanable or secure. Maybe it could be maintainable and secure, but that’s not what I’ve seen in practice.
I guess a disclaimer would be a more reasonable middle ground thab an outright ban.

2 Likes

I would like to add another perspective to this whole thing. I don’t like to admit it, but I have Long COVID, and if I try to program, I have only a few hours before my brain fogs out. It used to be even worse, allowing me only an hour or so. As a result I could no longer be an effective professional engineer.

Using LLMs has changed that. I can invest the hours I have available to build a harness and build rigid quality standards. I then let the LLM run with that while I rest. It’s been a huge force multiplier, and I can now make meaningful progress on things, which brings me some joy.

All I’m asking is that people remember that all technologies have their own good and bad points, and that painting arguments in black and white misses the whole picture. I have a project I’m working on now that I’d like to contribute, but honestly the knee jerk negativity towards AI in the open source community leaves me reluctant to do that.

12 Likes

I think it’s a very touchy subject for sure. A lot of people in OSS, myself included, care deeply about two things:

  1. Ethics in software.
  2. “The craft”

AI threatens both of those. I have a lot of complicated thoughts about it I still haven’t ironed out myself. So I appreciate everyone’s input. Personally I use AI at work because I’m practically forced to, but I keep my personal projects an AI free zone. Which does admittedly mean I end up not doing a lot of personal projects: It’s hard to justify spending any additional time coding when you spend a full work week doing it and need to do other things like stay active, cook food, or just enjoy a book.

4 Likes

I completely agree that LLM generated code should be treated cautiously. I just object to people dismissing projects as “vibe coded” or “ai slop” without even looking at the code. It is toxic.

1 Like

FWIW I’ve used LLMs to assist me recently with Navi, and I considered putting a header on some parts of it, mainly just to make clear that I am loud and proud and try to live outside the LLM closet.

I’m personally not ashamed of using LLMs, neither linters, documentation, or a split keyboard. They’re just objects to me, or tools, not a strong moral stance[1].

As someone that has dealt with a large volume of PRs over the years both before and after LLMs, and someone that has worked with a lot of codebases before LLMs came around - both professionally and in open source - many of the well known and beloved Nix tools are written really poorly from an engineering perspective. They have been so long before LLMs were around.

I think a lot of people now over-index on the origin of the code without realizing that most code is generally of a very low quality. One doesn’t need to look further than Nix itself to see that most tools in our space have less than ideal software engineering.

Whether or not a code originates from an LLM, an advanced autocomplete, or someone clicking a keyboard, it should be assessed based on its quality, not origin[2]. What gets people in trouble is they either think their code is brilliant just because it’s organic, or they just tell their LLM agent to “build this tool” without caring much for how it’s build.

But whether or not a human or a transformer typed the code, what really matters is the quality of the output. And I’d go out on a limp and say that LLMs have been a net positive for the amount of output and quality of the code I’ve written recently. But I’m also not naive about the limitations, and I take my engineering seriously, even if I use an LLM.


[1]: There are real conversations to be had about environmental impact, but don’t be naive that the plane ticket you take for NixCon, the Lithium battery in the device you’re reading this on, or the global supply chain or even tiered system you live in as someone that understand tech means that you just have to be “anti LLM” and you’re morally superior.

[2]: Unless you’re interested in issues around licensing, but it’s not something that you should factor into a code review, code should be taken at face value.

17 Likes

There have been discussions in the NixOS SC - informed by the policy of other projects - around ways to do this.

I think the best way to do this would be to just make sure to indicate LLM usage for the people that care. We could even go as far as having an announcement channel for just LLM generated projects and one without it if it would help accommodate more people.

I’d however personally contest the idea of a blanket ban, not just because it’s naively unenforceable, but I also think it’s a fundamentally Luddic and puritan approach, that would limit the amount of interesting and cool things that happen in our project and ecosystem.

4 Likes

I’m actually far more concerned about the morality of the societal impacts of AI than the environmental ones, but point taken. I’ll save everyone here a lengthy rant. But there’s moral landmines in every step of the AI process.

9 Likes

transparency does seem at the root of most issues mentioned in the original post.

this should probably also be put in perspective with the US military emitting more than 100 countries combined.

this might be a bit murkier - i did for one predict the evident vibe coding for NixMate.

while i empathize with the sentiment, this sense of the term seems a harmful liberal co-optation made to frame any technological advancement as inevitable, diverging from its historical roots describing not dogmatic abstinence from technology but sabotage.

14 Likes

That is a fair point. I’m not talking about machine breaking obviously. But I do think there are parallels to the actual movement, even if less insurrectionist.

I guess I should clarify that I mean this specifically as a description of a certain world view. Specifically programmers that (like me) have spend an enormous amount of time becoming really good at creating code, and now finding that this skill is being rapidly devalued economically.

I think the original movement is a really appropriate analogy. Textile workers being replaced with automation, that while industrially scalable and “efficient” under some economic measures lead to concerns around job security and product quality.

So in a sense we’re entering an era of “fast fashion”, but for code. This will both lead to an enormous amount of variety while lowering the barrier to entry massively. But at the same time likely lead to massive waste, and a lot more disposable software.

I can see many futures where LLMs aren’t inevitable, due to lack of profitability, regulatory capture, or many other scenarios. At the same time, I do expect that they will be around for a very long time at this point, since consumer hardware has rapidly been pushing to run large models on prosumer hardware already.

So to me, it’s already less about an inevitability and more about a system that exists, and couldn’t be ignored even if I wanted to.

3 Likes

Concerning LLM and “agents” in Nixpkgs contributions and Nix* tooling: chaotic enby: "My #Wikipedia request for comment just closed, fi…" - Wikis World, Wikipedia:Writing articles with large language models - Wikipedia.

Whether you’re a LW-fanfic enjoying rational effective altruist, or an “AI” skeptic, Wikipedia’s is the only meaningful stance. That is unless your bet is “deny the training material by contaminating Nixpkgs”, in which case I think your plan isn’t going to… show much yield.

The human-made decisions, and holistic human-created designs are the main value proposition of this project. Going slow once in a centralized venue is the point. Think of Nixpkgs as of those “12 bits per second” shared memory of Software Development.

If we wish for our trust processes to keep functioning, we need to keep the Nixpkgs codebase explicitly clean, and “agents” explicitly out of the loop.
If we wish to train our own language models, we also need to keep Nixpkgs explicitly clean (to the extent it’s possible, given that our cognitive processes are already affected), and the existing contaminants labeled.

CC How should we handle software created with LLMs?

10 Likes

maybe it’s helpful here to mention some more examples of ML-generated code we’ve seen in the nix ecosystem so far:

3 Likes

“The Luddite period of 1811-16 saw traditional practices being pursued as the machine breakers of the Midlands used the vulnerability of their employers’ property as a means of attempting to bring pressure to bear upon them over their various demands. These attacks on machines did not imply any necessary hostility to machinery as such; machinery was just a conveniently exposed target against which an attack could be made.” - Thomis, 1970.

It’s reasonable to disagree with property damage as their method of resistance, but I also think it’s fair to say the Luddites were seeing the world more clearly than workers embracing their exploitation via chatbot in the present day.

13 Likes

Imagine being a newbie programmers who wants to make a Nix tool, and hear something like: “AI slop is not allowed, request rejected“ from maintainers, just because they have some “AI policy“ and you may not know how to code as an 10years+ senior.

In reality, people would hide Ai usage and claim its their work.
Anyone who writes a code with some mistakes is automatically labeled “slop manager“.
Anyone who wants to give helpful advice about, does not do that, because he is unsure if its human or robot written.
I don’t think its a good way to quality check by using ai markers or whatever.

2 Likes

And if you keep outsourcing your learning to machines, you never will.

I don’t know what our community policy on AI-written tools should be, but I’m of the opinion that the greatest harm the current generation of AI assistants is doing to the software community is to the kids who are denying themselves the opportunity to become senior developers instead of senior ‘slop managers’, if that’s the term.

15 Likes

there may be some nuance between vibe-coding vs LLM queries there as well

PRs that involved vibe-coding: Pull request search results · GitHub

This is utterly insane. No 10 year old should be feeling discouraged because they cannot publish software. These are literal children. They shouldn’t even have social media accounts to publish with for all I’m concerned.

Teach kids to program in more suitable environments. There are plenty of cool projects out there, personally I think the Minrcraft environments designed for this are really freaking cool. LLMs on the other hand are a terrible way to instruct anyone to do anything but ask questions in obtuse ways.

This community is not a kindergarden. We shouldn’t be responsible for raising children, or be especially accomodating when they try to fit in, even if it’s understandable that they do.

I do. When it’s people using LLMs it just burns me, because most of the time they do not have interest in learning or understanding, they just need a smarter person to give them the magic words they need to continue their pursuit.

Often enough they end up copying what I say into their LLM and come to wild conclusions that then somehow do work, and then they disappear never to engage with my explanation of what, why and how.

I get this at work a lot, too, these days. Many underqualified hires suddenly expect everyone else to solve their problems for them. I wasn’t the type to judge people for not being at par, or not putting in too much effort, but hell, being asked to literally repeat an error message at someone and then just giving up and sending them a command - while they make the same salary I do - is turning me into one of those toxic workaholics.

Work multiplier my ass. It’s just concentrating the effort more than it already was.


For the record, I’m opposed to a policy like what is described here, because as you say, at best people will just hide their use. If the resulting project isn’t obviously vaporware, there’s no way to tell anyway.

But I do believe using LLMs to generate code is generally harmful to your ability to be a good engineer, even if I don’t believe all use is inherently bad, or leads to poor quality. Nor do I believe human-made is inherently good. I think I’m e.g. much more comfortable with LLM patch review - I don’t think it should replace human review, but review is difficult and more interpretation cannot hurt (as long as we can assume there was some long-term thinking behind the original patch - LLMs reviewing LLM code is not the end goal).

I do also think the impact of the associated industry is immensely negative (and no, this is not “knee-jerk”, stop pirating my work, and stop assuming I disregard the environment or other people’s literal lives as much as you do), but the related area I’m most expert in, and where I have seen the effects most clearly, is coaching and teaching around software, so that’s where I’ll leave my 2c.

13 Likes

Is this thread about discussion posts or about actual code?

If it’s about actual code, we already have a PR review and merge process. Your code needs to meet the same standards whether you got help from an AI agent or from Linus Torvalds.

If it’s about the discussion forum, I’m not sure how this intersects with AI generated code? Are you saying that every time I link to a code repo in a discussion post, I need to first investigate whether that repo has AI generated code? That seems undesirable to me and is putting a lot of burden on quality assurance just to participate in a discussion forum.

1 Like