Let's create skills for nixpkgs development

Like it or not, AI agents are here to stay, and they automate a heck of a lot of the boring stuff. If you ask nicely, they can even do the whole flow of a PR without intervention and ship good code, taking into account gotchas that a human would spend a lot of time checking.

Today, Nixpkgs is heavily documented with well-written guidelines but lacks that LLM entry point. AGENTS would help? Would!

But for the details, it’s good to have a standard process, or checklist, that the LLM can quickly take a look at and do tasks more in line with the quality we expect.

I am not suggesting adding CodeRabbit or that Gemini review bot that brings a lot of noise. I am proposing formalizing the procedures in a way that works for them, or people would eventually contribute wrong, feeding this anger or just go away to another community that is more friendly to that.

Today, the skill location is standard; place them on .agents/skills, and all relevant agents can see them; no need to symlink those around.

The idea is to start with something and iterate on it over time as processes evolve.

8 Likes

This sound like a great way to have people starting to push AI generated prs with increasingly lower level of control. Having agents that shove pr to our backlog is not something i would like to see.

On a personal level, I entirely engage in nixpkgs to de-pressure from my internship which has required me to make use of LLM. I am sick of AI everywhere and constant push for automation of everything without letting breathing space to people.

42 Likes

Where is “here”? It’s definitely not the NixOS project, and it sounds like a pretty terrible place to live.

22 Likes

Those PRs will come anyway. With skills and AGENTS at least they can get the basics right.

Models are evolving fast. Today, any well contextualized model can do most of stuff with minor supervision.

6 Likes

Great. Who will deal with theses prs?

6 Likes

Ill give examples of what kind of stuff we can achieve essentially oneshot from an issue or a build failure.

I am working on a private prototype of a nixpkgs skill and did these PRs.

First one started from a issue, the other two from that message that the Matrix bot sends when nixpkgs-update isn’t able to bump a package.

I didn’t actually write this code, only review. It dispatched the nixpkgs-review run via my Github Actions trick and took the hashes from the logs.

Do you think that any model would give this output on a uncontextualized prompt? Any blind spot that I didn’t spot on them?

Standardizing this process would give anyone the chance to do a good v1 without bothering anyone and without having to do silly changes because of a minor nitpick because the model did read every guideline and checklist while the person was doing something else.

It’s like recruiting a new maintainer, but this new maintainer can read really fast and doesn’t forget half the things on the way. If you don’t give it context it will mess up.

2 Likes

@lucasew I’m sorry for the overly hostile responses you’re receiving; obviously you mean well and are trying to contribute positively to the project. Unfortunately AI is quite a sore subject in this community; for instance you could look at this thread for some more recent context. In short, right now it’s simply difficult to have productive discussions about these topics.

My read on it is that right now there is a fairly large and vocal part of the community that feels quite strongly about AI. So I don’t see anything like this making its way into the Nixpkgs repo any time soon. In my personal experience, I’ve found that if you put in the effort to ensure your pull requests are of sufficient quality (which it sounds like you’ve been spending your effort on by developing these agent skills) then you can open these PRs according to the current policy and some of them will be merged. So from the perspective of making the project better, that’s a net positive.

Obviously it’s fairly thankless work, but that’s just how contributing to Nixpkgs is in general (especially for committers like @Sigmanificient who spend a lot of their free time filtering through PRs, including many low-quality ones, some of which come from AI).

8 Likes

@lucasew I dont find theses example particularly impressive. However, I invite you to run your model into all the entry listed by https://nfd.1l.is. It’s not fully up to date because i had not time / energy to maintain it this month, sadly.

While I disagree with your methodology, it would still be great to see some progress on our build failure listing. However keep in mind that prs are not cost free for reviewers. I appreciate that you go through some validation and keeping the scope restricted, i fear that it might not be the case for anyone if you lower the friction of AI based contribution.

7 Likes

Interesting

BTW on approach I was able to do this thing in one afternoon with AI: https://docharvest.github.io/. Made with Astro, DaisyUI and the impeccable skill because I am not a designer.

It essentially ingests markdown from other repos and make a site with llms.txt and stuff.

You could do a similar approach using a scheduled Github Actions that sends a PR then host it on Github Pages. I only moved docharvest to a different org because of the domain.

The maximalism and pushism of AI really sucks but there are really great things coming from that if you know how to review, test and guide. Like, you know how the answer looks like, so you guide the model towards it.

1 Like

Like it or not, AI agents are here to stay,

Just a tip for the next time you want to convince people not already on board with the whole LLM thing: Don’t start your posts like this.

It’s presumptuous and paints those that obviously disagree with that statement as ignorant or obstinate.

31 Likes

Sounds great in my view! Good behaviour should be the easiest behaviour, and vice versa. Bad behaviour (like AI PRs from people who don’t read the guidelines) should be the hardest behaviour.
Though I think the guidelines can be improved in many places.

7 Likes

Yeah, V1 probably will have a lot of blind spots.

That skill I made has a lot of coupling on my nixpkgs-review trick. I am not the best skill maker of the world too but I am trying, people dogfooding it more can make it better too.

Initial draft: agents/nixpkgs: add nixpkgs agent skill by lucasew · Pull Request #536518 · NixOS/nixpkgs · GitHub

2 Likes

While I agree that nixpkgs would profit from LLM instructions, I personally think your skill is too much at once. Why is it not an AGENTS.md for example? I propose we start with such a file and only add hand written text. Let’s not generate these files and keep them short and reviewable themselves.

Maybe we could collect LLM generated PRs, where the author seems to care and has also responded to feedback. Those would be good candidates for improving our AGENTS.md with minimal instructions that have a verified impact.

2 Likes

My personal suggestion is that whatever docs you’re trying to write, please consider writing them primarily for people instead.

On one hand, whether or not “AI agents are here to stay”, it’s clear that things will continue to change a lot, throwing massive churn what are effective ways to run these “skill files”. It’s massively unclear that what we have as the best practices at prompting LLMs would still be relevant even after a few months. Prompts like “do not invent flags” could literally become irrelevant overnight — who knows.

On the other hand, LLMs are supposed to be able to ingest natural language, so by writing for people, you will also reap the benefits of clearer documentation immediately for people working on the code.

18 Likes

Nixpkgs has a shortage of reviewers, not patches. My experience with LLM reviewers is that they provide more noise than signal, and I have as such learned to scroll past the copolit review unconsciously. Could skills improve review?

3 Likes

That’s probably a better approach.

But it’s harder to test.

Starting your post with

Like it or not, AI agents are here to stay, …

Is, at the very least, going to rub people opposed to AI contributions the wrong way. Please respect people with other opinions. Many threads like this become very heated, very quickly. Let’s do all we can to keep it civil :).

they automate a heck of a lot of the boring stuff

I’m all for automation, but not via LLMs. The idea with automation is that it reduces maintenance. LLMs regularly botch simple repetitive tasks, and output here would need to be verified intensively, which is arguably more boring to do.

An argument I see very often is how ‘fast’ LLMs make you go:

I was able to do this thing in one afternoon with AI

I see these people jump from project to project to project every week. Nixpkgs does not have a need for these ‘faster’ developers IMO. Maintaining software is not a race. We need more maintainers who are willing to take it slow, and understand the ‘gotchas’ that humans take time to understand.

I don’t think investing effort into making LLM powerusers take less time to use nixpkgs is a step in the right direction. The barrier of entry (time/effort-wise) is high for nix/nixpkgs/nixos, not because it’s LLM unfriendly, but because nix/nixpkgs/nixos is a large complex ecosystem that takes time and effort to understand.

This effort should be invested into humans who are here to stay.

28 Likes

I don’t think skills files are necessary or helpful, even for people who find LLMs useful tools, except in specific situations (more below).

  • LLMs can read the manual, and are good at that.
  • Creating a second hierarchy of “docs for bots” in skills/ will duplicate lots of information that’ll be hard to keep in sync.
  • LLM-written skills files are also often overly verbose (hard to review), overly specific, and often contain a huge amount of information that LLMs would be able to come up with on the fly anyway.
    • In my opinion, the most useful skills (or other LLM-target files) contain info that the LLM failed come up with. This makes writing them inherently a task for humans.
      (Of course there are minor exceptions to this, such as using a smarter LLM to generate instructions such that dumber, cheaper LLMs can follow them repeatedly, or generating a skill after a session with lots of LLM failure and human feedback. In any case, that still boils down to having them mainly derived from human input.)

What I think should be done instead:

  • If there’s something that should be documented, put it the manual.
  • nixpkgs should have a simple AGENTS.md that tells the LLM that there is a manual, and where it is, so it can efficiently find it.
    Likely, writing * Before making any changes, read CONTRIBUTING.md. already will take care of most of that.
    The main point of AGENTS.md is to avoid it having to search around or guess, and instead efficiently point to the right location.
  • Skills files should only be written for highly specific LLM tasks that affect non-humans only or mainly, and that are too complex to put into AGENTS.md.
    I cannot come up with one right now for nixpkgs, but as an example I wrote one for our work project that described the specific way we wanted agents do interactive git rebases in a way that doesn’t apply to humans.
  • It might be useful to have a separate Discourse thread or Github issue where people can collect things that LLMs fail to do in nixpkgs that they should really be able to do, to inform what’s worth writing down into AGENTS.md or skills files.
    • For example, my Claude sometimes does find /nix/store to inspect some built packages code which takes forever, when nix-instantiate + followup commands would be much faster and accurate.
11 Likes

That’s where I am pivoting to: AGENTS.md: init by lucasew · Pull Request #536668 · NixOS/nixpkgs · GitHub

1 Like