Should commercial actors ship telemetry in nixpkgs?

It does if you use the “I want to use AI to convert a written prompt into a devenv environment” functionality. It’s open source, instead of speculating, you could just read the code. But where would the fun be in that?

If my understanding of the code is correct, the table is pretty irrelevant. The feature will tar up whatever it can find in your git repository with source code and all and upload it to Domens server. Opting out will only add a flag to the url stating you don’t want all that tarball to be used as telemetry, but the upload happens none the less.

That would be a very liberal interpretation of the word “telemetry”.

Given that, if you’d run that on an existing code base that you do not own / is proprietary, I think the legal term is “You’re f***ed”, but INAL

15 Likes

Are you sure this is actually sending the content of the files and not just a list of files in the git repository? And also only if you use the AI-generate feature AND don’t give it a prompt? Seems like a pretty bold statement to make.

I could be wrong, I did not actually run any of the code, just glanced over it.
In any case, it’s already way more than advertised and it shows that the “Opt-out” does not actually opt you out.

4 Likes

Well what is your suggestion on how such an AI feature is going to work? Not send any prompt to a server? I mentioned it (albeit largely ignored for the generic outrage) above. If you don’t want your AI prompt to be processed by any server, maybe just don’t use the AI feature then.

I honestly don’t care how as I don’t use or intend to use it at all. It’s neither my problem nor the topic of this discussion.
If you are trying to make the point that not shipping software with this kind of telemetry enabled means we cannot ship AI software in nixpkgs, that a point the community can debate upon.

5 Likes

Guess I sometimes have some remaining hopes that a rational discussion based on facts is possible. But clearly, taking small slivers of information (even code), grossly misinterpreting them and then blurting out conclusions based on assuming the worst possible intent trumps this for some.

Sorry for being so naive. I will take the L on this one, the community has spoken and it chose outrage.

3 Likes

You could point out what the code is doing (if you believe it’s doing something else) instead of being snarky. Or you, know, make a comment that’s on-topic (since this thread is about telemetry in nixpkgs and not what devenv specifically is or isn’t doing). But where’s the fun in that?

EDIT: In any case, Pre-RFC: Add Telemetry Meta Attribute seems to be a more fruitful discussion than whatever’s going on here.

11 Likes

If my understanding of the code is correct, the table is pretty irrelevant.

In the context of collecting telemetry IMHO it is relevant to give users an understanding of what is transmitted and collected vs discarded.

E.g. I’m fine if my IP is transmitted (because that’s how networks work), less so if it’s collected.

This said;

The feature will tar up whatever it can find in your git repository with source code and all and upload it to Domens server. Opting out will only add a flag to the url stating you don’t want all that tarball to be used as telemetry, but the upload happens none the less.

Thanks for the pointer. My understanding is that that the method grabs all non-binary tracked files in a git repo (git ls-files -z), filters paths based on an exclude list and adds what’s left to a tarball via tokio_tar. The doc of tokio_tar::Builder::append_path suggests it will add content, not just the file name, to the archive.

With the caveat that I have not looked into what goes in the exclude list, this behavior seems pretty aggressive and goes beyond just logging prompts (that’s expected by a GenAI tool). My assumption (to be validated) would be that the exclude list would contain all non devenv managed files. But even if that’s the case, the approach would still be a liability.

Again, I assume good intentions, but this behavior was not clear to me just by reading the documentation at devenv 1.4: Generating Nix Developer Environments Using AI - devenv.

Anyway, I think this is going a bit off-topic. Happy to discuss elsewhere if we can keep the convo constructive :).

My takeaway is that maybe what constitutes “telemetry” is a bit of gray area, and this reinforces my preference for opt-out by default.

12 Likes

What in the world are you talking about? The code is public and you can read it. It lists all the files in the user’s git repository and then adds them to a tar archive. Go read the code and reference the library docs yourself. Yes, people are outraged; they should be! This is an egregious breach of trust and would break every single internal security policy for every company I have ever worked for.

29 Likes

My apologies to all in this thread, I messed up there. Uploading the whole source like this is not something that should happen.

15 Likes

Could this thread be split out between discussion of telemetry policies in nixpkgs, and discussion of the data Devenv is collecting? Both are valid topics, and I think they they deserve individual spaces for discussion.

8 Likes

I think referring to a command uploading the contents of the repo you run it in without warning or clear indication that it will happen as “telemetry” is very odd and am confused why discussion started with that phrase.
Upon seeing complaints about telemetry my initial thought is that it’s going to be a nothingburger and the actual details here seem much worse than that.

8 Likes

Having a telemetry option for Nixpkgs would be pretty useful, but I don’t necessarily want to disable all telemetry. I don’t want to disable telemetry for KDE Plasma, for example. It is purely opt-in and I feel comfortable with it. Unfortunately, standards like consoledonottrack don’t seem to make it possible to distinguish this nuance, so the software most likely to respect user preferences will probably be unfairly impacted.

All of this commotion could be avoided if software packages would just get explicit consent for telemetry. It’s a pity that getting accidental metrics from people who would’ve surely said “no” if they were asked is seemingly too valuable to pass up. I’m hoping regulations across the world catch up here eventually.


That all said, I’m more bothered by the instant merge. Is this OK?

Since I’m the maintainer and author of this package, I was never asked if I agree with this change and I don’t.

It’s going to make it really hard to improve the generation without telemetry, otherwise we wouldn’t have done it.

My understanding is that Nixpkgs maintainers explicitly don’t get absolute control over their packages. Per the README, it does seem like it is considered reasonable for a maintainer to submit a revert PR in case of a conflict where something is merged before the maintainer can respond with a review. OTOH, though, I am not under the impression that implies you should just go and commit it on your own without review, and changes should still be subject to some kind of a consensus, just like any changes a maintainer typically makes are.

Am I wrong? I hope there is some further discussion on this, but it seems like most of the discussion has turned towards the telemetry aspect instead.

Even if there is not, I’d just add my own personal plea: If you are a Nixpkgs committer, please reconsider this practice. It looks pretty bad, and I think it’s pretty unnecessary. I really don’t think a few days of lost telemetry from unstable users is such an urgent issue to warrant this.

4 Likes

Committers (not maintainers, domenkozar enjoys very special privileges, which makes all the controversy even more… severe) get this privilege, and in theory should only use it for “small” things - which the devenv story was; the PR landed without maintainer review and therefore a revert was warranted. It’s good that a committer can undo a change that happened with too little oversight without waiting like a month for review.

The fact that this change happened to be controversial is unfortunate, but it’s well within policy and the policy is not unreasonable.

We assume committers can be trusted, and to a degree need to for nixpkgs to work - a specific committer losing trust doesn’t mean the role should not exist.

Honestly I don’t think the revert even matters, it was a pretty reasonable action by a respected community member. Devenv’s features - even with the not-so-adequate opt-out - are a third and different matter entirely, this thing borderline behaves like malware.

Final edit: thanks @domenkozar for taking it seriously, for the record, I think a less problematic implementation would be a really cool thing to see.

You can just unset the variable in a wrapper for software you do want to do tracking. Either way, the discussion seems to be gravitating towards something like allowUnfreePredicate, which you will be able to tweak to your liking.

8 Likes

Based on feedback, Domen just announced that the feature of generating devenvs from source code will be removed :slight_smile:

20 Likes

Rather than try and decide what kind of spying is an unnecessary evil I think we should be making policies that balance competing interests between maintainers.

Maybe we shouldn’t allow upstream maintainers to merge PRs that concern their packages. I merge my own software into the tree because there are no disadvantages to myself. I don’t like that situation and you shouldn’t either.

Separating commercial and non-commercial interests in a package is probably intractable but we can at least track who has an interest in pushing packages in front of users and who just wants a package not be broken on their machine. Maybe putting another field next to meta.maintainers is good enough?

Disclosing who is upstream for what would be mostly self-reported. We could designate someone to handle reports of violations, but of course that raising the question of who is paying the oversight person for a thankless job.

5 Likes

I don’t think that’s the problem. If you couldn’t trust upstream to behave reasonably, why would you even use their package?

7 Likes

That’s the whole point. Up to devenv 1.3, Domen seemed someone who could be trusted. With devenv 1.4, that trust was shattered.

Domen is one, we’re a bunch, so we’re drowning his voice and this is likely unfair to him. I’m listing the below not to ostracize him, but as a short list of what I hope all maintainers will consider, in hope we will reduce the occurrences of similar events in the future:

  1. Don’t upload my stuff to your servers without my consent
  2. Don’t pretend you’re not uploading anything when I ask not to be tracked
  3. Don’t call such uploads “anonymous” or “telemetry”
  4. Don’t work over a year on a feature without considering basic moral, privacy and security issues
  5. Don’t assume your corporate interests such as you paying for this feature make such behavior acceptable
  6. When confronted by multiple people, don’t reply with anything except “It looks I messed up. Give me some time to consider these comments and I’ll revert”. Then take a walk outside, carefully consider what others are talking about, any conflict of interest, consult with other members and so on, then try mitigating any damage you caused and start building trust from zero

One of the reasons I use open source wherever I can is to avoid nefarious issues like this. If even high-profile members like Domen can have such serious lapses of judgement, who can I trust?

20 Likes

It’s not the biggest problem here but I think it is a problem.

I don’t trust upstream if they only ship a appflapsnat-image full of obsfucated javascript, but I trust that the nixpkgs maintainer has the same interests that I do as a downstream user.

8 Likes