I’m not sure where to put this, so I’ll put it here.
The events around devenv 1.4 sending telemetry by default to train their AI model has demonstrated (at least for me) that we need some clarification on the leeway that commercial actors should have on nixpkgs and it’s ecosystem.
My personal opinion is that nixpkgs is a community project.
Commercial actors are of course welcome to contribute and maintain their own packages, but the wishes and needs of the community should always have precedence over their commercial interest when doing so. For me, the current situation shows a clear conflict of interest that we, as a community, should address.
There have already been some suggestions that we might want to pick up on, eg. @quasigod has suggested that we add a general telemetry related option to nixpkgs that packages / modules must abide by. I could also see a more formal stance on the matter in terms of an RFC (which may of course go either way).
Generally, I’d love some form of consensus on where we as a community want to land on this topic.
I don’t feel that managing telemetry is within the scope of nixpkgs. Whatever the package does by default should probably just be done.
That being said, this is my assumption of the scope of nixpkgs, and certainly, if we want that to be in scope, then we should enforce it.
I agree having a globally named argument in nixpkgs is a good idea though. Maybe not “must” have, but certainly optionally have.
The particular software in question is following the https://consoledonottrack.com/ model. At the very least we can have a nixpkgs level configuration for this like nixpkgs.config.allowUnfree.
We could quite uncontrovercially add nixpkgs.config.doNotTrack or nixpkgs.config.consoleDoNotTrack. Default being null/unset
In this case devenv could check if nixpkgs.config.doNotTrack is true and otherwise not add the env var to have the “opt out” model where false is treated as false and unset is also treated as false.
For other packages they could have just false be tracking allowed and true + unset have tracking disabled.
There may be concerns for packages with many dependencies where it’d require excessive rebuilds but for leaf packages it shouldn’t be a problem.
I very much agree with you @kampka, thanks for bringing this up. nixpkgs should first and foremost serve its users, and this includes removing anti-features.
Regarding tracking in specific, I absolutely think it’s an anti-feature and should be disabled by default. In my opinion non-consensual[1] approaches like DO_NOT_TRACK are backwards. The right (meaning in the interest of users) thing to do would be to subject tracking to explicit consent[2]. When this is not easily doable (for example if the package does not provide a way to disable the tracking by default) the tracking code should just be stripped.
I like the definition from GDPR article 4: ‘consent’ of the data subject means any freely given, specific, informed and unambiguous indication of the data subject’s wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her; ↩︎
For this software in particular, disabling telemetry is very much a “trust me” thing anyways.
Since the AI does run on a server, whether or not you have telemetry enabled, your request needs to go there to be processed. All that DO_NOT_TRACK is doing is to append a URL parameter disable_telemetry with the selected value.
Can have one opinion or another on the defaults being set. But actually having this option to disable telemetry is probably better than 99.9% of the other AI tools out there. So please consider this before this gets blown out of proportion. If this was just being done and the option to disable telemetry was never offered, I am 99% certain no one would have batted an eye.
This is a non-comprehensive list just what I put together over 40 mins. From what I can gather when we make an option available in a service as far as I can tell we always default to disabling it.
(Just to mention, in home-manager there are some general settings which give examples of disabling telemetry but don’t set anything by default)
It’s not going to be possible to track the packages & services which have telemetry and the maintainer is unaware, or which have telemetry and the maintainer purposefully left it’s default settings, or packages which have telemetry but are disabled default opt-in or at least ask the user upstream.
Some packages could be purposely left with telemetry enabled by default because it logs out “To disable telemetry do XYZ” during initial (or every) use.
Domen should not be allowed to revert disabling telemetry in nixpkgs, there is a clear conflict of interest here (again, sigh), and majority of nixpkgs users don’t want telemetry, and they come before Domen.
If telemetry is not removed, it will quite frankly be a massive failure of putting corporate and established voices interests over users.
Do you have some good data for the statement? Or in this case that majority of users of Domen’s package don’t want the telemetry. I don’t mean to dispute it, but it’s simple to just say things. I wouldn’t dare to claim what users want without first at least running a poll or something.
Note that the one who merged the disablement then encouraged Domen to revert, so…
The instant self merge is concerning. I recommend devenv use its own method for distributing their product if it wants to maintain this level of control over how it is used by people.
Then it would also be worth mentioning the revert PR was created 8:37 UTC and self merged 8:37 UTC
the apology and offer to revert was 1h later at 9:38 UTC after it had already been reverted
No opinions on the timeline. IMO it’s fair to revert something which lacked sufficient prior discussion. I’m against self merging but also not a biggie IMO for a small revert.
The interest here is to improve the generation to give more accurate results over time.
Here’s the table of what we collect into an sqlite database:
CREATE TABLE `runs`(
`id` INTEGER NOT NULL PRIMARY KEY,
`source` TEXT NOT NULL,
`duration_sec` INTEGER NOT NULL,
`finished_at` TIMESTAMP NOT NULL,
`devenv_nix` TEXT NOT NULL,
`devenv_yaml` TEXT NOT NULL
);
The issue with AI models is that they are extremely unpredictable, so when you update to a newer model you need to understand if the results you’re getting are at least equally good.
That’s not an easy task to do, given that the output from AI is non-deterministic.
The interest here is that the results get more accurate over time and AI generation is something I pay out of my pocket and is offered for free.
Do you think that maybe users of devenv might benefit from the work we’re doing?
Or do you believe we have some ill intentions, and which?
I think this is missing the point. I’m sure the telemetry is going to benefit users of devenv (or at least it’s supposed to), but the issue is that users may not even be aware that telemetry is being gathered and not opting-out isn’t exactly consent.
Regardless, I think there needs to be a clear policy within Nixpkgs with regards to how telemetry is handled (if at all possible for a package).
I can’t find a privacy policy on data handling on the devenv website. According to GDPR principles, if user data is being collected, a privacy policy needs to be in place and users need to explicitly opt-in to data collection and retention. Only judging by the blog post about the AI feature there seems only opt-out without any indication that data is being collected which might be a breach of GDPR.
We shouldn’t redistribute software that doesn’t adhere to GDPR laws and general data protection principles. Everybody please keep in mind that devenv is likely not the only potential offender in nixpkgs, so a policy is IMHO needed.
This is going of topic a bit, but the said truth is: People do not care about your intentions.
That’s not your fault though.
Day in and day out, we (as a society) are barraged by corporate actors with the same corporate spiel of “All we want is to give you better product, all we want is your data. We won’t to anything bad, pinky promise!”. Most of those actors are shady and act out of a motivation that is mostly reflected in their shareholder profits.
People are simply tired of having to be on the lookout, being on the defensive against a never ending onslaught. So when you bring that attitude into the FOSS space, which many perceive as a safe space, you are bound to get push back.
If you want to win over the trust of such a community, a five line blog post is just not going to cut it. You’ll simply have to do better.
Yes, I invited Domen to revert the change (which he had already done when I sent the answer).
I said that because it was illegitimate for me to merge the original PR. It had nothing to do with the actual patch, but the way it has been merged.