CVE-2024-3094: Malicious code in xz 5.6.0 and 5.6.1 tarballs

Yeah, you can also check it by enabling the backdoor condition against a problematic binary:

$ nix build -f '<nixpkgs>' xz.out --out-link before

$ nix build --impure --expr 'with import <nixpkgs> {}; (xz.overrideAttrs (oa: { env.RPM_ARCH = "x86_64";})).out' --out-link after

$ ls -lh before/lib/liblzma.so.5.6.1 after/lib/liblzma.so.5.6.1
-r-xr-xr-x 2 root root 258K Jan  1  1970 after/lib/liblzma.so.5.6.1
-r-xr-xr-x 2 root root 210K Jan  1  1970 before/lib/liblzma.so.5.6.1

40 KB difference. And one of then contains malware _get_cpuid calls:

$ nix-shell -p binutils-unwrapped
$$ diff -u0 <(nm --format=just-symbols before/lib/liblzma.so.5.6.1) <(nm --format=just-symbols after/lib/liblzma.so.5.6.1)
--- /dev/fd/63  2024-03-31 10:02:48.977464772 +0100
+++ /dev/fd/62  2024-03-31 10:02:48.977464772 +0100
@@ -27,0 +28,2 @@
+__tls_get_addr@GLIBC_2.3
+_cpuid
@@ -28,0 +31 @@
+_get_cpuid

Thus by default at least those 40KB of malicious payload was not included in nixpkgs.

12 Likes

Why do you think that we should remove or mark as insecure all related packages without waiting for the rebuild?

1 Like

It is not as simple as that. As already mentioned before, xz is part of the bootstrap binaries through that stdenv which means you cannot even write a simple string to $out without first compiling multiple gccs if you have no cache hits. On my system I would need to rebuild close to 14k packages and some of them won’t even compile on my laptop unless I reduce max-jobs to 1.
If the revert would be just pushed to master, no PR created/updated in the next couple of days would succeed its CI checks and if it was pushed to nixos-unstable, then we would prevent people updating their systems.

5 Likes

(Point of order: could the “what kinds of source distribution should nixpkgs accept” discussion please go to a new thread? It’s unrelated to the current remediation efforts.)

7 Likes

14k store paths, why does everything depend on xz anyway? How long would it there to build it for you? Is it not worth not having the vulnerabilities?

10 days is a very long time for such a critical security update, and we need to be better for the next time this happens.

xz is part of stdenv (along with a few other tools). It’s needed to unpack .tar.xz tarballs. Virtually everything depends on stdenv. nix why-depends might help you figure they details:

$ nix why-depends --derivation nixpkgs#mc nixpkgs#xz
/nix/store/s724zymxglwzll28qkigb43aja6g8zm4-mc-4.8.31.drv
└───/nix/store/17gdfyx2nzzcbhh8c2fm6zm8973nnrsd-stdenv-linux.drv
    └───/nix/store/3mn2armpm7zvykml4aqy9rxvafczcpxx-xz-5.6.1.drv
5 Likes

I would estimate it to be at least a couple of days if not a week and I would need to baby sit it and restart a couple of times and that with multiple remote builders available that have 96 cores and 256 GB RAM.

Since the vulnerability is not exploitable at this point, it’s not for me.

This is just a not best case estimation. It is hard to estimate this but it should totally only take 4 days or 6 or 8. It is really hard to tell.

1 Like

It seems that the main problem here is not that xz (the program) is included in stdenv, but that the xz binary in stdenv is provided by the same derivation that provides liblzma.

I wonder, if it would be possible to make stdenv only expose the binary build tools, without exposing the associated libraries. That way, in the future we could first quickly update liblzma (rebuilding only the packages that actually link to it via buildInputs etc) and then update the xz binary used by stdenv at a later date.

2 Likes

I’m not sure I understand. xz does link against (it’s own) liblzma:

$ lddtree `which xz`
/run/current-system/sw/bin/xz (interpreter => /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/ld-linux-x86-64.so.2)
    liblzma.so.5 => /nix/store/yyqzw7xvsrn3h2zrvincbs1b291yzx8c-xz-5.6.1/lib/liblzma.so.5
    libpthread.so.0 => /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/libpthread.so.0
    libc.so.6 => /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/libc.so.6

If we are to fix liblzma we should relink xz as well.

1 Like

Yes. My point is that we don’t need to update the xz binary that is used in stdenv (or at least not right away), if we ensure that it’s only used during isolated builds. Even if the xz binary is technically using the “vulnerable” liblzma library, it doesn’t really matter since it can extract .tar.xz files just fine and the build process is isolated anyway.

We should have one version of stdenv.xz (and other build tools) that only exposes the binary, that is used only during stdenv builds (we might even want to enforce that it’s path doesn’t appear anywhere in build outputs) and another “nixpkgs-wide” version of xz that would be explicitly used by derivations that actually need xz (either to link to or to call the xz binary at runtime).

1 Like

Unless you have a more systemic fix you’re thinking of, this is really just over fitting a fringe/one-off event.

6 Likes

The maximal version of what I am proposing is a systemic fix. Basically, split all the dependencies of stdenv into two groups “build tools” and “libraries provided by default”. Enforce that the first group doesn’t appear in the build outputs. Reduce the second group as much as possible (ideally, just core stuff like glibc).

If we do that, then the next time a vulnerability is found in one of the packages used by stdenv, we can quickly rebuild only the packages that actually link against or reference the vulnerable package and then bump the stdenv build tool version at a later date.

3 Likes

Aside from the complexity of the implementation how much do you expect to gain from such a change? What the rebuild decrease would you call a net benefit for that? 2x? 10x? 100x rebuild speedup?

Let’s imagine we can isolate liblzma. What would it take to rebuild packages against new liblzma outside stdenv. My silly grep against currently running systems reveals the following direct users of the library:

$ fgrep -Rl liblzma.so $(nix path-info -r /run/current-system) 2>/dev/null | tr '/' ' ' | awk '{print $3}' | uniq

yyqzw7xvsrn3h2zrvincbs1b291yzx8c-xz-5.6.1
0v0wrr6ngh9d487lhwicwr5z61kz40zw-kmod-31
b4hxc9cg3700ac8p50gcj6hrcp17f9c3-kmod-31-lib
s2d4y6k2lanq8v8vg3skaxhmdflv12px-elfutils-0.190
2zvi5q6fvrmznavnqgzc947wssilv9vy-xz-5.6.1-bin
3np3qw5y5xarl4hxbhk9vj2d5kmgqsir-systemd-255.2
n5r9q9hxnbk168ps5kgxz7c2b8ym63pn-xz-5.6.1
bd2rgypp76p9mh7cc8152v57ckcpa92n-elfutils-0.190
mpbhjn9188gjgfj33nciif90x1zrz2zk-libunwind-1.8.1
plxvn2qhfa298rvwnazflvf1a8can4ih-libarchive-3.7.2-lib
dgbkx58nibgmav24mdaa1kxp634c3bym-xz-5.6.1-bin
g1af0mi9dnhpzw569zh50hw99661bhkv-python3-3.11.8
0xyqy6xlhgc63skigila2s5ifbhqqy0d-squashfs-4.6.1
n351xy2dk3m93s66flf993fhdzhznrn1-libtiff-4.6.0
1bwr5a2jinva4m5rzrbbhbzxpdbl1bk8-rizin-0.7.2
7wz6hm9i8wljz0hgwz1wqmn2zlbgavrq-python3-3.11.8
1spv5a8yi21zvi5mc7d0nfc46r79fnh4-ffmpeg-headless-6.1.1-lib
v7myppkzzsqvbl8230kld6z6g7dxshq9-libunwind-1.8.1
95zlvlyij0lxrlvsp1kgln58wxmjhr0s-karchive-5.115.0
zpafyxg75x3giyimh0c377sgwyypbyql-libtiff-4.6.0-bin
2sg8lk8k6ddvmj5nps2c213nkvhjlymq-ffmpeg-headless-6.1.1-lib
acbnmbypm3chs3ich1x99if4z0wnvr23-ffmpeg-6.1.1-lib
a6kpglzpj6nan8bxfjiqfcvvzqi2sgb2-kmod-31-lib
cs7zpcypgdvn2pjl98sph7m4dclj1cf3-kmod-31
3h0ikvb7jcfmqd1gz9is9ln7zsf526ah-systemd-255.2
4ifz2p14l5zivj6nc8l9s28kwq1cnz9w-xz-5.6.1-doc
fw6ws2d0assaiidcvlaahraa1pavgcfj-rpm-4.18.1
c57hvlkji0waj4zq0yxv1dfdw438rjmm-libxmlb-0.3.15-lib
npvqxns3miwkryagf4clrlldxbs649i0-libarchive-3.7.2
yvsxjd4zm7dkgl97d8vksinsdbhshnf6-python3-3.11.8-env
9hk7mrhmjfncx9aabrx7c9x393zqpm8r-boost-1.81.0
sl3h5z7q1ii0vbm3329iiz2vk59ywrrj-source
i1kn97pqkhg00glv080rla291wf05bzf-expose-flakes-inputs
ir3hy542khqxakcyb3d3b7pjq61g96qd-perf-linux-6.8.2
ha08hi6c7ak2iv682vapycr91h4cvk0s-libtiff-4.6.0
xlyfsi4v0kn8cy8lzdblp8rgp237586p-ffmpeg-6.1.1-lib
mpqmb9lv0i804vm2yi58h0w4ddnn3gzr-python3-3.11.8
d5i2w6dwgpcwhza8ywnd273jnvyvq58w-libarchive-3.7.2-lib
jwmiqziglj42a3a357cjd8vwp4rn7l7z-python3-3.11.8
6yb3nkk9jc8gd4fwigi8ipxv3wydyk95-ffmpeg-4.4.4-lib
y7bx3zmi7s06aifbn5wb8pk6q9ik3nx5-gdb-14.1
asn5nzbf4rs4mgbbgg8llqrnrmvxgnbi-python3-3.11.8-env
42yf6sfapwip0wbsph9giig6gqr99088-system-path
wismz59j4g8fbxc1zkkx9x3nz5kpp300-systemd

They all better be updated if we update a vulnerable library, right?

To simulate liblzma update I tweaked each package individually locally to change their output hash and ran $ ./maintainers/scripts/rebuild-amount.sh HEAD^ to get the rebuild counts in nixpkgs:

Most popular are:

  • elfutils: 36058 x86_64-linux rebuilds
  • libunwind: 7790 x86_64-linux
  • python: 66773 x86_64-linux
  • libxml2: 50197 x86_64-linux

python rebuild is probably a full nixpkgs rebuild.

Looking at the numbers above I would say it’s not worth the complexity of fiddling with xz outputs. We will rebuild most things anyway. Even if xz was not in stdenv. It’s used enough outside.

Note that there are more libraries (like pcre2) that cause stdenv rebuild. I don’t think holding those back just for stdenv will be any benefit rebuild-wise either.

4 Likes

I’m also wondering about this. I saw a post on hackernews saying that there was downsides like stuff being impure.

1 Like

FYI, the downgrade of xz is in nixpkgs master now. *-linux binaries are basically all there.

13 Likes

hey, that didn’t take too long at all!

heads up that the PR reverting xz is now in nixos-unstable and nixpkgs-unstable https://nixpk.gs/pr-tracker.html?pr=300028

11 Likes

This prompted us to add support for content addressable store to Cachix and see how much it would help with saving the rebuilds. I’ll report back once we have some results.

19 Likes

@domenkozar Do you have any results by now?

3 Likes