Another three weeks have passed, here is the update!
More information about the topic and goals of this project: https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-0 .
Previous updates can be found at:
- https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-1 ;
- https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-2 .
The most depressing period so far: no progress on previous issues, and I even encountered strong pain points when trying to explore different directions!
After failing to figure out where the side effects of the harness exercising parsing and evaluation came from, I decided to move on and fuzz a different part of the codebase.
I aimed to make a simpler fuzz target, to at least have a working example: so I targeted Store::parseStorePath .
The fuzz target is pretty straightforward, and I used it to run a less than a day long fuzzing session, which (un)fortunately did not produce any crasher.
At least, I have an operational target!
Encouraged by this meager success, I got interested in targeting another (with parsing and evaluation) critical component of
nix: the daemon.
the Nix daemon […] is a required component in multi-user Nix installations. It performs build actions and other operations on the Nix store on behalf of non-root users. 1
The daemon receives data over a Unix domain socket, and performs the appropriate actions.
Note that one could directly make the fuzz target use library functions that are called following the reception of the data, but I thought it could be more beneficial to fuzz the way the daemon handles the protocol itself (especially as there are no protocol specification).
The produced fuzz target thus needs to:
- Simulate a client: Send data (received as a parameter of the fuzz target) to the daemon;
- Receive data using
So far, the harness creates two threads: one for the client, the other for the daemon, communicating through Unix sockets. What’s left is for the former to send data to the latter…
The most naive approach would be to have the client forward the raw byte array to the daemon, let it interpret what it receives and act correspondingly.
However, the daemon input is highly structured; It expects: a magic byte, the client version, extra parameters, a sequence of operations and their arguments, an end-of-file.
Leaving to luck that random mutations of
Data will produce sequence of bytes respecting this structure is pretty optimistic.
libFuzzer provides two means to increase the odds with structure aware fuzzing:
- Define a custom mutator: to manually mutate the
Datafed to the fuzz target;
libprotobuf-mutatorto produce mutations respecting a given structure definition.
Implementing any of these solution felt like diving into another rabbit hole, so I left this idea on the side for now, to focus my efforts on a more tangible outcome.
Integration to OSS-fuzz
OSS-fuzz offers open source projects the capability to run fuzzers on a dedicated infrastructure, making use of Google’s computing power, (hypothetically) without too much hassle.
nix to the OSS-fuzz projects, we need:
- One or several fuzz target. The motivation behind our simple target;
Somewhat already dealt with as we keep scripts to build the fuzzers in our fork of
However, those scripts use
meson, which is not (yet?) used upstream;
Dockerfilefor building reproducibility.
OSS-fuzz champions Docker for reproducibility.
An easy-to-use Docker image is provided to simplify toolchain distribution. This also simplifies our support for a variety of Linux distributions and provides a reproducible and secure environment for fuzzer building and execution. 2
However, it appears that their use of Docker only provides “build time” reproducibility.
Packages that are installed via Dockerfile or built as part of build.sh are not available on the bot runtime environment (where the fuzz targets run). 3
All build artifacts needed during fuzz target execution should be inside the $OUT directory. Only those artifacts are archived and used on the bots. Everything else is ignored (e.g. artifacts in $WORK, $SRC, etc) and hence is not available in the execution environment.
We strongly recommend static linking because it just works. […] 4
Another thing to note is that they provide an image with a
gcr.io/oss-fuzz-base/base-builder, which is based on Ubuntu 16.04…
I didn’t see any other projects’s image use a different base. So I assume OSS-fuzz requires the
Dockerfile to be use theirs.
So, here is what needs to be done, in an
Ubuntu 16.04 Docker container: install
nix's dependencies, and build the statically linked fuzzers.
After failing to install
nix dependencies from the Ubuntu package repositories, I wondered if I could leverage the
flake.nix that we use for development anyway. I made the
nix from a tarball, for the
root user. 5
I can successfully
nix develop from the container.
I have yet to update the build of the fuzzers to produce statically linked binaries.
Only three weeks are left before the end of the fellowship!
I intend to focus 100% onto bringing
nix to OSS-fuzz.
1: nix daemon
5: Again, I assume it needs to use the
root user as no other projects use another one.