Tweag Fellowship: Fuzzing Nix #1

Pamplemousse · July 2, 2021, 10:46pm

Three weeks ago, I started working on fuzzing nix. It’s time for an update!

You can find more information about the topic and goals of this project in the original announcement: https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-0 .

Progress

I am quite happy to tell we made good progress on key topics.

First working fuzz target

We wrote and compiled our first fuzz target to exercise the parsing and evaluation logic of nix.

fuzz target - a function that accepts an array of bytes and does something interesting with these bytes using the API under test ¹

Using libFuzzer, fuzz targets are functions respecting the following template:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  // ... Do something with the Data.
  return 0;
}

In the body of the function, we use nix functions and data structures, fed with the Data received as a parameter. The function will be executed many times, with different Data, until something goes awry.

Gathering coverage information

As the fuzzing session progresses, the fuzzer gathers a corpus: a set of inputs that are interesting. In coverage-guided fuzzing, an “interesting” input is one that increase the coverage, i.e. executes instructions that were not executed by runs on previous inputs.

From our fuzz target code, we compiled another binary (different from the fuzzer) using LLVM’s Source-based Code Coverage, which instruments the code to gather coverage information.
Now, we can run this binary on the corpi generated by fuzzing sessions to identify which parts of the codebase were reached, and which were not.

Compiling with ASan (Address Sanitizer)

AddressSanitizer (aka ASan) is a memory error detector for C/C++.

By instrumenting code, ASan adds runtime checks for common memory errors; Among other things, it makes memory errors fail explicitly, while they might have gone undetected without it.

This is especially useful during fuzzing, where we want to be notified of any kind of memory corruption; We thus compiled our fuzz target with ASan as a shared library.

Deal with memory leaks

We witnessed our fuzzing session report a lot of Out-Of-Memory errors. And so far, all of them were false positives (executing gracefully when run independently).

nix code uses the Boehm garbage collector to automatically identify and free unused memory. Sadly, this GC does not always identify all the memory regions that are not in use anymore. Thus, over the course of a single run, nix leaks memory.

(For that reason, when running the fuzzer with ASan, we need to disable ASanLeakSanitizer using ASAN_OPTIONS=detect_leaks=false.)

Under normal runs, memory leaks are not problematic (as the program is short-lived), but an in-process fuzzer using nix code will need more and more memory as the time passes.

In-process fuzzing is a fuzzing technique where the fuzzing happens in only one process, i.e., for every test case the process isn’t restarted but the values are changed in memory. ²

LLVMFuzzerTestOneInput is called in a loop, being fed new Data every time, and its logic being executed again and again, leading to growing memory leaks, and the fuzzer needing more and more memory.

To ward off some of the memory leaks, we disabled the GC ³, and implemented an “arena” ⁴: a big chunk of memory allocated at the beginning and freed at the end of the LLVMFuzzerTestOneInput function, where all the allocations made by nix will happen.

Comparatively, our solution seems to prevent a decent amount of leaks.

$ ./fuzzer-with-GC ./leak-3c363836cf4e16666669a25da280a1865c2d2874
[...]
SUMMARY: AddressSanitizer: 14544 byte(s) leaked in 322 allocation(s).

$ ./fuzzer-with-arena ./leak-3c363836cf4e16666669a25da280a1865c2d2874
[...]
SUMMARY: AddressSanitizer: 1184 byte(s) leaked in 80 allocation(s).

Seeds corpus, dictionary

A fuzzer can generate inputs from scratch, but to reach good coverage quickly, providing seeds is customary. Seeds, or initial corpus is a set of inputs the fuzzer will run on, and mutate from. In our case, we use the nix expressions of the codebase used for testing.

It is also possible to supply a dictionary (a list of tokens) to libFuzzer for it to mutate from, and inject in inputs. We gathered a list of builtins (attrNames, import, etc.), keywords (assert, if, then, else, etc.), and operators (+, ++, //, etc.) for that use.

Difficulties encountered

Not everything went smoothly, and some points are still very much work-in-progress, if not show stoppers.

Code throwing `*Error`

nix has many code path leading to the raise of exceptions. When an input leads to the execution of these paths, the program exits with an error, and the fuzzer identifies it as a crash. Surely, those are false positives: exiting with an error is legitimate, and such behavior does not lead to any memory corruption.

We updated our fuzz target to catch errors we identified as legitimate, and return 0, indicating the fuzzer that everything went as expected.

So far, we preferred to catch specific errors. However, nix sometimes raise its generic Error.
Prefer to throw specific errors by Pamplemousse · Pull Request #4967 · NixOS/nix · GitHub has been opened to try to specify more precise errors, but this does not seem always possible.

Remaining Out-Of-Memory errors

Despite the use of the arena described earlier, our fuzzing sessions still reports false Out-Of-Memory errors. Although we overrode malloc, and calloc so their use will return addresses from the arena, nix calls other functions that expect the caller to free the returned pointer (such as strdup).

If Out-Of-Memory errors become more prevalent, we will need to improve our solution to cover all allocation mechanisms.

Compilation

libFuzzer requires the code to be compiled with clang; ASan, as well as other sanitizers are also LLVM libraries.

Currently, nix’s flake.nix needs to be updated to provide a development environment respecting these requirements. Restore the ability to build with clang on linux · Issue #4129 · NixOS/nix · GitHub is an opened issue pointing on this lack of flexibility.

Integration with the existing build process

As sketched through the previous sections, we need to compile each of our fuzz target, and its required nix shared libraries, with slightly different options (for coverage, libFuzzer with ASan and other sanitizers…), to produce everything we need for a complete fuzzing session.

Ideally, we want to make use of the building mechanisms available in the repository as much as possible, as it already provides the way to compile and glue every components (headers, libraries, binaries) together. Sadly, our use case (several flavor of the same code compiled with different flags) seems to not straightforwardly fit this setup.

Fuzzer segfaults

We sometimes witnessed the fuzzer binary with ASan segfaulting.
After a bit of investigation, we managed to recreate an input reproducing this behavior.
It seems to be related to an evaluation going into an infinite recursion (or at least deep enough), for which ASan is unable to unwind the stack trace; More analysis is needed to get to the end of it.

Future plans

For the next couple of week, our focus will be brought upon the two most critical difficulties described above:

Integration with the existing build process;
Determine the cause of the fuzzer segfaults, and fix it.

Then, we will move onto the following topics:

Run the fuzzing sessions for a “long” period of time (couple of days);
Create a fuzz target that uses UBSan;
The memory management using the “arena”: integrate it more nicely upstream, and expand the functions it handles if needed.

Stay tuned!

1: C.f. libFuzzer – a library for coverage-guided fuzz testing. — LLVM 19.0.0git documentation .

2: C.f. Threats & Research Archives - F-Secure Blog .

3: After correcting the --enable-gc=no compile option Allow to compile after `./configure --enable-gc=no` by Pamplemousse · Pull Request #4947 · NixOS/nix · GitHub .

4: C.f. Region-based memory management - Wikipedia .