Six weeks in, time for an update!
More information about the topic and goals of this project: https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-0 .
Previous update can be found at: https://discourse.nixos.org/t/tweag-fellowship-fuzzing-nix-1 .
Generally, progress seemed slower than for the first three weeks.
I caught the first bug using a fuzzing harness exercising the code instrumented with ASan.
As far as I can tell, it did not have serious security implications, but it is nice to see our efforts starting to pay off!
The underlying issue received an easy fix, via libexpr: Fix read out-of-bound on the heap by Pamplemousse · Pull Request #5011 · NixOS/nix · GitHub , and got merged quickly.
In the previous update, I mentioned the difficulties I had to use the current build system to fit my use case.
After a couple of days trying to implement the building of the necessary components for fuzzing with
make without success, I found added meson support by p01arst0rm · Pull Request #3160 · NixOS/nix · GitHub , introducing
meson as a replacement.
Standing on its shoulders, I managed to get something working within a couple of hours.
Hence, I plan continuing to work on top of this PR for the time being, and humbly help it to approach a “production-ready” state.
After setting up dedicated hardware, I was able to run a fuzzing session for a longer time.
Interestingly, it managed to find hundred of crashers in a couple of hours, after which I stopped (paused) it, as there is no point gathering more than what could be humanly triageable.
A “crasher” (sometimes “crash file”, or simply “crash”) is a file containing the
Data causing the fuzz target to fail (either from critical memory corruption - segfaults, or from corruptions detected by ASan).
libFuzzer does its best to avoid redundant crashers based on the coverage they produce (only one of two crashers exercising the same code path would be reported), a single bug could still produce several crashers.
I spent some time investigating the crashers obtained earlier, prioritising the fourteen ones produced by running the fuzzer with ASan on the test expressions.
Sadly, it made apparent that all of them were due to a faulty fuzz target, offering me a great transition to the next section.
fuzz target - a function that accepts an array of bytes and does something interesting with these bytes using the API under test 2
libFuzzer, our fuzz target is implemented as the body of the
LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) function.
libFuzzer calls this function thousands of times in a raw, providing different
To make the process deterministic,
LLVMFuzzerTestOneInput should avoid mutating a global state (as a subsequent call to this function will have requirements that are not reproducible in isolation).
Take two minimized crashers:
$ cat map.bug map $ cat seq.bug builtins.seq
Passing them one by one through the fuzzer1 does not trigger any error:
$ ASAN_OPTIONS=detect_leaks=false ./buildir/fuzz/parse_eval-fuzzer-with-asan -detect_leaks=0 map.bug 2>/dev/null && echo "success" success $ ASAN_OPTIONS=detect_leaks=false ./buildir/fuzz/parse_eval-fuzzer-with-asan -detect_leaks=0 seq.bug 2>/dev/null && echo "success" success
But ASan reports a memory violation when the fuzzer runs them consequently (order does not matter):
$ ASAN_OPTIONS=detect_leaks=false ./buildir/fuzz/parse_eval-fuzzer-with-asan -detect_leaks=0 map.bug seq.bug 2>&1 | head -n 9 | tail -n 5 Running: map.bug Executed map.bug in 26 ms Running: seq.bug ================================================================= ==19174==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7fffffff2e48 at pc 0x7ffff5279eb5 bp 0x7fffffff2d90 sp 0x7fffffff2d88 $ ASAN_OPTIONS=detect_leaks=false ./buildir/fuzz/parse_eval-fuzzer-with-asan -detect_leaks=0 seq.bug map.bug 2>&1 | head -n 9 | tail -n 5 Running: seq.bug Executed seq.bug in 23 ms Running: map.bug ================================================================= ==19300==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7fffffff3168 at pc 0x7ffff5296f58 bp 0x7fffffff3060 sp 0x7fffffff3058
As if the first run of
LLVMFuzzerTestOneInput left a global state causing the second run to fail…
I am still in the process of trying to understand this bug better, and finding a way to fix it.
It’s recognized that the evaluation function (
EvalState::eval) is not reentrant, so any global state involved in the evaluation is one of our primary suspect;
Also, any other global state that gets mutated might be on this list.
Currently buggy, the harness is useless, so fixing it is a priority.
Once that is done, I intend to resume where I left of:
- Run long fuzzing sessions;
- Triage and patch bugs;
- Enrich the fuzzing toolset (build fuzzers with different sanitizers).