Speeding up fuzzing rust with shared initialization

Having ported the Pi searcher to Rust, I've started exploring fuzzing frameworks to help find bugs.  I settled on Rust's cargo-fuzz using the libfuzzer backend, since I'm familiar with libfuzzer from using it on TensorFlow.

Naturally, since the Pi Searcher had been comparatively stable for 7 years in Go, and the translation from Go to Rust was fairly straightforward... I introduced a bug that my tests didn't find, and the fuzzer found over dinner:

Whoopsie.  That bug arose because of an error in handling the boundary condition of searching for just "9" using the index (which doesn't happen normally because the search function uses a simple sequential search for short strings):

Here, num_digits is 200 million, but they're stored compressed, so the bug was using the size of the file (100 million bytes), which caused the binary search to return a bad range.  Fuzzing is awesome, and adapting the fuzzer to the Pi searcher was straightforward:

But one thing that bugged me:  Fuzzing was a little slow.

Why was it slow?  Every time the fuzz function is run with a new input, it had to instantiate a new PiSearcher.  Each PiSearcher opens and memory maps using mmap two files:  100MB of compressed digits of Pi and 800MB of a suffix array index into those bytes.  This adds a bunch of unnecessary overhead compared to the relatively fast searching code.

Libfuzzer's documentation suggests that initialization should use a statically initialized global object.  But this is Rust.  One does not simply willy-nilly toss around globals.  Some quick Googling didn't turn up much useful, so I put it aside, as this was merely a speed optimization.

Fortunately, the most recent "this week in Rust" blog pointed to Paul Kernfeld's very useful Guide to Global Data in Rust.  Which, in turn, linked to the once_cell crate.  Just what the fuzzer ordered!  A quick rewrite of the fuzzing target yielded:

Which allows the (read-only) PiSearcher object to be reused across fuzzing calls.  Simple and easy.

How much did this help?

> cargo fuzz run coretest

Debug build (default):
Old version (new mmap per function call):  304 tests/second.
New version (statically initialized PiSeacher):  340 tests/second.

10% is nice and all, but that's not a particularly large gain.  But wait - that's very slow compared to the normal speed of the Pi searcher.  Ahh, yes:  cargo fuzz builds debug mode by default.  Let's try release, keeping in mind that coverage may not be quite as good (but with the speed gains, it's probably worth it):

> cargo fuzz run coretest --release --jobs 4

Release build, 4 jobs:

Old version:  10800
New version    26050

Not bad - 10% in debug mode and almost 2.5x faster in release mode with multiple jobs.  Thanks, once_cell!


Popular posts from this blog

No, C++ still isn't cutting it.

Reflecting on CS Graduate Admissions

Masking the taste of Augmentin - with candy canes