Fuzzing beast with libFuzzer

published at 10.07.2017 12:24 by Jens Weller

During the weekend I wanted to take a closer look at beast, a http library proposed for boost. I planned to write an http client class, as thats something I'll need in some project later anyways. I've been looking at beast on and off for a few month now, and started by reviewing the documentation and examples to get a feel for the library it self.

I also followed the review on the boost mailing list, many discussions on different topics related to beast and one of these was about how beast is tested for security. The author of the library mentioned that so far nobody had fuzzed the library, something which should be done. Also, in the last week on reddit, there was a link to a introduction course into fuzzing with libFuzzer. So I decided to give fuzzing a shot with beast and the course which gives you an easy start.

Setup

I used one of my linux laptops, and hence had to run the checkout_build_install_llvm.sh, which takes a while. From going through the course in the meantime, I knew some other things need to be done, like generating valid example data for the fuzzer. For this a test utility already exists in beast, but its not setup to generate files as output, so I wrote this little program to do this:

#include <iostream>

#include <fstream>

#include <string>



#include <beast.hpp>

#include <http/message_fuzz.hpp>

#include <boost/asio.hpp>



void writeRequests(const std::string& path,beast::http::message_fuzz& mfuzz, size_t s = 10)

{

    for(;s > 1;--s)

    {

        beast::multi_buffer buf;

        std::ofstream out(path + "req"+ std::to_string(s),std::ios::out|std::ios::binary);

        mfuzz.request(buf);

        out << beast::buffers(buf.data());

    }

}

void writeResponse(const std::string& path,beast::http::message_fuzz& mfuzz, size_t s = 10)

{

    for(;s > 1;--s)

    {

        beast::multi_buffer buf;

        std::ofstream out(path + "response"+ std::to_string(s),std::ios::out|std::ios::binary);

        mfuzz.response(buf);

        out << beast::buffers(buf.data());

    }

}



int main(int argc, char *argv[])

{

    std::string path;

    if(argc > 1)

        path = argv[1];

    beast::http::message_fuzz mfuzz;

    writeRequests(path,mfuzz,50);

    writeResponse(path,mfuzz,50);

}

But this is only to generate messages as input to the fuzzer, all of them are valid and beast should not have problems with them. The fuzzer yet will mutate them, testing mostly invalid inputs. Next is the fuzzer.cpp file, which does the fuzzing it self. There is one entry point, called by the fuzzer, and providing the input as uint8_t* , size_t s. As I also fuzzed the websocket implementation, the fuzzer.cpp file has two functions to be called for the actual fuzzing:

#include <beast.hpp>

#include <http/test_parser.hpp>

#include <boost/asio.hpp>



void fuzz_basic_parser(const uint8_t *data, size_t size)

{

    beast::http::test_parser parser;

    auto buf = boost::asio::buffer(data,size);

    beast::error_code ec;

    parser.put(buf,ec);

}



#include <beast/test/pipe_stream.hpp>



void fuzz_websocket_stream(const uint8_t *data, size_t size)

{

    boost::asio::io_service io;

    beast::test::pipe p{io};

    beast::websocket::stream ws{p.client};

    auto buf = boost::asio::buffer(data,size);

    beast::ostream(p.client.buffer) << beast::buffers(buf);

    beast::multi_buffer mbuf;

    beast::error_code ec;

    ws.read(mbuf,ec);

}

// the actual interface to libFuzzer

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {

    fuzz_websocket_stream(data,size);

    return 0;

}

As one can see, its a bit tedious to setup the data from libFuzzer to go into beast, like the websocket. Also there is no main function, this is already provided by libFuzzer. Only two minor things missing, the build.sh:

#!/bin/bash -eux
rm -f beast_fuzzer clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard \ -I/home/jens/cpp/libraries/beast/Beast/include \ -I/home/jens/cpp/libraries/beast/Beast/test \ -I/home/jens/cpp/libraries/beast/Beast/extras \ -I/home/jens/cpp/libraries/boost_1_64_0 \ fuzzer.cpp ../workshop/libFuzzer/Fuzzer/libFuzzer.a /home/jens/cpp/libraries/boost_1_64_0/stage/lib/libboost_system.a \ -o beast_fuzzer

Clang is a needed dependency, and the build script that is in the workshop works really well. This will produce the actual executable used for the fuzzing.

And with run.sh its easy to start the fuzzing it self:

#!/bin/bash -eux

./beast_fuzzer -max_total_time=300 -print_final_stats=1 -timeout=5 corpus2 seed_corpus -jobs=100

The executable will run for 300 seconds, the timeout is set to 5 seconds, the fuzzer uses two directories, corpus contains the evolved fuzzing inputs, while seed_corpus is containing the valid inputs from beasts message_fuzz object. The jobs parameter lets libFuzzer execute the fuzzing 100 times, also the results are written then to fuzz#.log files.

Results

My expectation was that this will take some time to find anything. But it did find with the first run already the bug which is fixed now. It turns out, that the other runs which also showed lots of results, found all the same bug. This but is related to handling "obs-fold" in http fields. The parser can return a nullptr, but does not handle this case. This was fixed within hours, the author of the library, Vinnie Falco was very supportive and helped where he could.

This is the resulting bugfix. The fuzzing continued with the fixed version, but did not bring any other results so far.

Review...

For most of us, beast is not what one expects under a http library. Beast does not provide ready to use "Endpoints" to the user. Its aim is to support the low level, http interfaces in combination with asio. Asio is the networking part of beast, beast it self is a more or less a protocol implementation.

Beast has the clear goal to integrate well into boost, but also wants to follow up with standardization, once asio is in the standard. It could become a foundation of a C++ networking stack, as it has the right focus on http only for this.

But it clearly needs another library, building up on beast to provide client and server primitives to the end user. That library could be called cage ;)

You can find my review of beast here.