Fuzzing

The program is then monitored for exceptions such as crashes, failing built-in code assertions, or potential memory leaks.

An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose corner cases that have not been properly dealt with.

The project was designed to test the reliability of UNIX command line programs by executing a large number of random inputs in quick succession until they crashed.

To allow other researchers to conduct similar experiments with other software, the source code of the tools, the test procedures, and the raw result data were made publicly available.

According to Prof. Barton Miller, "In the process of writing the project description, I needed to give this kind of testing a name.

In April 2012, Google announced ClusterFuzz, a cloud-based fuzzing infrastructure for security-critical components of the Chromium web browser.

[13]) In August 2016, the Defense Advanced Research Projects Agency (DARPA) held the finals of the first Cyber Grand Challenge, a fully automated capture-the-flag competition that lasted 11 hours.

[14] The objective was to develop automatic defense systems that can discover, exploit, and correct software flaws in real-time.

In September 2016, Microsoft announced Project Springfield, a cloud-based fuzz testing service for finding security critical bugs in software.

[16] In December 2016, Google announced OSS-Fuzz which allows for continuous fuzzing of several security-critical open-source projects.

[17] At Black Hat 2018, Christopher Domas demonstrated the use of fuzzing to expose the existence of a hidden RISC core in a processor.

In September 2020, Microsoft released OneFuzz, a self-hosted fuzzing-as-a-service platform that automates the detection of software bugs.

[21] Testing programs with random inputs dates back to the 1950s when data was still stored on punched cards.

In 1983, Steve Capps at Apple developed "The Monkey",[25] a tool that would generate random inputs for classic Mac OS applications, such as MacPaint.

Even items not normally considered as input can be fuzzed, such as the contents of databases, shared memory, environment variables or the precise interleaving of threads.

An effective fuzzer generates semi-valid inputs that are "valid enough" so that they are not directly rejected from the parser and "invalid enough" so that they might stress corner cases and exercise interesting program behaviours.

For instance, if the input can be modelled as an abstract syntax tree, then a smart mutation-based fuzzer[33] would employ random transformations to move complete subtrees from one node to another.

However, a dumb fuzzer might generate a lower proportion of valid inputs and stress the parser code rather than the main components of a program.

The disadvantage of dumb fuzzers can be illustrated by means of the construction of a valid checksum for a cyclic redundancy check (CRC).

A CRC is an error-detecting code that ensures that the integrity of the data contained in the input file is preserved during transmission.

However, there are attempts to identify and re-compute a potential checksum in the mutated input, once a dumb mutation-based fuzzer has modified the protected data.

Hence, a blackbox fuzzer can execute several hundred inputs per second, can be easily parallelized, and can scale to programs of arbitrary size.

For instance, LearnLib employs active learning to generate an automaton that represents the behavior of a web application.

For instance, AFL and libFuzzer utilize lightweight instrumentation to trace basic block transitions exercised by an input.

In order to expose bugs, a fuzzer must be able to distinguish expected (normal) from unexpected (buggy) program behavior.

[45][46] Typically, a fuzzer distinguishes between crashing and non-crashing inputs in the absence of specifications and to use a simple and objective measure.

Fuzzing in combination with dynamic program analysis can be used to try to generate an input that actually witnesses the reported problem.

[54] The Microsoft Security Research Centre (MSEC) developed the "!exploitable" tool which first creates a hash for a crashing input to determine its uniqueness and then assigns an exploitability rating:[55] Previously unreported, triaged bugs might be automatically reported to a bug tracking system.