Driller: Augmenting Fuzzing Through Selective Symbolic Execution

Driller: Augmenting Fuzzing Through Selective Symbolic Execution
1 · 2 · 3 · 4 · 5 · 6 · 7 — 14 · 15

fuzzing concolic.

It was built for DARPA Cyber Grand Challenge.

Idea: decompose application to different compartments; fuzz inside compartment, use concolic to walk between compartments.

1 Driller is for vulenrability excavation using fuzzing and selective concolic (to go deeper). Concolic is used for conditions that the fuzzer cannot satisfy. They divide analysis approaches into 3 categories: static, dynamic (they mean single run, actually), and concolic. 2 Concolic usually fix the depth but fuzzers usually fail to pass through input-processing code. They propose general (unlimited set of values) and specific (limited) inputs. Application's checks for paticular input values may split execution into compartments (like reading commands from input). Fuzzing is useful inside compartment, not between them. 3 Dowser uses static analysis to identity regions and detect input bytes for defects in this regions using taint tracking. But it requires test cases to reach this regions. Almost same is BuzzFuzz. Flayer allows auditor to skip complex checks. They said blind using of concolic on inputs generated by fuzzing does not work without insight about different compartments of application (because inside compartment concolic is redundant). See Veritesting and Firmalice.

4 Driller assumes that application is dividable into compartments. Corner example is magic number checking. Fuzzing just means minimal instrumentation. 5 They use AFL as a fuzzer. 6 Stuck is when fuzzer has gone through a predetermined amount (proportional to the input length) of mutations without identifying new states. They use angr as a concolic execution engine. It translates binary code into Valgrind's VEX. They use approach popularized by Mayhem: memory model can store both concrete and symbolic values; read addresses may be symbolic, but write address are always concretized. Very importaint!

7 When a crashing input is found by fuzzer, concolic engine “re-randomizes” it to recover the parts are dependent on randomness and other environmental factors.

14 They said their representation of state is imperfect. As I understand, if may be not interesting for inversion when Driller needs to invert it second time. 15 When program use input as generic and specific simultaneously: check magic number and check it's hash. Concolic gives magic number but it does not say about dependency between this number and it's hash. So, fuzzer mutates this number and never found proper hash. Pretty interesting! They want to generate “semi-symbolic” fuzzing input (set of constraints on input) or create generational fuzzing (produce not just inputs but input generators).

Sergey Vartanov, 2007–2020