Dynamic symbolic execution module.
Offers a way to have a symbolic execution along a concrete one.
Basically, this is done through DSEEngine class, with scheme:
dse = DSEEngine(Machine("x86_32"))
dse.attach(jitter)
The DSE state can be updated through:
- .update_state_from_concrete: update the values from the CPU, so the symbolic
execution will be completely concrete from this point (until changes)
- .update_state: inject information, for instance RAX = symbolic_RAX
- .symbolize_memory: symbolize (using .memory_to_expr) memory areas (ie,
reading from an address in one of these areas yield a symbol)
The DSE run can be instrumented through:
- .add_handler: register an handler, modifying the state instead of the current
execution. Can be used for stubbing external API
- .add_lib_handler: register handlers for libraries
- .add_instrumentation: register an handler, modifying the state but continuing
the current execution. Can be used for logging facilities
On branch, if the decision is symbolic, one can also collect "path constraints"
and inverse them to produce new inputs potentially reaching new paths.
Basically, this is done through DSEPathConstraint. In order to produce a new
solution, one can extend this class, and override 'handle_solution' to produce a
solution which fit its needs. It could avoid computing new solution by
overriding 'produce_solution'.
If one is only interested in constraints associated to its path, the option
"produce_solution" should be set to False, to speed up emulation.
The constraints are accumulated in the .z3_cur z3.Solver object.
Here are a few remainings TODO:
- handle endianness in check_state / atomic read: currently, but this is also
true for others Miasm2 symbolic engines, the endianness is not take in
account, and assumed to be Little Endian
- too many memory dependencies in constraint tracking: in order to let z3 find
new solution, it does need information on memory values (for instance, a
lookup in a table with a symbolic index). The estimated possible involved
memory location could be too large to pass to the solver (threshold named
MAX_MEMORY_INJECT). One possible solution, not yet implemented, is to call
the solver for reducing the possible values thanks to its accumulated
constraints.