miasm
Reverse engineering framework
|
Classes | |
class | _MetaMemStruct |
class | _MetaMemType |
class | Array |
class | BitField |
class | Bits |
class | MemArray |
class | MemBitField |
class | MemPtr |
class | MemSelf |
class | MemSizedArray |
class | MemStr |
class | MemStruct |
class | MemType |
class | MemUnion |
class | MemValue |
class | MemVoid |
class | Num |
class | Ptr |
class | RawStruct |
class | Self |
class | Str |
class | Struct |
class | Type |
class | Union |
class | Void |
Functions | |
def | set_allocator (alloc_func) |
def | to_type (obj) |
def | indent (s, size=4) |
def | get_str (vm, addr, enc, max_char=None, end=u'\x00') |
def | raw_str (s, enc, end=u'\x00') |
def | set_str (vm, addr, s, enc, end=u'\x00') |
def | raw_len (py_unic_str, enc, end=u'\x00') |
def | enc_triplet (enc, max_char=None, end=u'\x00') |
Variables | |
log = logging.getLogger(__name__) | |
console_handler = logging.StreamHandler() | |
dictionary | DYN_MEM_STRUCT_CACHE = {} |
SELF_TYPE_INSTANCE = Self() | |
VOID_TYPE_INSTANCE = Void() | |
This module provides classes to manipulate pure C types as well as their representation in memory. A typical usecase is to use this module to easily manipylate structures backed by a VmMngr object (a miasm sandbox virtual memory): class ListNode(MemStruct): fields = [ ("next", Ptr("<I", Self())), ("data", Ptr("<I", Void())), ] class LinkedList(MemStruct): fields = [ ("head", Ptr("<I", ListNode)), ("tail", Ptr("<I", ListNode)), ("size", Num("<I")), ] link = LinkedList(vm, addr1) link.memset() node = ListNode(vm, addr2) node.memset() link.head = node.get_addr() link.tail = node.get_addr() link.size += 1 assert link.head.deref == node data = Num("<I").lval(vm, addr3) data.val = 5 node.data = data.get_addr() # see examples/jitter/types.py for more info It provides two families of classes, Type-s (Num, Ptr, Str...) and their associated MemType-s. A Type subclass instance represents a fully defined C type. A MemType subclass instance represents a C LValue (or variable): it is a type attached to the memory. Available types are: - Num: for number (float or int) handling - Ptr: a pointer to another Type - Struct: equivalent to a C struct definition - Union: similar to union in C, list of Types at the same offset in a structure; the union has the size of the biggest Type (~ Struct with all the fields at offset 0) - Array: an array of items of the same type; can have a fixed size or not (e.g. char[3] vs char* used as an array in C) - BitField: similar to C bitfields, a list of [(<field_name>, <number_of_bits>),]; creates fields that correspond to certain bits of the field; analogous to a Union of Bits (see Bits below) - Str: a character string, with an encoding; not directly mapped to a C type, it is a higher level notion provided for ease of use - Void: analogous to C void, can be a placeholder in void*-style cases. - Self: special marker to reference a Struct inside itself (FIXME: to remove?) And some less common types: - Bits: mask only some bits of a Num - RawStruct: abstraction over a simple struct pack/unpack (no mapping to a standard C type) For each type, the `.lval` property returns a MemType subclass that allows to access the field in memory. The easiest way to use the API to declare and manipulate new structures is to subclass MemStruct and define a list of (<field_name>, <field_definition>): class MyStruct(MemStruct): fields = [ # Scalar field: just struct.pack field with one value ("num", Num("I")), ("flags", Num("B")), # Ptr fields contain two fields: "val", for the numerical value, # and "deref" to get the pointed object ("other", Ptr("I", OtherStruct)), # Ptr to a variable length String ("s", Ptr("I", Str())), ("i", Ptr("I", Num("I"))), ] And access the fields: mstruct = MyStruct(jitter.vm, addr) mstruct.num = 3 assert mstruct.num == 3 mstruct.other.val = addr2 # Also works: mstruct.other = addr2 mstruct.other.deref = OtherStruct(jitter.vm, addr) MemUnion and MemBitField can also be subclassed, the `fields` field being in the format expected by, respectively, Union and BitField. The `addr` argument can be omitted if an allocator is set, in which case the structure will be automatically allocated in memory: my_heap = miasm.os_dep.common.heap() # the allocator is a func(VmMngr) -> integer_address set_allocator(my_heap) Note that some structures (e.g. MemStr or MemArray) do not have a static size and cannot be allocated automatically.
def miasm.core.types.enc_triplet | ( | enc, | |
max_char = None , |
|||
end = u'\x00' |
|||
) |
Returns a triplet of functions (get_str_enc, set_str_enc, raw_len_enc) for a given encoding (as needed by Str to add an encoding). The prototypes are: - get_str_end: same as get_str without the @enc argument - set_str_end: same as set_str without the @enc argument - raw_len_enc: same as raw_len without the @enc argument
def miasm.core.types.get_str | ( | vm, | |
addr, | |||
enc, | |||
max_char = None , |
|||
end = u'\x00' |
|||
) |
Get a @end (by default '\\x00') terminated @enc encoded string from a VmMngr. For example: - get_str(vm, addr, "ascii") will read "foo\\x00" in memory and return u"foo" - get_str(vm, addr, "utf-16le") will read "f\\x00o\\x00o\\x00\\x00\\x00" in memory and return u"foo" as well. Setting @max_char=<n> and @end='' allows to read non null terminated strings from memory. @vm: VmMngr instance @addr: the address at which to read the string @enc: the encoding of the string to read. @max_char: max number of bytes to get in memory @end: the unencoded ending sequence of the string, by default "\\x00". Unencoded here means that the actual ending sequence that this function will look for is end.encode(enc), not directly @end.
def miasm.core.types.indent | ( | s, | |
size = 4 |
|||
) |
Indent a string with @size spaces
def miasm.core.types.raw_len | ( | py_unic_str, | |
enc, | |||
end = u'\x00' |
|||
) |
Returns the length in bytes of @py_unic_str in memory (once @end has been added and the full str has been encoded). It returns exactly the room necessary to call set_str with similar arguments. @py_unic_str: the unicode str to work with @enc: the encoding to encode @py_unic_str to @end: the ending string/character to append to the string _before encoding_ (by default \\x00)
def miasm.core.types.raw_str | ( | s, | |
enc, | |||
end = u'\x00' |
|||
) |
Returns a string representing @s as an @end (by default \\x00) terminated @enc encoded string. @s: the unicode str to serialize @enc: the encoding to apply to @s and @end before serialization. @end: the ending string/character to append to the string _before encoding_ and serialization (by default '\\x00')
def miasm.core.types.set_allocator | ( | alloc_func | ) |
Shorthand to set the default allocator of MemType. See MemType.set_allocator doc for more information.
def miasm.core.types.set_str | ( | vm, | |
addr, | |||
s, | |||
enc, | |||
end = u'\x00' |
|||
) |
Encode a string to an @end (by default \\x00) terminated @enc encoded string and set it in a VmMngr memory. @vm: VmMngr instance @addr: start address to serialize the string to @s: the unicode str to serialize @enc: the encoding to apply to @s and @end before serialization. @end: the ending string/character to append to the string _before encoding_ and serialization (by default '\\x00')
def miasm.core.types.to_type | ( | obj | ) |
If possible, return the Type associated with @obj, otherwise raises a ValueError. Works with a Type instance (returns obj) or a MemType subclass or instance (returns obj.get_type()).
miasm.core.types.console_handler = logging.StreamHandler() |
dictionary miasm.core.types.DYN_MEM_STRUCT_CACHE = {} |
miasm.core.types.log = logging.getLogger(__name__) |
miasm.core.types.SELF_TYPE_INSTANCE = Self() |
miasm.core.types.VOID_TYPE_INSTANCE = Void() |