miasm
Reverse engineering framework
miasm.core.types Namespace Reference

Classes

class  _MetaMemStruct
 
class  _MetaMemType
 
class  Array
 
class  BitField
 
class  Bits
 
class  MemArray
 
class  MemBitField
 
class  MemPtr
 
class  MemSelf
 
class  MemSizedArray
 
class  MemStr
 
class  MemStruct
 
class  MemType
 
class  MemUnion
 
class  MemValue
 
class  MemVoid
 
class  Num
 
class  Ptr
 
class  RawStruct
 
class  Self
 
class  Str
 
class  Struct
 
class  Type
 
class  Union
 
class  Void
 

Functions

def set_allocator (alloc_func)
 
def to_type (obj)
 
def indent (s, size=4)
 
def get_str (vm, addr, enc, max_char=None, end=u'\x00')
 
def raw_str (s, enc, end=u'\x00')
 
def set_str (vm, addr, s, enc, end=u'\x00')
 
def raw_len (py_unic_str, enc, end=u'\x00')
 
def enc_triplet (enc, max_char=None, end=u'\x00')
 

Variables

 log = logging.getLogger(__name__)
 
 console_handler = logging.StreamHandler()
 
dictionary DYN_MEM_STRUCT_CACHE = {}
 
 SELF_TYPE_INSTANCE = Self()
 
 VOID_TYPE_INSTANCE = Void()
 

Detailed Description

This module provides classes to manipulate pure C types as well as their
representation in memory. A typical usecase is to use this module to
easily manipylate structures backed by a VmMngr object (a miasm sandbox virtual
memory):

    class ListNode(MemStruct):
        fields = [
            ("next", Ptr("<I", Self())),
            ("data", Ptr("<I", Void())),
        ]

    class LinkedList(MemStruct):
        fields = [
            ("head", Ptr("<I", ListNode)),
            ("tail", Ptr("<I", ListNode)),
            ("size", Num("<I")),
        ]

    link = LinkedList(vm, addr1)
    link.memset()
    node = ListNode(vm, addr2)
    node.memset()
    link.head = node.get_addr()
    link.tail = node.get_addr()
    link.size += 1
    assert link.head.deref == node
    data = Num("<I").lval(vm, addr3)
    data.val = 5
    node.data = data.get_addr()
    # see examples/jitter/types.py for more info


It provides two families of classes, Type-s (Num, Ptr, Str...) and their
associated MemType-s. A Type subclass instance represents a fully defined C
type. A MemType subclass instance represents a C LValue (or variable): it is
a type attached to the memory. Available types are:

    - Num: for number (float or int) handling
    - Ptr: a pointer to another Type
    - Struct: equivalent to a C struct definition
    - Union: similar to union in C, list of Types at the same offset in a
      structure; the union has the size of the biggest Type (~ Struct with all
      the fields at offset 0)
    - Array: an array of items of the same type; can have a fixed size or
      not (e.g. char[3] vs char* used as an array in C)
    - BitField: similar to C bitfields, a list of
      [(<field_name>, <number_of_bits>),]; creates fields that correspond to
      certain bits of the field; analogous to a Union of Bits (see Bits below)
    - Str: a character string, with an encoding; not directly mapped to a C
      type, it is a higher level notion provided for ease of use
    - Void: analogous to C void, can be a placeholder in void*-style cases.
    - Self: special marker to reference a Struct inside itself (FIXME: to
      remove?)

And some less common types:

    - Bits: mask only some bits of a Num
    - RawStruct: abstraction over a simple struct pack/unpack (no mapping to a
      standard C type)

For each type, the `.lval` property returns a MemType subclass that
allows to access the field in memory.


The easiest way to use the API to declare and manipulate new structures is to
subclass MemStruct and define a list of (<field_name>, <field_definition>):

    class MyStruct(MemStruct):
        fields = [
            # Scalar field: just struct.pack field with one value
            ("num", Num("I")),
            ("flags", Num("B")),
            # Ptr fields contain two fields: "val", for the numerical value,
            # and "deref" to get the pointed object
            ("other", Ptr("I", OtherStruct)),
            # Ptr to a variable length String
            ("s", Ptr("I", Str())),
            ("i", Ptr("I", Num("I"))),
        ]

And access the fields:

    mstruct = MyStruct(jitter.vm, addr)
    mstruct.num = 3
    assert mstruct.num == 3
    mstruct.other.val = addr2
    # Also works:
    mstruct.other = addr2
    mstruct.other.deref = OtherStruct(jitter.vm, addr)

MemUnion and MemBitField can also be subclassed, the `fields` field being
in the format expected by, respectively, Union and BitField.

The `addr` argument can be omitted if an allocator is set, in which case the
structure will be automatically allocated in memory:

    my_heap = miasm.os_dep.common.heap()
    # the allocator is a func(VmMngr) -> integer_address
    set_allocator(my_heap)

Note that some structures (e.g. MemStr or MemArray) do not have a static
size and cannot be allocated automatically.

Function Documentation

◆ enc_triplet()

def miasm.core.types.enc_triplet (   enc,
  max_char = None,
  end = u'\x00' 
)
Returns a triplet of functions (get_str_enc, set_str_enc, raw_len_enc)
for a given encoding (as needed by Str to add an encoding). The prototypes
are:

    - get_str_end: same as get_str without the @enc argument
    - set_str_end: same as set_str without the @enc argument
    - raw_len_enc: same as raw_len without the @enc argument
Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_str()

def miasm.core.types.get_str (   vm,
  addr,
  enc,
  max_char = None,
  end = u'\x00' 
)
Get a @end (by default '\\x00') terminated @enc encoded string from a
VmMngr.

For example:
    - get_str(vm, addr, "ascii") will read "foo\\x00" in memory and
      return u"foo"
    - get_str(vm, addr, "utf-16le") will read "f\\x00o\\x00o\\x00\\x00\\x00"
      in memory and return u"foo" as well.

Setting @max_char=<n> and @end='' allows to read non null terminated strings
from memory.

@vm: VmMngr instance
@addr: the address at which to read the string
@enc: the encoding of the string to read.
@max_char: max number of bytes to get in memory
@end: the unencoded ending sequence of the string, by default "\\x00".
    Unencoded here means that the actual ending sequence that this function
    will look for is end.encode(enc), not directly @end.
Here is the caller graph for this function:

◆ indent()

def miasm.core.types.indent (   s,
  size = 4 
)
Indent a string with @size spaces
Here is the caller graph for this function:

◆ raw_len()

def miasm.core.types.raw_len (   py_unic_str,
  enc,
  end = u'\x00' 
)
Returns the length in bytes of @py_unic_str in memory (once @end has been
added and the full str has been encoded). It returns exactly the room
necessary to call set_str with similar arguments.

@py_unic_str: the unicode str to work with
@enc: the encoding to encode @py_unic_str to
@end: the ending string/character to append to the string _before encoding_
    (by default \\x00)
Here is the call graph for this function:
Here is the caller graph for this function:

◆ raw_str()

def miasm.core.types.raw_str (   s,
  enc,
  end = u'\x00' 
)
Returns a string representing @s as an @end (by default \\x00)
terminated @enc encoded string.

@s: the unicode str to serialize
@enc: the encoding to apply to @s and @end before serialization.
@end: the ending string/character to append to the string _before encoding_
    and serialization (by default '\\x00')
Here is the caller graph for this function:

◆ set_allocator()

def miasm.core.types.set_allocator (   alloc_func)
Shorthand to set the default allocator of MemType. See
MemType.set_allocator doc for more information.

◆ set_str()

def miasm.core.types.set_str (   vm,
  addr,
  s,
  enc,
  end = u'\x00' 
)
Encode a string to an @end (by default \\x00) terminated @enc encoded
string and set it in a VmMngr memory.

@vm: VmMngr instance
@addr: start address to serialize the string to
@s: the unicode str to serialize
@enc: the encoding to apply to @s and @end before serialization.
@end: the ending string/character to append to the string _before encoding_
    and serialization (by default '\\x00')
Here is the call graph for this function:
Here is the caller graph for this function:

◆ to_type()

def miasm.core.types.to_type (   obj)
If possible, return the Type associated with @obj, otherwise raises
a ValueError.

Works with a Type instance (returns obj) or a MemType subclass or instance
(returns obj.get_type()).
Here is the caller graph for this function:

Variable Documentation

◆ console_handler

miasm.core.types.console_handler = logging.StreamHandler()

◆ DYN_MEM_STRUCT_CACHE

dictionary miasm.core.types.DYN_MEM_STRUCT_CACHE = {}

◆ log

miasm.core.types.log = logging.getLogger(__name__)

◆ SELF_TYPE_INSTANCE

miasm.core.types.SELF_TYPE_INSTANCE = Self()

◆ VOID_TYPE_INSTANCE

miasm.core.types.VOID_TYPE_INSTANCE = Void()