Package Libs :: Module librecognition :: Class FunctionRecognition
[hide private]
[frames] | no frames]

Class FunctionRecognition

source code

Instance Methods [hide private]
 
__init__(self, imm, dictionaryfiles=None)
This class try to recognize a function using different methods (address/signature/heuristic).
source code
STRING
resolvFunctionByAddress(self, address, heuristic=90)
Look up into our dictionaries to find a function match.
source code
INTEGER
checkHeuristic(self, address, reference, refFirstCall=[])
Check a given address with a precomputed hash of a function.
source code
 
compareHeuristic(self, cfg, refcfg) source code
 
makeFunctionHashHeuristic(self, address, compressed=False, followCalls=True)
Consider: - Control Flow Graph - generalized instructions that: access memory/write memory/use registers/use constant/call/jmp/jmc and all his combinations.
source code
UNSIGNED LONG
hash_a_list(self, data)
Take a list and return a binary representation of his CRC32.
source code
LIST
searchFunctionByHeuristic(self, csvline, heuristic=90, module=None)
Search memory to find a function that fullfit the options.
source code
LIST
_searchFunctionByHeuristic(self, search, functionhash=None, firstcallhash=None, exact=None, heuristic=90, module=None, firstbb=None)
Search memory to find a function that fullfit the options.
source code
LIST
searchFunctionByName(self, name, heuristic=90, module=None, version=None)
Look up into our dictionaries to find a function match.
source code
STRING
makeFunctionHashExact(self, address)
Return a SHA-1 hash of the function, taking the raw bytes as data.
source code
LIST
makeFunctionHash(self, address, compressed=False)
Return a list with the best BB to use for a search and the heuristic hash of the function.
source code
 
selectBasicBlock(self, address) source code
LIST
generalizeFunction(self, address)
Take an address an return a generalized version of the function, dismissing address and register dependant information.
source code
STRING
generalizeInstruction(self, inp)
Generalize an instruction given an address or an opCode instance
source code
DWORD|None
findBasicBlockHeuristically(self, address, firstbb, maxsteps=20)
Try to match a generalized BB with an address range (moving backward).
source code
DWORD|None
findFirstBB(self, address, recursive=False)
The main idea is traverse a function backward following Xrefs until we reach a point where there's no more Xrefs other than CALLs
source code
Method Details [hide private]

__init__(self, imm, dictionaryfiles=None)
(Constructor)

source code 

This class try to recognize a function using different methods (address/signature/heuristic).

Parameters:
  • imm (Debbuger OBJECT) - Debbuger instance
  • dictionaryfiles (STRING|LIST) - Name, or list of names, of .dat files inside the Data folder, where're stored the function patterns. Use an empty string to use all .dat files in Data folder.

resolvFunctionByAddress(self, address, heuristic=90)

source code 

Look up into our dictionaries to find a function match.

Parameters:
  • address (DWORD) - Address of the function to search
  • heuristic (INTEGER) - heuristic threasold to consider a real function match
Returns: STRING
a STRING with the function's real name or the given address if there's no match

checkHeuristic(self, address, reference, refFirstCall=[])

source code 

Check a given address with a precomputed hash of a function. Return a percentage of match (you can use a threasold to consider a real match)

Parameters:
  • address (DWORD) - Address of the function to compare
  • reference (STRING) - base64 representation of the compressed information about the function
  • refFirstCall (STRING) - the same, but following the function pointed by the first call in the first BB. (OPTIONAL)
Returns: INTEGER
heuristic threasold to consider a real function match

makeFunctionHashHeuristic(self, address, compressed=False, followCalls=True)

source code 

Consider:
- Control Flow Graph
- generalized instructions that:
    access memory/write memory/use registers/use constant/call/jmp/jmc
    and all his combinations.
- special case of functions with just 1 BB and a couple of calls (follow the first call)

@type  address: DWORD
@param address: address of the function to hash

@type  compressed: Boolean
@param compressed: return a compressed base64 representation or the raw data

@type  followCalls: Boolean
@param followCalls: follow the first call in a single basic block function

@rtype: LIST
@return: the first element is described below and the second is the result of this same function but over the first
         call of a single basic block function (if applies), each element is like this:
    a base64 representation of the compressed version of each bb hash:
    [4 bytes BB(i) start][4 bytes BB(i) 1st edge][4 bytes BB(i) 2nd edge]
    0 <= i < BB count
    or the same but like a LIST with raw data.

hash_a_list(self, data)

source code 

Take a list and return a binary representation of his CRC32.

Parameters:
  • data (LIST) - a list of elements to make the hash
Returns: UNSIGNED LONG
a hash of the given values

searchFunctionByHeuristic(self, csvline, heuristic=90, module=None)

source code 

Search memory to find a function that fullfit the options.

Parameters:
  • csvline (STRING) - A line of a Data CSV file. This's a simple support for copy 'n paste from a CSV file.
  • heuristic (INTEGER) - heuristic threasold to consider a real function match
  • module (STRING) - name of a module to restrict the search
Returns: LIST
a list of tuples with possible function's addresses and the heauristic match percentage

_searchFunctionByHeuristic(self, search, functionhash=None, firstcallhash=None, exact=None, heuristic=90, module=None, firstbb=None)

source code 

Search memory to find a function that fullfit the options.

Parameters:
  • search (STRING) - searchCommand string to make the first selection
  • functionhash (STRING) - the primary function hash (use makeFunctionHash to generate this value)
  • firstcallhash (STRING) - the hash of the first call on single BB functions (use makeFunctionHash to generate this value)
  • exact (STRING) - an exact function hash, this's a binary byte-per-byte hash (use makeFunctionHash to generate this value)
  • heuristic (INTEGER) - heuristic threasold to consider a real function match
  • module (STRING) - name of a module to restrict the search
  • firstbb (STRING) - generalized assembler of the first BB (to search function begin)
Returns: LIST
a list of tuples with possible function's addresses and the heauristic match percentage

searchFunctionByName(self, name, heuristic=90, module=None, version=None)

source code 

Look up into our dictionaries to find a function match.

Parameters:
  • name (STRING) - Name of the function to search
  • module (STRING) - name of a module to restrict the search
  • version (STRING) - restrict the search to the given version
  • heuristic (INTEGER) - heuristic threasold to consider a real function match
Returns: LIST
a list of tuples with possible function's addresses and the heauristic match percentage

makeFunctionHashExact(self, address)

source code 

Return a SHA-1 hash of the function, taking the raw bytes as data.

Parameters:
  • address (DWORD) - address of the function to hash
Returns: STRING
SHA-1 hash of the function

makeFunctionHash(self, address, compressed=False)

source code 

Return a list with the best BB to use for a search and the heuristic hash of the function. This two components are the function hash.

Parameters:
  • address (DWORD) - address of the function to hash
  • compressed (Boolean) - return a compressed base64 representation or the raw data
Returns: LIST
1st element is the generalized instructions to use with searchCommand 2nd element is the heuristic function hash (makeFunctionHashHeuristic) 3rd element is an exact hash of the function (makeFunctionHashExact) 4th element is a LIST of generalized instructions of the first BB (to find the function begin)

generalizeFunction(self, address)

source code 

Take an address an return a generalized version of the function, dismissing address and register dependant information.

Parameters:
  • address (DWORD) - address to the function begin
Returns: LIST
the 1st value is a DICTIONARY of a Control Flow Graph of the BB conexions (each BB have an arbitrary ID) the 2nd value is a DICTIONARY using this arbitrary BB ID as the key and a LIST of searchCommand suitable, generalized instructions.

generalizeInstruction(self, inp)

source code 

Generalize an instruction given an address or an opCode instance

Parameters:
  • inp (DWORD|OpCode OBJECT) - address to generalize or opcode to generalize
Returns: STRING
a generalized assembler instruction

findBasicBlockHeuristically(self, address, firstbb, maxsteps=20)

source code 

Try to match a generalized BB with an address range (moving backward).

Parameters:
  • address (DWORD) - address used to match with the generalized BB
  • firstbb (LIST) - a list of generalized assembler instructions
  • maxsteps (INTEGER) - max amount of steps to go backward looking for a BB
Returns: DWORD|None
starting address of the BB that match with the generalized version or None if we don't find it

findFirstBB(self, address, recursive=False)

source code 

The main idea is traverse a function backward following Xrefs until we reach a point where there's no more Xrefs other than CALLs

Parameters:
  • address (DWORD) - address used find the first BB
Returns: DWORD|None
Address of the first BB of the function or None if we don't find it