input file with "magic numbers" (DEADCAFEh, etc.) into array
given output from DumpPE -disasm:
get Win32 filename from first line of dumppe output
ignore everything until see "Disassembly"
for each line in disassembly
if it's a function label
if opstring length > minlength and
if opstring contains at least one ret or jmp
if opstring length is maximum, might be junk so:
chop it off at the first ret or jmp
output filename, previous function name, and opstring
set up for next opstring:
set new function name; clear opstring; oplength = 0
else if it's an instruction line, and opstring isn't at maxlength
if line contains a large hex operand
if the hex operand is found in the "magic" array
add mnemonic "_" magic number to opstring
else if it's a Windows API call
add API name to opstring without trailing 'W' or 'A'
else if it's a branch target
add "loc" to opstring
else if it's common (mov, push, pop, add esp)
if it's an API mov
add "mov_" and API name to opstring
else
do nothing
else if it's junk code (nop, int 3, etc.)
do nothing
else if it's data ; relying on DumpPE to separate code/data
do nothing
else if we've seen this same thing many times in a row
do nothing
else if mnemonic has prefix (rep, lock, etc.)
add prefix "_" mnemonic to opstring
else
add mnemonic to opstring
if added to opstring
oplength++
stop processing when see hex dump
output last one
send output to mkmd5db
Figure 4: Pseudocode for the Opstring program.
Back to Article