[panda-users] How can I get the original assembly code(opcode)?

Brendan Dolan-Gavitt brendandg at gatech.edu
Mon Aug 17 13:03:34 EDT 2015


"pc" and env->eip can be different! QEMU typically only updates
env->eip every basic block. The insn_exec callback will provide the
precise program counter value as its argument though (it stores it at
translation time so it can be passed in).

Manolis is right that this won't give you the original binary back.
One thing you can do is take a memory snapshot during replay and then
use Volatility to extract the binary image from memory. This will
preserve the headers, data sections, etc. However, depending on the
amount of RAM available, some pages might be swapped out.

If what you're looking to do is just disassemble something, you can
use the recently added panda_disas function:

void panda_disas(FILE *out, void *code, unsigned long size)

Alternatively, if you want to have some machine-parseable description
of the disassembled instruction, you can use distorm; an example of
that can be found in the callstack_instr plugin.

-Brendan

On Mon, Aug 17, 2015 at 9:08 AM, Manolis Stamatogiannakis
<mstamat at gmail.com> wrote:
> Igor, are you sure that the "pc" argument and "env->eip" will contain
> different arguments? I'd guess that "pc" is provided as convenience so that
> you can avoid architecture-specific #ifdef macros in your plugin code
> ("env->eip" is x86 specific).
>
> InGap, could you elaborate on what you attempt to achieve?
>
> Reconstructing mybin.exe from an execution trace is a non-trivial task. Even
> in the (unlikely) case you have full coverage of mybin.exe in the execution
> trace (i.e. every instruction in mybin.exe was executed at least once), the
> order of the instructions as executed still may be different than the order
> they appear in the binary. Moreover, executables are not plain instruction
> dumps. They contain a lot of structured information (see
> https://en.wikipedia.org/wiki/Portable_Executable) that you will not be able
> to recapture just by observing the execution.
>
> M.
>
>
>
> 2015-08-17 8:33 GMT+02:00 Igor R <boost.lists at gmail.com>:
>>
>> > I trying to get the "mybin.exe'' 's original assembly code(opcode) in
>> > the PANDA plugin.
>> > (for tracing binary's opcode, registers, memory ..)
>> >
>> > Host OS : ubuntu x64
>> > Guest OS : windows xp x86
>> > Test binary : mybin.exe
>> >
>> > I got the opcode using panda_virtual_memory_rw function at
>> > PANDA_CB_INSN_TRANSLATE.
>> > ex) panda_virtual_memory_rw(env, env->eip, buf, 20, 0);
>> >
>> > but, It is not same as original assembly code('mybin.exe').
>> > It seems to be translated by the PANDA.
>>
>>
>>
>> Quoting from the documentation:
>> <<
>> insn_translate: called before the translation of each instruction
>>
>> Callback ID: PANDA_CB_INSN_TRANSLATE
>>
>> Arguments:
>>
>> CPUState *env: the current CPU state
>> target_ulong pc: the guest PC we are about to translate
>> >>
>>
>> So, if you need the opcode of the instruction being translated, you
>> should read the memory from "pc" address (rather than env->ip).
>> _______________________________________________
>> panda-users mailing list
>> panda-users at mit.edu
>> http://mailman.mit.edu/mailman/listinfo/panda-users
>
>
>
> _______________________________________________
> panda-users mailing list
> panda-users at mit.edu
> http://mailman.mit.edu/mailman/listinfo/panda-users
>


More information about the panda-users mailing list