[panda-users] A TCG-based taint plugin for Panda

Gabriele Viglianisi vigliag at gmail.com
Mon Nov 20 12:58:23 EST 2017


Hi everbody,

I had some difficulties with taint2, which I was unable to debug with my
current qemu and llvm knowledge, so I decided to make a quick attempt to
port the tcg-based taint instrumentation from Qtrace (https://github.com/
rpaleari/qtrace), a different qemu port by @rpaleari. I've suprisingly
managed to get something working well enough for my purposes (or so appears
from a few tests), and I thougth I'd share it with you. Most of the credit
goes to Roberto Paleari for publishing the original Qtrace code.

The plugin is not a full replacement for taint2, as it's not as general or
powerful, but it seems lighter on resources. It only supports the i386
target, doesn't instrument helpers, and it is composed of two parts: the
instrumentation (additional helper calls generated inside of tcg's
frontend), and the "tcgtaint" plugin, managing the data structures, and
exposing an interface similar to the one of taint2.

The approach is probably not the cleanest, as it requires some insertions
to qemu's tcg code, and is maybe not general enough to be included in panda.
If by the way Panda developers are interesting in including it (in the main
or in a separate branch), I'd be happy to do some cleanup and prepare a PR.

All the relevant code is in ifdefs (you can grep for
"CONFIG_QTRACE_TAINT"), and is only built when the "--enable-tcgtaint"
switch is present. The instrumentation helper calls are only generated when
requested via the plugin api, and shouldn't affect the performances when
disabled.

Differences with respect to taint2:
- comparatively very light on memory (unless you actually taint a lot)
- a little faster
- can be enabled and disabled via the plugin api, can also be told to only
instrument user-level code
- doesn't instrument helpers (the main use case is being able to tell the
provenance of some piece of data, so the focus was on getting movs, memcpys
and similar working correctly)
- the instrumentation is added before stores, so you safely can taint on a
"virt_mem_after_write" callback
- doesn't support tainting through pointer dereference
- doesn't suppot tainted branches
- only partial support to xmm registers (movs only)

Differences with respect to the original instrumentation in qtrace:
- it uses panda's memory functions and callbacks instead of qtrace's
- I moved load instrumentation deeper down in tcg's frontend, so to support
`rep mov` and similar
- added partial support to xmm registers, in order to support `movqda`
- some bug fixes

The code still lacks some comments, copyright notices, and documentation,
but it should work. Any comment or feedback is welcome!

You can find the code at the tcgtaint branch in my repo
https://github.com/vigliag/panda/tree/tcgtaint

changes wrt panda's master are here: https://github.com/panda-re/
panda/compare/master...vigliag:tcgtaint

Best regards,
Gabriele
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/panda-users/attachments/20171120/fe3e4756/attachment.html


More information about the panda-users mailing list