<div dir="ltr">Really interesting, I will have to take a look! I saw that the basic shadow memory is essentially an unordered_map from the physical address in memory to an std::set of labels. I'm a little surprised this is performant, because we had tried this a few years back and found it was pretty slow (hence our current sparse virtual memory approach).<div><br></div><div>How hard would it be to add additional helpers? (Not saying I would want you to do it, just how hard it would be if someone later wanted to make the taint support more complete)</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Nov 20, 2017 at 9:58 PM, Gabriele Viglianisi <span dir="ltr"><<a href="mailto:vigliag@gmail.com" target="_blank">vigliag@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi everbody, <div><br>I had some difficulties with taint2, which I was unable to debug with my current qemu and llvm knowledge, so I decided to make a quick attempt to port the tcg-based taint instrumentation from Qtrace (<a href="https://github.com/rpaleari/qtrace" target="_blank">https://github.com/rpaleari/<wbr>qtrace</a>), a different qemu port by @rpaleari. I've suprisingly managed to get something working well enough for my purposes (or so appears from a few tests), and I thougth I'd share it with you. Most of the credit goes to Roberto Paleari for publishing the original Qtrace code.</div><div><br></div><div>The plugin is not a full replacement for taint2, as it's not as general or powerful, but it seems lighter on resources. It only supports the i386 target, doesn't instrument helpers, and it is composed of two parts: the instrumentation (additional helper calls generated inside of tcg's frontend), and the "tcgtaint" plugin, managing the data structures, and exposing an interface similar to the one of taint2.</div><div><br></div><div>The approach is probably not the cleanest, as it requires some insertions to qemu's tcg code, and is maybe not general enough to be included in panda.</div><div>If by the way Panda developers are interesting in including it (in the main or in a separate branch), I'd be happy to do some cleanup and prepare a PR.</div><div><br></div><div>All the relevant code is in ifdefs (you can grep for "CONFIG_QTRACE_TAINT"), and is only built when the "--enable-tcgtaint" switch is present. The instrumentation helper calls are only generated when requested via the plugin api, and shouldn't affect the performances when disabled.</div><div><br>Differences with respect to taint2:<br></div><div>- comparatively very light on memory (unless you actually taint a lot)<br>- a little faster<br>- can be enabled and disabled via the plugin api, can also be told to only instrument user-level code<br>- doesn't instrument helpers (the main use case is being able to tell the provenance of some piece of data, so the focus was on getting movs, memcpys and similar working correctly)<br>- the instrumentation is added before stores, so you safely can taint on a "virt_mem_after_write" callback<br>- doesn't support tainting through pointer dereference</div><div>- doesn't suppot tainted branches<br>- only partial support to xmm registers (movs only) <br><br>Differences with respect to the original instrumentation in qtrace:<br>- it uses panda's memory functions and callbacks instead of qtrace's<br>- I moved load instrumentation deeper down in tcg's frontend, so to support `rep mov` and similar<br>- added partial support to xmm registers, in order to support `movqda`<br>- some bug fixes</div><div><br>The code still lacks some comments, copyright notices, and documentation, but it should work. Any comment or feedback is welcome! </div><div><br></div><div>You can find the code at the tcgtaint branch in my repo <a href="https://github.com/vigliag/panda/tree/tcgtaint" target="_blank">https://github.com/vigliag/pan<wbr>da/tree/tcgtaint</a></div><div><br></div><div>changes wrt panda's master are here: <a href="https://github.com/panda-re/panda/compare/master...vigliag:tcgtaint" target="_blank">https://github.com/panda-re/pa<wbr>nda/compare/master...vigliag:<wbr>tcgtaint</a><br><br>Best regards,<br>Gabriele<br></div></div>
<br>______________________________<wbr>_________________<br>
panda-users mailing list<br>
<a href="mailto:panda-users@mit.edu">panda-users@mit.edu</a><br>
<a href="http://mailman.mit.edu/mailman/listinfo/panda-users" rel="noreferrer" target="_blank">http://mailman.mit.edu/<wbr>mailman/listinfo/panda-users</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Brendan Dolan-Gavitt<br>Assistant Professor, Department of Computer Science and Engineering<br>NYU Tandon School of Engineering</div>
</div>