[panda-users] Presenting my thesis work, SysTaint, which employs PANDA for malware analysis
Gabriele Viglianisi
vigliag at gmail.com
Sun May 6 05:23:01 EDT 2018
Dear PANDA users,
I've finally finished my master thesis and I'm happy to share it with you.
I've applied PANDA to the study of malware, with the goals of
providing a replacement for debugging and making it easier to study
malware that communicate with external servers. Some of the techniques
I've used are similar to the ones described in the Dispatcher paper by
Caballero et al., but instead of on performing protocol reverse
engineering, my focus was on building an easy-to-employ tool to study
a sample by inspecting its data-flow.
My approach consists in:
- Executing the malware sample in a virtual machine (manually or
through Cuckoo Sandbox) and obtaining a PANDA recording
- Collecting information on all processes in the recording via
asidstory and rekall
- Using network logs and "stringsearch" to find the processes of interest
- Collecting statistics on the functions the malware processes called,
and detecting encryption functions
- Tracing system calls and applying taint analysis to find
data-dependencies between them, as well as the function calls using
the tracked data
- Logging the collected data to disk, so that it can be interactively
queried by an analyst to quickly locate relevant data and code,
complementing both Cuckoo Sandbox's analyses and the usual reverse
engineering tools.
I've made changes to some existing PANDA plugins, and developed some new ones:
- Callstack_instr was refactored and expanded, so that it assigns an
identifier to each call, allowing per-call information to be
collected.
- A new "ProcInfoDump" plugin exposes the guest's memory to a
python+rekall script embedded in the same process via PyBind11, so
that it can be used to quickly inspect memory at various points in
time.
- "StringSearch2" is an easier to use version of StringSearch
- "FnMemLogger" collects statistics about the functions the malware
uses, by monitoring the first 5 calls to each function. For each call,
it obtains the size, entropy and number of ASCII characters of each
buffer the function reads or writes, together with the number of basic
block and instructions executed, and the ratio of arithmetic
operations over the total. This data is then analyzed by scripts to
automatically detect encryption functions via heuristics.
- "TCGTaint" is a tcg-based taint tracking implementation, adapted
from Qtrace. The way it hooks in QEMU's TCG is not the ideal, but it's
fast, flexible and gets the job done
- "SysTaint" is the main analysis plugin, it collects information on
selected system and function calls, monitoring memory accesses, and
employing taint tracking.
Additionally, I patched Cuckoo Monitor so that it emits hypercalls
when it intercepts a call to a known system library. This allows, when
the sample's execution is recorded while it is being analyzed by
Cuckoo Sandbox, to be able to quickly jump from the entries in
Cuckoo's behavioral log to the in-depth data collected by SysTaint.
I tested my work by analyzing the execution of Zeus, Citadel, Dridex
and Emotet, locating the data sent through the network, finding its
provenance, and the code that transformed and encrypted the original
data. More details are provided in the thesis. There are still many
things that can be improved, but as a prototype this tool works
already, and can provide the analysts with plenty of information
without having to debug the malware or execute it more than once.
You can find my work here:
- Thesis: https://www.gabrieleviglianisi.com/files/GabrieleViglianisi-SysTaint-Thesis.pdf
- Thesis defense slides:
https://www.gabrieleviglianisi.com/files/GabrieleViglianisi-SysTaint-Thesis-Defense.pdf
- PANDA fork with the added plugins: https://github.com/vigliag/panda
- Fork of Cuckoo Monitor with the added hypercalls:
https://github.com/vigliag/cuckoo_monitor_panda
- I can also share my python scripts and jupyter notebooks. They still
need some cleanup, but feel free to ask if interested
Please feel free to contact me if you have any questions. I'd also be
happy to upstream my changes and plugins to PANDA.
Best regards,
Gabriele
More information about the panda-users
mailing list