[panda-users] Tracing instructions from kernel space

Jayashree Mohan jayashree2912 at gmail.com
Sat Nov 10 00:09:06 EST 2018


To reproduce the problem I am facing try this :

I've enabled tracking the entire memory region in the plugin here :
https://github.com/williewillus/panda_scratchpad/blob/jaya_dev/personal_plugins/panda/plugins/writetracker/writetracker.cpp

and the test workload file I am using is here :
https://github.com/williewillus/panda_scratchpad/blob/jaya_dev/communication/write-cacheline.c

It simply writes 64 B of data into the file.

What I am trying to do is -
1. Create a pmem device / ramdisk  and mount it on ext4-dax at /mnt/pmem0
2. Now start up the VM on QEMU and load the writetracker plugin above
3. On the VM, execute the workload above, which writes 64B, followed by a
fsync
4. Unload the write tracker plugin. This would have traced memory writes
along with the address and data in the wt.out file in the host.
5. You can simply parse the file to see the address and data by using the
src/reader.cpp file in the repo.

On doing the above steps, I see the behaviour I described. When I write
64B, I don't see the data, which is a series of RRRRR... in the parsed
wt.out file. Rather, if I just change this to write 63 B instead, I can see
it.

Sample snippet from the parsed file when I write 63B :

[pc 0xffffffff81456121] write to VA 41411000, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411008, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411010, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411018, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411020, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411028, size 8, Data : RRRRRRRR
[pc 0xffffffff81456121] write to VA 41411030, size 8, Data : RRRRRRRR
[pc 0xffffffff81456149] write to VA 41411038, size 4, Data : RRRR
[pc 0xffffffff8145615d] write to VA 4141103c, size 1, Data : R
[pc 0xffffffff8145615d] write to VA 4141103d, size 1, Data : R
[pc 0xffffffff8145615d] write to VA 4141103e, size 1, Data : R

Though it says VA, the physical address is being printed, which is
something around 1GB. My emulated pmem device occupies 1GB- 1GB+128MB in
the physical memory - so this makes sense.

Irrespective of the address, even if I run the workload on a ramdisk, I
would expect to see 'RRRR....'  in the trace file, which I don't if I write
multiples of aligned 64B.

Hope this helps reproduce the issue. Let me know if you need more details
or if I am doing some mistake here.

Thanks,
Jayashree Mohan



Thanks,
Jayashree Mohan



On Fri, Nov 9, 2018 at 10:19 PM Brendan Dolan-Gavitt <brendandg at nyu.edu>
wrote:

> Hmm, something definitely seems odd – with this test program:
>
> https://gist.github.com/moyix/ed0d6dde9bc8164ff5e58030282d72af
>
> and then testing with -panda stringsearch:str="averylongstring" I can
> see writes in the kernel in __copy_from_user_ll_nozero – but only if I
> do a memcpy to a different userspace buffer first!
>
> Could you share the userland program you're using to test so I can compare?
>
> -Brendan
>
> On Fri, Nov 9, 2018 at 9:06 PM, Jayashree Mohan <jayashree2912 at gmail.com>
> wrote:
> > Hi Brendan,
> >
> > We verified by enabling tracing the entire memory region rather than
> > confining it to 1-2G. However, the writes still cannot be traced. The
> > behaviour we see if rather interesting. When we write something less
> > than a cacheline(64B) using a write system call followed by fsync, it
> > gets traced by PANDA during the PANDA_CB_VIRT_MEM_AFTER_WRITE
> > callback. However, when we write anything in multiples of aligned
> > cachelines, we don't see any memory write traces. For example, if I
> > write 258B into a file, I can see the last two bytes of data alone.
> > This seems weird as PANDA is not tracing full aligned cachelines. Do
> > you seem to understand why this could be happening?
> >
> > Thanks,
> > Jayashree Mohan
> >
> > On Fri, Nov 9, 2018 at 11:41 AM Brendan Dolan-Gavitt <brendandg at nyu.edu>
> wrote:
> >>
> >> The only thing I can think of from looking at your code briefly is
> >> your use of the physical address range to restrict it to only log
> >> writes in the 1-2GB range. Could it be that the kernel does
> >> copy_from_user at the start and copies it into someplace outside that
> >> range, then writes it back with copy_to_user at the end?
> >>
> >> On Fri, Nov 9, 2018 at 11:01 AM, Jayashree Mohan
> >> <jayashree2912 at gmail.com> wrote:
> >> > Hi Brendan,
> >> >
> >> > Thanks for the reply.
> >> >
> >> > Take a look at the plugin here :
> >> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_williewillus_panda-5Fscratchpad_blob_master_personal-5Fplugins_panda_plugins_writetracker_writetracker.cpp&d=DwIFaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=A4wu5Zmpus3hDmokNWeJTO0SLjrxguzCAxn30Hc-o48&m=kHkbR-cTusA4zz6euYKwT5Pzf8nrwZP1idV_9NTfZaA&s=BDZjQuADJrBGUzNIgo2EW_xVvAN7aN-caR1Rl0PqNgo&e=
> >> >
> >> > We load this plugin, and in the VM, write a simple program that
> writes to
> >> > the pmem device mounted within the memory region being tracked. I see
> memcpy
> >> > writes being traced, but not the ones due to write system call.
> >> >
> >> > I'll try checking if any of my writes originate in the kernel.
> >> >
> >> > Thanks,
> >> > Jayashree Mohan
> >> >
> >> >
> >> >
> >> > Thanks,
> >> > Jayashree Mohan
> >> >
> >> >
> >> >
> >> > On Fri, Nov 9, 2018 at 9:56 AM Brendan Dolan-Gavitt <
> brendandg at nyu.edu>
> >> > wrote:
> >> >>
> >> >> Yes, it should definitely be tracing memory accesses in the kernel
> (it
> >> >> traces all memory accesses on the system) – could you post your
> plugin
> >> >> code?
> >> >>
> >> >> You may also want to simply log all memory accesses, along with the
> >> >> current program counter and (optionally) whether or not they
> originate
> >> >> in the kernel (using the panda_in_kernel API) to debug.
> >> >>
> >> >> On Fri, Nov 9, 2018 at 10:35 AM, Jayashree Mohan
> >> >> <jayashree2912 at gmail.com> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > I am using PANDA to trace all store instructions in an emulated
> pmem
> >> >> > device. I do this by writing a plugin that issues calbacks on
> >> >> > "PANDA_CB_VIRT_MEM_AFTER_WRITE" events. If I run a simple workload
> >> >> > that does write() system calls followed by mmap and memcpy, I can
> see
> >> >> > the callbacks being triggered for the user-space memcpy calls to a
> >> >> > file, but not anytime during the write system call. Does PANDA
> allow
> >> >> > tracing instructions from the kernel space?
> >> >> >
> >> >> > Thanks,
> >> >> > Jayashree Mohan
> >> >> > _______________________________________________
> >> >> > panda-users mailing list
> >> >> > panda-users at mit.edu
> >> >> >
> >> >> >
> https://urldefense.proofpoint.com/v2/url?u=http-3A__mailman.mit.edu_mailman_listinfo_panda-2Dusers&d=DwICAg&c=slrrB7dE8n7gBJbeO0g-IQ&r=A4wu5Zmpus3hDmokNWeJTO0SLjrxguzCAxn30Hc-o48&m=t_XD-sNNGDpfuGLf63sp5f-I-OP6dhEVNn-r9F-giQU&s=o4Ml-SG3gwaAZ7JrRz3N2W7BvJdTyZvua-jgEyicY2Q&e=
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Brendan Dolan-Gavitt
> >> >> Assistant Professor, Department of Computer Science and Engineering
> >> >> NYU Tandon School of Engineering
> >>
> >>
> >>
> >> --
> >> Brendan Dolan-Gavitt
> >> Assistant Professor, Department of Computer Science and Engineering
> >> NYU Tandon School of Engineering
>
>
>
> --
> Brendan Dolan-Gavitt
> Assistant Professor, Department of Computer Science and Engineering
> NYU Tandon School of Engineering
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/panda-users/attachments/20181110/31527dcd/attachment-0001.html


More information about the panda-users mailing list