Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 21 Aug 2016 17:59:32 +0200
From:      Roger Pau =?iso-8859-1?Q?Monn=E9?= <royger@freebsd.org>
To:        Akshay Jaggi <akshay1994.leo@gmail.com>
Cc:        soc-status@freebsd.org, Pedro Giffuni <pfg@freebsd.org>
Subject:   Re: Grant Table Userspace Device - Status Update
Message-ID:  <20160821155932.lzqkbldx3ihhccdn@mac>
In-Reply-To: <CAAeUNV=b7=n9LH7vk6iCWXR=eoSe6i9aH05Aa1yEtSYGfw0u5A@mail.gmail.com>
References:  <CAAeUNVnfK29Mck_eRKguij2pYV%2BehAG=Qb5bcYwRJZXmODbPQQ@mail.gmail.com> <CAAeUNVnP9tm4F3b9hCBv=rTc8kd0gBsQ0jKhP7HVNrdrtK8RfA@mail.gmail.com> <CAAeUNV=b7=n9LH7vk6iCWXR=eoSe6i9aH05Aa1yEtSYGfw0u5A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Aug 19, 2016 at 12:50:32AM +0530, Akshay Jaggi wrote:
> Carrying over discussion from IRC.
> 
> 20:11 royger: ghost_rider: hello! I've been doing some testing with the
> > device today, and it seems there's a memory leak somewhere, after shutting
> > down all my domains I still see 1KB of memory used by the device, which
> > AFAICT is not expected (you can check with `vmstat -m |grep gntdev`)
> >
> 
> Nope. That's not a leak.
> 
> I ran `vmstat -m | grep gntdev` just after booting up Dom0, without any of
> the DomU's running, and I still saw 1KB of memory being used by the device.
> 
> root@freebsd:~ # vmstat -m | grep gntdev
>        gntdev     2     1K       -        2  64
> 
> That is, 2 requests have been made, out of which both are currently active,
> without any DomU's active.
> 
> After this I fired up a DomU with qdisk backends, and vmstat returned:
> 
> root@freebsd:~/xen_test # vmstat -m | grep gntdev
>        gntdev  2129   134K       -     2137  32,64,128
> 
> Well in line with expectations. Now, powering off the DomU and running
> vmstat again, we get:
> 
> root@freebsd:~/xen_test # vmstat -m | grep gntdev
>        gntdev     2     1K       -     2845  32,64,128
> 
> The initial 2 requests are still active, and this has nothing to do with
> the DomU's. The first malloc() that happens in the device is in the device
> open function at [1]. That means that someone has the device open. `fstat`
> confirmed my suspicions.
> 
> `fstat` with DomU active:
> 
> root@freebsd:~/xen_test # fstat | grep xen/gntdev
> root     qemu-system-i386  1266   29 /dev         62 crw-------  xen/gntdev
> rw
> root     qemu-system-i386  1266   32 /dev         62 crw-------  xen/gntdev
> rw
> root     qemu-system-i386  1266   34 /dev         62 crw-------  xen/gntdev
> rw
> root     xenconsoled   751    6 /dev         62 crw-------  xen/gntdev rw
> root     xenstored    746   11 /dev         62 crw-------  xen/gntdev rw
> 
> `fstat` with DomU powered off:
> 
> root@freebsd:~ # fstat | grep xen/gntdev
> root     xenconsoled   751    6 /dev         62 crw-------  xen/gntdev rw
> root     xenstored    746   11 /dev         62 crw-------  xen/gntdev rw
> 
> So yep! It's no leak. Just that xenconsoled and xenstored keep the gntdev
> device open. I guess this would be expected behaviour. Let me know if it is
> not.
> 
> 20:14 royger: ghost_rider: and I've also seen a "Can't find requested
> > grant-map." after attaching 4 Qdisk to a domain and done heavy IO to to
> > them.
> > 20:16 royger: although this last one I haven't been able to reproduce
> >
> 
> That's pretty strange. I have never noticed this in any of my manual or
> stress tests.
> 
> At this point I would also like to mention, that the xen-gnttab code is
> kind of buggy (putting it mildly, no offence).
> Like I pointed out in the xen-devel patch thread, there is a place in code
> where "-1" is being used to specify there is no CLEAR_BYTE notify. But this
> is not being checked for inside the function, which would have caused a
> clear-byte notification on a different page, causing data corruption. The
> only reason this bug is not doing so, is because of another bug, where this
> -1 is being passed on to an unsigned int32, which would keep it out of
> bounds for most requests.
> 
> I don't think this has to do anything with our device. If we lost some
> unmap request (which is where this message is generated) we would have
> surely leaked the memory for the gmap structure associated with that
> request (because, 1. ref-counting, 2. transferred to global clean list only
> on an unmap request), and that would have been visible in `vmstat`.
> 
> Let me know if this repeats.
> 
> 
> > 20:40 royger: and I'm not sure if you tested it, but if you attach a
> > ramdisk to a VM (one created with `mdconfig -t malloc ...`) and try to run
> > newfs against it, it doesn't work, a bunch of read errors appear on both
> > the DomU console and Qemu log. Although it works with a plain file, so I
> > guess this is probably some bad interation between Qemu and FreeBSD block
> > devices...
> 
> 
> Mhm. Sounds like that. I'll try it out on my setup and post the results.

OK, no problem, as I said, it looks like this is some kind of bad 
interaction between the grant table device and md devices, it's worth 
looking into it, but it's not a blocking issue in any case.

I've already reviewed all the remaining FreeBSD code, and I plan to commit 
it once 11.0 is released, so you still have a couple of weeks to look into 
the md issue if you want.

Regarding the Xen code, I'm not a maintainer of the library that you have 
modified, so you will have to wait for the Ack of one of the maintainers 
(next week is XenSummit, so everyone is probably going to be mostly 
offline).

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160821155932.lzqkbldx3ihhccdn>