From owner-soc-status@freebsd.org Sun Aug 21 15:59:33 2016 Return-Path: Delivered-To: soc-status@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D25B4BC127B for ; Sun, 21 Aug 2016 15:59:33 +0000 (UTC) (envelope-from royger@gmail.com) Received: from mail-wm0-x22f.google.com (mail-wm0-x22f.google.com [IPv6:2a00:1450:400c:c09::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 64D8717CF; Sun, 21 Aug 2016 15:59:33 +0000 (UTC) (envelope-from royger@gmail.com) Received: by mail-wm0-x22f.google.com with SMTP id q128so91248607wma.1; Sun, 21 Aug 2016 08:59:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=VPY8Y9lP5SbSMzAae6VQRTMUIhnBQmpqpeTo4Qtw5HQ=; b=gdFTFc9ZnvIu3CanGS8vPL605qDl8CMsXpu7nUUA7Lfa1UWGQK8sEai0xwKA2Zrs5K j+OKMC7AcLflNbtZgwnX5aEvw+Cgmb3w2lFl+GIkX8lpdlLAYEI8MVvpxzVY/WYKQSTB Kc1M4gDuqI4449NtVzF2ykyu4+yDicnC7Vyw3/gEzSiKTZXQ4fk5Q8TLWS6jOb1ulw7j sORDC5KcorRSE+VvPcSw2Y56YbbEGSWzeXqf1ar1dy7pzMJfiY1BvHp0eZL7RFDSZWy2 3cyHT+U5kyhNaPwzUqqgk50UoQRXvMXOKbpiTen205dRn5p5chnurbBNNZIRjuzg15Nh 0pOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=VPY8Y9lP5SbSMzAae6VQRTMUIhnBQmpqpeTo4Qtw5HQ=; b=cVKiHfybCtmOLVeiGwYFSKEy1ZEAIWCEBj946HyyQ1QKAk3VQ9Uch3XJJCRLuEuTZh 30yag4zWr+ldIv6EbS0S2+nOpUelpdXFJP+6K9XakXc+H6ceqCk72PA+KLPpzroyzOcz G0nHYsvfbIArg/t5/zUGQcZqe9s/PQ9rjt2yZ4Rjj5x7zgj2kW1EXxjA/NPBoXmzwr6m 12XtlaMnju1b76dM3GXqMjDxdYHs4EYjqW1L/pF70a9ks278SZ/92sSXETMW1gd8tZps AFBfYhjkyns2zU+Rvabxe1LvmZIYhQq12pch4rpM2I9/EyxxkMCsolS3voU+mdkDNGl/ FUQw== X-Gm-Message-State: AEkoouuVi/QTxQk/EiPGBeYqt7dqKw7en9apuOXTVCiQkFywfDG0TpvQgGmc/2CMa/ve9Q== X-Received: by 10.194.80.104 with SMTP id q8mr16344169wjx.151.1471795171875; Sun, 21 Aug 2016 08:59:31 -0700 (PDT) Received: from localhost (247.red-88-1-153.dynamicip.rima-tde.net. [88.1.153.247]) by smtp.gmail.com with ESMTPSA id e5sm15834157wma.13.2016.08.21.08.59.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 21 Aug 2016 08:59:31 -0700 (PDT) Sender: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= Date: Sun, 21 Aug 2016 17:59:32 +0200 From: Roger Pau =?iso-8859-1?Q?Monn=E9?= To: Akshay Jaggi Cc: soc-status@freebsd.org, Pedro Giffuni Subject: Re: Grant Table Userspace Device - Status Update Message-ID: <20160821155932.lzqkbldx3ihhccdn@mac> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.2-neo (2016-06-11) X-BeenThere: soc-status@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Summer of Code Status Reports and Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Aug 2016 15:59:33 -0000 On Fri, Aug 19, 2016 at 12:50:32AM +0530, Akshay Jaggi wrote: > Carrying over discussion from IRC. > > 20:11 royger: ghost_rider: hello! I've been doing some testing with the > > device today, and it seems there's a memory leak somewhere, after shutting > > down all my domains I still see 1KB of memory used by the device, which > > AFAICT is not expected (you can check with `vmstat -m |grep gntdev`) > > > > Nope. That's not a leak. > > I ran `vmstat -m | grep gntdev` just after booting up Dom0, without any of > the DomU's running, and I still saw 1KB of memory being used by the device. > > root@freebsd:~ # vmstat -m | grep gntdev > gntdev 2 1K - 2 64 > > That is, 2 requests have been made, out of which both are currently active, > without any DomU's active. > > After this I fired up a DomU with qdisk backends, and vmstat returned: > > root@freebsd:~/xen_test # vmstat -m | grep gntdev > gntdev 2129 134K - 2137 32,64,128 > > Well in line with expectations. Now, powering off the DomU and running > vmstat again, we get: > > root@freebsd:~/xen_test # vmstat -m | grep gntdev > gntdev 2 1K - 2845 32,64,128 > > The initial 2 requests are still active, and this has nothing to do with > the DomU's. The first malloc() that happens in the device is in the device > open function at [1]. That means that someone has the device open. `fstat` > confirmed my suspicions. > > `fstat` with DomU active: > > root@freebsd:~/xen_test # fstat | grep xen/gntdev > root qemu-system-i386 1266 29 /dev 62 crw------- xen/gntdev > rw > root qemu-system-i386 1266 32 /dev 62 crw------- xen/gntdev > rw > root qemu-system-i386 1266 34 /dev 62 crw------- xen/gntdev > rw > root xenconsoled 751 6 /dev 62 crw------- xen/gntdev rw > root xenstored 746 11 /dev 62 crw------- xen/gntdev rw > > `fstat` with DomU powered off: > > root@freebsd:~ # fstat | grep xen/gntdev > root xenconsoled 751 6 /dev 62 crw------- xen/gntdev rw > root xenstored 746 11 /dev 62 crw------- xen/gntdev rw > > So yep! It's no leak. Just that xenconsoled and xenstored keep the gntdev > device open. I guess this would be expected behaviour. Let me know if it is > not. > > 20:14 royger: ghost_rider: and I've also seen a "Can't find requested > > grant-map." after attaching 4 Qdisk to a domain and done heavy IO to to > > them. > > 20:16 royger: although this last one I haven't been able to reproduce > > > > That's pretty strange. I have never noticed this in any of my manual or > stress tests. > > At this point I would also like to mention, that the xen-gnttab code is > kind of buggy (putting it mildly, no offence). > Like I pointed out in the xen-devel patch thread, there is a place in code > where "-1" is being used to specify there is no CLEAR_BYTE notify. But this > is not being checked for inside the function, which would have caused a > clear-byte notification on a different page, causing data corruption. The > only reason this bug is not doing so, is because of another bug, where this > -1 is being passed on to an unsigned int32, which would keep it out of > bounds for most requests. > > I don't think this has to do anything with our device. If we lost some > unmap request (which is where this message is generated) we would have > surely leaked the memory for the gmap structure associated with that > request (because, 1. ref-counting, 2. transferred to global clean list only > on an unmap request), and that would have been visible in `vmstat`. > > Let me know if this repeats. > > > > 20:40 royger: and I'm not sure if you tested it, but if you attach a > > ramdisk to a VM (one created with `mdconfig -t malloc ...`) and try to run > > newfs against it, it doesn't work, a bunch of read errors appear on both > > the DomU console and Qemu log. Although it works with a plain file, so I > > guess this is probably some bad interation between Qemu and FreeBSD block > > devices... > > > Mhm. Sounds like that. I'll try it out on my setup and post the results. OK, no problem, as I said, it looks like this is some kind of bad interaction between the grant table device and md devices, it's worth looking into it, but it's not a blocking issue in any case. I've already reviewed all the remaining FreeBSD code, and I plan to commit it once 11.0 is released, so you still have a couple of weeks to look into the md issue if you want. Regarding the Xen code, I'm not a maintainer of the library that you have modified, so you will have to wait for the Ack of one of the maintainers (next week is XenSummit, so everyone is probably going to be mostly offline). Thanks, Roger.