From owner-freebsd-xen@FreeBSD.ORG Mon Feb 9 09:39:54 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F2544E96 for ; Mon, 9 Feb 2015 09:39:54 +0000 (UTC) Received: from mail-pd0-f172.google.com (mail-pd0-f172.google.com [209.85.192.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CB18D64C for ; Mon, 9 Feb 2015 09:39:54 +0000 (UTC) Received: by pdev10 with SMTP id v10so7429784pde.10 for ; Mon, 09 Feb 2015 01:39:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:date:message-id:subject:from :to:content-type; bh=l0ihp9yQ9LwkDtmtVtNhc3Q6zR2/PBPrJO28+lRXItQ=; b=jjHbLAfCJ8skhSEDJEJ30gHiyuDD8ALbIX58/qjuwCjg2hljdS7uXq0v7peILVm/tk 3KVtfzdJjrADavI9RYwsU0DwIBw9bMEI9NZPOVrcpLl12BUrsFZhWXFOIj4xonF34860 0DcWPa2rPVUWlIyBpVsiY/A+LPYGgb3V9CZvS7iKZnfXZi0s2xXy0rHkp+1begNQu21I VzhnG/61GgMtwD1J1jliN7M92S0YOcAVfroI44BUSCWM1FZqnAptbX17BRUZwvkr/v5q +LtjmnZ0Y1r7f+mqLRWvhxSUQ7fM3cGzAkEgJEGxaQr58mR6114bhXGCQo2DsahZDrXw TJWA== X-Gm-Message-State: ALoCoQmXwHnPh1/CjXSfzlPh4Hk8tjaIz8xSztaIncAuIFTgCTJKgOacoq+i0+GOamJzHF96n6V2 MIME-Version: 1.0 X-Received: by 10.68.135.37 with SMTP id pp5mr27267929pbb.105.1423474787740; Mon, 09 Feb 2015 01:39:47 -0800 (PST) Sender: andy@fud.org.nz Received: by 10.70.129.238 with HTTP; Mon, 9 Feb 2015 01:39:47 -0800 (PST) Date: Mon, 9 Feb 2015 22:39:47 +1300 X-Google-Sender-Auth: RmMQUhwHYsJR2GOUR_gtoKApb30 Message-ID: Subject: xenstore memory issue From: Andrew Thompson To: freebsd-xen@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Feb 2015 09:39:55 -0000 Hi, I have three VMs with Rackspace and one is behaving oddly with xenstore memory consumption. Here are the kernel versions and vmstat -m results. FreeBSD us.e.com 10.0-RELEASE-p9 FreeBSD 10.0-RELEASE-p9 #0: Mon Sep 15 14:35:52 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 xenbus 16 2K - 86 16,32,64,256 xenstore 409 4837K - 38424052 16,32,64,128,256 xen_hvm 2 8K - 2 4096 xen_intr 25 4K - 25 128 FreeBSD uk.e.com 10.0-RELEASE-p12 FreeBSD 10.0-RELEASE-p12 #0: Tue Nov 4 05:07:17 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 xenbus 11 2K - 83 16,32,64,256 xenstore 198 2317K - 43428137 16,32,64,128,256,512 xen_hvm 2 8K - 2 4096 xen_intr 24 3K - 24 128 FreeBSD au.e.com 10.0-RELEASE-p12 FreeBSD 10.0-RELEASE-p12 #0: Tue Nov 4 05:07:17 UTC 2014 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 xenbus 11 2K - 83 16,32,64,256 xenstore 8477 101653K - 55249 16,32,64,128,256,512 xen_hvm 2 8K - 2 4096 xen_intr 14 2K - 14 128 As you can see the third VM is using 100MB in xenstore memory and it seems to be climbing by 1-2MB per hour. Eventually all the processes go in to pfault state and it grinds to a halt. How should I be debugging this? Is it either a local leak or the Xen host is to blame? cheers, Andew From owner-freebsd-xen@FreeBSD.ORG Mon Feb 9 10:28:44 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0DFD3CDA; Mon, 9 Feb 2015 10:28:44 +0000 (UTC) Received: from SMTP.CITRIX.COM (smtp.citrix.com [66.165.176.89]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mail.citrix.com", Issuer "Cybertrust Public SureServer SV CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 1CD7BBCB; Mon, 9 Feb 2015 10:28:42 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.09,543,1418083200"; d="scan'208";a="224018044" Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.80) with Microsoft SMTP Server id 14.3.210.2; Mon, 9 Feb 2015 05:28:38 -0500 Message-ID: <54D88BD5.7050703@citrix.com> Date: Mon, 9 Feb 2015 11:28:37 +0100 From: =?windows-1252?Q?Roger_Pau_Monn=E9?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Andrew Thompson , Subject: Re: xenstore memory issue References: In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-DLP: MIA2 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Feb 2015 10:28:44 -0000 Hello, El 09/02/15 a les 10.39, Andrew Thompson ha escrit: > Hi, > > > I have three VMs with Rackspace and one is behaving oddly with xenstore > memory consumption. Here are the kernel versions and vmstat -m results. > > FreeBSD us.e.com 10.0-RELEASE-p9 FreeBSD 10.0-RELEASE-p9 #0: Mon Sep 15 > 14:35:52 UTC 2014 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > amd64 > > xenbus 16 2K - 86 16,32,64,256 > xenstore 409 4837K - 38424052 16,32,64,128,256 > xen_hvm 2 8K - 2 4096 > xen_intr 25 4K - 25 128 > > > FreeBSD uk.e.com 10.0-RELEASE-p12 FreeBSD 10.0-RELEASE-p12 #0: Tue Nov 4 > 05:07:17 UTC 2014 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > amd64 > > xenbus 11 2K - 83 16,32,64,256 > xenstore 198 2317K - 43428137 16,32,64,128,256,512 > xen_hvm 2 8K - 2 4096 > xen_intr 24 3K - 24 128 > > > FreeBSD au.e.com 10.0-RELEASE-p12 FreeBSD 10.0-RELEASE-p12 #0: Tue Nov 4 > 05:07:17 UTC 2014 > root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC > amd64 > > xenbus 11 2K - 83 16,32,64,256 > xenstore 8477 101653K - 55249 16,32,64,128,256,512 > xen_hvm 2 8K - 2 4096 > xen_intr 14 2K - 14 128 > > > As you can see the third VM is using 100MB in xenstore memory and it seems > to be climbing by 1-2MB per hour. Eventually all the processes go in to > pfault state and it grinds to a halt. That's certainly weird, are you doing something different on this VM as compared to the others? Did you hot-add a nic, disk or ballooned memory? Has the VM been saved/restored or migrated? Tracking down this kind of xenstore leaks can be difficult without having a way to reproduce them. > > How should I be debugging this? Is it either a local leak or the Xen host > is to blame? Even if the host is doing something weird we should be able to cope with it, or at least detect it. Roger. From owner-freebsd-xen@FreeBSD.ORG Mon Feb 9 23:07:21 2015 Return-Path: Delivered-To: freebsd-xen@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D8B3714C for ; Mon, 9 Feb 2015 23:07:21 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BF4D7E1E for ; Mon, 9 Feb 2015 23:07:21 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t19N7LRm073991 for ; Mon, 9 Feb 2015 23:07:21 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-xen@FreeBSD.org Subject: [Bug 188369] [xen] [panic] FreeBSD 10 XENHVM panic under NetBSD Dom0 (xn_txeof: WARNING: response is -1) Date: Mon, 09 Feb 2015 23:07:22 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: unspecified X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: miguelmclara@gmail.com X-Bugzilla-Status: In Progress X-Bugzilla-Priority: Normal X-Bugzilla-Assigned-To: freebsd-xen@FreeBSD.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Feb 2015 23:07:21 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=188369 --- Comment #11 from miguelmclara@gmail.com --- Is the kgbd info helpful? Any extra info I can add? -- You are receiving this mail because: You are the assignee for the bug. From owner-freebsd-xen@FreeBSD.ORG Tue Feb 10 03:25:33 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1553B2CE for ; Tue, 10 Feb 2015 03:25:33 +0000 (UTC) Received: from mail-pa0-f41.google.com (mail-pa0-f41.google.com [209.85.220.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id DC7FF7BB for ; Tue, 10 Feb 2015 03:25:32 +0000 (UTC) Received: by mail-pa0-f41.google.com with SMTP id kx10so13846917pab.0 for ; Mon, 09 Feb 2015 19:25:26 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=0tcEZbzYiX3s9ZPznD062ay6C13/GHCWqvDlPKQG0KQ=; b=a23TCTKYuVd0RPOtw7i7bSskwRiKVynYKZ6vS6f/eodZ9fnZ7+Ef6ZRvnqHT7uhQUz V7hbr8h4MDJ4qan3D5q0RVfXpKx0T4hVPwFYb+s7i8tiYCD80/0N1nvv4vlrtcgeXxpS 0n8LOPltcpZHT+/IQFBrLJR0hK6/SWNwvz7EDbRLA3Vc49Be0hR+Ch6LfhkOG4c+5+8w WbzQ8whH7YzB0HuNo2FaxDCSJ88echBwj4tCdkxmaWy6mZop6b4opok2/M/np4VqknWd +a6qxKpMv3jzhAx5oXIzdLNm/YzzrAw5N70YFmlpNdbuiqqvUmEH1xui1DASnBvJ6+/R t1Iw== X-Gm-Message-State: ALoCoQmMywXrX9J793QsWfYcC7RqnQfvcj5g76kCD9N9AfKugIlO4vI4xpzLPOiOW6RXI/fCIF4U MIME-Version: 1.0 X-Received: by 10.68.253.101 with SMTP id zz5mr33858182pbc.50.1423538726530; Mon, 09 Feb 2015 19:25:26 -0800 (PST) Sender: andy@fud.org.nz Received: by 10.70.129.238 with HTTP; Mon, 9 Feb 2015 19:25:26 -0800 (PST) In-Reply-To: <54D88BD5.7050703@citrix.com> References: <54D88BD5.7050703@citrix.com> Date: Tue, 10 Feb 2015 16:25:26 +1300 X-Google-Sender-Auth: xHBtmjT3NeqMyWZ4RRXSrF7CUfE Message-ID: Subject: Re: xenstore memory issue From: Andrew Thompson To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= , freebsd-xen@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Feb 2015 03:25:33 -0000 On 9 February 2015 at 23:28, Roger Pau Monn=C3=A9 wr= ote: > Hello, > > El 09/02/15 a les 10.39, Andrew Thompson ha escrit: > > Hi, > > > > > > I have three VMs with Rackspace and one is behaving oddly with xenstore > > memory consumption. Here are the kernel versions and vmstat -m results. > > > > As you can see the third VM is using 100MB in xenstore memory and it seem= s > > to be climbing by 1-2MB per hour. Eventually all the processes go in to > > pfault state and it grinds to a halt. > > That's certainly weird, are you doing something different on this VM as > compared to the others? Did you hot-add a nic, disk or ballooned memory? > > Has the VM been saved/restored or migrated? > > Tracking down this kind of xenstore leaks can be difficult without > having a way to reproduce them. > > A bit of trial and error with dtrace has narrowed this down. I can cause the leak by just opening /dev/xen/xenstore int main() { open("/dev/xen/xenstore", O_RDWR, 0); } # vmstat -m | grep xenstore; ./open; vmstat -m | grep xenstore xenstore 8739 104797K - 56078 16,32,64,128,256,512 xenstore 8740 104809K - 56079 16,32,64,128,256,512 Using dtrace probes I can see that xs_dev_close is never called. # dtrace -n 'fbt::xs_dev_open:{} fbt::xs_dev_close: {} dtmalloc::xenstore: {}' # ./open CPU FUNCTION 0 -> xs_dev_open 0 | xenstore:malloc 0 <- xs_dev_open This is on 10.0-RELEASE-p12. I get the same result with `/usr/local/bin/xenstore-read domid` but the above c program is enough too. regards, Andrew From owner-freebsd-xen@FreeBSD.ORG Tue Feb 10 09:22:37 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 497F7790 for ; Tue, 10 Feb 2015 09:22:37 +0000 (UTC) Received: from mail-pd0-f172.google.com (mail-pd0-f172.google.com [209.85.192.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1B389D5D for ; Tue, 10 Feb 2015 09:22:36 +0000 (UTC) Received: by pdev10 with SMTP id v10so15325408pde.10 for ; Tue, 10 Feb 2015 01:22:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=wdT2EsAtPrnoIr16FWDXElR10gc3gmOJPKSxt3sKNrU=; b=IBug9AzG9nFQCE2cKn/+kxxnRV2RaEUC9t56EqBNW4vwrS/gcQoEmP2UaAPN44M8tp JqqPsEtMMP7oxXnczQDxv3u974WBKrA12uoPz69ZVuA7uDw5q1ZcP/w6D34IqD0kt9mu t1HcRyfJmDNgr9RFfn6Z+koZx4et0sNfjPBYXew4anUQgvqnmHTGiAD3cUhCCMfo1UTl LX5IbSEMdssa/VaGzYxAlNvdHKTkSfI3D8IOtZIv7ALc8v0btXrlNs1og8BPzZ5tFlOk a6P5V90vtOHlkBE+Ngp3rjrJBu//kfAOa1AkIs93fF+XvCnja6j4IFetEpBvvJQJ8DVx Ly/w== X-Gm-Message-State: ALoCoQlwk8KXNY7xR4l6wzxx6nKBcEgYowksVG0Fg4j7iYlbo6Vi4LhOjJeTrh+0XgZrdQVj/gBW MIME-Version: 1.0 X-Received: by 10.70.41.231 with SMTP id i7mr35709372pdl.102.1423560155345; Tue, 10 Feb 2015 01:22:35 -0800 (PST) Sender: andy@fud.org.nz Received: by 10.70.129.238 with HTTP; Tue, 10 Feb 2015 01:22:35 -0800 (PST) In-Reply-To: References: <54D88BD5.7050703@citrix.com> Date: Tue, 10 Feb 2015 22:22:35 +1300 X-Google-Sender-Auth: VHMkaFlrY9uAcYuo0rzjRpv2F5U Message-ID: Subject: Re: xenstore memory issue From: Andrew Thompson To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= , freebsd-xen@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Feb 2015 09:22:37 -0000 On 10 February 2015 at 16:25, Andrew Thompson wrote: > On 9 February 2015 at 23:28, Roger Pau Monn=C3=A9 = wrote: > >> Hello, >> >> El 09/02/15 a les 10.39, Andrew Thompson ha escrit: >> > Hi, >> > >> > >> > I have three VMs with Rackspace and one is behaving oddly with xenstor= e >> > memory consumption. Here are the kernel versions and vmstat -m results= . >> > >> > > As you can see the third VM is using 100MB in xenstore memory and it >> seems >> > to be climbing by 1-2MB per hour. Eventually all the processes go in t= o >> > pfault state and it grinds to a halt. >> >> That's certainly weird, are you doing something different on this VM as >> compared to the others? Did you hot-add a nic, disk or ballooned memory? >> >> Has the VM been saved/restored or migrated? >> >> Tracking down this kind of xenstore leaks can be difficult without >> having a way to reproduce them. >> >> > A bit of trial and error with dtrace has narrowed this down. I can cause > the leak by just opening /dev/xen/xenstore > > int main() { > open("/dev/xen/xenstore", O_RDWR, 0); > } > > # vmstat -m | grep xenstore; ./open; vmstat -m | grep xenstore > xenstore 8739 104797K - 56078 16,32,64,128,256,512 > xenstore 8740 104809K - 56079 16,32,64,128,256,512 > > > Using dtrace probes I can see that xs_dev_close is never called. > I think I have worked this out. Rackspace use an agent called nova-agent which keeps and open handle on /dev/xen/xenstore. Since xenstore isnt using the D_TRACKCLOSE flag it will not call d_close until the last reference is dropped. Since xenstore expects to malloc/free on open and close this assumption breaks and will leak memory. If i stop nova-agent I can see xs_dev_close being called and the memory freed with testing with xenstore-read. The correct solution seems to be to set D_TRACKCLOSE if I understand its purpose correctly. regards, Andrew From owner-freebsd-xen@FreeBSD.ORG Tue Feb 10 16:22:49 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 235B12CF; Tue, 10 Feb 2015 16:22:49 +0000 (UTC) Received: from SMTP.CITRIX.COM (smtp.citrix.com [66.165.176.89]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mail.citrix.com", Issuer "Cybertrust Public SureServer SV CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E4875224; Tue, 10 Feb 2015 16:22:47 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.09,551,1418083200"; d="scan'208";a="224399965" Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.80) with Microsoft SMTP Server id 14.3.210.2; Tue, 10 Feb 2015 11:22:24 -0500 Message-ID: <54DA303F.9020203@citrix.com> Date: Tue, 10 Feb 2015 17:22:23 +0100 From: =?UTF-8?B?Um9nZXIgUGF1IE1vbm7DqQ==?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Andrew Thompson , Subject: Re: xenstore memory issue References: <54D88BD5.7050703@citrix.com> In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-DLP: MIA2 X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 10 Feb 2015 16:22:49 -0000 Hello Andrew, El 10/02/15 a les 10.22, Andrew Thompson ha escrit: > On 10 February 2015 at 16:25, Andrew Thompson wrote: > >> On 9 February 2015 at 23:28, Roger Pau Monné wrote: >> A bit of trial and error with dtrace has narrowed this down. I can cause >> the leak by just opening /dev/xen/xenstore >> >> int main() { >> open("/dev/xen/xenstore", O_RDWR, 0); >> } >> >> # vmstat -m | grep xenstore; ./open; vmstat -m | grep xenstore >> xenstore 8739 104797K - 56078 16,32,64,128,256,512 >> xenstore 8740 104809K - 56079 16,32,64,128,256,512 >> >> >> Using dtrace probes I can see that xs_dev_close is never called. >> > > I think I have worked this out. Rackspace use an agent called nova-agent > which keeps and open handle on /dev/xen/xenstore. Since xenstore isnt using > the D_TRACKCLOSE flag it will not call d_close until the last reference is > dropped. Since xenstore expects to malloc/free on open and close this > assumption breaks and will leak memory. > > If i stop nova-agent I can see xs_dev_close being called and the memory > freed with testing with xenstore-read. The correct solution seems to be to > set D_TRACKCLOSE if I understand its purpose correctly. Thanks for doing all this legwork! IMHO the best solution is to switch xenstore dev to use cdevpriv in order to store each client data. What we are doing right now (storing client data in dev->si_dvr1) is plain wrong. I've uploaded two patches (one for HEAD and one for stable/10) so that you can try it also, please report back whether this fixes your problem or not: https://people.freebsd.org/~royger/xenstore_fix/ Roger. From owner-freebsd-xen@FreeBSD.ORG Thu Feb 12 23:46:30 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D014FE9D for ; Thu, 12 Feb 2015 23:46:30 +0000 (UTC) Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com [209.85.192.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id A0CACF08 for ; Thu, 12 Feb 2015 23:46:30 +0000 (UTC) Received: by pdno5 with SMTP id o5so15302817pdn.8 for ; Thu, 12 Feb 2015 15:46:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=810GG/aDHPTV/d4qW50T+RtXoWR/Gl6LpG/xo7CBbec=; b=m97tgzVB4kWFJ/PEs4Ye4E2OpfgNpAoqy61kDRBvpk4/DJS8CUpI5I4UgnS5EXlvoC Zw5mS/JAXj2OjbpjZLDMsDhZLdtJVbWRMmcq6iy5kR2WM2RPDCgnQtRxYYu75KmfK//3 KtuvHcWHb7c92lDQ9xix1OcGKgNDnD+Pz3c8oj/sdjvsdQ5nqESwGIkP0IrXABCLNStg HbHAi+t/qJQTI5hS6jn1tTsjUI5YWa+2wdVwG0xRiAJ1qjYsG7e/uDAAWP3eCn7z2Pji jrXMUkq1Uf4x59Srqk2Sq4Jc50KzBTgoPCm7xmOQ5PqjjVQXz7QuCFemxTlnvQoPzINs QRtg== X-Gm-Message-State: ALoCoQmIRSP6twdE7RXlW3e9G1N88bJ7u2auNCTT7a0Z0FVMdbQsKVXvwoJQ3OJdMKHFQYr8YhIM MIME-Version: 1.0 X-Received: by 10.70.44.132 with SMTP id e4mr10623987pdm.58.1423784782841; Thu, 12 Feb 2015 15:46:22 -0800 (PST) Sender: andy@fud.org.nz Received: by 10.70.129.238 with HTTP; Thu, 12 Feb 2015 15:46:22 -0800 (PST) In-Reply-To: <54DA303F.9020203@citrix.com> References: <54D88BD5.7050703@citrix.com> <54DA303F.9020203@citrix.com> Date: Fri, 13 Feb 2015 12:46:22 +1300 X-Google-Sender-Auth: _58bbvuH7DmI-bYZ1rWmdvLqPiI Message-ID: Subject: Re: xenstore memory issue From: Andrew Thompson To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-xen@freebsd.org X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Feb 2015 23:46:30 -0000 On 11 February 2015 at 05:22, Roger Pau Monn=C3=A9 w= rote: > Hello Andrew, > > El 10/02/15 a les 10.22, Andrew Thompson ha escrit: > > On 10 February 2015 at 16:25, Andrew Thompson > wrote: > > > >> On 9 February 2015 at 23:28, Roger Pau Monn=C3=A9 > wrote: > >> A bit of trial and error with dtrace has narrowed this down. I can cau= se > >> the leak by just opening /dev/xen/xenstore > > Thanks for doing all this legwork! IMHO the best solution is to switch > xenstore dev to use cdevpriv in order to store each client data. What we > are doing right now (storing client data in dev->si_dvr1) is plain > wrong. I've uploaded two patches (one for HEAD and one for stable/10) so > that you can try it also, please report back whether this fixes your > problem or not: > > https://people.freebsd.org/~royger/xenstore_fix/ > > I have tested this on 10.0 and it does fix the issue. After 36000 allocations there are just 6 active at 13K xenstore 6 13K - 36699 16,32,64,128,256,512 I looked in to why this only affected one of my many VMs as they run xe-update-guest-attrs and nova-agent which should have triggered the leak. They are all stuck on xsdread when reading from the xenstore so the update loop has stopped. UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 13250 13249 0 20 0 16552 1968 xsdread S - 5:57.60 /usr/local/bin/xenstore-read domid (xenstore) Is there an issue with the xs_dev_read and xs_queue_reply logic or would the Xen host be timing out? regards, Andrew From owner-freebsd-xen@FreeBSD.ORG Fri Feb 13 16:31:33 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D8B945E3; Fri, 13 Feb 2015 16:31:33 +0000 (UTC) Received: from SMTP02.CITRIX.COM (smtp02.citrix.com [66.165.176.63]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (Client CN "mail.citrix.com", Issuer "Cybertrust Public SureServer SV CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E473779F; Fri, 13 Feb 2015 16:31:32 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.09,571,1418083200"; d="scan'208";a="226779481" Received: from [IPv6:::1] (10.80.16.47) by smtprelay.citrix.com (10.13.107.78) with Microsoft SMTP Server id 14.3.210.2; Fri, 13 Feb 2015 11:31:24 -0500 Message-ID: <54DE26DB.90000@citrix.com> Date: Fri, 13 Feb 2015 17:31:23 +0100 From: =?UTF-8?B?Um9nZXIgUGF1IE1vbm7DqQ==?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Andrew Thompson Subject: Re: xenstore memory issue References: <54D88BD5.7050703@citrix.com> <54DA303F.9020203@citrix.com> In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-DLP: MIA1 Cc: freebsd-xen@freebsd.org X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Feb 2015 16:31:33 -0000 El 13/02/15 a les 0.46, Andrew Thompson ha escrit: > On 11 February 2015 at 05:22, Roger Pau Monné wrote: > >> Hello Andrew, >> >> El 10/02/15 a les 10.22, Andrew Thompson ha escrit: >>> On 10 February 2015 at 16:25, Andrew Thompson >> wrote: >>> >>>> On 9 February 2015 at 23:28, Roger Pau Monné >> wrote: >>>> A bit of trial and error with dtrace has narrowed this down. I can cause >>>> the leak by just opening /dev/xen/xenstore >> >> Thanks for doing all this legwork! IMHO the best solution is to switch >> xenstore dev to use cdevpriv in order to store each client data. What we >> are doing right now (storing client data in dev->si_dvr1) is plain >> wrong. I've uploaded two patches (one for HEAD and one for stable/10) so >> that you can try it also, please report back whether this fixes your >> problem or not: >> >> https://people.freebsd.org/~royger/xenstore_fix/ >> >> > I have tested this on 10.0 and it does fix the issue. After 36000 > allocations there are just 6 active at 13K > > xenstore 6 13K - 36699 16,32,64,128,256,512 > > > I looked in to why this only affected one of my many VMs as they run > xe-update-guest-attrs and nova-agent which should have triggered the leak. > They are all stuck on xsdread when reading from the xenstore so the update > loop has stopped. > > UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME > COMMAND > 0 13250 13249 0 20 0 16552 1968 xsdread S - 5:57.60 > /usr/local/bin/xenstore-read domid (xenstore) Is this with my patch applied? AFAICT xsdread means it's waiting on the address at dev->si_drv1, which is completely wrong because every time the device was opened this address changed. With my patch applied it should no longer be an issue. Roger. From owner-freebsd-xen@FreeBSD.ORG Fri Feb 13 19:22:56 2015 Return-Path: Delivered-To: freebsd-xen@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 04374C57 for ; Fri, 13 Feb 2015 19:22:56 +0000 (UTC) Received: from mail-pd0-f180.google.com (mail-pd0-f180.google.com [209.85.192.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C77B1E2F for ; Fri, 13 Feb 2015 19:22:55 +0000 (UTC) Received: by pdbfp1 with SMTP id fp1so21222617pdb.9 for ; Fri, 13 Feb 2015 11:22:54 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=3AkxivkbhgnXn1/pau7mkSIYQZ+MggQPOpcs9Opqxz0=; b=Uz8rFHiwFBcgHPJBgjhljo0xPmS2K98Qkua4hu+v46YoG/GehCPP04AAZqt3ocm4Gf tLwvWYW5UonrSBkTqKjy6QsybCvsAUD7QPdarMXyFPD+kAVJWMnjaIJb5F0d57utYPUx 1SG6vyL+w5aaMpqZWdKBiDd9277v3WLR5TwQCbEpyUfZ6xG8hV39zD9eM8rDACDuYAGl hX7q5Ke0KA8QlyTILP2uu5qCB5gJ3SmW2MV2b7x3ge5DvI3QmdlPb7Nqtr2S5l+0UO+K cWsr83B+4XJKkqPNiY/5ys6cl8bhlAsRgzHAoAW7eogt96MrN3d9kuen+b83D3HgVvq+ ++wA== X-Gm-Message-State: ALoCoQlEmlKBqvnomhW826Vegj4gY8AvQdjveoxAguBNmGCsWnQJNsy9DvCCrxyToS0z8expmxh9 MIME-Version: 1.0 X-Received: by 10.66.161.170 with SMTP id xt10mr17952838pab.14.1423854963282; Fri, 13 Feb 2015 11:16:03 -0800 (PST) Sender: andy@fud.org.nz Received: by 10.70.129.238 with HTTP; Fri, 13 Feb 2015 11:16:03 -0800 (PST) In-Reply-To: <54DE26DB.90000@citrix.com> References: <54D88BD5.7050703@citrix.com> <54DA303F.9020203@citrix.com> <54DE26DB.90000@citrix.com> Date: Sat, 14 Feb 2015 08:16:03 +1300 X-Google-Sender-Auth: 7fvujF7sPr4BQzOJIqZ85a9iAEs Message-ID: Subject: Re: xenstore memory issue From: Andrew Thompson To: =?UTF-8?Q?Roger_Pau_Monn=C3=A9?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-xen@freebsd.org X-BeenThere: freebsd-xen@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussion of the freebsd port to xen - implementation and usage List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 Feb 2015 19:22:56 -0000 On 14 February 2015 at 05:31, Roger Pau Monn=C3=A9 w= rote: > El 13/02/15 a les 0.46, Andrew Thompson ha escrit: > > On 11 February 2015 at 05:22, Roger Pau Monn=C3=A9 > wrote: > > > >> Hello Andrew, > >> > >> El 10/02/15 a les 10.22, Andrew Thompson ha escrit: > >>> On 10 February 2015 at 16:25, Andrew Thompson > >> wrote: > >>> > >>>> On 9 February 2015 at 23:28, Roger Pau Monn=C3=A9 > >> wrote: > >>>> A bit of trial and error with dtrace has narrowed this down. I can > cause > >>>> the leak by just opening /dev/xen/xenstore > >> > >> Thanks for doing all this legwork! IMHO the best solution is to switch > >> xenstore dev to use cdevpriv in order to store each client data. What = we > >> are doing right now (storing client data in dev->si_dvr1) is plain > >> wrong. I've uploaded two patches (one for HEAD and one for stable/10) = so > >> that you can try it also, please report back whether this fixes your > >> problem or not: > >> > >> https://people.freebsd.org/~royger/xenstore_fix/ > >> > >> > > I have tested this on 10.0 and it does fix the issue. After 36000 > > allocations there are just 6 active at 13K > > > > xenstore 6 13K - 36699 16,32,64,128,256,512 > > > > > > I looked in to why this only affected one of my many VMs as they run > > xe-update-guest-attrs and nova-agent which should have triggered the > leak. > > They are all stuck on xsdread when reading from the xenstore so the > update > > loop has stopped. > > > > UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIM= E > > COMMAND > > 0 13250 13249 0 20 0 16552 1968 xsdread S - 5:57.6= 0 > > /usr/local/bin/xenstore-read domid (xenstore) > > Is this with my patch applied? AFAICT xsdread means it's waiting on the > address at dev->si_drv1, which is completely wrong because every time > the device was opened this address changed. With my patch applied it > should no longer be an issue. > > No, I was looking at the other unpatched servers, I didnt realise the two issues were linked. Please commit! cheers, Andrew