Date: Thu, 24 Oct 2013 23:17:59 +0100 From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= <roger.pau@citrix.com> To: Konstantin Belousov <kostikbel@gmail.com> Cc: Outback Dingo <outbackdingo@gmail.com>, current@freebsd.org Subject: Re: CUREENT issue with ballon.c Message-ID: <52699C97.7070105@citrix.com> In-Reply-To: <20131024211507.GD10625@kib.kiev.ua> References: <CAKYr3zyVXxL9A9wEY_KL8sp1yivAUG2gpK8daUMu%2BbeyU2b8pw@mail.gmail.com> <5268F37E.9050004@citrix.com> <CAKYr3zw5Se4Z-fZLgp9e9nBMxWNhXuwt1%2BTAqZJeozt=qhpmpQ@mail.gmail.com> <526986E0.2050807@citrix.com> <20131024211507.GD10625@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 24/10/13 22:15, Konstantin Belousov wrote: > On Thu, Oct 24, 2013 at 09:45:20PM +0100, Roger Pau Monn? wrote: >> On 24/10/13 13:01, Outback Dingo wrote: >>> >>> >>> On Thu, Oct 24, 2013 at 6:16 AM, Roger Pau Monn? <roger.pau@citrix.com >>> <mailto:roger.pau@citrix.com>> wrote: >>> >>> On 24/10/13 03:02, Outback Dingo wrote: >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> > exclusive sleep mutex balloon_lock (balloon_lock) r = 0 >>> > (0xffffffff816e9c58) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:339 >>> > exclusive sleep mutex balloon_mutex (balloon_mutex) r = 0 >>> > (0xffffffff816e9c38) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:373 >>> > KDB: stack backtrace: >>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> > 0xfffffe00002c67c0 >>> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00002c6870 >>> > witness_warn() at witness_warn+0x4a8/frame 0xfffffe00002c6930 >>> > uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00002c69a0 >>> > malloc() at malloc+0x101/frame 0xfffffe00002c69f0 >>> > balloon_process() at balloon_process+0x44a/frame 0xfffffe00002c6a70 >>> > fork_exit() at fork_exit+0x84/frame 0xfffffe00002c6ab0 >>> > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00002c6ab0 >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> > exclusive sleep mutex balloon_lock (balloon_lock) r = 0 >>> > (0xffffffff816e9c58) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:339 >>> > exclusive sleep mutex balloon_mutex (balloon_mutex) r = 0 >>> > (0xffffffff816e9c38) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:373 >>> > KDB: stack backtrace: >>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> > 0xfffffe00002c67c0 >>> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00002c6870 >>> > witness_warn() at witness_warn+0x4a8/frame 0xfffffe00002c6930 >>> > uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00002c69a0 >>> > malloc() at malloc+0x101/frame 0xfffffe00002c69f0 >>> > balloon_process() at balloon_process+0x44a/frame 0xfffffe00002c6a70 >>> > fork_exit() at fork_exit+0x84/frame 0xfffffe00002c6ab0 >>> > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00002c6ab0 >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> >>> Did you do anything specific to trigger the crash? Can you explain the >>> steps needed to reproduce it? >>> >>> >>> just recompiled a kernel, and booted it scrolls continuously across the >>> screen >>> doesnt seem to ever stop. >> >> I've tried r257051 and it seems to work fine, could you please post your >> Xen version, the config file used to launch the VM and the toolstack used? > > Do you have witness enabled in your kernel config ? Yes, but I'm not touching balloon memory target. > There is an obvious case of calling malloc(M_WAITOK) while holding both > balloon_lock and balloon_mutex: > ballon_process->decrease_reservation->balloon_append. Yes, I'm aware of that, it's just that it shouldn't happen unless you actually trigger a balloon memory decrease, which should not happen automatically AFAIK, that's why I was asking if this was happening without the user specifically requesting it. Anyway, this should be clearly fixed and pulled into 10 no matter what triggered it. I will send a patch as soon as possible.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52699C97.7070105>