From owner-freebsd-current@FreeBSD.ORG Thu Oct 24 22:18:04 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 3C203C34 for ; Thu, 24 Oct 2013 22:18:04 +0000 (UTC) (envelope-from roger.pau@citrix.com) Received: from SMTP02.CITRIX.COM (smtp02.citrix.com [66.165.176.63]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8611D2F07 for ; Thu, 24 Oct 2013 22:18:02 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.93,565,1378857600"; d="scan'208";a="64526896" Received: from accessns.citrite.net (HELO FTLPEX01CL01.citrite.net) ([10.9.154.239]) by FTLPIPO02.CITRIX.COM with ESMTP; 24 Oct 2013 22:18:00 +0000 Received: from Roger-2.local (10.80.16.47) by smtprelay.citrix.com (10.13.107.78) with Microsoft SMTP Server id 14.2.342.4; Thu, 24 Oct 2013 18:18:00 -0400 Message-ID: <52699C97.7070105@citrix.com> Date: Thu, 24 Oct 2013 23:17:59 +0100 From: =?ISO-8859-1?Q?Roger_Pau_Monn=E9?= User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:24.0) Gecko/20100101 Thunderbird/24.0.1 MIME-Version: 1.0 To: Konstantin Belousov Subject: Re: CUREENT issue with ballon.c References: <5268F37E.9050004@citrix.com> <526986E0.2050807@citrix.com> <20131024211507.GD10625@kib.kiev.ua> In-Reply-To: <20131024211507.GD10625@kib.kiev.ua> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-DLP: MIA1 Cc: Outback Dingo , current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Oct 2013 22:18:04 -0000 On 24/10/13 22:15, Konstantin Belousov wrote: > On Thu, Oct 24, 2013 at 09:45:20PM +0100, Roger Pau Monn? wrote: >> On 24/10/13 13:01, Outback Dingo wrote: >>> >>> >>> On Thu, Oct 24, 2013 at 6:16 AM, Roger Pau Monn? >> > wrote: >>> >>> On 24/10/13 03:02, Outback Dingo wrote: >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> > exclusive sleep mutex balloon_lock (balloon_lock) r = 0 >>> > (0xffffffff816e9c58) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:339 >>> > exclusive sleep mutex balloon_mutex (balloon_mutex) r = 0 >>> > (0xffffffff816e9c38) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:373 >>> > KDB: stack backtrace: >>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> > 0xfffffe00002c67c0 >>> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00002c6870 >>> > witness_warn() at witness_warn+0x4a8/frame 0xfffffe00002c6930 >>> > uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00002c69a0 >>> > malloc() at malloc+0x101/frame 0xfffffe00002c69f0 >>> > balloon_process() at balloon_process+0x44a/frame 0xfffffe00002c6a70 >>> > fork_exit() at fork_exit+0x84/frame 0xfffffe00002c6ab0 >>> > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00002c6ab0 >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> > exclusive sleep mutex balloon_lock (balloon_lock) r = 0 >>> > (0xffffffff816e9c58) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:339 >>> > exclusive sleep mutex balloon_mutex (balloon_mutex) r = 0 >>> > (0xffffffff816e9c38) locked @ >>> /usr/src/sys/dev/xen/balloon/balloon.c:373 >>> > KDB: stack backtrace: >>> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame >>> > 0xfffffe00002c67c0 >>> > kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe00002c6870 >>> > witness_warn() at witness_warn+0x4a8/frame 0xfffffe00002c6930 >>> > uma_zalloc_arg() at uma_zalloc_arg+0x3b/frame 0xfffffe00002c69a0 >>> > malloc() at malloc+0x101/frame 0xfffffe00002c69f0 >>> > balloon_process() at balloon_process+0x44a/frame 0xfffffe00002c6a70 >>> > fork_exit() at fork_exit+0x84/frame 0xfffffe00002c6ab0 >>> > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00002c6ab0 >>> > --- trap 0, rip = 0, rsp = 0xfffffe00002c6b70, rbp = 0 --- >>> > uma_zalloc_arg: zone "16" with the following non-sleepable locks held: >>> >>> Did you do anything specific to trigger the crash? Can you explain the >>> steps needed to reproduce it? >>> >>> >>> just recompiled a kernel, and booted it scrolls continuously across the >>> screen >>> doesnt seem to ever stop. >> >> I've tried r257051 and it seems to work fine, could you please post your >> Xen version, the config file used to launch the VM and the toolstack used? > > Do you have witness enabled in your kernel config ? Yes, but I'm not touching balloon memory target. > There is an obvious case of calling malloc(M_WAITOK) while holding both > balloon_lock and balloon_mutex: > ballon_process->decrease_reservation->balloon_append. Yes, I'm aware of that, it's just that it shouldn't happen unless you actually trigger a balloon memory decrease, which should not happen automatically AFAIK, that's why I was asking if this was happening without the user specifically requesting it. Anyway, this should be clearly fixed and pulled into 10 no matter what triggered it. I will send a patch as soon as possible.