From owner-freebsd-current@freebsd.org Mon Jan 8 21:26:55 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B8504E638CE for ; Mon, 8 Jan 2018 21:26:55 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 96CA875D50 for ; Mon, 8 Jan 2018 21:26:54 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (astound-66-234-199-215.ca.astound.net [66.234.199.215]) by mail.baldwin.cx (Postfix) with ESMTPSA id D024910A8BE; Mon, 8 Jan 2018 16:26:47 -0500 (EST) From: John Baldwin To: freebsd-current@freebsd.org Cc: Michael Jung Subject: Re: witness_lock_list_get: witness exhausted Date: Mon, 08 Jan 2018 10:39:47 -0800 Message-ID: <1684681.MCyL5Ev91y@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.1-STABLE; KDE/4.14.30; amd64; ; ) In-Reply-To: <6eecc842ba7a37af6b2ffe146dfd91da@mikej.com> References: <6eecc842ba7a37af6b2ffe146dfd91da@mikej.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Mon, 08 Jan 2018 16:26:47 -0500 (EST) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jan 2018 21:26:55 -0000 On Tuesday, November 28, 2017 02:46:03 PM Michael Jung wrote: > Hi! > > I've recently up'd my processor count on our poudriere box and have > started noticing the error > "witness_lock_list_get: witness exhausted" on the console. The kernel > *DOES NOT* crash but I > thought the report may be useful to someone. > > $ uname -a > FreeBSD poudriere 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r325999: Sun Nov > 19 18:41:20 EST 2017 > mikej@poudriere:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 > > The machine is pretty busy running four poudriere build instances. > > last pid: 76584; load averages: 115.07, 115.96, 98.30 > > up 6+07:32:59 14:44:03 > 763 processes: 117 running, 581 sleeping, 2 zombie, 63 lock > CPU: 59.0% user, 0.0% nice, 40.7% system, 0.1% interrupt, 0.1% idle > Mem: 12G Active, 2003M Inact, 44G Wired, 29G Free > ARC: 28G Total, 11G MFU, 16G MRU, 122M Anon, 359M Header, 1184M Other > 25G Compressed, 32G Uncompressed, 1.24:1 Ratio > > Let me know what additional information I might supply. This just means that WITNESS stopped working because it ran out of pre-allocated objects. In particular the objects used to track how many locks are held by how many threads: /* * XXX: This is somewhat bogus, as we assume here that at most 2048 threads * will hold LOCK_NCHILDREN locks. We handle failure ok, and we should * probably be safe for the most part, but it's still a SWAG. */ #define LOCK_NCHILDREN 5 #define LOCK_CHILDCOUNT 2048 Probably the '2048' (max number of concurrent threads) needs to scale with MAXCPU. 2048 threads is probably a bit low on big x86 boxes. -- John Baldwin