Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 08 Jan 2018 18:41:20 -0500
From:      Michael Jung <mikej@mikej.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-current@freebsd.org, owner-freebsd-current@freebsd.org
Subject:   Re: witness_lock_list_get: witness exhausted
Message-ID:  <54018b1b2feaab3b05d7ed406eb8273c@mikej.com>
In-Reply-To: <1684681.MCyL5Ev91y@ralph.baldwin.cx>
References:  <6eecc842ba7a37af6b2ffe146dfd91da@mikej.com> <1684681.MCyL5Ev91y@ralph.baldwin.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2018-01-08 13:39, John Baldwin wrote:
> On Tuesday, November 28, 2017 02:46:03 PM Michael Jung wrote:
>> Hi!
>> 
>> I've recently up'd my processor count on our poudriere box and have
>> started noticing the error
>> "witness_lock_list_get: witness exhausted" on the console.  The kernel
>> *DOES NOT* crash but I
>> thought the report may be useful to someone.
>> 
>> $ uname -a
>> FreeBSD poudriere 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r325999: Sun 
>> Nov
>> 19 18:41:20 EST 2017
>> mikej@poudriere:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
>> 
>> The machine is pretty busy running four poudriere build instances.
>> 
>> last pid: 76584;  load averages: 115.07, 115.96, 98.30
>> 
>>                                       up 6+07:32:59  14:44:03
>> 763 processes: 117 running, 581 sleeping, 2 zombie, 63 lock
>> CPU: 59.0% user,  0.0% nice, 40.7% system,  0.1% interrupt,  0.1% idle
>> Mem: 12G Active, 2003M Inact, 44G Wired, 29G Free
>> ARC: 28G Total, 11G MFU, 16G MRU, 122M Anon, 359M Header, 1184M Other
>>       25G Compressed, 32G Uncompressed, 1.24:1 Ratio
>> 
>> Let me know what additional information I might supply.
> 
> This just means that WITNESS stopped working because it ran out of
> pre-allocated objects.  In particular the objects used to track how
> many locks are held by how many threads:
> 
> /*
>  * XXX: This is somewhat bogus, as we assume here that at most 2048 
> threads
>  * will hold LOCK_NCHILDREN locks.  We handle failure ok, and we should
>  * probably be safe for the most part, but it's still a SWAG.
>  */
> #define LOCK_NCHILDREN  5
> #define LOCK_CHILDCOUNT 2048
> 
> Probably the '2048' (max number of concurrent threads) needs to scale 
> with
> MAXCPU.  2048 threads is probably a bit low on big x86 boxes.


Thank you for you explanation.  We are expanding our ESXi cluster and 
even
though with standard edition I can only assign 64 vCPU's to a guest and 
as much
RAM as I want, I do like to help with edge cases if I can make them 
occur pushing
boundaries as I can towards additianional improvements in FreeBSD.

--mikej



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54018b1b2feaab3b05d7ed406eb8273c>