Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Jun 2025 17:17:24 +0000 (UTC)
From:      "Bjoern A. Zeeb" <bzeeb-lists@lists.zabbadoz.net>
To:        John Baldwin <jhb@FreeBSD.org>
Cc:        Zhenlei Huang <zlei@FreeBSD.org>, FreeBSD Current <current@freebsd.org>,  Olivier Certner <olce@freebsd.org>
Subject:   Re: regression: memory issues on main/arm64 over sched/runq changes
Message-ID:  <2qp0845s-p0oq-qsr9-0n64-3snn4466s139@yvfgf.mnoonqbm.arg>
In-Reply-To: <274774b0-9137-4fa7-a0a5-a0ce8976dc27@FreeBSD.org>
References:  <43005447-2rq0-6nn2-pnr5-4939s112npr4@yvfgf.mnoonqbm.arg> <0A01B9F5-C49C-41D8-BAB7-4378DEDBF647@FreeBSD.org> <28o26o81-so5r-qq79-6q6n-0q6746o7oo79@yvfgf.mnoonqbm.arg> <6A003013-415A-4594-AB04-AF5A9B2D660D@FreeBSD.org> <23n1773o-10o2-5p5o-25s4-r623rnn44649@yvfgf.mnoonqbm.arg> <907D042E-AE8A-4818-A807-AD45F36354FD@FreeBSD.org> <274774b0-9137-4fa7-a0a5-a0ce8976dc27@FreeBSD.org>

index | next in thread | previous in thread | raw e-mail

On Mon, 30 Jun 2025, John Baldwin wrote:

> On 6/28/25 11:35, Zhenlei Huang wrote:
>> I boot from disk.
>> 
>> Updates on this locking issue,
>> 
>> I think I finally figured out why. More stack trace from my video:
>> 
>> ```
>> shared lock of (sx) ifnet_sx @/usr/home/zlei/freebsd-src/sys/net/if.c:1467
>> while exclusively locked from /usr/home/zlei/freebsd-src/sys/net/if.c:1416
>> panic: excl->share
>> ...
>> witness_checkorder() at ...
>> _sx_slock_int() at _sx_slock_int+0x64/frame ....
>> if_addgroup() at ...
>> if_attach_internal() at ...
>> ether_ifattach() at ...
>> iflib_device_register() at ...
>> iflib_device_attach() at ...
>> device_attach() at ...
>> ...
>> root_bus_configure() at ...
>> configure() at ...
>> mi_startup() at ...
>> ```
>> 
>> The ifnet_sx has flag bit SX_RECURSE then it can be recursively locked.
>> 
>> iflib_device_register() acquired ifnet_sx exclusively and then calls 
>> ethernet_ifattach() which will then calls if_addgroup(). It is prohibited 
>> to re-acquire the same lock shared so the witness blames.
>> 
>> I think the witness should show the first file location of the exclusively 
>> lock, i.e. sys/net/iflib.c rather than the sys/net/if.c:1416 . So that it 
>> is more straight forward to figure out how that happens. CC John to see if 
>> that can be improved.
>
> Hmm, I think we have stopped at the first lle we found walking
> back up the lle list (find_instance() always works this way).
>
> You could add a 'find_last_instance' and use it in a few places
> perhaps.  I guess both the share->excl and excl->share are places
> where you would maybe use it.  Alternatively, you could have a
> 'find_next_instance' and maybe output all of them before the
> panic?

Having the full chain would be really nice :)  +1 from me for that.

/bz

-- 
Bjoern A. Zeeb                                                     r15:7
home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2qp0845s-p0oq-qsr9-0n64-3snn4466s139>