Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 22 May 2024 22:45:33 +0200
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Warner Losh <imp@bsdimp.com>
Cc:        Current <current@freebsd.org>, Alexander Motin <mav@freebsd.org>
Subject:   Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469
Message-ID:  <d7138e8c2d6888cfe9ec73b76e6ae98b@Leidinger.net>
In-Reply-To: <CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA@mail.gmail.com>
References:  <730565997ef678bbfe87d7861075edae@Leidinger.net> <CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)

--=_dc0f07b562c2d8155c3c32185a8aac04
Content-Type: multipart/alternative;
 boundary="=_b3cd9b8a523236327cf34f2d64752b6f"

--=_b3cd9b8a523236327cf34f2d64752b6f
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII;
 format=flowed

Am 2024-05-22 20:53, schrieb Warner Losh:

> First order:
> 
> Looks like we're trying to schedule a trim, but that fails due to a 
> malloc issue. So then, since it's a
> malloc issue, we wind up trying to automatically reschedule this I/O, 
> which recurses into the driver
> with a bad lock held and boop.
> 
> Can you reproduce this?

So far I had it once. At least I have only one crashdump. I had one more 
reboot/crash, but no dump. I also have a watchdog running on this 
system, so not sure what caused the (unusual) reboot. I had a poudriere 
build running at both times. Since the crashdump I didn't run poudriere 
anymore.

> If so, can you test this patch?

I give it a try tomorrow anyway, and I will try to stress the system 
again with poudriere.

The nvme is a cache and also a log device for a zpool, so not really a 
deterministic way to trigger access to it.

Bye,
Alexander.

-- 
http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netchild@FreeBSD.org  : PGP 0x8F31830F9F2772BF
--=_b3cd9b8a523236327cf34f2d64752b6f
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html; charset=UTF-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; charset=
=3DUTF-8" /></head><body style=3D'font-size: 10pt; font-family: Verdana,Gen=
eva,sans-serif'>
<p id=3D"reply-intro">Am 2024-05-22 20:53, schrieb Warner Losh:</p>
<blockquote type=3D"cite" style=3D"padding: 0 0.4em; border-left: #1010ff 2=
px solid; margin: 0">
<div id=3D"replybody1">
<div dir=3D"ltr">
<div dir=3D"ltr">
<div>First order: </div>
<div>&nbsp;</div>
<div>Looks like we're trying to schedule a trim, but that fails due to a ma=
lloc issue. So then, since it's a</div>
<div>malloc issue, we wind up trying to automatically reschedule this I/O, =
which recurses into the driver</div>
<div>with a bad lock held and boop.</div>
<div>&nbsp;</div>
<div>Can you reproduce this?</div>
</div>
</div>
</div>
</blockquote>
<div id=3D"replybody1">
<div dir=3D"ltr">
<div dir=3D"ltr">
<div>&nbsp;</div>
<div>So far I had it once. At least I have only one crashdump. I had one mo=
re reboot/crash, but no dump. I also have a watchdog running on this system=
, so not sure what caused the (unusual) reboot. I had a poudriere build run=
ning at both times. Since the crashdump I didn't run poudriere anymore.</di=
v>
<div>&nbsp;</div>
</div>
</div>
</div>
<blockquote type=3D"cite" style=3D"padding: 0 0.4em; border-left: #1010ff 2=
px solid; margin: 0">
<div id=3D"replybody1">
<div dir=3D"ltr">
<div dir=3D"ltr">
<div>If so, can you test this patch?</div>
</div>
</div>
</div>
</blockquote>
<div id=3D"replybody1">
<div dir=3D"ltr">
<div dir=3D"ltr">
<div>&nbsp;</div>
<div>I give it a try tomorrow anyway, and I will try to stress the system a=
gain with poudriere.</div>
</div>
</div>
</div>
<p>The nvme is a cache and also a log device for a zpool, so not really a d=
eterministic way to trigger access to it.</p>
<p>Bye,<br />Alexander.</p>
<div id=3D"signature">-- <br />
<div class=3D"pre" style=3D"margin: 0; padding: 0; font-family: monospace">=
<a href=3D"http://www.Leidinger.net" target=3D"_blank" rel=3D"noopener nore=
ferrer">http://www.Leidinger.net</a>; <a href=3D"mailto:Alexander@Leidinger.=
net:">Alexander@Leidinger.net:</a> PGP 0x8F31830F9F2772BF<br /><a href=3D"h=
ttp://www.FreeBSD.org" target=3D"_blank" rel=3D"noopener noreferrer">http:/=
/www.FreeBSD.org</a> &nbsp; &nbsp;<a href=3D"mailto:netchild@FreeBSD.org">n=
etchild@FreeBSD.org</a> &nbsp;: PGP 0x8F31830F9F2772BF</div>
</div>
</body></html>

--=_b3cd9b8a523236327cf34f2d64752b6f--


--=_dc0f07b562c2d8155c3c32185a8aac04
Content-Type: application/pgp-signature;
 name=signature.asc
Content-Disposition: attachment;
 filename=signature.asc;
 size=833
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmZOWYAACgkQEg2wmwP4
2IY4wA/9GhlwJBIeQvQaGnoH632EzWeZR8d3/tOkGxFYUoid9gSW4KDkxElE/i92
3RL2axaAKzhqnIMUo4R7qbJ5TImQqQn4Eh60NAPqm/IdkZoUcAno7Q8npzFiSyMc
MZV3t9cY+OnxLfA9FAR628Zx1k8u0nNz4VG5xT2QIa7FtRjxxpfw7VVJOIcNQsPV
kMmh4IJ4JbVc4N41VgGfOiLcihbh+6RVu4Yj0GaHSaeexV6knIe1g7jkCoo7vlwf
OtKEu8Ua67yiB/VfpFTHcxljFUmOXeadXqw5TVHTAQJXdtJ4No0NK4RbcmVGojEh
0viPxTr1CPlk7sFjFtEPtKTQhHyD5Mpeq8OGDTVKabkROK1iY/4YQeIr2NuzTLyr
hRygUld7Wnt2jhEiRbAXuIP3Mp5PRvNVSAZ+txNwMCHLveCVMtGuxvBDMYHI9lni
mEmQxo9yJb85A4J7MQNRBJkohfAR/4kxIrP83xJj5lhaNI/DgVe0JrBXXJH6Q5gI
+Muq4n327h/mYGy8SQfSebOnq4Mbsusi9eLurGs7gjAbPqf5SyGteFlJ2fIGQOmc
gRSYUpw0ZcyogKHBqHZ2tvhoRTbBVBjX1z4cEEhMYwaebNRkYofyL8oN5YT+arkf
dpJX7q6ls/ChwVAFNmn5n6qq0t2heuCLIXCH0YdpRbFU9n0tQG8=
=BqXp
-----END PGP SIGNATURE-----

--=_dc0f07b562c2d8155c3c32185a8aac04--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d7138e8c2d6888cfe9ec73b76e6ae98b>