Date: Wed, 22 May 2024 22:45:33 +0200 From: Alexander Leidinger <Alexander@Leidinger.net> To: Warner Losh <imp@bsdimp.com> Cc: Current <current@freebsd.org>, Alexander Motin <mav@freebsd.org> Subject: Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469 Message-ID: <d7138e8c2d6888cfe9ec73b76e6ae98b@Leidinger.net> In-Reply-To: <CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA@mail.gmail.com> References: <730565997ef678bbfe87d7861075edae@Leidinger.net> <CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --=_dc0f07b562c2d8155c3c32185a8aac04 Content-Type: multipart/alternative; boundary="=_b3cd9b8a523236327cf34f2d64752b6f" --=_b3cd9b8a523236327cf34f2d64752b6f Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed Am 2024-05-22 20:53, schrieb Warner Losh: > First order: > > Looks like we're trying to schedule a trim, but that fails due to a > malloc issue. So then, since it's a > malloc issue, we wind up trying to automatically reschedule this I/O, > which recurses into the driver > with a bad lock held and boop. > > Can you reproduce this? So far I had it once. At least I have only one crashdump. I had one more reboot/crash, but no dump. I also have a watchdog running on this system, so not sure what caused the (unusual) reboot. I had a poudriere build running at both times. Since the crashdump I didn't run poudriere anymore. > If so, can you test this patch? I give it a try tomorrow anyway, and I will try to stress the system again with poudriere. The nvme is a cache and also a log device for a zpool, so not really a deterministic way to trigger access to it. Bye, Alexander. -- http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF --=_b3cd9b8a523236327cf34f2d64752b6f Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8 <html><head><meta http-equiv=3D"Content-Type" content=3D"text/html; charset= =3DUTF-8" /></head><body style=3D'font-size: 10pt; font-family: Verdana,Gen= eva,sans-serif'> <p id=3D"reply-intro">Am 2024-05-22 20:53, schrieb Warner Losh:</p> <blockquote type=3D"cite" style=3D"padding: 0 0.4em; border-left: #1010ff 2= px solid; margin: 0"> <div id=3D"replybody1"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div>First order: </div> <div> </div> <div>Looks like we're trying to schedule a trim, but that fails due to a ma= lloc issue. So then, since it's a</div> <div>malloc issue, we wind up trying to automatically reschedule this I/O, = which recurses into the driver</div> <div>with a bad lock held and boop.</div> <div> </div> <div>Can you reproduce this?</div> </div> </div> </div> </blockquote> <div id=3D"replybody1"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div> </div> <div>So far I had it once. At least I have only one crashdump. I had one mo= re reboot/crash, but no dump. I also have a watchdog running on this system= , so not sure what caused the (unusual) reboot. I had a poudriere build run= ning at both times. Since the crashdump I didn't run poudriere anymore.</di= v> <div> </div> </div> </div> </div> <blockquote type=3D"cite" style=3D"padding: 0 0.4em; border-left: #1010ff 2= px solid; margin: 0"> <div id=3D"replybody1"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div>If so, can you test this patch?</div> </div> </div> </div> </blockquote> <div id=3D"replybody1"> <div dir=3D"ltr"> <div dir=3D"ltr"> <div> </div> <div>I give it a try tomorrow anyway, and I will try to stress the system a= gain with poudriere.</div> </div> </div> </div> <p>The nvme is a cache and also a log device for a zpool, so not really a d= eterministic way to trigger access to it.</p> <p>Bye,<br />Alexander.</p> <div id=3D"signature">-- <br /> <div class=3D"pre" style=3D"margin: 0; padding: 0; font-family: monospace">= <a href=3D"http://www.Leidinger.net" target=3D"_blank" rel=3D"noopener nore= ferrer">http://www.Leidinger.net</a> <a href=3D"mailto:Alexander@Leidinger.= net:">Alexander@Leidinger.net:</a> PGP 0x8F31830F9F2772BF<br /><a href=3D"h= ttp://www.FreeBSD.org" target=3D"_blank" rel=3D"noopener noreferrer">http:/= /www.FreeBSD.org</a> <a href=3D"mailto:netchild@FreeBSD.org">n= etchild@FreeBSD.org</a> : PGP 0x8F31830F9F2772BF</div> </div> </body></html> --=_b3cd9b8a523236327cf34f2d64752b6f-- --=_dc0f07b562c2d8155c3c32185a8aac04 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc; size=833 Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmZOWYAACgkQEg2wmwP4 2IY4wA/9GhlwJBIeQvQaGnoH632EzWeZR8d3/tOkGxFYUoid9gSW4KDkxElE/i92 3RL2axaAKzhqnIMUo4R7qbJ5TImQqQn4Eh60NAPqm/IdkZoUcAno7Q8npzFiSyMc MZV3t9cY+OnxLfA9FAR628Zx1k8u0nNz4VG5xT2QIa7FtRjxxpfw7VVJOIcNQsPV kMmh4IJ4JbVc4N41VgGfOiLcihbh+6RVu4Yj0GaHSaeexV6knIe1g7jkCoo7vlwf OtKEu8Ua67yiB/VfpFTHcxljFUmOXeadXqw5TVHTAQJXdtJ4No0NK4RbcmVGojEh 0viPxTr1CPlk7sFjFtEPtKTQhHyD5Mpeq8OGDTVKabkROK1iY/4YQeIr2NuzTLyr hRygUld7Wnt2jhEiRbAXuIP3Mp5PRvNVSAZ+txNwMCHLveCVMtGuxvBDMYHI9lni mEmQxo9yJb85A4J7MQNRBJkohfAR/4kxIrP83xJj5lhaNI/DgVe0JrBXXJH6Q5gI +Muq4n327h/mYGy8SQfSebOnq4Mbsusi9eLurGs7gjAbPqf5SyGteFlJ2fIGQOmc gRSYUpw0ZcyogKHBqHZ2tvhoRTbBVBjX1z4cEEhMYwaebNRkYofyL8oN5YT+arkf dpJX7q6ls/ChwVAFNmn5n6qq0t2heuCLIXCH0YdpRbFU9n0tQG8= =BqXp -----END PGP SIGNATURE----- --=_dc0f07b562c2d8155c3c32185a8aac04--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d7138e8c2d6888cfe9ec73b76e6ae98b>