Date: Wed, 22 May 2024 12:53:04 -0600 From: Warner Losh <imp@bsdimp.com> To: Alexander Leidinger <Alexander@leidinger.net> Cc: Current <current@freebsd.org>, Alexander Motin <mav@freebsd.org> Subject: Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469 Message-ID: <CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA@mail.gmail.com> In-Reply-To: <730565997ef678bbfe87d7861075edae@Leidinger.net> References: <730565997ef678bbfe87d7861075edae@Leidinger.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--0000000000008455b106190f7288 Content-Type: multipart/alternative; boundary="0000000000008455ae06190f7286" --0000000000008455ae06190f7286 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable First order: Looks like we're trying to schedule a trim, but that fails due to a malloc issue. So then, since it's a malloc issue, we wind up trying to automatically reschedule this I/O, which recurses into the driver with a bad lock held and boop. Can you reproduce this? If so, can you test this patch? diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nvme/nvme_da.c index 3f6cf8702870..357e612200e9 100644 --- a/sys/cam/nvme/nvme_da.c +++ b/sys/cam/nvme/nvme_da.c @@ -1077,7 +1077,9 @@ ndastart(struct cam_periph *periph, union ccb *start_ccb) trim =3D malloc(sizeof(*trim), M_NVMEDA, M_ZERO | M_NOWAIT); if (trim =3D=3D NULL) { + cam_periph_unlock(periph); biofinish(bp, NULL, ENOMEM); + cam_periph_lock(periph); xpt_release_ccb(start_ccb); ndaschedule(periph); return; (the mailer may mangle it, so I've also attached it in case people want to comment on this). The logic here is that we have special logic in the ENOMEM case that will recursively call the start routine, which calls the scheduler which expects to be able to take out the periph lock. But it can't, because it's already locked. It also invokes a pacing protocol that slows down things even more. What I'm now not sure about is whether or not we need to just release start_ccb or if we also need to call ndaschedule too here. Seems like we might not want to (but it's a safe nop if not). I've cc'd mav@ to see if he has opinions on what's going on. Warner On Wed, May 22, 2024 at 11:22=E2=80=AFAM Alexander Leidinger < Alexander@leidinger.net> wrote: > Hi, > > I got the panic in $Subject. Anyone an idea? > > Complete crashlog available at https://wiki.leidinger.net/core.txt.6 > (1.2 MB) > > Short version: > ---snip--- > [11417] KDB: stack backtrace: > [11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe043133f830 > [11417] vpanic() at vpanic+0x13f/frame 0xfffffe043133f960 > [11417] panic() at panic+0x43/frame 0xfffffe043133f9c0 > [11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame > 0xfffffe043133fa50 > [11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame > 0xfffffe043133fa70 > [11417] ndastrategy() at ndastrategy+0x3c/frame 0xfffffe043133faa0 > [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fb00 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fb30 > [11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfffffe043133fb80 > [11417] g_disk_done() at g_disk_done+0xee/frame 0xfffffe043133fbc0 > [11417] ndastart() at ndastart+0x4a3/frame 0xfffffe043133fc20 > [11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfffffe043133fc70 > [11417] ndastrategy() at ndastrategy+0x6d/frame 0xfffffe043133fca0 > [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fd00 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd30 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd60 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd90 > [11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame > 0xfffffe043133fdc0 > [11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame > 0xfffffe043133fe10 > [11417] zio_execute() at zio_execute+0x78/frame 0xfffffe043133fe40 > [11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame > 0xfffffe043133fec0 > [11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame > 0xfffffe043133fef0 > ---snip--- > > This is with a world from 2024-05-17-084543. > > Bye, > Alexander. > > -- > http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF > http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF > --0000000000008455ae06190f7286 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable <div dir=3D"ltr"><div dir=3D"ltr"><div>First order: <br></div><div><br></di= v><div>Looks like we're trying to schedule a trim, but that fails due t= o a malloc issue. So then, since it's a</div><div>malloc issue, we wind= up trying to automatically reschedule this I/O, which recurses into the dr= iver</div><div>with a bad lock held and boop.</div><div><br></div><div>Can = you reproduce this?</div><div><br></div><div>If so, can you test this patch= ?</div><div><br></div><div>diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nv= me/nvme_da.c<br>index 3f6cf8702870..357e612200e9 100644<br>--- a/sys/cam/nv= me/nvme_da.c<br>+++ b/sys/cam/nvme/nvme_da.c<br>@@ -1077,7 +1077,9 @@ ndast= art(struct cam_periph *periph, union ccb *start_ccb)<br>=C2=A0<br>=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 trim =3D malloc(sizeof(*trim), M_NVMEDA, M_ZERO | M_NOWAIT);<br>=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 if (trim =3D=3D NULL) {<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cam_periph_unlock(periph);<br>=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 biofinish(bp, NULL, ENOMEM);<br>+ =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 cam_periph_lock(periph);<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 xpt_r= elease_ccb(start_ccb);<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 ndasche= dule(periph);<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return;</div><div><= br></div><div>(the mailer may mangle it, so I've also attached it in ca= se people want to comment on this).</div><div><br></div><div>The logic here= is that we have special logic in the ENOMEM case that will recursively</di= v><div>call the start routine, which calls the scheduler which expects to b= e able to take out the</div><div>periph lock. But it can't, because it&= #39;s already locked. It also invokes a pacing protocol that</div><div>slow= s down things even more.</div><div><br></div><div>What I'm now not sure= about is whether or not we need to just release start_ccb or if we also ne= ed to call ndaschedule too here. Seems like we might not want to (but it= 9;s a safe nop if not). I've cc'd mav@ to see if he has opinions on= what's going on.<br></div><div><br></div><div>Warner</div><div><br></d= iv></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gmail_att= r">On Wed, May 22, 2024 at 11:22=E2=80=AFAM Alexander Leidinger <<a href= =3D"mailto:Alexander@leidinger.net">Alexander@leidinger.net</a>> wrote:<= br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8e= x;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi,<br> <br> I got the panic in $Subject. Anyone an idea?<br> <br> Complete crashlog available at <a href=3D"https://wiki.leidinger.net/core.t= xt.6" rel=3D"noreferrer" target=3D"_blank">https://wiki.leidinger.net/core.= txt.6</a> <br> (1.2 MB)<br> <br> Short version:<br> ---snip---<br> [11417] KDB: stack backtrace:<br> [11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame <br> 0xfffffe043133f830<br> [11417] vpanic() at vpanic+0x13f/frame 0xfffffe043133f960<br> [11417] panic() at panic+0x43/frame 0xfffffe043133f9c0<br> [11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame <br> 0xfffffe043133fa50<br> [11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame <br> 0xfffffe043133fa70<br> [11417] ndastrategy() at ndastrategy+0x3c/frame 0xfffffe043133faa0<br> [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fb00<br> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fb30<br> [11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfffffe043133fb80<br> [11417] g_disk_done() at g_disk_done+0xee/frame 0xfffffe043133fbc0<br> [11417] ndastart() at ndastart+0x4a3/frame 0xfffffe043133fc20<br> [11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfffffe043133fc70<br= > [11417] ndastrategy() at ndastrategy+0x6d/frame 0xfffffe043133fca0<br> [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fd00<br> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd30<br> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd60<br> [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd90<br> [11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame <br> 0xfffffe043133fdc0<br> [11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame <br> 0xfffffe043133fe10<br> [11417] zio_execute() at zio_execute+0x78/frame 0xfffffe043133fe40<br> [11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame <br> 0xfffffe043133fec0<br> [11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame <br> 0xfffffe043133fef0<br> ---snip---<br> <br> This is with a world from 2024-05-17-084543.<br> <br> Bye,<br> Alexander.<br> <br> -- <br> <a href=3D"http://www.Leidinger.net" rel=3D"noreferrer" target=3D"_blank">h= ttp://www.Leidinger.net</a> Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF= <br> <a href=3D"http://www.FreeBSD.org" rel=3D"noreferrer" target=3D"_blank">htt= p://www.FreeBSD.org</a>=C2=A0 =C2=A0 netchild@FreeBSD.org=C2=A0 : PGP 0x8F3= 1830F9F2772BF<br> </blockquote></div></div> --0000000000008455ae06190f7286-- --0000000000008455b106190f7288 Content-Type: text/x-patch; charset="US-ASCII"; name="nda.diff" Content-Disposition: attachment; filename="nda.diff" Content-Transfer-Encoding: base64 Content-ID: <f_lwi6hnts0> X-Attachment-Id: f_lwi6hnts0 ZGlmZiAtLWdpdCBhL3N5cy9jYW0vbnZtZS9udm1lX2RhLmMgYi9zeXMvY2FtL252bWUvbnZtZV9k YS5jCmluZGV4IDNmNmNmODcwMjg3MC4uMzU3ZTYxMjIwMGU5IDEwMDY0NAotLS0gYS9zeXMvY2Ft L252bWUvbnZtZV9kYS5jCisrKyBiL3N5cy9jYW0vbnZtZS9udm1lX2RhLmMKQEAgLTEwNzcsNyAr MTA3Nyw5IEBAIG5kYXN0YXJ0KHN0cnVjdCBjYW1fcGVyaXBoICpwZXJpcGgsIHVuaW9uIGNjYiAq c3RhcnRfY2NiKQogCiAJCQl0cmltID0gbWFsbG9jKHNpemVvZigqdHJpbSksIE1fTlZNRURBLCBN X1pFUk8gfCBNX05PV0FJVCk7CiAJCQlpZiAodHJpbSA9PSBOVUxMKSB7CisJCQkJY2FtX3Blcmlw aF91bmxvY2socGVyaXBoKTsKIAkJCQliaW9maW5pc2goYnAsIE5VTEwsIEVOT01FTSk7CisJCQkJ Y2FtX3BlcmlwaF9sb2NrKHBlcmlwaCk7CiAJCQkJeHB0X3JlbGVhc2VfY2NiKHN0YXJ0X2NjYik7 CiAJCQkJbmRhc2NoZWR1bGUocGVyaXBoKTsKIAkJCQlyZXR1cm47Cg== --0000000000008455b106190f7288--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfo-k_ScVQY1MtOC2wUG4nCatbea9JwS7xzJc_OduVLyhA>