From nobody Wed May 22 18:53:04 2024 X-Original-To: current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Vl0nB70Rqz5KVZw for ; Wed, 22 May 2024 18:53:18 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Vl0nB4n07z4tnc for ; Wed, 22 May 2024 18:53:18 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ej1-x634.google.com with SMTP id a640c23a62f3a-a5a7d28555bso1093819166b.1 for ; Wed, 22 May 2024 11:53:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20230601.gappssmtp.com; s=20230601; t=1716403997; x=1717008797; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=nSIHSX23oBv3sAmSHI3RdDv7gybal2duwE2bGPW6SEg=; b=J8S9U2fQ7N6SCC64qoEmREt/NGP4gna0hsR/JW/r+7Y+4Jryt3+o45b5miJ9bFqo+0 o/DX5tB3SbPb9I473zWsC+Pz7DwpcINak1PNIMOLoc7TJSKEuKOPVic4dUkUL1XelRZN NI2GgYBsYWLJy+cT/6KJ/riGXaREK19Hb29EoQNLrzv9WcwQdV8J+UjUd1uffdw6rtj1 gw12xbHnqVkAGl0tTbQIS+QvC0MXDH36MC2YOcyHiABw7SSN7djiulYIPK0ZJznPA9ge mBm/5LbGO8dibnX/YigcmFGeh9HMQQTsE7P+yOlQobZ0v6sPiuqsfKrPL0spCOsRZSIo Mfsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716403997; x=1717008797; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nSIHSX23oBv3sAmSHI3RdDv7gybal2duwE2bGPW6SEg=; b=LAznxFqXDRFkktzq0miKvVNAhv9loB4PLI3azsXms7JddhRsAoJgHfmk+dHWrnVI9g LQBg79WuvLZdXrkKCpY27Wqxch7XzHhpJsydygJ8yNleQ4ikCWXcDKLT4PJAKvtxpR13 FQLBPXwVvb7NygIdQ8rnGOqbFVSw95BIXQMhEE6fSOFLSsZkTxMSZJjXQiq4VMptqBGn 4sNd3qpxrvatFcPxzj7JlNFu85Kz73XsR0KdghZtiI7rUR9k34DpVMXzcgMCA/uFTi1X kiNX07/8hJGAoI7/3V7I6gYnFg/bP8PgeutPlgRMHxa6+j+x/V6aaKhfQsvL36PrzEYJ iH7A== X-Gm-Message-State: AOJu0Yys9AZJuj4QefvPPzieSTMQtqt08VkPGsF8wGGnsNvr14B4mf6s tWB4W7fVLHv9FlCrZT5joScM2Bkn02ghcjyvDzSv8e1BDsqro4Coy7MB9HprcQT4gXYJ/fTEDWY RRkd8Q9oT+JkcPGmNYkNIQ86TL4XeqUG/Z4WQpA== X-Google-Smtp-Source: AGHT+IH9QdUJf4ss+szvpyDIDvCwEjw56wLt83CxiW4jS+i97Iybm0SWOkPpJ3FkCjcbcAhcNDt0eh2wb8Jru0iS5Nk= X-Received: by 2002:a17:906:eb0f:b0:a62:2ba5:7bc2 with SMTP id a640c23a62f3a-a622ba58d68mr167567766b.0.1716403996731; Wed, 22 May 2024 11:53:16 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <730565997ef678bbfe87d7861075edae@Leidinger.net> In-Reply-To: <730565997ef678bbfe87d7861075edae@Leidinger.net> From: Warner Losh Date: Wed, 22 May 2024 12:53:04 -0600 Message-ID: Subject: Re: _mtx_lock_sleep: recursed on non-recursive mutex CAM device lock @ /..../sys/cam/nvme/nvme_da.c:469 To: Alexander Leidinger Cc: Current , Alexander Motin Content-Type: multipart/mixed; boundary="0000000000008455b106190f7288" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4Vl0nB4n07z4tnc --0000000000008455b106190f7288 Content-Type: multipart/alternative; boundary="0000000000008455ae06190f7286" --0000000000008455ae06190f7286 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable First order: Looks like we're trying to schedule a trim, but that fails due to a malloc issue. So then, since it's a malloc issue, we wind up trying to automatically reschedule this I/O, which recurses into the driver with a bad lock held and boop. Can you reproduce this? If so, can you test this patch? diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nvme/nvme_da.c index 3f6cf8702870..357e612200e9 100644 --- a/sys/cam/nvme/nvme_da.c +++ b/sys/cam/nvme/nvme_da.c @@ -1077,7 +1077,9 @@ ndastart(struct cam_periph *periph, union ccb *start_ccb) trim =3D malloc(sizeof(*trim), M_NVMEDA, M_ZERO | M_NOWAIT); if (trim =3D=3D NULL) { + cam_periph_unlock(periph); biofinish(bp, NULL, ENOMEM); + cam_periph_lock(periph); xpt_release_ccb(start_ccb); ndaschedule(periph); return; (the mailer may mangle it, so I've also attached it in case people want to comment on this). The logic here is that we have special logic in the ENOMEM case that will recursively call the start routine, which calls the scheduler which expects to be able to take out the periph lock. But it can't, because it's already locked. It also invokes a pacing protocol that slows down things even more. What I'm now not sure about is whether or not we need to just release start_ccb or if we also need to call ndaschedule too here. Seems like we might not want to (but it's a safe nop if not). I've cc'd mav@ to see if he has opinions on what's going on. Warner On Wed, May 22, 2024 at 11:22=E2=80=AFAM Alexander Leidinger < Alexander@leidinger.net> wrote: > Hi, > > I got the panic in $Subject. Anyone an idea? > > Complete crashlog available at https://wiki.leidinger.net/core.txt.6 > (1.2 MB) > > Short version: > ---snip--- > [11417] KDB: stack backtrace: > [11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe043133f830 > [11417] vpanic() at vpanic+0x13f/frame 0xfffffe043133f960 > [11417] panic() at panic+0x43/frame 0xfffffe043133f9c0 > [11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame > 0xfffffe043133fa50 > [11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame > 0xfffffe043133fa70 > [11417] ndastrategy() at ndastrategy+0x3c/frame 0xfffffe043133faa0 > [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fb00 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fb30 > [11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfffffe043133fb80 > [11417] g_disk_done() at g_disk_done+0xee/frame 0xfffffe043133fbc0 > [11417] ndastart() at ndastart+0x4a3/frame 0xfffffe043133fc20 > [11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfffffe043133fc70 > [11417] ndastrategy() at ndastrategy+0x6d/frame 0xfffffe043133fca0 > [11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fd00 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd30 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd60 > [11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd90 > [11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame > 0xfffffe043133fdc0 > [11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame > 0xfffffe043133fe10 > [11417] zio_execute() at zio_execute+0x78/frame 0xfffffe043133fe40 > [11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame > 0xfffffe043133fec0 > [11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame > 0xfffffe043133fef0 > ---snip--- > > This is with a world from 2024-05-17-084543. > > Bye, > Alexander. > > -- > http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF > http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF > --0000000000008455ae06190f7286 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
First order:

Looks like we're trying to schedule a trim, but that fails due t= o a malloc issue. So then, since it's a
malloc issue, we wind= up trying to automatically reschedule this I/O, which recurses into the dr= iver
with a bad lock held and boop.

Can = you reproduce this?

If so, can you test this patch= ?

diff --git a/sys/cam/nvme/nvme_da.c b/sys/cam/nv= me/nvme_da.c
index 3f6cf8702870..357e612200e9 100644
--- a/sys/cam/nv= me/nvme_da.c
+++ b/sys/cam/nvme/nvme_da.c
@@ -1077,7 +1077,9 @@ ndast= art(struct cam_periph *periph, union ccb *start_ccb)
=C2=A0
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 trim =3D malloc(sizeof(*trim), M_NVMEDA, M_ZERO | M_NOWAIT);
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 if (trim =3D=3D NULL) {
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cam_periph_unlock(periph);
=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 biofinish(bp, NULL, ENOMEM);
+ =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 cam_periph_lock(periph);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 xpt_r= elease_ccb(start_ccb);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=C2=A0=C2=A0 =C2=A0 =C2=A0 =C2=A0 ndasche= dule(periph);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return;
<= br>
(the mailer may mangle it, so I've also attached it in ca= se people want to comment on this).

The logic here= is that we have special logic in the ENOMEM case that will recursively
call the start routine, which calls the scheduler which expects to b= e able to take out the
periph lock. But it can't, because it&= #39;s already locked. It also invokes a pacing protocol that
slow= s down things even more.

What I'm now not sure= about is whether or not we need to just release start_ccb or if we also ne= ed to call ndaschedule too here. Seems like we might not want to (but it= 9;s a safe nop if not). I've cc'd mav@ to see if he has opinions on= what's going on.

Warner


On Wed, May 22, 2024 at 11:22=E2=80=AFAM Alexander Leidinger <Alexander@leidinger.net> wrote:<= br>
Hi,

I got the panic in $Subject. Anyone an idea?

Complete crashlog available at https://wiki.leidinger.net/core.= txt.6
(1.2 MB)

Short version:
---snip---
[11417] KDB: stack backtrace:
[11417] db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0xfffffe043133f830
[11417] vpanic() at vpanic+0x13f/frame 0xfffffe043133f960
[11417] panic() at panic+0x43/frame 0xfffffe043133f9c0
[11417] __mtx_lock_sleep() at __mtx_lock_sleep+0x491/frame
0xfffffe043133fa50
[11417] __mtx_lock_flags() at __mtx_lock_flags+0x9c/frame
0xfffffe043133fa70
[11417] ndastrategy() at ndastrategy+0x3c/frame 0xfffffe043133faa0
[11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fb00
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fb30
[11417] g_io_deliver() at g_io_deliver+0x1cc/frame 0xfffffe043133fb80
[11417] g_disk_done() at g_disk_done+0xee/frame 0xfffffe043133fbc0
[11417] ndastart() at ndastart+0x4a3/frame 0xfffffe043133fc20
[11417] xpt_run_allocq() at xpt_run_allocq+0xa5/frame 0xfffffe043133fc70 [11417] ndastrategy() at ndastrategy+0x6d/frame 0xfffffe043133fca0
[11417] g_disk_start() at g_disk_start+0x569/frame 0xfffffe043133fd00
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd30
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd60
[11417] g_io_request() at g_io_request+0x2b6/frame 0xfffffe043133fd90
[11417] vdev_geom_io_start() at vdev_geom_io_start+0x257/frame
0xfffffe043133fdc0
[11417] zio_vdev_io_start() at zio_vdev_io_start+0x321/frame
0xfffffe043133fe10
[11417] zio_execute() at zio_execute+0x78/frame 0xfffffe043133fe40
[11417] taskqueue_run_locked() at taskqueue_run_locked+0x1c7/frame
0xfffffe043133fec0
[11417] taskqueue_thread_loop() at taskqueue_thread_loop+0xd3/frame
0xfffffe043133fef0
---snip---

This is with a world from 2024-05-17-084543.

Bye,
Alexander.

--
h= ttp://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF=
htt= p://www.FreeBSD.org=C2=A0 =C2=A0 netchild@FreeBSD.org=C2=A0 : PGP 0x8F3= 1830F9F2772BF
--0000000000008455ae06190f7286-- --0000000000008455b106190f7288 Content-Type: text/x-patch; charset="US-ASCII"; name="nda.diff" Content-Disposition: attachment; filename="nda.diff" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_lwi6hnts0 ZGlmZiAtLWdpdCBhL3N5cy9jYW0vbnZtZS9udm1lX2RhLmMgYi9zeXMvY2FtL252bWUvbnZtZV9k YS5jCmluZGV4IDNmNmNmODcwMjg3MC4uMzU3ZTYxMjIwMGU5IDEwMDY0NAotLS0gYS9zeXMvY2Ft L252bWUvbnZtZV9kYS5jCisrKyBiL3N5cy9jYW0vbnZtZS9udm1lX2RhLmMKQEAgLTEwNzcsNyAr MTA3Nyw5IEBAIG5kYXN0YXJ0KHN0cnVjdCBjYW1fcGVyaXBoICpwZXJpcGgsIHVuaW9uIGNjYiAq c3RhcnRfY2NiKQogCiAJCQl0cmltID0gbWFsbG9jKHNpemVvZigqdHJpbSksIE1fTlZNRURBLCBN X1pFUk8gfCBNX05PV0FJVCk7CiAJCQlpZiAodHJpbSA9PSBOVUxMKSB7CisJCQkJY2FtX3Blcmlw aF91bmxvY2socGVyaXBoKTsKIAkJCQliaW9maW5pc2goYnAsIE5VTEwsIEVOT01FTSk7CisJCQkJ Y2FtX3BlcmlwaF9sb2NrKHBlcmlwaCk7CiAJCQkJeHB0X3JlbGVhc2VfY2NiKHN0YXJ0X2NjYik7 CiAJCQkJbmRhc2NoZWR1bGUocGVyaXBoKTsKIAkJCQlyZXR1cm47Cg== --0000000000008455b106190f7288--