Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 29 Jul 2016 08:44:50 -0700
From:      Jim Harris <jim.harris@gmail.com>
To:        Borja Marcos <borjam@sarenet.es>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: Intel NVMe troubles?
Message-ID:  <CAJP=Hc-3ogfoSZ0cjycm%2Bsb0M80B6M5ZrGtWn1BjfOFPteGgdA@mail.gmail.com>
In-Reply-To: <AAC8E93B-F263-4B7E-91DF-9EAC77FB2C3C@sarenet.es>
References:  <CBC304D0-AA57-4EF5-A2DD-1888FB88DE12@sarenet.es> <CAJP=Hc-KdmScZtCRDcF=CTpNcMkn2brXiPx4XwJA0aTYgkxm%2Bg@mail.gmail.com> <AAC8E93B-F263-4B7E-91DF-9EAC77FB2C3C@sarenet.es>

next in thread | previous in thread | raw e-mail | index | archive | help

[-- Attachment #1 --]
On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam@sarenet.es> wrote:

>
> > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris@gmail.com> wrote:
> >
> > Yes, you should worry.
> >
> > Normally we could use the dump_debug sysctls to help debug this - these
> > sysctls will dump the NVMe I/O submission and completion queues.  But in
> > this case the LBA data is in the payload, not the NVMe submission
> entries,
> > so dump_debug will not help as much as dumping the NVMe DSM payload
> > directly.
> >
> > Could you try the attached patch and send output after recreating your
> pool?
>
> Just in case the evil anti-spam ate my answer, sent the results to your
> Gmail account.
>
>
Thanks Borja.

It looks like all of the TRIM commands are formatted properly.  The
failures do not happen until about 10 seconds after the last TRIM to each
drive was submitted, and immediately before TRIMs start to the next drive,
so I'm assuming the failures are for the the last few TRIM commands but
cannot say for sure.  Could you apply patch v2 (attached) which will dump
the TRIM payload contents inline with the failure messages?

Thanks,

-Jim

[-- Attachment #2 --]
diff --git a/sys/dev/nvme/nvme_ns.c b/sys/dev/nvme/nvme_ns.c
index 754d074..293dd25 100644
--- a/sys/dev/nvme/nvme_ns.c
+++ b/sys/dev/nvme/nvme_ns.c
@@ -461,6 +461,7 @@ nvme_ns_bio_process(struct nvme_namespace *ns, struct bio *bp,
 		    bp->bio_bcount/nvme_ns_get_sector_size(ns);
 		dsm_range->starting_lba =
 		    bp->bio_offset/nvme_ns_get_sector_size(ns);
+		nvme_printf(ns->ctrlr, "length=%ju lba=%ju\n", (uintmax_t)dsm_range->length, (uintmax_t)dsm_range->starting_lba);
 		bp->bio_driver2 = dsm_range;
 		err = nvme_ns_cmd_deallocate(ns, dsm_range, 1,
 			nvme_ns_bio_done, bp);
diff --git a/sys/dev/nvme/nvme_qpair.c b/sys/dev/nvme/nvme_qpair.c
index 92fe672..6d36d33 100644
--- a/sys/dev/nvme/nvme_qpair.c
+++ b/sys/dev/nvme/nvme_qpair.c
@@ -319,6 +319,13 @@ nvme_qpair_complete_tracker(struct nvme_qpair *qpair, struct nvme_tracker *tr,
 
 	if (error && print_on_error) {
 		nvme_qpair_print_command(qpair, &req->cmd);
+		if (qpair->id > 0 && req->cmd.opc == NVME_OPC_DATASET_MANAGEMENT) {
+			struct nvme_dsm_range *dsm_range;
+
+			dsm_range = req->u.payload;
+			nvme_printf(qpair->ctrlr, "trim failed: len=%ju lba=%ju\n",
+				    (uintmax_t)dsm_range->length, (uintmax_t)dsm_range->starting_lba);
+		}
 		nvme_qpair_print_completion(qpair, cpl);
 	}
 

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc-3ogfoSZ0cjycm%2Bsb0M80B6M5ZrGtWn1BjfOFPteGgdA>