Date: Mon, 1 Aug 2016 08:32:10 -0700 From: Michael Loftis <mloftis@wgops.com> To: Borja Marcos <borjam@sarenet.es> Cc: Jim Harris <jim.harris@gmail.com>, FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: Intel NVMe troubles? Message-ID: <CAHDg04s9krSA3HpPPYU8Gv0gHwXZMc8tEu_e0%2BzcwT2MHi8N8Q@mail.gmail.com> In-Reply-To: <4996AF96-76BA-47F1-B328-D4FE7AC777EE@sarenet.es> References: <CBC304D0-AA57-4EF5-A2DD-1888FB88DE12@sarenet.es> <CAJP=Hc-KdmScZtCRDcF=CTpNcMkn2brXiPx4XwJA0aTYgkxm%2Bg@mail.gmail.com> <AAC8E93B-F263-4B7E-91DF-9EAC77FB2C3C@sarenet.es> <CAJP=Hc-3ogfoSZ0cjycm%2Bsb0M80B6M5ZrGtWn1BjfOFPteGgdA@mail.gmail.com> <4996AF96-76BA-47F1-B328-D4FE7AC777EE@sarenet.es>
next in thread | previous in thread | raw e-mail | index | archive | help
FWIW I've had similar issues with Intel 750 PCIe NVMe drives when attempting to use 4K blocks on Linux with EXT4 on top of MD RAID1 (software mirror). I didn't dig much into because too many layers to reduce at the time but it looked like the drive misreported the number of blocks and a subsequent TRIM command or write of the last sector then errored. I mention it because despite the differences the similarities (Intel NVMe, LBA#3/4K) and error writing to a nonexistent block. Might give someone enough info to figure it out fully. On Monday, August 1, 2016, Borja Marcos <borjam@sarenet.es> wrote: > > > On 29 Jul 2016, at 17:44, Jim Harris <jim.harris@gmail.com > <javascript:;>> wrote: > > > > > > > > On Fri, Jul 29, 2016 at 1:10 AM, Borja Marcos <borjam@sarenet.es > <javascript:;>> wrote: > > > > > On 28 Jul 2016, at 19:25, Jim Harris <jim.harris@gmail.com > <javascript:;>> wrote: > > > > > > Yes, you should worry. > > > > > > Normally we could use the dump_debug sysctls to help debug this - the= se > > > sysctls will dump the NVMe I/O submission and completion queues. But > in > > > this case the LBA data is in the payload, not the NVMe submission > entries, > > > so dump_debug will not help as much as dumping the NVMe DSM payload > > > directly. > > > > > > Could you try the attached patch and send output after recreating you= r > pool? > > > > Just in case the evil anti-spam ate my answer, sent the results to your > Gmail account. > > > > > > Thanks Borja. > > > > It looks like all of the TRIM commands are formatted properly. The > failures do not happen until about 10 seconds after the last TRIM to each > drive was submitted, and immediately before TRIMs start to the next drive= , > so I'm assuming the failures are for the the last few TRIM commands but > cannot say for sure. Could you apply patch v2 (attached) which will dump > the TRIM payload contents inline with the failure messages? > > Sure, this is the complete /var/log/messages starting with the system > boot. Before booting I destroyed the pool > so that you could capture what happens when booting, zpool create, etc. > > Remember that the drives are in LBA format #3 (4 KB blocks). As far as I > know that=E2=80=99s preferred to the old 512 byte blocks. > > Thank you very much and sorry about the belated response. > > > > > > Borja. > > > > _______________________________________________ > freebsd-stable@freebsd.org <javascript:;> mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org > <javascript:;>" --=20 "Genius might be described as a supreme capacity for getting its possessors into trouble of all kinds." -- Samuel Butler
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDg04s9krSA3HpPPYU8Gv0gHwXZMc8tEu_e0%2BzcwT2MHi8N8Q>