Date: Tue, 6 Oct 2015 11:03:43 -0700 From: Jim Harris <jimharris@freebsd.org> To: Steven Hartland <killing@multiplay.co.uk>, Sean Kelly <smkelly@smkelly.org> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: Dell NVMe issues Message-ID: <CAJP=Hc9-oQnk2r48OBXVCQbMDn0URDMDb80a0i0XvUDPuuLkrA@mail.gmail.com> In-Reply-To: <5613FA02.2080205@multiplay.co.uk> References: <BC5F191D-FEB2-4ADC-9D6B-240C80B2301C@smkelly.org> <5613FA02.2080205@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
--089e0115f3c453352f05217373e9 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Tue, Oct 6, 2015 at 9:42 AM, Steven Hartland <killing@multiplay.co.uk> wrote: > Also looks like nvme exposes a timeout_period sysctl you could try > increasing that as it could be too small for a full disk TRIM. > > Under CAM SCSI da support we have a delete_max which limits the max singl= e > request size for a delete it may be we need something similar for nvme as > well to prevent this as it should still be chunking the deletes to ensure > this sort of thing doesn't happen. See attached. Sean - can you try this patch with TRIM re-enabled in ZFS? I would be curious if TRIM passes without this patch if you increase the timeout_period as suggested. -Jim > > > On 06/10/2015 16:18, Sean Kelly wrote: > >> Back in May, I posted about issues I was having with a Dell PE R630 with >> 4x800GB NVMe SSDs. I would get kernel panics due to the inability to ass= ign >> all the interrupts because of >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321 < >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321>. Jim Harris >> helped fix this issue so I bought several more of these servers, Includi= ng >> ones with 4x1.6TB drives=E2=80=A6 >> >> while the new servers with 4x800GB drives still work, the ones with >> 4x1.6TB drives do not. When I do a >> zpool create tank mirror nvd0 nvd1 mirror nvd2 nvd3 >> the command never returns and the kernel logs: >> nvme0: resetting controller >> nvme0: controller ready did not become 0 within 2000 ms >> >> I=E2=80=99ve tried several different things trying to understand where t= he actual >> problem is. >> WORKS: dd if=3D/dev/nvd0 of=3D/dev/null bs=3D1m >> WORKS: dd if=3D/dev/zero of=3D/dev/nvd0 bs=3D1m >> WORKS: newfs /dev/nvd0 >> FAILS: zpool create tank mirror nvd[01] >> FAILS: gpart add -t freebsd-zfs nvd[01] && zpool create tank mirror >> nvd[01]p1 >> FAILS: gpart add -t freebsd-zfs -s 1400g nvd[01[ && zpool create tank >> nvd[01]p1 >> WORKS: gpart add -t freebsd-zfs -s 800g nvd[01] && zpool create tank >> nvd[01]p1 >> >> NOTE: The above commands are more about getting the point across, not >> validity. I wiped the disk clean between gpart attempts and used GPT. >> >> So it seems like zpool works if I don=E2=80=99t cross past ~800GB. But o= ther >> things like dd and newfs work. >> >> When I get the kernel messages about the controller resetting and then >> not responding, the NVMe subsystem hangs entirely. Since my boot disks a= re >> not NVMe, the system continues to work but no more NVMe stuff can be don= e. >> Further, attempting to reboot hangs and I have to do a power cycle. >> >> Any thoughts on what the deal may be here? >> >> 10.2-RELEASE-p5 >> >> nvme0@pci0:132:0:0: class=3D0x010802 card=3D0x1f971028 chip=3D0xa820= 144d >> rev=3D0x03 hdr=3D0x00 >> vendor =3D 'Samsung Electronics Co Ltd' >> class =3D mass storage >> subclass =3D NVM >> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > --089e0115f3c453352f05217373e9 Content-Type: application/octet-stream; name="nvd.patch" Content-Disposition: attachment; filename="nvd.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_iffo6nt30 ZGlmZiAtLWdpdCBhL3N5cy9kZXYvbnZkL252ZC5jIGIvc3lzL2Rldi9udmQvbnZkLmMKaW5kZXgg ZDc1MjgzMi4uMzAxNWUzOSAxMDA2NDQKLS0tIGEvc3lzL2Rldi9udmQvbnZkLmMKKysrIGIvc3lz L2Rldi9udmQvbnZkLmMKQEAgLTMyLDYgKzMyLDcgQEAgX19GQlNESUQoIiRGcmVlQlNEOiByZWxl bmcvMTAuMi9zeXMvZGV2L252ZC9udmQuYyAyODU5MTkgMjAxNS0wNy0yNyAxNzo1MDowNVogamkK ICNpbmNsdWRlIDxzeXMva2VybmVsLmg+CiAjaW5jbHVkZSA8c3lzL21hbGxvYy5oPgogI2luY2x1 ZGUgPHN5cy9tb2R1bGUuaD4KKyNpbmNsdWRlIDxzeXMvc3lzY3RsLmg+CiAjaW5jbHVkZSA8c3lz L3N5c3RtLmg+CiAjaW5jbHVkZSA8c3lzL3Rhc2txdWV1ZS5oPgogCkBAIC04NSw2ICs4NiwxMSBA QCBzdHJ1Y3QgbnZkX2NvbnRyb2xsZXIgewogc3RhdGljIFRBSUxRX0hFQUQoLCBudmRfY29udHJv bGxlcikJY3RybHJfaGVhZDsKIHN0YXRpYyBUQUlMUV9IRUFEKGRpc2tfbGlzdCwgbnZkX2Rpc2sp CWRpc2tfaGVhZDsKIAorc3RhdGljIFNZU0NUTF9OT0RFKF9odywgT0lEX0FVVE8sIG52ZCwgQ1RM RkxBR19SRCwgMCwgIm52ZCBkcml2ZXIgcGFyYW1ldGVycyIpOworc3RhdGljIHVpbnQ2NF90IG52 ZF9kZWxldGVfbWF4ID0gKDQgKiAxMDI0ICogMTAyNCAqIDEwMjQpOyAgLyogNEdCICovCitTWVND VExfVVFVQUQoX2h3X252ZCwgT0lEX0FVVE8sIGRlbGV0ZV9tYXgsIENUTEZMQUdfUldUVU4sICZu dmRfZGVsZXRlX21heCwgMCwKKwkgICAgICJudmQgbWF4aW11bSBCSU9fREVMRVRFIHNpemUiKTsK Kwogc3RhdGljIGludCBudmRfbW9kZXZlbnQobW9kdWxlX3QgbW9kLCBpbnQgdHlwZSwgdm9pZCAq YXJnKQogewogCWludCBlcnJvciA9IDA7CkBAIC0yNzksNiArMjg1LDggQEAgbnZkX25ld19kaXNr KHN0cnVjdCBudm1lX25hbWVzcGFjZSAqbnMsIHZvaWQgKmN0cmxyX2FyZykKIAlkaXNrLT5kX3Nl Y3RvcnNpemUgPSBudm1lX25zX2dldF9zZWN0b3Jfc2l6ZShucyk7CiAJZGlzay0+ZF9tZWRpYXNp emUgPSAob2ZmX3QpbnZtZV9uc19nZXRfc2l6ZShucyk7CiAJZGlzay0+ZF9kZWxtYXhzaXplID0g KG9mZl90KW52bWVfbnNfZ2V0X3NpemUobnMpOworCWlmIChkaXNrLT5kX2RlbG1heHNpemUgPiBu dmRfZGVsZXRlX21heCkKKwkJZGlzay0+ZF9kZWxtYXhzaXplID0gbnZkX2RlbGV0ZV9tYXg7CiAK IAlpZiAoVEFJTFFfRU1QVFkoJmRpc2tfaGVhZCkpCiAJCWRpc2stPmRfdW5pdCA9IDA7Cg== --089e0115f3c453352f05217373e9--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc9-oQnk2r48OBXVCQbMDn0URDMDb80a0i0XvUDPuuLkrA>