Date: Tue, 6 Oct 2015 11:03:43 -0700 From: Jim Harris <jimharris@freebsd.org> To: Steven Hartland <killing@multiplay.co.uk>, Sean Kelly <smkelly@smkelly.org> Cc: FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org> Subject: Re: Dell NVMe issues Message-ID: <CAJP=Hc9-oQnk2r48OBXVCQbMDn0URDMDb80a0i0XvUDPuuLkrA@mail.gmail.com> In-Reply-To: <5613FA02.2080205@multiplay.co.uk> References: <BC5F191D-FEB2-4ADC-9D6B-240C80B2301C@smkelly.org> <5613FA02.2080205@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] On Tue, Oct 6, 2015 at 9:42 AM, Steven Hartland <killing@multiplay.co.uk> wrote: > Also looks like nvme exposes a timeout_period sysctl you could try > increasing that as it could be too small for a full disk TRIM. > > Under CAM SCSI da support we have a delete_max which limits the max single > request size for a delete it may be we need something similar for nvme as > well to prevent this as it should still be chunking the deletes to ensure > this sort of thing doesn't happen. See attached. Sean - can you try this patch with TRIM re-enabled in ZFS? I would be curious if TRIM passes without this patch if you increase the timeout_period as suggested. -Jim > > > On 06/10/2015 16:18, Sean Kelly wrote: > >> Back in May, I posted about issues I was having with a Dell PE R630 with >> 4x800GB NVMe SSDs. I would get kernel panics due to the inability to assign >> all the interrupts because of >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321 < >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=199321>. Jim Harris >> helped fix this issue so I bought several more of these servers, Including >> ones with 4x1.6TB drives… >> >> while the new servers with 4x800GB drives still work, the ones with >> 4x1.6TB drives do not. When I do a >> zpool create tank mirror nvd0 nvd1 mirror nvd2 nvd3 >> the command never returns and the kernel logs: >> nvme0: resetting controller >> nvme0: controller ready did not become 0 within 2000 ms >> >> I’ve tried several different things trying to understand where the actual >> problem is. >> WORKS: dd if=/dev/nvd0 of=/dev/null bs=1m >> WORKS: dd if=/dev/zero of=/dev/nvd0 bs=1m >> WORKS: newfs /dev/nvd0 >> FAILS: zpool create tank mirror nvd[01] >> FAILS: gpart add -t freebsd-zfs nvd[01] && zpool create tank mirror >> nvd[01]p1 >> FAILS: gpart add -t freebsd-zfs -s 1400g nvd[01[ && zpool create tank >> nvd[01]p1 >> WORKS: gpart add -t freebsd-zfs -s 800g nvd[01] && zpool create tank >> nvd[01]p1 >> >> NOTE: The above commands are more about getting the point across, not >> validity. I wiped the disk clean between gpart attempts and used GPT. >> >> So it seems like zpool works if I don’t cross past ~800GB. But other >> things like dd and newfs work. >> >> When I get the kernel messages about the controller resetting and then >> not responding, the NVMe subsystem hangs entirely. Since my boot disks are >> not NVMe, the system continues to work but no more NVMe stuff can be done. >> Further, attempting to reboot hangs and I have to do a power cycle. >> >> Any thoughts on what the deal may be here? >> >> 10.2-RELEASE-p5 >> >> nvme0@pci0:132:0:0: class=0x010802 card=0x1f971028 chip=0xa820144d >> rev=0x03 hdr=0x00 >> vendor = 'Samsung Electronics Co Ltd' >> class = mass storage >> subclass = NVM >> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > [-- Attachment #2 --] diff --git a/sys/dev/nvd/nvd.c b/sys/dev/nvd/nvd.c index d752832..3015e39 100644 --- a/sys/dev/nvd/nvd.c +++ b/sys/dev/nvd/nvd.c @@ -32,6 +32,7 @@ __FBSDID("$FreeBSD: releng/10.2/sys/dev/nvd/nvd.c 285919 2015-07-27 17:50:05Z ji #include <sys/kernel.h> #include <sys/malloc.h> #include <sys/module.h> +#include <sys/sysctl.h> #include <sys/systm.h> #include <sys/taskqueue.h> @@ -85,6 +86,11 @@ struct nvd_controller { static TAILQ_HEAD(, nvd_controller) ctrlr_head; static TAILQ_HEAD(disk_list, nvd_disk) disk_head; +static SYSCTL_NODE(_hw, OID_AUTO, nvd, CTLFLAG_RD, 0, "nvd driver parameters"); +static uint64_t nvd_delete_max = (4 * 1024 * 1024 * 1024); /* 4GB */ +SYSCTL_UQUAD(_hw_nvd, OID_AUTO, delete_max, CTLFLAG_RWTUN, &nvd_delete_max, 0, + "nvd maximum BIO_DELETE size"); + static int nvd_modevent(module_t mod, int type, void *arg) { int error = 0; @@ -279,6 +285,8 @@ nvd_new_disk(struct nvme_namespace *ns, void *ctrlr_arg) disk->d_sectorsize = nvme_ns_get_sector_size(ns); disk->d_mediasize = (off_t)nvme_ns_get_size(ns); disk->d_delmaxsize = (off_t)nvme_ns_get_size(ns); + if (disk->d_delmaxsize > nvd_delete_max) + disk->d_delmaxsize = nvd_delete_max; if (TAILQ_EMPTY(&disk_head)) disk->d_unit = 0;
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJP=Hc9-oQnk2r48OBXVCQbMDn0URDMDb80a0i0XvUDPuuLkrA>
