From owner-freebsd-stable@freebsd.org Tue Oct 6 18:03:44 2015 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5D2B49B6FD2 for ; Tue, 6 Oct 2015 18:03:44 +0000 (UTC) (envelope-from jim.harris@gmail.com) Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com [IPv6:2607:f8b0:4003:c01::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1FDE7D98 for ; Tue, 6 Oct 2015 18:03:44 +0000 (UTC) (envelope-from jim.harris@gmail.com) Received: by obcgx8 with SMTP id gx8so160435961obc.3 for ; Tue, 06 Oct 2015 11:03:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=g8WUl/vEMOLU6FgeGsQn2s9EExvZ/rftfFX2dYfY/tU=; b=njX2FogJcOIElNYCnVWNkd8n9Vo/nd97KBmvIqZtyiziQPZZoyAgvm9AKXgMpUjKxx LKb49YL5uWa7LpdQavTbYjtYNUXLAr++gNt979CfUNPD+n+J+e3klki4BrcMk6hspzPt hfg3tjCuSnE+6wquz58XRv6IRCkhQ9587ZFkxxByWNUJo0w6m+73LrdeduNb46+Cegnz 6T6nwjJKVCB3JlMvmnWxWfyLmnmZyHBiiykXVF1ivGPsqLzXxr663saJj6y334JvLYPO 3zhSD9wbdsVLW4r4GivnhTwlieypRbkoz6AR1LRrhWBZdfRwjxxElFmQErNJ8VQxeGcj QdMg== MIME-Version: 1.0 X-Received: by 10.60.176.36 with SMTP id cf4mr7563921oec.9.1444154623299; Tue, 06 Oct 2015 11:03:43 -0700 (PDT) Sender: jim.harris@gmail.com Received: by 10.202.212.201 with HTTP; Tue, 6 Oct 2015 11:03:43 -0700 (PDT) In-Reply-To: <5613FA02.2080205@multiplay.co.uk> References: <5613FA02.2080205@multiplay.co.uk> Date: Tue, 6 Oct 2015 11:03:43 -0700 X-Google-Sender-Auth: eoV5p-4Z9iJFQ4yxYVfmFzzxHCk Message-ID: Subject: Re: Dell NVMe issues From: Jim Harris To: Steven Hartland , Sean Kelly Cc: FreeBSD-STABLE Mailing List Content-Type: multipart/mixed; boundary=089e0115f3c453352f05217373e9 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Oct 2015 18:03:44 -0000 --089e0115f3c453352f05217373e9 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Tue, Oct 6, 2015 at 9:42 AM, Steven Hartland wrote: > Also looks like nvme exposes a timeout_period sysctl you could try > increasing that as it could be too small for a full disk TRIM. > > Under CAM SCSI da support we have a delete_max which limits the max singl= e > request size for a delete it may be we need something similar for nvme as > well to prevent this as it should still be chunking the deletes to ensure > this sort of thing doesn't happen. See attached. Sean - can you try this patch with TRIM re-enabled in ZFS? I would be curious if TRIM passes without this patch if you increase the timeout_period as suggested. -Jim > > > On 06/10/2015 16:18, Sean Kelly wrote: > >> Back in May, I posted about issues I was having with a Dell PE R630 with >> 4x800GB NVMe SSDs. I would get kernel panics due to the inability to ass= ign >> all the interrupts because of >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321 < >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D199321>. Jim Harris >> helped fix this issue so I bought several more of these servers, Includi= ng >> ones with 4x1.6TB drives=E2=80=A6 >> >> while the new servers with 4x800GB drives still work, the ones with >> 4x1.6TB drives do not. When I do a >> zpool create tank mirror nvd0 nvd1 mirror nvd2 nvd3 >> the command never returns and the kernel logs: >> nvme0: resetting controller >> nvme0: controller ready did not become 0 within 2000 ms >> >> I=E2=80=99ve tried several different things trying to understand where t= he actual >> problem is. >> WORKS: dd if=3D/dev/nvd0 of=3D/dev/null bs=3D1m >> WORKS: dd if=3D/dev/zero of=3D/dev/nvd0 bs=3D1m >> WORKS: newfs /dev/nvd0 >> FAILS: zpool create tank mirror nvd[01] >> FAILS: gpart add -t freebsd-zfs nvd[01] && zpool create tank mirror >> nvd[01]p1 >> FAILS: gpart add -t freebsd-zfs -s 1400g nvd[01[ && zpool create tank >> nvd[01]p1 >> WORKS: gpart add -t freebsd-zfs -s 800g nvd[01] && zpool create tank >> nvd[01]p1 >> >> NOTE: The above commands are more about getting the point across, not >> validity. I wiped the disk clean between gpart attempts and used GPT. >> >> So it seems like zpool works if I don=E2=80=99t cross past ~800GB. But o= ther >> things like dd and newfs work. >> >> When I get the kernel messages about the controller resetting and then >> not responding, the NVMe subsystem hangs entirely. Since my boot disks a= re >> not NVMe, the system continues to work but no more NVMe stuff can be don= e. >> Further, attempting to reboot hangs and I have to do a power cycle. >> >> Any thoughts on what the deal may be here? >> >> 10.2-RELEASE-p5 >> >> nvme0@pci0:132:0:0: class=3D0x010802 card=3D0x1f971028 chip=3D0xa820= 144d >> rev=3D0x03 hdr=3D0x00 >> vendor =3D 'Samsung Electronics Co Ltd' >> class =3D mass storage >> subclass =3D NVM >> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > --089e0115f3c453352f05217373e9 Content-Type: application/octet-stream; name="nvd.patch" Content-Disposition: attachment; filename="nvd.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_iffo6nt30 ZGlmZiAtLWdpdCBhL3N5cy9kZXYvbnZkL252ZC5jIGIvc3lzL2Rldi9udmQvbnZkLmMKaW5kZXgg ZDc1MjgzMi4uMzAxNWUzOSAxMDA2NDQKLS0tIGEvc3lzL2Rldi9udmQvbnZkLmMKKysrIGIvc3lz L2Rldi9udmQvbnZkLmMKQEAgLTMyLDYgKzMyLDcgQEAgX19GQlNESUQoIiRGcmVlQlNEOiByZWxl bmcvMTAuMi9zeXMvZGV2L252ZC9udmQuYyAyODU5MTkgMjAxNS0wNy0yNyAxNzo1MDowNVogamkK ICNpbmNsdWRlIDxzeXMva2VybmVsLmg+CiAjaW5jbHVkZSA8c3lzL21hbGxvYy5oPgogI2luY2x1 ZGUgPHN5cy9tb2R1bGUuaD4KKyNpbmNsdWRlIDxzeXMvc3lzY3RsLmg+CiAjaW5jbHVkZSA8c3lz L3N5c3RtLmg+CiAjaW5jbHVkZSA8c3lzL3Rhc2txdWV1ZS5oPgogCkBAIC04NSw2ICs4NiwxMSBA QCBzdHJ1Y3QgbnZkX2NvbnRyb2xsZXIgewogc3RhdGljIFRBSUxRX0hFQUQoLCBudmRfY29udHJv bGxlcikJY3RybHJfaGVhZDsKIHN0YXRpYyBUQUlMUV9IRUFEKGRpc2tfbGlzdCwgbnZkX2Rpc2sp CWRpc2tfaGVhZDsKIAorc3RhdGljIFNZU0NUTF9OT0RFKF9odywgT0lEX0FVVE8sIG52ZCwgQ1RM RkxBR19SRCwgMCwgIm52ZCBkcml2ZXIgcGFyYW1ldGVycyIpOworc3RhdGljIHVpbnQ2NF90IG52 ZF9kZWxldGVfbWF4ID0gKDQgKiAxMDI0ICogMTAyNCAqIDEwMjQpOyAgLyogNEdCICovCitTWVND VExfVVFVQUQoX2h3X252ZCwgT0lEX0FVVE8sIGRlbGV0ZV9tYXgsIENUTEZMQUdfUldUVU4sICZu dmRfZGVsZXRlX21heCwgMCwKKwkgICAgICJudmQgbWF4aW11bSBCSU9fREVMRVRFIHNpemUiKTsK Kwogc3RhdGljIGludCBudmRfbW9kZXZlbnQobW9kdWxlX3QgbW9kLCBpbnQgdHlwZSwgdm9pZCAq YXJnKQogewogCWludCBlcnJvciA9IDA7CkBAIC0yNzksNiArMjg1LDggQEAgbnZkX25ld19kaXNr KHN0cnVjdCBudm1lX25hbWVzcGFjZSAqbnMsIHZvaWQgKmN0cmxyX2FyZykKIAlkaXNrLT5kX3Nl Y3RvcnNpemUgPSBudm1lX25zX2dldF9zZWN0b3Jfc2l6ZShucyk7CiAJZGlzay0+ZF9tZWRpYXNp emUgPSAob2ZmX3QpbnZtZV9uc19nZXRfc2l6ZShucyk7CiAJZGlzay0+ZF9kZWxtYXhzaXplID0g KG9mZl90KW52bWVfbnNfZ2V0X3NpemUobnMpOworCWlmIChkaXNrLT5kX2RlbG1heHNpemUgPiBu dmRfZGVsZXRlX21heCkKKwkJZGlzay0+ZF9kZWxtYXhzaXplID0gbnZkX2RlbGV0ZV9tYXg7CiAK IAlpZiAoVEFJTFFfRU1QVFkoJmRpc2tfaGVhZCkpCiAJCWRpc2stPmRfdW5pdCA9IDA7Cg== --089e0115f3c453352f05217373e9--