From owner-freebsd-bugs@freebsd.org Sat Jan 21 03:15:08 2017 Return-Path: Delivered-To: freebsd-bugs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 365DDCBA980 for ; Sat, 21 Jan 2017 03:15:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1B5D11694 for ; Sat, 21 Jan 2017 03:15:08 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id v0L3F73P025683 for ; Sat, 21 Jan 2017 03:15:07 GMT (envelope-from bugzilla-noreply@freebsd.org) From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 211852] Unsafe shutdowns on Intel 750 SSD Date: Sat, 21 Jan 2017 03:15:08 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: kern X-Bugzilla-Version: 10.3-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Some People X-Bugzilla-Who: rpokala@panasas.com X-Bugzilla-Status: New X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: freebsd-bugs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Jan 2017 03:15:08 -0000 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D211852 --- Comment #2 from Ravi Pokala --- Neat, I didn't know `smartctl' had been extended to understand NVMe! :-) In any case, it the code for handling power-down looks grossly correct: sys/dev/nvme/nvme_ctrlr.c (r308431) 1184 void 1185 nvme_ctrlr_shutdown(struct nvme_controller *ctrlr) 1186 { 1187 union cc_register cc; 1188 union csts_register csts; 1189 int ticks =3D 0; 1190=20 1191 cc.raw =3D nvme_mmio_read_4(ctrlr, cc); 1192 cc.bits.shn =3D NVME_SHN_NORMAL; 1193 nvme_mmio_write_4(ctrlr, cc, cc.raw); 1194 csts.raw =3D nvme_mmio_read_4(ctrlr, csts); 1195 while ((csts.bits.shst !=3D NVME_SHST_COMPLETE) && (ticks++ < = 5*hz)) { 1196 pause("nvme shn", 1); 1197 csts.raw =3D nvme_mmio_read_4(ctrlr, csts); 1198 } 1199 if (csts.bits.shst !=3D NVME_SHST_COMPLETE) 1200 nvme_printf(ctrlr, "did not complete shutdown within 5 seconds " 1201 "of notification\n"); 1202 } In English, that's roughly: notify the controller about a normal shutdown (= as opposed to an "abrupt" shutdown), then wait until the controller status indicates that shutdown is complete; if the controller doesn't indicate complete shutdown within 5 seconds, print a log message and continue anyway. It has been in that state since r254302 (2013-08-13). (That's in -HEAD, but= the same code is in 10.3-RELEASE.) Hmmm... In NVMe-1.2.1, section 7.6.2: "It is recommended that the host wait a minimum of the RTD3 Entry Latency reported in the Identify Controller data structure for the shutdown operati= ons to complete; if the value reported in RTD3 Entry Latency is 0h, then the ho= st should wait for a minimum of one second." The "RTD3 Entry Latency" is described in section 5.11, Figure 90: "Bytes 91:88: RTD3 Entry Latency (RTD3E): This field indicates the typical latency in microseconds to enter Runtime D3 (RTD3). Refer to section 8.4.4 = for test conditions. A value of 0h indicates RTD3 Entry Latency is not reported= ." So, that hard-coded 5 seconds might not be correct. It looks like (struct nvme_controller_data) treats the part of the "Identify Controller" data structure which contains RTD3E as reserved. It looks like it was in fact reserved in NVMe-1.1, but was defined later. --=20 You are receiving this mail because: You are the assignee for the bug.=