From owner-freebsd-questions@freebsd.org Wed Aug 4 17:16:05 2021 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2168B65147D for ; Wed, 4 Aug 2021 17:16:05 +0000 (UTC) (envelope-from dan@langille.org) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gfz0h2KvHz4bqJ for ; Wed, 4 Aug 2021 17:16:04 +0000 (UTC) (envelope-from dan@langille.org) Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 768E75C00F4 for ; Wed, 4 Aug 2021 13:09:16 -0400 (EDT) Received: from imap42 ([10.202.2.92]) by compute6.internal (MEProxy); Wed, 04 Aug 2021 13:09:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=langille.org; h= mime-version:message-id:date:from:to:subject:content-type; s= fm1; bh=sBp9hCbO/weCYZVwpUbgVaaKOugTOg6oTtfN6Fu2/JU=; b=q2WjghdE Vj4HDmXmydYfyd5zTGPdk69sPg1eGlQxUbsg9pMfrUyBUdZc25SiKiPiW7nd/yM0 gOpNl8C2OE8Ke0YacsfkvUZToP4guLckK7Z+O/I0991zBl/3wpqAVCO0ijxHBt4z p0Mv7Ndg++BAj6uM0bjpV+MVFTXjsCPYdceBon63V1BklQyTOuaaH2wdeEZW8DEi m2s2fPAIhDGO7gfc7aQLfy87OL2BOlLH4MDqUgNhcfnQA+N/eZVUzlrMgMZgNdOy PsCXVmdbeiVWNUb+W2LGp8gzKYGEHfvy19qYo9vhx2m77VZdzFPrTRO1TaA7OU5G Dwg4vt9YV27esg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-type:date:from:message-id :mime-version:subject:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; bh=sBp9hCbO/weCYZVwpUbgVaaKOugTO g6oTtfN6Fu2/JU=; b=h3z/84i20Te5kLqoEDYJfqNtIys+jdrJZsYVJGiyQII1f WdJBqDkIodF+3P+siTu9k6yzE46p6gYm85sLD5xwtSBPm146elvURkf0SrveOgYJ 1xpo7yCS2xMpUgTNVrpvhrtSZLCUeQYNdjEDCEj6Dl7/sS1f4GqNixWt+oOxp/o3 Qe6hwgKs+n/VAlyyqAGe19UikEx1zo5IJKyUhjTH9WmzpKFBnOLco3dw9zlrLCi7 phY8uglCEvUecpDfzWDmZuos/cmL3TYIP5uBLWeF9zY0XIIoqOL1ri4fZDorGmmj 9JDs0LZwgn90jtOqypuWsp4OXWgn9UgSed9uJWS8A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvtddrieejgdeltdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecunecujfgurhepofgfggfkfffhvffutgesthdtredtre ertdenucfhrhhomhepfdffrghnucfnrghnghhilhhlvgdfuceouggrnheslhgrnhhgihhl lhgvrdhorhhgqeenucggtffrrghtthgvrhhnpeejudejhfegvefgleevheevtdeuvdeiue elfefgleeikedujedtuddvheeuhefhveenucffohhmrghinhepghhithhhuhgsrdgtohhm necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepuggrnh eslhgrnhhgihhllhgvrdhorhhg X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 1B9C82180064; Wed, 4 Aug 2021 13:09:08 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.5.0-alpha0-548-g3a0b1fef7b-fm-20210802.001-g3a0b1fef Mime-Version: 1.0 Message-Id: Date: Wed, 04 Aug 2021 13:08:31 -0400 From: "Dan Langille" To: freebsd-questions@freebsd.org Subject: nvme detached Content-Type: text/plain X-Rspamd-Queue-Id: 4Gfz0h2KvHz4bqJ X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=langille.org header.s=fm1 header.b=q2WjghdE; dkim=pass header.d=messagingengine.com header.s=fm3 header.b="h3z/84i2"; dmarc=pass (policy=none) header.from=langille.org; spf=pass (mx1.freebsd.org: domain of dan@langille.org designates 66.111.4.25 as permitted sender) smtp.mailfrom=dan@langille.org X-Spamd-Result: default: False [-4.09 / 15.00]; XM_UA_NO_VERSION(0.01)[]; RWL_MAILSPIKE_GOOD(0.00)[66.111.4.25:from]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ip4:66.111.4.25]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; DKIM_TRACE(0.00)[langille.org:+,messagingengine.com:+]; DMARC_POLICY_ALLOW(-0.50)[langille.org,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; RCVD_IN_DNSWL_LOW(-0.10)[66.111.4.25:from]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11403, ipnet:66.111.0.0/20, country:US]; RCVD_TLS_LAST(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[langille.org:s=fm1,messagingengine.com:s=fm3]; FREEFALL_USER(0.00)[dan]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; RCPT_COUNT_ONE(0.00)[1]; DWL_DNSWL_LOW(-1.00)[messagingengine.com:dkim]; MAILMAN_DEST(0.00)[freebsd-questions]; MID_RHS_WWW(0.50)[] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 17:16:05 -0000 Yesterday I had an NVME stick detach. This degraded a zpool but zpools status indicated the device was still online. Yet it was not visible in /dev/. More details are at https://gist.github.com/dlangille/bc8af0f5a098d3a106fa5fbf40a88d42 I first noticed the issue with multiple ssh sessions freezing up. Then Nagios started alerting. A reboot cleared this up. scrubs did not find any errors. The /var/log/messages entries below. Thank you. Aug 3 15:06:02 knew kernel: nvme0: Resetting controller due to a timeout. Aug 3 15:06:02 knew kernel: nvme0: resetting controller Aug 3 15:06:32 knew kernel: nvme0: controller ready did not become 0 within 30500 ms Aug 3 15:06:32 knew kernel: nvme0: failing queued i/o Aug 3 15:06:32 knew kernel: nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000 Aug 3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0 Aug 3 15:06:32 knew kernel: nvme0: failing outstanding i/o Aug 3 15:06:32 knew kernel: nvme0: READ sqid:2 cid:123 nsid:1 lba:250153507 len:5 Aug 3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:123 cdw0:0 Aug 3 15:06:32 knew kernel: nvme0: failing outstanding i/o Aug 3 15:06:32 knew kernel: nvme0: WRITE sqid:3 cid:118 nsid:1 lba:454009346 len:1 Aug 3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:118 cdw0:0 Aug 3 15:06:32 knew kernel: nvme0: failing outstanding i/o Aug 3 15:06:32 knew kernel: nvme0: WRITE sqid:4 cid:122 nsid:1 lba:454009345 len:1 Aug 3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:122 cdw0:0 Aug 3 15:06:32 knew kernel: nvd0: detached -- Dan Langille dan@langille.org