From owner-freebsd-stable@freebsd.org Thu Apr 30 18:07:04 2020 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 3657C2C45B3 for ; Thu, 30 Apr 2020 18:07:04 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49CjyG6dNXz3Myt for ; Thu, 30 Apr 2020 18:07:02 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-qv1-xf32.google.com with SMTP id w18so3501077qvs.3 for ; Thu, 30 Apr 2020 11:07:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6pM5I3ZR18CRX+tFtXBHH5yCb+ipfPt5Nx8+IUOCsEw=; b=kBfDucXtyGqLyawBi0pBv2gI+r33sdhe6UdocDucQCQuCj7aOIRX/vyLerixULOZxM UO7NRTPluy4fXJewtmIbEYPNhzJUFCEuf4waKJZdd6NZwBsiw+bnS7alGGp3y04QGmWP m4yd2cfaGc/bmKhkymMMfMa6dfVHBo3P6g4euQMd+x0WXmOr5YPgoV38fqvw1CGkGbN1 gxbwPD1zf+ZqU4wyt9zR6YLE9pZ3HukVKdbDr+gZ4E9GhR+nP+698CHLy8Yn1kB1gv10 gi6bp+wAAF59gjnF7Omif9+Rt0WQpre3MAIrddM0c+0UmB4s6GOofLtoLY8G9eMF2AKh MuRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6pM5I3ZR18CRX+tFtXBHH5yCb+ipfPt5Nx8+IUOCsEw=; b=frqaupGqBAtwbAUSmNRLCuRAdHN9zQuRSgBd0QXLRhhs3qGvaAidfuaRMx5wl2w/z4 UHyAY5sHlJvSNRmf/qypfpTHFyXXIKth5vXBsLDuK+pHVj6JNiq8hZf2LMGfzkhsKP6U gQsaTfP8BWslczzlsypBVxsKEzaZLFhw2802mVJyofArsAHGunnxYEuBcFyDbint8FfK jbqgSAWBf9SGj67E/3wAFiB6j5X3mCvMpBKX6MTmS9Y9nbLYxonhovVS8fa8gFNU7eza LSu5DsTP8zULULCPNpnG1HPCnWYDizu9KuOdQ6QtmoIm97a8bDotcYOEBJvjyLk+AnNs +t6g== X-Gm-Message-State: AGi0PuYDgDdAj6vAzCoWn5n93J4LOOPGsDq+5K1RPQWX3mqVdiPtZ6m/ BCKfnb4r6RSjseApMJEqIn5WdkgHGbVeYZFBPmu15Y3MNyOfpw== X-Google-Smtp-Source: APiQypI0D5DcOeTpLiJX0U0eV8qgPOHwo38UZOYVck508iCgzkG88c9P0sIYC+5rwJ+4eH8cwI7BZbL4KOR3TJu8nQ4= X-Received: by 2002:a0c:b5dd:: with SMTP id o29mr87591qvf.87.1588270021759; Thu, 30 Apr 2020 11:07:01 -0700 (PDT) MIME-Version: 1.0 References: <636DB3B3-E4C7-4A17-AB79-8AFDC6352712@lassitu.de> In-Reply-To: <636DB3B3-E4C7-4A17-AB79-8AFDC6352712@lassitu.de> From: Warner Losh Date: Thu, 30 Apr 2020 12:06:50 -0600 Message-ID: Subject: Re: nvme0 error To: Stefan Bethke Cc: freebsd-stable X-Rspamd-Queue-Id: 49CjyG6dNXz3Myt X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=bsdimp-com.20150623.gappssmtp.com header.s=20150623 header.b=kBfDucXt; dmarc=none; spf=none (mx1.freebsd.org: domain of wlosh@bsdimp.com has no SPF policy when checking 2607:f8b0:4864:20::f32) smtp.mailfrom=wlosh@bsdimp.com X-Spamd-Result: default: False [-3.58 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; R_DKIM_ALLOW(-0.20)[bsdimp-com.20150623.gappssmtp.com:s=20150623]; FROM_HAS_DN(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-stable@freebsd.org]; DMARC_NA(0.00)[bsdimp.com]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; DKIM_TRACE(0.00)[bsdimp-com.20150623.gappssmtp.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[2.3.f.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[imp@bsdimp.com,wlosh@bsdimp.com]; MIME_TRACE(0.00)[0:+,1:+,2:~]; IP_SCORE(-1.58)[ip: (-7.08), ipnet: 2607:f8b0::/32(-0.33), asn: 15169(-0.43), country: US(-0.05)]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[imp@bsdimp.com,wlosh@bsdimp.com]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Apr 2020 18:07:04 -0000 On Thu, Apr 30, 2020 at 11:48 AM Stefan Bethke wrote: > nvme0: async event occurred (type 0x1, info 0x00, page 0x02) > nvme0: device reliability degraded > type 1: SMART event info 0: reliability error page 2: look at what's up here 1.4 standard says: NVM subsystem Reliability: NVM subsystem reliability has been compromised. This may be due to significant media errors, an internal error, the media being placed in read only mode, or a volatile memory backup device failing. This status value shall not be used if the read-only condition on the media is due to a change in the write protection state of a namespace (refer to section 8.19.1). Should I be concerned? I'm using this Samsung SSD as cache and log for ZFS > on a 12-stable machine. > > nvd0: NVMe namespace > nvd0: 122104MB (250069680 512 byte sectors) > > # nvmecontrol logpage -p 2 nvme0 > SMART/Health Information Log > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > Critical Warning State: 0x04 > Available spare: 0 > Temperature: 0 > Device reliability: 1 > Read only: 0 > Volatile memory backup: 0 > Temperature: 311 K, 37.85 C, 100.13 F > Available spare: 100 > Available spare threshold: 10 > Percentage used: 110 > Data units (512,000 byte) read: 18417596 > Data units written: 164091845 > Host read commands: 499986873 > Host write commands: 1491808067 > Controller busy time (minutes): 48315 > Power cycles: 59 > Power on hours: 20432 > Unsafe shutdowns: 26 > Media errors: 0 > No. error info log entries: 22 > Warning Temp Composite Time: 0 > Error Temp Composite Time: 0 > Temperature Sensor 1: 311 K, 37.85 C, 100.13 F > Temperature Sensor 2: 330 K, 56.85 C, 134.33 F > Temperature 1 Transition Count: 0 > Temperature 2 Transition Count: 0 > Total Time For Temperature 1: 0 > Total Time For Temperature 2: 0 > I'm thinking percent used 110 may be the thing it's alerting on, the standard says: Percentage Used: Contains a vendor specific estimate of the percentage of NVM subsystem life used based on the actual usage and the manufacturer=E2= =80=99s prediction of NVM life. A value of 100 indicates that the estimated endurance of the NVM in the NVM subsystem has been consumed, but may not indicate an NVM subsystem failure. The value is allowed to exceed 100. Percentages greater than 254 shall be represented as 255. This value shall be updated once per power-on hour (when the controller is not in a sleep state). Refer to the JEDEC JESD218A standard for SSD device life and endurance measurement techniques. Warner > > Stefan > > -- > Stefan Bethke Fon +49 151 14070811 > >