From owner-freebsd-hackers@freebsd.org Fri May 4 04:33:55 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 2E76EFC6FFC for ; Fri, 4 May 2018 04:33:55 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: from mail-io0-x22a.google.com (mail-io0-x22a.google.com [IPv6:2607:f8b0:4001:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AFD7E8638E for ; Fri, 4 May 2018 04:33:54 +0000 (UTC) (envelope-from wlosh@bsdimp.com) Received: by mail-io0-x22a.google.com with SMTP id g1-v6so14877206iob.2 for ; Thu, 03 May 2018 21:33:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bsdimp-com.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=drw5jUwjEiMX7RdOaJRseS0ViQHUj7eLFbsGj73CxVU=; b=VPvfV7NM9Fyv5+1FUneCs/ZBgzRQRD4djAr1JbRZQ265R/l418CXLhuKyuJzHNySsO RXfjkZY9tA7TJx0PN+chg5kcec3HW2tVM6ZYO9pG0VWHc0RRLnS6bnMGyaIv+XQZ1Yid PLeJuss6SL7Ns7NgwuJBp63eSY45r/u1omlwvj251K2Yt80C4plAuPbPJJtoTpUdyaww nIY/aO+IR2qK8MiBxhWhH1vVyR7C3G1xEEccapOPq70+8I5uldrfRaF/xf+9WaYusnzV /eiTEuRgQJ4mQKhDzb8NliBSSyRmfDYa6xOEl2qpJjp7/g9/4q3TvEqHBmr5LS7URQs8 KULg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=drw5jUwjEiMX7RdOaJRseS0ViQHUj7eLFbsGj73CxVU=; b=MsuBj7ikdnK2xVJYgeVb6DwswggB1v2YSIC52/9mNqIUasWUCwQLPGrZkCpkvW+wW1 ZaxtDT/7Vkyswik2qi0qhWlD+GTBdplkZckCyYkJefzU77gsNYgjFvIL3WWHgmBHuY+1 n2Zj8gRYEqofMhKhpuxPtfta/TKHjpuZtlRYWMl58UHa4QlIRgLF053aemQKKgO6RY4m QxpEZlTUtsNFpNsM4xRw2Ex5I3a5P4P8HGsdwgy8VaENPwAEb+TDEmWHWZ4uOXceTqsA dk+zYhYG4kTySnFApp7eyaRjE7PnXT08pBHPKge8c9sRFpXp2iPk7aOnMA6jh4jLx1Ru 0rkg== X-Gm-Message-State: ALQs6tCJ53GrJDGhfNbs3F2TWiwwRjt7OF3ythFBYu8MaSr1kH8oKneL XbZy9MoXiGNSk6eDs94oqlMfiBTklhyjqXv8nW8sYQ== X-Google-Smtp-Source: AB8JxZqzrx/kgbPQ2Fu429CT74ISjLOfGOLuZCpUhPRLAhl+9MFVsFO04xq2Iaj9hCFf078zRt/VmBbb1pHI/lRMNh4= X-Received: by 2002:a6b:be01:: with SMTP id o1-v6mr26788721iof.299.1525408433987; Thu, 03 May 2018 21:33:53 -0700 (PDT) MIME-Version: 1.0 Sender: wlosh@bsdimp.com Received: by 2002:a4f:a65a:0:0:0:0:0 with HTTP; Thu, 3 May 2018 21:33:53 -0700 (PDT) X-Originating-IP: [2603:300b:6:5100:18a2:a4f7:170:8dd9] In-Reply-To: <8b1eadc2-8c9d-3f11-b877-b9a0a57512ec@freebsd.org> References: <960be682-9991-f8c6-0253-7d6f782d4cbe@freebsd.org> <8b1eadc2-8c9d-3f11-b877-b9a0a57512ec@freebsd.org> From: Warner Losh Date: Thu, 3 May 2018 22:33:53 -0600 X-Google-Sender-Auth: dTFAIcPC0QTq3tQBvqEAFNzxZCE Message-ID: Subject: Re: nvme0: async event occurred (log page id=0x2) To: Craig Leres Cc: "freebsd-hackers@freebsd.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.25 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2018 04:33:55 -0000 On Thu, May 3, 2018 at 10:28 PM, Craig Leres wrote: > On 5/3/2018 9:07 PM, Warner Losh wrote: > > Async events are 'something went wrong' messages. Log page 2 is the > > smart log page. > > > > what does 'nvmecontrol logpage -p 2 nvme0' tell you right after this > > happens. My guess is that it's overheating. > > Interesting. I try to run smartd anywhere it's supported and have > appended the last few entries before things went sideways; 60=C2=B0 C/140= =C2=B0 F > is a bit toasty! > > This system is a couple of years old, might be time to blow the dust out > with compressed air and see if the bios has more aggressive fan settings. > > Is the Raw_Read_Error_Rate changed a problem? > > (Thanks!) > > Craig > > May 3 13:59:22 tiny smartd[770]: Device: /dev/ada0, SMART Usage > Attribute: 190 Airflow_Temperature_Cel changed from 59 to 60 > May 3 13:59:22 tiny smartd[770]: Device: /dev/ada0, SMART Usage > Attribute: 194 Temperature_Celsius changed from 41 to 40 > May 3 14:59:23 tiny smartd[770]: Device: /dev/ada0, SMART Usage > Attribute: 190 Airflow_Temperature_Cel changed from 60 to 58 > May 3 14:59:23 tiny smartd[770]: Device: /dev/ada0, SMART Usage > Attribute: 194 Temperature_Celsius changed from 40 to 42 > May 3 17:29:23 tiny smartd[770]: Device: /dev/ada0, SMART Prefailure > Attribute: 1 Raw_Read_Error_Rate changed from 75 to 76 > Things are getting hot, and there was a recoverable error (since you didn't report a read error, though you could also check page 1 for any errors). Chances are the controller shut down completely (though from just a few data points you've given aren't enough for me to be sure). Warner