From owner-freebsd-fs@freebsd.org Fri Aug 31 00:40:33 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6AC10EFD391 for ; Fri, 31 Aug 2018 00:40:33 +0000 (UTC) (envelope-from grant@grantgray.id.au) Received: from mail.agc.net.au (mail.evps.com.au [27.121.114.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E2A0C799A8 for ; Fri, 31 Aug 2018 00:40:31 +0000 (UTC) (envelope-from grant@grantgray.id.au) Received: from localhost (localhost [127.0.0.1]) by mail.agc.net.au (Postfix) with ESMTP id E74974719F72; Fri, 31 Aug 2018 10:40:07 +1000 (AEST) Received: from mail.agc.net.au ([127.0.0.1]) by localhost (mail.agc.net.au [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 6dfm6I1FfCpX; Fri, 31 Aug 2018 10:40:05 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by mail.agc.net.au (Postfix) with ESMTP id 513884719F76; Fri, 31 Aug 2018 10:40:04 +1000 (AEST) X-Virus-Scanned: amavisd-new at mail.agc.net.au Received: from mail.agc.net.au ([127.0.0.1]) by localhost (mail.agc.net.au [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JkvRchB4nQ2J; Fri, 31 Aug 2018 10:40:03 +1000 (AEST) Received: from mail.agc.net.au (mail.agc.net.au [27.121.114.102]) by mail.agc.net.au (Postfix) with ESMTP id EBA134719F72; Fri, 31 Aug 2018 10:40:01 +1000 (AEST) Date: Fri, 31 Aug 2018 10:39:59 +1000 (AEST) From: Grant Gray To: "M. Casper Lewis" Cc: freebsd-fs@freebsd.org Message-ID: <707525919.257415.1535675999891.JavaMail.zimbra@grantgray.id.au> In-Reply-To: <20180831003436.GW1473@genomecenter.ucdavis.edu> References: <20180831003436.GW1473@genomecenter.ucdavis.edu> Subject: Re: Failing ZFS log devices/panic MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [27.121.114.102] X-Mailer: Zimbra 8.8.8_GA_1728 (ZimbraWebClient - FF62 (Linux)/8.8.8_GA_1703) Thread-Topic: Failing ZFS log devices/panic Thread-Index: 0YTtrsCb3C/haCUbtFppmqe16fLqwg== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Aug 2018 00:40:33 -0000 SAS? SATA? Make/model? HBA? Are you mixing SAS and SATA on the same bus? Pool configuration? ----- On 31 Aug, 2018, at 10:34 AM, M. Casper Lewis mclewis@genomecenter.ucdavis.edu wrote: > Greetings, > > We are having an issue with stability problems on one of our ZFS fileservers. > The system will run fine for a few days, but gradually report the log > the log devices as failing, and then eventually panic. After several rounds > of this, we finally removed the log devices and the machine has not > panicked since. > > We have tried several different types of SSD (both datacenter and non) and > the issue happens with all of them. When queried with the vendor tools, the > drives all report themselves healthy, and after a reboot they all report > healthy as well. > > The same SSDs are serving as cache devices without issue. > > This is FreeBSD 11.2-RELEASE-p2 #2 r337991 > > Here is a backtrace: > > KDB: stack backtrace: > #0 0xffffffff80b3d3c7 at kdb_backtrace+0x67 > #1 0xffffffff80af6a37 at vpanic+0x177 > #2 0xffffffff80af68b3 at panic+0x43 > #3 0xffffffff80deabea at vm_fault_hold+0x244a > #4 0xffffffff80de8755 at vm_fault+0x75 > #5 0xffffffff80f7810c at trap_pfault+0x14c > #6 0xffffffff80f777d7 at trap+0x2c7 > #7 0xffffffff80f5740c at calltrap+0x8 > #8 0xffffffff823442b9 at zfs_log_write+0x169 > #9 0xffffffff82350a30 at zfs_freebsd_write+0xb50 > #10 0xffffffff810faea3 at VOP_WRITE_APV+0x103 > #11 0xffffffff80a32ffb at nfsvno_write+0x12b > #12 0xffffffff80a2af45 at nfsrvd_write+0x4a5 > #13 0xffffffff80a1866b at nfsrvd_dorpc+0x11bb > #14 0xffffffff80a287e7 at nfssvc_program+0x557 > #15 0xffffffff80d6bcd9 at svc_run_internal+0xe09 > #16 0xffffffff80d6c18b at svc_thread_start+0xb > #17 0xffffffff80aba073 at fork_exit+0x83 > > Any suggestions on what to try next? We are at a loss as to why the devices > are being marked failed when they clearly are not. > > -- > M. Casper Lewis | mclewis@ucdavis.edu > Systems Administrator | Voice: (530) 754-7978 > Genome Center | > University of California, Davis | > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"