From owner-freebsd-questions@freebsd.org Mon Sep 9 11:45:57 2019 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 29286F6573 for ; Mon, 9 Sep 2019 11:45:57 +0000 (UTC) (envelope-from Albert.Shih@obspm.fr) Received: from mx-p1.obspm.fr (mx-p1.obspm.fr [145.238.193.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "*.obspm.fr", Issuer "TERENA SSL CA 3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46RmZX1Z3Vz3MZq for ; Mon, 9 Sep 2019 11:45:55 +0000 (UTC) (envelope-from Albert.Shih@obspm.fr) Received: from io.chezmoi.fr (io-p2.obspm.fr [145.238.197.205]) (authenticated bits=0) by mx-p1.obspm.fr (8.14.4/8.14.4/DIO Observatoire de Paris - 15/04/10) with ESMTP id x89Bjq1T185039 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 9 Sep 2019 13:45:53 +0200 Date: Mon, 9 Sep 2019 13:45:52 +0200 From: Albert Shih To: Julien Cigar Cc: freebsd-questions@freebsd.org Subject: Re: Verry serious problem with ZFS & 12.0 Message-ID: <20190909114552.GD13411@io.chezmoi.fr> References: <20190828224547.GA1557@io.chezmoi.fr> <20190829083727.GC38457@home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190829083727.GC38457@home.lan> User-Agent: Mutt/1.12.1 (2019-06-15) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.5.11 (mx-p1.obspm.fr [145.238.193.20]); Mon, 09 Sep 2019 13:45:53 +0200 (CEST) X-Virus-Scanned: clamav-milter 0.100.3 at mx-p1.obspm.fr X-Virus-Status: Clean X-Rspamd-Queue-Id: 46RmZX1Z3Vz3MZq X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of Albert.Shih@obspm.fr designates 145.238.193.20 as permitted sender) smtp.mailfrom=Albert.Shih@obspm.fr X-Spamd-Result: default: False [-4.81 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[obspm.fr]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[20.193.238.145.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-2.31)[ip: (-9.70), ipnet: 145.238.0.0/16(-4.85), asn: 2200(2.99), country: FR(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2200, ipnet:145.238.0.0/16, country:FR]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Sep 2019 11:45:57 -0000 Le 29/08/2019 à 10:37:28+0200, Julien Cigar a écrit > On Thu, Aug 29, 2019 at 12:45:47AM +0200, Albert Shih wrote: > > Hi > > > > After update 4 servers from 11.2 to 12.0 without any problem, wait few > > weeks to see if everything work well, and it did. I just upgrade my mail > > server. > > > > During the upgrade I also upgrade all firmware for the hardware. > > > > And now I got a very serious issue with my server. > > > > Configuration : > > > > Dell PowerEdge R740Xd with H730P, 192 Go Ram, 2 SAS mechanical disk for the system, > > 2 SSD (in a zfs pool) for the mail index (cyrus), and 28 mechanical disk > > (in a second zfs pool) for the mailbox. > > > > The problem: > > > > After running few days the zfs pool with the 2 SSD are not responding. > > > > The system are perfectly working. > > > > The second zpool (mechanical disk) are perfectly working. > > > > I got zero log, zero message in the console or in dmesg. > > > > The arc_size are correct, it's around 70-75 %. > > > > The moment the zfs pool become not responding are random, not related to > > any activity (human or cron). > > > > The only option I pass for the kernel related to ZFS are vfs.zfs.min_auto_ashift=12 and > > vfs.zfs.prefetch_disable=1. Without the second one the system no > > responding (under 11.2) when the server send (through zfs send) the data to another > > server. > > > > After the first problem I make a zfs upgrade, thinking maybe that's the > > problem so I'm not sure I can downgrade to 11.2 (and 11.2 are EOL) > > > > In your opinion : > > > > 1/ What should I do to try to find the problem ? > > > > 2/ Do you think that's a hardware/firmware problem or FreeBSD problem, > > the point is the second zpool are working perfectly so I'm thinking at > > some firmware/hardware/compatibility problem. > > > > > > Regards. > > looks like PR 236480 > > see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236480 > So I can confirm, with this patch the server work fine without any hang or crash. Thanks folks. Regards -- Albert SHIH Observatoire de Paris xmpp: jas@obspm.fr Heure local/Local time: Mon 09 Sep 2019 01:44:42 PM CEST