From owner-freebsd-questions@freebsd.org Wed Aug 28 23:08:48 2019 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9E727E8D18 for ; Wed, 28 Aug 2019 23:08:48 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from btw.pki2.com (btw.pki2.com [IPv6:2603:3023:16d:1001::c0a8:1701]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 46JhHy6WCyz3Lpk for ; Wed, 28 Aug 2019 23:08:46 +0000 (UTC) (envelope-from freebsd@pki2.com) Received: from localhost (localhost [IPv6:0:0:0:0:0:0:0:1]) by btw.pki2.com (8.15.2/8.15.2) with ESMTP id x7SN8Wxn010091; Wed, 28 Aug 2019 16:08:32 -0700 (PDT) (envelope-from freebsd@pki2.com) DMARC-Filter: OpenDMARC Filter v1.3.2 btw.pki2.com x7SN8Wxn010091 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=pki2.com; s=pki2; t=1567033713; bh=JGajq5HzX72wFQBOCeB00emtMxtZnvJNNN9mBaltPpE=; h=Subject:From:To:Date:In-Reply-To:References; z=Subject:=20Re:=20Verry=20serious=20problem=20with=20ZFS=20&=2012. 0|From:=20Dennis=20Glatting=20|To:=20Albert=20Sh ih=20,=20freebsd-questions@freebsd.org|Date: =20Wed,=2028=20Aug=202019=2016:08:32=20-0700|In-Reply-To:=20<20190 828224547.GA1557@io.chezmoi.fr>|References:=20<20190828224547.GA15 57@io.chezmoi.fr>; b=sHxncJ46PGaZruU7S2NNBP24UwekpZl61MiPF/0sTYYSTwgrIOfpqIsqDJANIXLCk FApa/wMgCTSre6/YZLRa+xaPurxf3ddksMjLu48YAbHs9XiK7M5/3OUiiQUqdJdJhU JNnBmGhDs2Sm8Vblvx3rebcx8P75S55MWgSwQDe0sK9cNu1yN74nMpymxruXvpJeoN AOg4H+fw7aqBC94+d3javpcgw7qnmR6qMkGf8+7EtbKnhImzwy+4py7lNR7phPjNoY JBR2HLayP6w2Ct3tc6xomTsZqOxqtk2AHRctBwsMiVHcl6kmU8DNHkeqW83J4q9Jev n008jPmvxw1mA== Message-ID: Subject: Re: Verry serious problem with ZFS & 12.0 From: Dennis Glatting To: Albert Shih , freebsd-questions@freebsd.org Date: Wed, 28 Aug 2019 16:08:32 -0700 In-Reply-To: <20190828224547.GA1557@io.chezmoi.fr> References: <20190828224547.GA1557@io.chezmoi.fr> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5 FreeBSD GNOME Team Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-yoursite-MailScanner-Information: Please contact the ISP for more information X-yoursite-MailScanner-ID: x7SN8Wxn010091 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: freebsd@pki2.com X-Spam-Status: No X-Rspamd-Queue-Id: 46JhHy6WCyz3Lpk X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=pki2.com header.s=pki2 header.b=sHxncJ46; dmarc=pass (policy=none) header.from=pki2.com; spf=pass (mx1.freebsd.org: domain of freebsd@pki2.com designates 2603:3023:16d:1001::c0a8:1701 as permitted sender) smtp.mailfrom=freebsd@pki2.com X-Spamd-Result: default: False [-2.12 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.94)[-0.938,0]; R_DKIM_ALLOW(-0.20)[pki2.com:s=pki2]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; IP_SCORE(0.40)[ipnet: 2603:3000::/24(2.74), asn: 7922(-0.66), country: US(-0.05)]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[pki2.com:+]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[pki2.com,none]; NEURAL_HAM_SHORT(-0.08)[-0.085,0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:7922, ipnet:2603:3000::/24, country:US]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Aug 2019 23:08:48 -0000 On Thu, 2019-08-29 at 00:45 +0200, Albert Shih wrote: > Hi > > After update 4 servers from 11.2 to 12.0 without any problem, wait few > weeks to see if everything work well, and it did. I just upgrade my > mail > server. > I am running 12.0 on a Supermicro with two 6-core E5-2620 processors, 192G of RAM with two RAIDz2 pools *but not* ZFS root, which is hardware RAID1. One volume has 250G of NVMe cache and the other 480G of SSD cache. Both volumes have SLOG. The only problem I've experienced is boot: it forgets the boot volume during the boot process. I fixed that problem in loader.conf with: vfs.root.mountfrom="ufs:/dev/gpt/disk0" I have another Supermicro with 128G RAM, AMD processors (16x2) but running 11.3 because I'm too lazy to shut everything down just to upgrade the NAS. The NAS has two ZFS volumes and also uses hardware RAID1 for root. The NAS is also my back-up server using dump and ZFS send from other systems, including ZFS systems. The slowdowns that I experience under 12.0 is when I have a lot of network activity and running three videos (it's my primary workstation). The disk arrays are responsive. The 12.0 system sees heavy use but not very heavy use, which includes several rsync/etc operations under cron (including rsync a Debian archive) and a network gateway/router. On different hardware and under FreeBSD 9.0, I had a lot of problems with crap disks that I did not have under 8.x, and so I downgraded. I didn't have a problem under 10.0. I believe in keeping my BIOS/firmware updated on an annual cycle but firmware can be tricky. It turns out that a lot of bugs can get fixed that don't make their way into release notes; and new ones introduced. > During the upgrade I also upgrade all firmware for the hardware. > > And now I got a very serious issue with my server. > > Configuration : > > Dell PowerEdge R740Xd with H730P, 192 Go Ram, 2 SAS mechanical disk > for the system, > 2 SSD (in a zfs pool) for the mail index (cyrus), and 28 mechanical > disk > (in a second zfs pool) for the mailbox. > > The problem: > > After running few days the zfs pool with the 2 SSD are not > responding. > > The system are perfectly working. > > The second zpool (mechanical disk) are perfectly working. > > I got zero log, zero message in the console or in dmesg. > > The arc_size are correct, it's around 70-75 %. > > The moment the zfs pool become not responding are random, not > related to > any activity (human or cron). > > The only option I pass for the kernel related to ZFS are > vfs.zfs.min_auto_ashift=12 and > vfs.zfs.prefetch_disable=1. Without the second one the system no > responding (under 11.2) when the server send (through zfs send) the > data to another > server. > > After the first problem I make a zfs upgrade, thinking maybe that's > the > problem so I'm not sure I can downgrade to 11.2 (and 11.2 are EOL) > > In your opinion : > > 1/ What should I do to try to find the problem ? > > 2/ Do you think that's a hardware/firmware problem or FreeBSD > problem, > the point is the second zpool are working perfectly so I'm thinking > at > some firmware/hardware/compatibility problem. > > > Regards. > > > -- > Albert SHIH > DIO bâtiment 15 > Observatoire de Paris > Heure local/Local time: > Thu 29 Aug 2019 12:26:55 AM CEST > _______________________________________________ > freebsd-questions@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to " > freebsd-questions-unsubscribe@freebsd.org" >