From owner-freebsd-questions@freebsd.org Mon Sep 9 11:55:38 2019 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 2AC92F6A7D for ; Mon, 9 Sep 2019 11:55:38 +0000 (UTC) (envelope-from SRS0=mvsS=XE=perdition.city=julien@bebif.be) Received: from orval.bbpf.belspo.be (orval.bbpf.belspo.be [193.191.208.90]) by mx1.freebsd.org (Postfix) with ESMTP id 46Rmnj211zz3NDX for ; Mon, 9 Sep 2019 11:55:36 +0000 (UTC) (envelope-from SRS0=mvsS=XE=perdition.city=julien@bebif.be) Received: from home.lan (unknown [77.109.105.33]) by orval.bbpf.belspo.be (Postfix) with ESMTPSA id 06DC11D4FC10; Mon, 9 Sep 2019 13:55:34 +0200 (CEST) Date: Mon, 9 Sep 2019 13:55:33 +0200 From: Julien Cigar To: Albert Shih Cc: freebsd-questions@freebsd.org Subject: Re: Verry serious problem with ZFS & 12.0 Message-ID: <20190909115533.GM38457@home.lan> References: <20190828224547.GA1557@io.chezmoi.fr> <20190829083727.GC38457@home.lan> <20190909114552.GD13411@io.chezmoi.fr> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="dzI2QqkSBOAresgT" Content-Disposition: inline In-Reply-To: <20190909114552.GD13411@io.chezmoi.fr> User-Agent: Mutt/1.11.4 (2019-03-13) X-Rspamd-Queue-Id: 46Rmnj211zz3NDX X-Spamd-Bar: ------- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of SRS0=mvsS=XE=perdition.city=julien@bebif.be designates 193.191.208.90 as permitted sender) smtp.mailfrom=SRS0=mvsS=XE=perdition.city=julien@bebif.be X-Spamd-Result: default: False [-7.02 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; DMARC_NA(0.00)[perdition.city]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[90.208.191.193.list.dnswl.org : 127.0.10.0]; SIGNED_PGP(-2.00)[]; FORGED_SENDER(0.30)[julien@perdition.city,SRS0=mvsS=XE=perdition.city=julien@bebif.be]; RCVD_NO_TLS_LAST(0.10)[]; RECEIVED_SPAMHAUS_PBL(0.00)[33.105.109.77.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:2611, ipnet:193.191.192.0/19, country:BE]; FROM_NEQ_ENVFROM(0.00)[julien@perdition.city,SRS0=mvsS=XE=perdition.city=julien@bebif.be]; IP_SCORE(-3.02)[ip: (-8.95), ipnet: 193.191.192.0/19(-4.47), asn: 2611(-1.68), country: BE(-0.01)]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Sep 2019 11:55:38 -0000 --dzI2QqkSBOAresgT Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 09, 2019 at 01:45:52PM +0200, Albert Shih wrote: > Le 29/08/2019 =C3=A0 10:37:28+0200, Julien Cigar a =C3=A9crit > > On Thu, Aug 29, 2019 at 12:45:47AM +0200, Albert Shih wrote: > > > Hi > > > > > > After update 4 servers from 11.2 to 12.0 without any problem, wait few > > > weeks to see if everything work well, and it did. I just upgrade my m= ail > > > server. > > > > > > During the upgrade I also upgrade all firmware for the hardware. > > > > > > And now I got a very serious issue with my server. > > > > > > Configuration : > > > > > > Dell PowerEdge R740Xd with H730P, 192 Go Ram, 2 SAS mechanical disk= for the system, > > > 2 SSD (in a zfs pool) for the mail index (cyrus), and 28 mechanical= disk > > > (in a second zfs pool) for the mailbox. > > > > > > The problem: > > > > > > After running few days the zfs pool with the 2 SSD are not respondi= ng. > > > > > > The system are perfectly working. > > > > > > The second zpool (mechanical disk) are perfectly working. > > > > > > I got zero log, zero message in the console or in dmesg. > > > > > > The arc_size are correct, it's around 70-75 %. > > > > > > The moment the zfs pool become not responding are random, not relat= ed to > > > any activity (human or cron). > > > > > > The only option I pass for the kernel related to ZFS are vfs.zfs.mi= n_auto_ashift=3D12 and > > > vfs.zfs.prefetch_disable=3D1. Without the second one the system no > > > responding (under 11.2) when the server send (through zfs send) the= data to another > > > server. > > > > > > After the first problem I make a zfs upgrade, thinking maybe that's= the > > > problem so I'm not sure I can downgrade to 11.2 (and 11.2 are EOL) > > > > > > In your opinion : > > > > > > 1/ What should I do to try to find the problem ? > > > > > > 2/ Do you think that's a hardware/firmware problem or FreeBSD probl= em, > > > the point is the second zpool are working perfectly so I'm thinking= at > > > some firmware/hardware/compatibility problem. > > > > > > > > > Regards. > > > > looks like PR 236480 > > > > see https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D236480 > > >=20 > So I can confirm, with this patch the server work fine without any hang or > crash. Cool, it would be worth to add a comment to the PR (problem was observed with PostgreSQL only until now) >=20 > Thanks folks. >=20 > Regards >=20 > -- > Albert SHIH > Observatoire de Paris > xmpp: jas@obspm.fr > Heure local/Local time: > Mon 09 Sep 2019 01:44:42 PM CEST --=20 Julien Cigar Belgian Biodiversity Platform (http://www.biodiversity.be) PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 No trees were killed in the creation of this message. However, many electrons were terribly inconvenienced. --dzI2QqkSBOAresgT Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE7vn2l0to0nV7EWolsrs3EKIEI8AFAl12Pa4ACgkQsrs3EKIE I8B3oQ/9HgMA+YPFH5Xr0RNU8CsNH/ThNagXh1DoqAn+zz2uYkvWDhi+dEOLg+Sz 4S9g0z7cHwN7tohYs7hSx8KrNu+bYpwyS8m2HRMjfz+M0gokXGqWH1STpyfHZs2o vWlAEz4RwZql0P7npVhDYEnRlA72FR30rx0vbSMUCyEmxlauoj0wnl8QbZeJXBF3 +5hIIKudED7DAUAtm/8vtRDzCI6MvvytYe4ZxQFvx+b2Bc9BvBn3EVKZfMbaDAJh BwjKIv3mqirX8WvScW4xCs8ymd3/3HGB9zid9E8X3QFc503ccobo9MZOjjjBEy3f giFcNLP8f19eDsqGUpVCs6CmLrLvgiiuv3AQXuhImx6K+26lfTuIMDsFRLDxj4Lf uHL1jDGsTSSBGbV+hIvWqFk03ckwgmW4xsIDj3pzt8GwHum0zAuz3JhYuyORnkIf VSLofpoXWBFG/tjYwk3fJZHdl+cl6xReSEHZJJZVcIg+MlFr3n94JR6JC9Qqr7Vc XKTc3WoJCod3YyfskaQRkBOctiLMhnSaU2dWIJ+3BTZFT7SAAUbfzlvTHEsMLzcJ yaXHzPJOK8CzqzSforsqAmfoyT820AhyXzd2EMGX9lNfy806w5NCxsJetR5AbTXc y7QuHsw3cTrSvL6Z9vVSPp0LDA5+/nEnGWuh2VFmbV2HWyt7Mr8= =E6jh -----END PGP SIGNATURE----- --dzI2QqkSBOAresgT--