From owner-freebsd-questions@freebsd.org Tue Feb 19 09:17:36 2019 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3180114E7B2A for ; Tue, 19 Feb 2019 09:17:36 +0000 (UTC) (envelope-from ole@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DA16775676 for ; Tue, 19 Feb 2019 09:17:33 +0000 (UTC) (envelope-from ole@free.de) Received: from bard (x4e3048dc.dyn.telefonica.de [78.48.72.220]) by smtp.free.de (Postfix) with ESMTPSA id 7E4A223973 for ; Tue, 19 Feb 2019 10:17:26 +0100 (CET) Date: Tue, 19 Feb 2019 10:17:17 +0100 From: Ole To: freebsd-questions@freebsd.org Subject: Re: ZFS deadlock on parallel ZFS operations FreeBSD 11.2 and 12.0 Message-ID: <20190219101717.61526ab1.ole@free.de> In-Reply-To: <20190215113423.01edabe9.ole@free.de> References: <20190215113423.01edabe9.ole@free.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; boundary="Sig_/YBhg/WnD.5x0KVMWTOJgpCs"; protocol="application/pgp-signature" X-Rspamd-Queue-Id: DA16775676 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; spf=pass (mx1.freebsd.org: domain of ole@free.de designates 91.204.6.103 as permitted sender) smtp.mailfrom=ole@free.de X-Spamd-Result: default: False [-3.48 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; DMARC_NA(0.00)[free.de]; MX_GOOD(-0.01)[pop.free.de]; NEURAL_HAM_SHORT(-0.07)[-0.067,0]; RCVD_IN_DNSWL_NONE(0.00)[103.6.204.91.list.dnswl.org : 127.0.10.0]; MID_CONTAINS_FROM(1.00)[]; IP_SCORE(-0.00)[country: DE(-0.01)]; SIGNED_PGP(-2.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[220.72.48.78.zen.spamhaus.org : 127.0.0.11]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+]; ASN(0.00)[asn:31371, ipnet:91.204.4.0/22, country:DE]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Feb 2019 09:17:36 -0000 --Sig_/YBhg/WnD.5x0KVMWTOJgpCs Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Hi, ok now I got a again unkillable ZFS process. It is only one 'zfs send' command. Any Idea how to kill this process without powering off the machine? oot@jails1:/usr/home/admin # ps aux | grep 'zfs send' root 17617 0.0 0.0 12944 3856 - Is Sat04 0:00.00 sudo z= fs send -e -I cryptopool/iocage/jails/2fe7ae89-760e-423c-8e7f-4f504e0f08bf@= 2019- root 17618 0.0 0.0 12980 4036 - D Sat04 0:00.01 zfs se= nd -e -I cryptopool/iocage/jails/2fe7ae89-760e-423c-8e7f-4f504e0f08bf@2019-= 02-16 root 19299 0.0 0.0 11320 2588 3 S+ 09:53 0:00.00 grep z= fs send root@jails1:/usr/home/admin # kill -9 17618 root@jails1:/usr/home/admin # ps aux | grep 'zfs send' root 17617 0.0 0.0 12944 3856 - Is Sat04 0:00.00 sudo z= fs send -e -I cryptopool/iocage/jails/2fe7ae89-760e-423c-8e7f-4f504e0f08bf@= 2019- root 17618 0.0 0.0 12980 4036 - D Sat04 0:00.01 zfs se= nd -e -I cryptopool/iocage/jails/2fe7ae89-760e-423c-8e7f-4f504e0f08bf@2019-= 02-16 root 19304 0.0 0.0 11320 2588 3 S+ 09:53 0:00.00 grep z= fs send It is a FreeBSD 12.0 VM-Image running in a Bhyve VM. There is basicly only py36-iocage installed, and there are 7 running Jails.=20 There is 30G RAM and sysctl vfs.zfs.arc_max ist set to 20G. It seems that=20 the whole zpool is in some kind of deadlock. All Jails are crashed,=20 unkillable and I can not run any command inside.=20 regards Ole Fri, 15 Feb 2019 11:34:23 +0100 - Ole : > Hi, >=20 > I observed that FreeBSD Systems with ZFS will run into a deadlock if > there are many parallel zfs send/receive/snapshot processes. >=20 > I observed this on bare metal and virtual machines with FreeBSD 11.2 > and 12.0. With RAM from 20 to 64G. >=20 > If the system is also on ZFS the whole system crashes. With only jails > on ZFS they freeze, but the Host system stays stable. But you can't > kill -9 the zfs processes. Only a poweroff stops the machine. >=20 > On a FreeBSD 12.0 VM (bhyve), 30G RAM, 5 CPUs, about 30 zfs > operations, mostly send and receive will crash the system. >=20 > There is no heavy load on the machine: >=20 > # top | head -8 > last pid: 91503; load averages: 0.34, 0.31, 0.29 up 0+22:50:47 > 11:24:00 536 processes: 1 running, 529 sleeping, 6 zombie > CPU: 0.9% user, 0.0% nice, 1.5% system, 0.2% interrupt, 97.4% idle > Mem: 165M Active, 872M Inact, 19G Wired, 264M Buf, 9309M Free > ARC: 11G Total, 2450M MFU, 7031M MRU, 216M Anon, 174M Header, 1029M > Other 8423M Compressed, 15G Uncompressed, 1.88:1 Ratio > Swap: 1024M Total, 1024M Free >=20 > I wonder if this is a BUG or normal behaviour. I could live with a > limited amount of parallel ZFS operation, but I don't want the whole > system to crash.=20 >=20 > Reducing the vfs.zfs.arc_max wont help. >=20 > Any Idea to handle with this? >=20 > regards > Ole --Sig_/YBhg/WnD.5x0KVMWTOJgpCs Content-Type: application/pgp-signature Content-Description: Digitale Signatur von OpenPGP -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE60BGd7KVfL83NXCUJZaRRqjklFAFAlxryZ0ACgkQJZaRRqjk lFA8tw//Tmj/CLRBuFYRnoXhFLL9rhoR3whmJaB0KvgYLwuYKPCPNLRe7pO0qdRt FXXVidDp0pmQv8+p+tC9AXJXZOEec3h2+7M0Akv/s0V7LVYx0bOqs8rXRAGK2sBX MBI8Cdo3WKkaod+/FY4svUSckDqW7hqQziCHxCCz4WXTE7XGcPpNHASbkA+1rz1X baYByV1Jz1rh5ywA5JTf1dhPn8jdfA4Rr2atElr+RVnvWuSh386rWdQcRH3bbM45 2CPeODiF3VCi2xIQ/bpaSWs6lTZFdKCDw34GuXjgwqkZsrpRS5aG5osZeL8eubGs U76qyyCYfq1Pt4h7dfGZmP4kLO57w60PPzTx/QiQXyDMVj0PQfUiveOCmpKVK0ce Ot7habbNPdxFjyZMZVGgKKS1E/h7UM4CasuypEbbqRdIU6/gYoWrlDV9dCGk/O64 L6WQ3+jrsM1ORzjJ30txCWsLParQBlx+x2XO8/w+j5lH0AfiHVEfcM/dsOm1IHwY awa/qD8DpDpPEFFnij/bEL5gwjB+ef7n7S+VjCAXd53LCbmZFZ9qenbZWv7GRwIf z8thnA5+lnBZTMk7qPajvgxvsFjHXVGsgozhaUnsZp7T4Tf6Xny7Z1+Q8G5J7arv Es3KHXyxI2vYwanM2FB/MlbdnE1aDx6+36IogCfM6EqcfZ2KZOQ= =Wp9M -----END PGP SIGNATURE----- --Sig_/YBhg/WnD.5x0KVMWTOJgpCs--