From owner-freebsd-stable@freebsd.org Mon Apr 12 09:44:21 2021 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 652405C92E4 for ; Mon, 12 Apr 2021 09:44:21 +0000 (UTC) (envelope-from felix@palmen-it.de) Received: from stef.palmen-it.de (stef.palmen-it.de [84.38.67.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4FJkN45HgKz3mpF for ; Mon, 12 Apr 2021 09:44:19 +0000 (UTC) (envelope-from felix@palmen-it.de) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=palmen-it.de; s=20200414; h=Content-Type:MIME-Version:Message-ID:Subject:To :From:Date:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=4awiske2lqo+4E3znHUkp865aNeKqiPXnXZO0BY1nqo=; b=iDAmwRZ6WmWy9ilT0bRxlQR8Dh aif7BtgkjqmZ5JUhGzDwA436Wugf4hqlVWzWlmi+lZ1V9j9hhBV2CH0QGlCLqdbSbVQtb6acYR3tD pk3um/zXN67Prjfl5ZUSxABCr8YLaNjz7y8Lud5iU0vX871km/OCYww/jhACQ2FV4DvEugZowXxnd 6LhlibmgCh14RQ+dLUx2saygDrURX/qcibvEHC+fMxIv6k7NHZe2a6OP9dTDuIxyWH8y+0fLtah/k XEZh+UfqAXbeOQn4JBM3vQ8lDFcoeo6VQMuTfvgkdkna3/g6n49WYxfY4Kl8oGLQ3DvK40oyXjLqa j3gWKJqQ==; Received: from [192.168.71.101] (helo=mail.home.palmen-it.de) by stef.palmen-it.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lVt78-0007lL-Uq for freebsd-stable@freebsd.org; Mon, 12 Apr 2021 11:44:15 +0200 Received: from nexus.home.palmen-it.de ([192.168.99.2]) by mail.home.palmen-it.de with esmtpsa (TLS1.3) tls TLS_CHACHA20_POLY1305_SHA256 (Exim 4.94 (FreeBSD)) (envelope-from ) id 1lVt77-000KMK-NF for freebsd-stable@freebsd.org; Mon, 12 Apr 2021 09:44:14 +0000 Date: Mon, 12 Apr 2021 11:44:11 +0200 From: Felix Palmen To: freebsd-stable@freebsd.org Subject: Frequent disk I/O stalls while building (poudriere), processes in "zfs tear" state Message-ID: <20210412094411.j3s7us5ru2d7dzcz@nexus.home.palmen-it.de> Mail-Followup-To: freebsd-stable@freebsd.org X-Face: /1K@t"h.}e~pR@]c7HorQ!T`F^RJCa'BCr#e>IKA{>C/9OTGB4|xh"y2{?1Z5M i2w"AH^pN_LlHR^{+f',_Np~; .B; !M/bL}*qk]p5*r7F5vW}; {:@4u5S?T&f0$7BJ-71Q5SV]:v$`5 A0[DZ:=?S52x8HJ~5@^P_\T@MsjG{R( Organization: palmen-it.de MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="55ruzytbwkfudvy5" Content-Disposition: inline User-Agent: NeoMutt/20210205 X-Rspamd-Queue-Id: 4FJkN45HgKz3mpF X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=palmen-it.de header.s=20200414 header.b=iDAmwRZ6; dmarc=pass (policy=none) header.from=palmen-it.de; spf=pass (mx1.freebsd.org: domain of felix@palmen-it.de designates 84.38.67.7 as permitted sender) smtp.mailfrom=felix@palmen-it.de X-Spamd-Result: default: False [-5.20 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:84.38.67.7:c]; TO_DN_NONE(0.00)[]; HAS_ORG_HEADER(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_TRACE(0.00)[palmen-it.de:+]; DMARC_POLICY_ALLOW(-0.50)[palmen-it.de,none]; NEURAL_HAM_SHORT(-1.00)[-1.000]; SIGNED_PGP(-2.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RBL_DBL_DONT_QUERY_IPS(0.00)[84.38.67.7:from]; ASN(0.00)[asn:204119, ipnet:84.38.64.0/20, country:DE]; RCVD_IN_DNSWL_LOW(-0.10)[84.38.67.7:from]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[palmen-it.de:s=20200414]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.20)[multipart/signed,text/plain]; RCPT_COUNT_ONE(0.00)[1]; DWL_DNSWL_LOW(-1.00)[palmen-it.de:dkim]; SPAMHAUS_ZRD(0.00)[84.38.67.7:from:127.0.2.255]; NEURAL_SPAM_LONG(1.00)[1.000]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-stable] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Apr 2021 09:44:21 -0000 --55ruzytbwkfudvy5 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hello all, since following the releng/13.0 branch, I experience stalled disk I/O quite often (ca. once per minute) while building packages with poudriere. What I can see in this case is the CPU going almost idle, and several processes shown in `top` in state "zfs te" (and procstat shows "zfs tear" for that). For up to several seconds, no disk I/O completes (even starting a new process is impossible), then it recovers. Only two times, I have seen the system going into a deadlock instead, with printing messages similar to this to the serial console: swap_pager: indefinite wait buffer ... I have this behavior since -RC3 (followed releng/13.0 now up to -RELEASE). Before that, I had the vnlru-related problem that was fixed with faa41af1fed350327cc542cb240ca2c6e1e8ba0c. Some details: * CPU: Intel(R) Xeon(R) CPU E3-1240L v5 @ 2.10GHz * RAM: 64GB (ECC) * Four HDDs (Seagate NAS models), 4TB each * Swap 16GB, striped over the 4 disks * Pool: 12TB raid-z on GELI-encrypted partitions. NOT upgraded yet, so I have a way back to 12.2. * Two bhyve VMs running with 1GB and 8GB RAM, both wired * Several jails running services like samba, an MTA, nginx... * Several NFS shares mounted by other machines * Poudriere running on idprio 22 with 8 parallel build jobs Reducing the parallel jobs in poudriere also reduces the frequency of the problem, but it doesn't seem to completely go away. Also, I have the impression running into these stalls is more likely when a lot of compilation jobs can be satisfied from ccache. Thanks for any ideas and insight (e.g. what this "zfs tear" status means). Best regards, Felix Palmen --=20 Dipl.-Inform. Felix Palmen ,.//.......... {web} http://palmen-it.de {jabber} [see email] ,//palmen-it.de {pgp public key} http://palmen-it.de/pub.txt // """"""""""" {pgp fingerprint} A891 3D55 5F2E 3A74 3965 B997 3EF2 8B0A BC02 DA2A --55ruzytbwkfudvy5 Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEqJE9VV8uOnQ5ZbmXPvKLCrwC2ioFAmB0FmQACgkQPvKLCrwC 2iqLNAgAlNuixljbKlN/GE/e6GG/Bm2uAdmia1agM0oLgEjdwmmzfC1JYK9QcO98 tw2jj75M1DlVSascUaTu7rSZ2TKFcFhVw7jb0ak6EPOgP7RRXCUZNPuuY/sSKF1C zJG1m7B7W0BHZWaKMLFduuP1TejNErHPN9hjJS0Jrs8sNHlPgQrtkKZcoqawj9tG On3uhXhQkGrMf0Y2agsMVpkcNWVitOKgKpaSKkvPnUJLfl2XQLWkwHHjGQw5u6xZ tXv+iWLJWQ/FqvFrR2rmNVcSrmei2N0jEhj6LuYIRt/ggA7aYO7fUxh4RpqQHaHi h3HwUxhaHKrFL5ULyhCMtj27qQbzoA== =tmH0 -----END PGP SIGNATURE----- --55ruzytbwkfudvy5--