From owner-freebsd-fs@freebsd.org Mon Dec 2 22:39:08 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0FB4D1BAD7E for ; Mon, 2 Dec 2019 22:39:08 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47Rg5P5lWNz4VTV for ; Mon, 2 Dec 2019 22:39:05 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 16C2340026 for ; Mon, 2 Dec 2019 23:39:02 +0100 (CET) Received: from [192.168.1.132] (h-201-140.A785.priv.bahnhof.se [98.128.201.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id EA55D4001A for ; Mon, 2 Dec 2019 23:39:01 +0100 (CET) From: Peter Eriksson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Mon, 2 Dec 2019 23:39:01 +0100 References: To: =?utf-8?Q?Karli_Sj=C3=B6berg_via_freebsd-fs?= In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47Rg5P5lWNz4VTV X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; MV_CASE(0.50)[]; IP_SCORE(-0.60)[ip: (-2.13), ipnet: 2001:6b0::/32(-0.48), asn: 1653(-0.39), country: EU(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Dec 2019 22:39:08 -0000 Sigh. Slight correction, the output below should have said uma_zdestroy() and = not uma_zfree_arg() (wrong printf text, but the right times). After an uptime of 7 hours, a reboot have these times (I removed the = =E2=80=9Cuma=E2=80=9D printf in this run): kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds This is on a mostly idle server (well, apart from compiling the kernel = code :-) and some snapshots being taken of all filesystems (once per = hour). So now on to finding out why uma_destroy() is taking so long=E2=80=A6 = :-). - Peter > On 2 Dec 2019, at 15:32, Peter Eriksson wrote: >=20 > I=E2=80=99ve been looking at trying to figure out why our servers take = so long to reboot, where the most time is spent doing a =E2=80=9Cshutdown=E2= =80=9D. We=E2=80=99ve seen examples where it has taken 10-20 minutes (or = more). >=20 > This is Dell PowerEdge R730xd servers with 256GB RAM and ~140TB of = disks. FreeBSD 11.3. With ~24000 filsystems per server. > We normally cap ARC to 128GB RAM. >=20 > Adding a lot of debugging printf() calls to relevant parts of the code = points to: >=20 > kern_shutdown() -> > EVENTHANDLER_INVOKE(shutdown_post_sync) -> > zfsshutdown() ->=20 > zfs__fini() ->=20 > spa_fini() ->=20 > zio_fini(): >=20 > Debug output from a test run: > zio_fini: kmem_cache_destroy(zio_buf_cache & zio_data_buf_cache): >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff803465eec00) [zio_buf_12288] = took 16 seconds > kmem_cache_destroy(zio_buf_cache[20]) took 16 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff803465eeb00) [zio_buf_16384] = took 61 seconds > kmem_cache_destroy(zio_buf_cache[28]) took 61 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff8034c9018c0) [zio_buf_131072] = took 87 seconds > kmem_cache_destroy(zio_buf_cache[224]) took 87 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff8034c901880) = [zio_data_buf_131072] took 5 seconds > kmem_cache_destroy(zio_data_buf_cache[224]) took 5 seconds >=20 > (I modified the code here to print the time spent if it took 2 seconds = or more) >=20 > This is on a newly rebooted server (with all filesystems mounted). = Seems like uma_zfree_arg() is taking really long to execute. Now that = code isn=E2=80=99t exactly easy to read (for me atleast)=E2=80=A6 = Lot=E2=80=99s of barrier/locks and stuff. >=20 > I wonder why this code should take so long? There shouldn=E2=80=99t be = any disk I/O involved and it=E2=80=99s just a cache so I wonder if there = might be some way to get rid of it quicker? Any UMA experts online? :-) >=20 > Reason for this is that I=E2=80=99d like to be able to make sure a = server reboots more quickly in case of problems. Now with the parallell = ZFS mount stuff being done at boot time that part is much quicker :-). >=20 > - Peter >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"