From owner-freebsd-fs@freebsd.org Sun Dec 1 07:19:03 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C37961C1CA5 for ; Sun, 1 Dec 2019 07:19:03 +0000 (UTC) (envelope-from isafonov@me.com) Received: from mr85p00im-hyfv06021301.me.com (mr85p00im-hyfv06021301.me.com [17.58.23.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47QfkF69fMz4XLm for ; Sun, 1 Dec 2019 07:19:01 +0000 (UTC) (envelope-from isafonov@me.com) Received: from [192.168.37.155] (ip-097-206-122-091.static.atnet.ru [91.122.206.97]) by mr85p00im-hyfv06021301.me.com (Postfix) with ESMTPSA id 2086540D5A for ; Sun, 1 Dec 2019 07:18:56 +0000 (UTC) From: Ivan Safonov Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: zfs panic: solaris assert Message-Id: <78272047-A0E6-4EE0-B3B3-1F927D29EC12@me.com> References: To: freebsd-fs@freebsd.org Date: Sun, 1 Dec 2019 10:18:53 +0300 X-Mailer: Apple Mail (2.3445.104.11) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-12-01_01:, , signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1912010064 X-Rspamd-Queue-Id: 47QfkF69fMz4XLm X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.60 / 15.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:17.58.0.0/16:c]; FREEMAIL_FROM(0.00)[me.com]; TO_DN_NONE(0.00)[]; MV_CASE(0.50)[]; DKIM_TRACE(0.00)[me.com:+]; DMARC_POLICY_ALLOW(-0.50)[me.com,quarantine]; RCVD_IN_DNSWL_LOW(-0.10)[188.23.58.17.list.dnswl.org : 127.0.5.1]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FREEMAIL_ENVFROM(0.00)[me.com]; ASN(0.00)[asn:714, ipnet:17.58.16.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; R_DKIM_ALLOW(-0.20)[me.com:s=1a1hai]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; IP_SCORE_FREEMAIL(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; DWL_DNSWL_LOW(-1.00)[me.com.dwl.dnswl.org : 127.0.5.1]; IP_SCORE(0.00)[ip: (-3.62), ipnet: 17.58.16.0/20(-1.57), asn: 714(-2.31), country: US(-0.05)]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Dec 2019 07:19:03 -0000 Hi there I have a small home server with a ZFS pool created in FreeBSD 11.1 and = updated to 12.0 -> 12.1. A few days ago, the server rebooted with panic panic: solaris assert: size <=3D (1ULL << 24) (0x1401000 <=3D = 0x1000000), file: = /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c, line: 296 cpuid =3D 3 time =3D 1574981747 KDB: stack backtrace: #0 0xffffffff80c1d297 at kdb_backtrace+0x67 #1 0xffffffff80bd05cd at vpanic+0x19d #2 0xffffffff80bd0423 at panic+0x43 #3 0xffffffff82a6e22c at assfail3+0x2c #4 0xffffffff8284a8f7 at abd_alloc+0x67 #5 0xffffffff82850319 at arc_hdr_alloc_pabd+0x99 #6 0xffffffff8284d554 at arc_hdr_alloc+0x124 #7 0xffffffff8284ef13 at arc_read+0x243 #8 0xffffffff8287942d at traverse_prefetch_metadata+0xbd #9 0xffffffff828788cc at traverse_visitbp+0x3dc #10 0xffffffff82878930 at traverse_visitbp+0x440 #11 0xffffffff82878930 at traverse_visitbp+0x440 #12 0xffffffff82878930 at traverse_visitbp+0x440 #13 0xffffffff82878930 at traverse_visitbp+0x440 #14 0xffffffff82879513 at traverse_dnode+0xd3 #15 0xffffffff82878c30 at traverse_visitbp+0x740 #16 0xffffffff828780a7 at traverse_impl+0x317 #17 0xffffffff8287837c at traverse_pool+0x14c Uptime: 1m0s I tested the memory and drives and they look OK. Then I installed = FreeBSD 12.1 on a USB drive zpool import pool: zroot id: 5722521002676846505 state: ONLINE status: Some supported features are not enabled on the pool. action: The pool can be imported using its name or numeric identifier, = though some features will not be available without an explicit 'zpool = upgrade'. config: zroot ONLINE raidz1-0 ONLINE gpt/D0-33GTM0VGS ONLINE gpt/D1-Z1E3Q7KP ONLINE gpt/D2-Z4ZARG1V ONLINE I tried to import the pool but got the same panic zpool import -o readonly -f -F -N -R /pool zroot zdb -ue zroot Uberblock: magic =3D 0000000000bab10c version =3D 5000 txg =3D 11751485 guid_sum =3D 14638353410936556308 timestamp =3D 1574776136 UTC =3D Tue Nov 26 16:48:56 2019 checkpoint_txg =3D 0 zdb -l /dev/gpt/D0-33GTM0VGS Code: ------------------------------------ LABEL 0 ------------------------------------ version: 5000 name: 'zroot' state: 0 txg: 11750848 pool_guid: 5722521002676846505 hostid: 1550328424 hostname: '' top_guid: 10071190156355008053 guid: 4016597551985842896 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 10071190156355008053 nparity: 1 metaslab_array: 39 metaslab_shift: 35 ashift: 12 asize: 5997325713408 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 4016597551985842896 path: '/dev/gpt/D0-33GTM0VGS' whole_disk: 1 DTL: 70406 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 18015782253695591313 path: '/dev/gpt/D1-Z1E3Q7KP' whole_disk: 1 DTL: 70405 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 13705750593642370773 path: '/dev/gpt/D2-Z4ZARG1V' whole_disk: 1 DTL: 70323 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data ------------------------------------ LABEL 1 ------------------------------------ version: 5000 name: 'zroot' state: 0 txg: 11750848 pool_guid: 5722521002676846505 hostid: 1550328424 hostname: '' top_guid: 10071190156355008053 guid: 4016597551985842896 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 10071190156355008053 nparity: 1 metaslab_array: 39 metaslab_shift: 35 ashift: 12 asize: 5997325713408 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 4016597551985842896 path: '/dev/gpt/D0-33GTM0VGS' whole_disk: 1 DTL: 70406 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 18015782253695591313 path: '/dev/gpt/D1-Z1E3Q7KP' whole_disk: 1 DTL: 70405 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 13705750593642370773 path: '/dev/gpt/D2-Z4ZARG1V' whole_disk: 1 DTL: 70323 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data ------------------------------------ LABEL 2 ------------------------------------ version: 5000 name: 'zroot' state: 0 txg: 11750848 pool_guid: 5722521002676846505 hostid: 1550328424 hostname: '' top_guid: 10071190156355008053 guid: 4016597551985842896 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 10071190156355008053 nparity: 1 metaslab_array: 39 metaslab_shift: 35 ashift: 12 asize: 5997325713408 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 4016597551985842896 path: '/dev/gpt/D0-33GTM0VGS' whole_disk: 1 DTL: 70406 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 18015782253695591313 path: '/dev/gpt/D1-Z1E3Q7KP' whole_disk: 1 DTL: 70405 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 13705750593642370773 path: '/dev/gpt/D2-Z4ZARG1V' whole_disk: 1 DTL: 70323 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data ------------------------------------ LABEL 3 ------------------------------------ version: 5000 name: 'zroot' state: 0 txg: 11750848 pool_guid: 5722521002676846505 hostid: 1550328424 hostname: '' top_guid: 10071190156355008053 guid: 4016597551985842896 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 10071190156355008053 nparity: 1 metaslab_array: 39 metaslab_shift: 35 ashift: 12 asize: 5997325713408 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 4016597551985842896 path: '/dev/gpt/D0-33GTM0VGS' whole_disk: 1 DTL: 70406 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 18015782253695591313 path: '/dev/gpt/D1-Z1E3Q7KP' whole_disk: 1 DTL: 70405 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 13705750593642370773 path: '/dev/gpt/D2-Z4ZARG1V' whole_disk: 1 DTL: 70323 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data zdb -AAA -F -X zroot ... capacity operations bandwidth ---- = errors ---- description used avail read write read write read = write cksum zroot 1.47T 3.97T 82 0 351K 0 0 = 0 30 raidz1 1.47T 3.97T 82 0 351K 0 0 = 0 121 /dev/gpt/D0-33GTM0VGS 27 0 146K 0 0 = 0 0 /dev/gpt/D1-Z1E3Q7KP 27 0 146K 0 0 = 0 0 /dev/gpt/D2-Z4ZARG1V 27 0 146K 0 0 = 0 0 ... Is it possible to somehow restore the pool or data?= From owner-freebsd-fs@freebsd.org Sun Dec 1 12:13:21 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0DAB91A99D3 for ; Sun, 1 Dec 2019 12:13:21 +0000 (UTC) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from uucp.dinoex.org (uucp.dinoex.sub.de [IPv6:2001:1440:5001:1::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "uucp.dinoex.sub.de", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47QnFq4VNZz3HT4 for ; Sun, 1 Dec 2019 12:13:18 +0000 (UTC) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from uucp.dinoex.sub.de (uucp.dinoex.org [185.220.148.12]) by uucp.dinoex.org (8.16.0.41/8.16.0.41) with ESMTPS id xB1CD5XP059870 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Sun, 1 Dec 2019 13:13:06 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) X-MDaemon-Deliver-To: X-Authentication-Warning: uucp.dinoex.sub.de: Host uucp.dinoex.org [185.220.148.12] claimed to be uucp.dinoex.sub.de Received: from citylink.dinoex.sub.org (uucp@localhost) by uucp.dinoex.sub.de (8.16.0.41/8.16.0.41/Submit) with UUCP id xB1CD53A059869 for freebsd-fs@FreeBSD.ORG; Sun, 1 Dec 2019 13:13:05 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from gate.oper.dinoex.org (gate-e [192.168.98.2]) by citylink.dinoex.sub.de (8.15.2/8.15.2) with ESMTP id xB1C6rBE073197 for ; Sun, 1 Dec 2019 13:06:53 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from gate.oper.dinoex.org (gate-e [192.168.98.2]) by gate.oper.dinoex.org (8.15.2/8.15.2) with ESMTP id xB1C411i071315 for ; Sun, 1 Dec 2019 13:04:01 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: (from news@localhost) by gate.oper.dinoex.org (8.15.2/8.15.2/Submit) id xB1C41qC071311 for freebsd-fs@FreeBSD.ORG; Sun, 1 Dec 2019 13:04:01 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) X-Authentication-Warning: gate.oper.dinoex.org: news set sender to li-fbsd@citylink.dinoex.sub.org using -f From: "Peter Much" Subject: ZFS overgrowing arc_max -> OOM kills Date: Sun, 01 Dec 2019 12:52:53 +0100 Organization: n/a Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Injection-Info: oper.dinoex.de; logging-data="68697"; mail-complaints-to="usenet@citylink.dinoex.sub.org" User-Agent: Opera Mail/12.16 (FreeBSD) Sender: li-fbsd@citylink.dinoex.sub.org To: freebsd-fs@FreeBSD.ORG X-Milter: Spamilter (Reciever: uucp.dinoex.sub.de; Sender-ip: 185.220.148.12; Sender-helo: uucp.dinoex.sub.de; ) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (uucp.dinoex.org [185.220.148.12]); Sun, 01 Dec 2019 13:13:08 +0100 (CET) X-Rspamd-Queue-Id: 47QnFq4VNZz3HT4 X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of li-fbsd@citylink.dinoex.sub.org has no SPF policy when checking 2001:1440:5001:1::2) smtp.mailfrom=li-fbsd@citylink.dinoex.sub.org X-Spamd-Result: default: False [4.03 / 15.00]; ARC_NA(0.00)[]; RCVD_COUNT_FIVE(0.00)[5]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; HAS_ORG_HEADER(0.00)[]; NEURAL_SPAM_MEDIUM(0.38)[0.376,0]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.46)[ip: (1.23), ipnet: 2001:1440::/32(0.61), asn: 8469(0.49), country: DE(-0.01)]; NEURAL_SPAM_LONG(0.99)[0.992,0]; TO_DN_NONE(0.00)[]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[pmc@citylink.dinoex.sub.org,li-fbsd@citylink.dinoex.sub.org]; DMARC_NA(0.00)[sub.org]; R_DKIM_NA(0.00)[]; MID_RHS_NOT_FQDN(0.50)[]; ASN(0.00)[asn:8469, ipnet:2001:1440::/32, country:DE]; FROM_NEQ_ENVFROM(0.00)[pmc@citylink.dinoex.sub.org,li-fbsd@citylink.dinoex.sub.org]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Dec 2019 12:13:21 -0000 Hi all, I thought this matter was solved a long time ago! In fact, I am using ZFS almost from the beginning, and I never observed it seriousely overgrowing ArcMax, until now (this is 11.3-RELEASE-p3). > System Memory: > 9.38% 187.32 MiB Active, 0.37% 7.35 MiB Inact > 78.60% 1.53 GiB Wired, 0.00% 0 Cache > 1.11% 22.14 MiB Free, 10.54% 210.59 MiB Gap > Real Installed: 2.00 GiB > Real Available: 99.19% 1.98 GiB > Real Managed: 98.33% 1.95 GiB > Logical Total: 2.00 GiB > Logical Used: 98.56% 1.97 GiB > Logical Free: 1.44% 29.49 MiB > Kernel Memory: 768.84 MiB > Data: 98.21% 755.05 MiB > Text: 1.79% 13.79 MiB > Kernel Memory Map: 1.38 GiB > Size: 98.79% 1.36 GiB > Free: 1.21% 17.10 MiB See the KMEM! Obviousely, this was the last useful vital sign from the machine, then it stalled normal I/O and did only network gatewaying: > last pid: 51252; load averages: 0.39, 3.09, 3.68 up 5+07:46:39 > 03:09:25 > 1569 processes: 3 running, 1530 sleeping, 36 waiting > CPU 0: 1.1% user, 0.0% nice, 5.0% system, 0.4% interrupt, 93.5% idle > CPU 1: 0.0% user, 0.0% nice, 3.4% system, 1.9% interrupt, 94.6% idle > Mem: 169M Active, 4K Inact, 204M Laundry, 1582M Wired, 85M Buf, 43M Free > ARC: 1026M Total, 483M MFU, 160M MRU, 11M Anon, 188M Header, 184M Other > 127M Compressed, 526M Uncompressed, 4.16:1 Ratio > Swap: 8192M Total, 1965M Used, 6227M Free, 23% Inuse, 56K In Now see the ARC: > ARC Summary: (HEALTHY) > Memory Throttle Count: 0 > ARC Misc: > Deleted: 22.35m > Recycle Misses: 0 > Mutex Misses: 22.11m > Evict Skips: 733.63m > ARC Size: 127.01% 1016.06 MiB > Target Size: (Adaptive) 25.00% 200.00 MiB > Min Size (Hard Limit): 25.00% 200.00 MiB > Max Size (High Water): 4:1 800.00 MiB This was steadily growing since 03:01 - likely due to the influx of metadata from these daily/security/100/110 "find" jobs. I did my calculations to cater for a 25% safety margin (which appear to be quite accurate), but nevertheless, ZFS goes beyond that. :( (Yes, this is i386 - it's a twenty-year-old machine. ;)) The problem started with OOM kills at 03:02 reporting "out of swap", which is bogus (there is no swap shortage); so I found a way to switch such annoying behaviour off: > vm.pageout_oom_seq=512 Then, instead, I got massive "indefinite buffer wait" messages. And these seem to come from what You can read above: out of kernel heap. PMc From owner-freebsd-fs@freebsd.org Mon Dec 2 14:32:44 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5644E1AB981 for ; Mon, 2 Dec 2019 14:32:44 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47RSJB6yjfz3QY2 for ; Mon, 2 Dec 2019 14:32:42 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 0455D40015 for ; Mon, 2 Dec 2019 15:32:40 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:cd44:8234:6c47:5d2e] (unknown [IPv6:2001:6b0:17:f002:cd44:8234:6c47:5d2e]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id DF8D240011 for ; Mon, 2 Dec 2019 15:32:39 +0100 (CET) From: Peter Eriksson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Message-Id: Date: Mon, 2 Dec 2019 15:32:39 +0100 To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47RSJB6yjfz3QY2 X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-4.12 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; MV_CASE(0.50)[]; RCVD_IN_DNSWL_MED(-0.20)[3.254.236.130.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-1.62)[ipnet: 130.236.0.0/16(-4.50), asn: 2843(-3.60), country: SE(-0.03)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Dec 2019 14:32:44 -0000 I=E2=80=99ve been looking at trying to figure out why our servers take = so long to reboot, where the most time is spent doing a =E2=80=9Cshutdown=E2= =80=9D. We=E2=80=99ve seen examples where it has taken 10-20 minutes (or = more). This is Dell PowerEdge R730xd servers with 256GB RAM and ~140TB of = disks. FreeBSD 11.3. With ~24000 filsystems per server. We normally cap ARC to 128GB RAM. Adding a lot of debugging printf() calls to relevant parts of the code = points to: kern_shutdown() -> EVENTHANDLER_INVOKE(shutdown_post_sync) -> zfsshutdown() ->=20 zfs__fini() ->=20 spa_fini() ->=20 zio_fini(): Debug output from a test run: zio_fini: kmem_cache_destroy(zio_buf_cache & zio_data_buf_cache): kmem_cache_destroy: uma_zfree_arg(0xfffff803465eec00) [zio_buf_12288] = took 16 seconds kmem_cache_destroy(zio_buf_cache[20]) took 16 seconds kmem_cache_destroy: uma_zfree_arg(0xfffff803465eeb00) [zio_buf_16384] = took 61 seconds kmem_cache_destroy(zio_buf_cache[28]) took 61 seconds kmem_cache_destroy: uma_zfree_arg(0xfffff8034c9018c0) [zio_buf_131072] = took 87 seconds kmem_cache_destroy(zio_buf_cache[224]) took 87 seconds kmem_cache_destroy: uma_zfree_arg(0xfffff8034c901880) = [zio_data_buf_131072] took 5 seconds kmem_cache_destroy(zio_data_buf_cache[224]) took 5 seconds (I modified the code here to print the time spent if it took 2 seconds = or more) This is on a newly rebooted server (with all filesystems mounted). Seems = like uma_zfree_arg() is taking really long to execute. Now that code = isn=E2=80=99t exactly easy to read (for me atleast)=E2=80=A6 Lot=E2=80=99s= of barrier/locks and stuff. I wonder why this code should take so long? There shouldn=E2=80=99t be = any disk I/O involved and it=E2=80=99s just a cache so I wonder if there = might be some way to get rid of it quicker? Any UMA experts online? :-) Reason for this is that I=E2=80=99d like to be able to make sure a = server reboots more quickly in case of problems. Now with the parallell = ZFS mount stuff being done at boot time that part is much quicker :-). - Peter From owner-freebsd-fs@freebsd.org Mon Dec 2 21:13:22 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 861741B86F2 for ; Mon, 2 Dec 2019 21:13:22 +0000 (UTC) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from uucp.dinoex.org (uucp.dinoex.sub.de [IPv6:2001:1440:5001:1::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "uucp.dinoex.sub.de", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47RdBT4xQCz4Pq9 for ; Mon, 2 Dec 2019 21:13:21 +0000 (UTC) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from uucp.dinoex.sub.de (uucp.dinoex.org [185.220.148.12]) by uucp.dinoex.org (8.16.0.41/8.16.0.41) with ESMTPS id xB2LD4XK059143 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Mon, 2 Dec 2019 22:13:05 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) X-MDaemon-Deliver-To: X-Authentication-Warning: uucp.dinoex.sub.de: Host uucp.dinoex.org [185.220.148.12] claimed to be uucp.dinoex.sub.de Received: from citylink.dinoex.sub.org (uucp@localhost) by uucp.dinoex.sub.de (8.16.0.41/8.16.0.41/Submit) with UUCP id xB2LD4Vr059142 for freebsd-fs@FreeBSD.ORG; Mon, 2 Dec 2019 22:13:04 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from gate.oper.dinoex.org (gate-e [192.168.98.2]) by citylink.dinoex.sub.de (8.15.2/8.15.2) with ESMTP id xB2KQGDQ038453 for ; Mon, 2 Dec 2019 21:26:16 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: from gate.oper.dinoex.org (gate-e [192.168.98.2]) by gate.oper.dinoex.org (8.15.2/8.15.2) with ESMTP id xB2KO1WF038018 for ; Mon, 2 Dec 2019 21:24:01 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) Received: (from news@localhost) by gate.oper.dinoex.org (8.15.2/8.15.2/Submit) id xB2KO0s8038017 for freebsd-fs@FreeBSD.ORG; Mon, 2 Dec 2019 21:24:00 +0100 (CET) (envelope-from li-fbsd@citylink.dinoex.sub.org) X-Authentication-Warning: gate.oper.dinoex.org: news set sender to li-fbsd@citylink.dinoex.sub.org using -f From: Peter Subject: Re: ZFS overgrowing arc_max -> OOM kills Date: Mon, 02 Dec 2019 21:12:30 +0100 Organization: n/a Message-ID: References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Injection-Info: oper.dinoex.de; logging-data="35134"; mail-complaints-to="usenet@citylink.dinoex.sub.org" User-Agent: Opera Mail/12.16 (FreeBSD) Sender: li-fbsd@citylink.dinoex.sub.org To: freebsd-fs@FreeBSD.ORG X-Milter: Spamilter (Reciever: uucp.dinoex.sub.de; Sender-ip: 185.220.148.12; Sender-helo: uucp.dinoex.sub.de; ) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (uucp.dinoex.org [185.220.148.12]); Mon, 02 Dec 2019 22:13:07 +0100 (CET) X-Rspamd-Queue-Id: 47RdBT4xQCz4Pq9 X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of li-fbsd@citylink.dinoex.sub.org has no SPF policy when checking 2001:1440:5001:1::2) smtp.mailfrom=li-fbsd@citylink.dinoex.sub.org X-Spamd-Result: default: False [4.27 / 15.00]; ARC_NA(0.00)[]; RCVD_COUNT_FIVE(0.00)[5]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; HAS_ORG_HEADER(0.00)[]; NEURAL_SPAM_MEDIUM(0.69)[0.692,0]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.39)[ip: (1.04), ipnet: 2001:1440::/32(0.52), asn: 8469(0.42), country: DE(-0.01)]; NEURAL_SPAM_LONG(0.98)[0.982,0]; TO_DN_NONE(0.00)[]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[peter@citylink.dinoex.sub.org,li-fbsd@citylink.dinoex.sub.org]; DMARC_NA(0.00)[sub.org]; R_DKIM_NA(0.00)[]; MID_RHS_NOT_FQDN(0.50)[]; ASN(0.00)[asn:8469, ipnet:2001:1440::/32, country:DE]; FROM_NEQ_ENVFROM(0.00)[peter@citylink.dinoex.sub.org,li-fbsd@citylink.dinoex.sub.org]; RCVD_TLS_LAST(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Dec 2019 21:13:22 -0000 It seems this issue is a combination of two different matters. On Sun, 01 Dec 2019 12:48:29 +0100, Peter Much wrote: >> Kernel Memory Map: 1.38 GiB >> Size: 98.79% 1.36 GiB >> Free: 1.21% 17.10 MiB Now I worked myself thru the code in arc.c, and everything there appears to work correctly. What I seem to have is about 85% metadata, and 75% of what is called "overhead" data (arcstats.overhead_size), and that seems not evictable. So in that case the ARC goes beyond it's upper limit, and it goes beyond the provisioned kmem just the same way, because it's the same code. So, the exitus is inevitable. The other issue is the OOM kills, and that is the same issue as already reported by many people (bug 232800 and 231457, where Mark Millard declared this behaviour "works as designed"). It is probably unrelated to ZFS (except that ZFS does not tend to improve memory abundance). So now I go and hack my arc.c, and probably hack the vm pageout as well, and see how to make that stuff behave. (Commentary and inspiration is welcome, as usual.) rgds, PMc From owner-freebsd-fs@freebsd.org Mon Dec 2 22:39:08 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0FB4D1BAD7E for ; Mon, 2 Dec 2019 22:39:08 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47Rg5P5lWNz4VTV for ; Mon, 2 Dec 2019 22:39:05 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 16C2340026 for ; Mon, 2 Dec 2019 23:39:02 +0100 (CET) Received: from [192.168.1.132] (h-201-140.A785.priv.bahnhof.se [98.128.201.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id EA55D4001A for ; Mon, 2 Dec 2019 23:39:01 +0100 (CET) From: Peter Eriksson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Mon, 2 Dec 2019 23:39:01 +0100 References: To: =?utf-8?Q?Karli_Sj=C3=B6berg_via_freebsd-fs?= In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47Rg5P5lWNz4VTV X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; MV_CASE(0.50)[]; IP_SCORE(-0.60)[ip: (-2.13), ipnet: 2001:6b0::/32(-0.48), asn: 1653(-0.39), country: EU(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Dec 2019 22:39:08 -0000 Sigh. Slight correction, the output below should have said uma_zdestroy() and = not uma_zfree_arg() (wrong printf text, but the right times). After an uptime of 7 hours, a reboot have these times (I removed the = =E2=80=9Cuma=E2=80=9D printf in this run): kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds This is on a mostly idle server (well, apart from compiling the kernel = code :-) and some snapshots being taken of all filesystems (once per = hour). So now on to finding out why uma_destroy() is taking so long=E2=80=A6 = :-). - Peter > On 2 Dec 2019, at 15:32, Peter Eriksson wrote: >=20 > I=E2=80=99ve been looking at trying to figure out why our servers take = so long to reboot, where the most time is spent doing a =E2=80=9Cshutdown=E2= =80=9D. We=E2=80=99ve seen examples where it has taken 10-20 minutes (or = more). >=20 > This is Dell PowerEdge R730xd servers with 256GB RAM and ~140TB of = disks. FreeBSD 11.3. With ~24000 filsystems per server. > We normally cap ARC to 128GB RAM. >=20 > Adding a lot of debugging printf() calls to relevant parts of the code = points to: >=20 > kern_shutdown() -> > EVENTHANDLER_INVOKE(shutdown_post_sync) -> > zfsshutdown() ->=20 > zfs__fini() ->=20 > spa_fini() ->=20 > zio_fini(): >=20 > Debug output from a test run: > zio_fini: kmem_cache_destroy(zio_buf_cache & zio_data_buf_cache): >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff803465eec00) [zio_buf_12288] = took 16 seconds > kmem_cache_destroy(zio_buf_cache[20]) took 16 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff803465eeb00) [zio_buf_16384] = took 61 seconds > kmem_cache_destroy(zio_buf_cache[28]) took 61 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff8034c9018c0) [zio_buf_131072] = took 87 seconds > kmem_cache_destroy(zio_buf_cache[224]) took 87 seconds >=20 > kmem_cache_destroy: uma_zfree_arg(0xfffff8034c901880) = [zio_data_buf_131072] took 5 seconds > kmem_cache_destroy(zio_data_buf_cache[224]) took 5 seconds >=20 > (I modified the code here to print the time spent if it took 2 seconds = or more) >=20 > This is on a newly rebooted server (with all filesystems mounted). = Seems like uma_zfree_arg() is taking really long to execute. Now that = code isn=E2=80=99t exactly easy to read (for me atleast)=E2=80=A6 = Lot=E2=80=99s of barrier/locks and stuff. >=20 > I wonder why this code should take so long? There shouldn=E2=80=99t be = any disk I/O involved and it=E2=80=99s just a cache so I wonder if there = might be some way to get rid of it quicker? Any UMA experts online? :-) >=20 > Reason for this is that I=E2=80=99d like to be able to make sure a = server reboots more quickly in case of problems. Now with the parallell = ZFS mount stuff being done at boot time that part is much quicker :-). >=20 > - Peter >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Mon Dec 2 22:54:30 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C9C7E1BB52A for ; Mon, 2 Dec 2019 22:54:30 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47RgR92sVgz4WVX for ; Mon, 2 Dec 2019 22:54:29 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qk1-x72b.google.com with SMTP id v23so1470351qkg.2 for ; Mon, 02 Dec 2019 14:54:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=RsXJYb88LHCvWCJzesynw/K27JrgCfAiPAdKpxbJXew=; b=MPfq213+Kf0l8VPeXvmxCFsiefFD6BIoJoAd5u2eIjWSvNUKi+h3AM33YawmwHFqae hU477Utb6GZd4FQ5kH06QVhqs5ithQYYWWKBVuDCy26SfC9FWQQJxAVDm+dF5+gTIpWK A+BVYd/9ZQjkoq+3CJp7KrtSMHluLyzzqvtNCH6yZRojpXBZ6SrJTSVwZOhwC8r1kvk1 3HwXGyQLGi6SqYe2e0S6v1ieNIOo7u6XxUqOg112CodZcTpwen4D8b+5zTDVUWp8BqPp xkx/b0VDjJ7dDIhkoRn4AKZG+4XCm6nWWHr0TdceqhO4ABThbRKUv+lMu4okKuycIRpv pdrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=RsXJYb88LHCvWCJzesynw/K27JrgCfAiPAdKpxbJXew=; b=gMmjKzev72fPmF+J75NXuKoIx1IXQtboObUIlSpKyhZJY66ySpLIEiGfV5Ckw50vw+ EWgoGl0QonIl6UfnmUP24SNAzBOT27ODpDjzEfLUN8W48WVVd65J49XHD61Rw5T1l2qI ORtOeChJuYaf7p6uGaNUdvC4XVVgjw+clpADg9SZ2XG4ui6xcfk8lmEhyIOi+CEWdKIS 7pZfo41SNA3V8NhdeTTBKq7GOAogkCqTk8IjdabC4DHFW0L+IWM5nXtawHJMsTwvkCio +7zMIVaa0BmYeetCK9yBYI1cUzS4A4McajiS38BNwX7K/RRyOs2XQCS0KbbJHPc4X6lm pioA== X-Gm-Message-State: APjAAAU4TKPbSkGWUsnIBulw8OmZuytP7kf5FRm9U2YTW5IhI/EWsYLu yNWeSuhvhNhNlPQEcEVy0gLip0CxLf8= X-Google-Smtp-Source: APXvYqwR/PF2IiaX0IBOkUhGtT17WWquYixCDNBK3J45fU8JMhlbCqUORb6BNfVLy83N6+id6Hk0ZA== X-Received: by 2002:a37:81c6:: with SMTP id c189mr1679653qkd.223.1575327268093; Mon, 02 Dec 2019 14:54:28 -0800 (PST) Received: from raichu (toroon0560w-lp130-05-69-158-183-252.dsl.bell.ca. [69.158.183.252]) by smtp.gmail.com with ESMTPSA id h28sm621893qte.54.2019.12.02.14.54.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Dec 2019 14:54:27 -0800 (PST) Sender: Mark Johnston Date: Mon, 2 Dec 2019 17:54:24 -0500 From: Mark Johnston To: Peter Eriksson Cc: freebsd-fs@freebsd.org Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Message-ID: <20191202225424.GG43802@raichu> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.12.2 (2019-09-21) X-Rspamd-Queue-Id: 47RgR92sVgz4WVX X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20161025 header.b=MPfq213+; dmarc=none; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::72b as permitted sender) smtp.mailfrom=markjdb@gmail.com X-Spamd-Result: default: False [-4.43 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20161025]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[freebsd.org]; RCVD_COUNT_THREE(0.00)[3]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[b.2.7.0.0.0.0.0.0.0.0.0.0.0.0.0.0.2.0.0.4.6.8.4.0.b.8.f.7.0.6.2.list.dnswl.org : 127.0.5.0]; IP_SCORE(-2.73)[ip: (-9.40), ipnet: 2607:f8b0::/32(-2.24), asn: 15169(-1.94), country: US(-0.05)]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; MID_RHS_NOT_FQDN(0.50)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; RCVD_TLS_ALL(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Dec 2019 22:54:30 -0000 On Mon, Dec 02, 2019 at 11:39:01PM +0100, Peter Eriksson wrote: > Sigh. > > Slight correction, the output below should have said uma_zdestroy() and not uma_zfree_arg() (wrong printf text, but the right times). > > After an uptime of 7 hours, a reboot have these times (I removed the “uma” printf in this run): > > kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds > kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds > kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds > kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds > kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds > kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds > kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds > kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds > > This is on a mostly idle server (well, apart from compiling the kernel code :-) and some snapshots being taken of all filesystems (once per hour). > > > So now on to finding out why uma_destroy() is taking so long… :-). uma_destroy() frees all of the memory cached in the zone back to the page allocator. This operation takes time proportional to the number of cached items. I would expect most of the time to be spent in zone_reclaim(), called by zone_dtor(). Before rebooting, could you try grabbing the output of vmstat -z? From owner-freebsd-fs@freebsd.org Tue Dec 3 08:34:21 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 65CB11CD677 for ; Tue, 3 Dec 2019 08:34:21 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47RwJ84PcPz44s5; Tue, 3 Dec 2019 08:34:16 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 69BF44002A; Tue, 3 Dec 2019 09:34:10 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075] (unknown [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 321AE40005; Tue, 3 Dec 2019 09:34:10 +0100 (CET) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() From: Peter Eriksson In-Reply-To: <20191202225424.GG43802@raichu> Date: Tue, 3 Dec 2019 09:34:09 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20191202225424.GG43802@raichu> To: FreeBSD FS X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47RwJ84PcPz44s5 X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-5.95 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.254.236.130.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-3.45)[ip: (-9.04), ipnet: 130.236.0.0/16(-4.56), asn: 2843(-3.64), country: SE(-0.03)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 08:34:21 -0000 Sure. Here=E2=80=99s one from just before I rebooted last evening (at = the 7 hour uptime mark): ITEM SIZE LIMIT USED FREE REQ FAIL SLEEP UMA Kegs: 384, 0, 247, 3, 247, 0, 0 UMA Zones: 4736, 0, 264, 0, 264, 0, 0 UMA Slabs: 80, 0, 2894986, 14, 2898224, 0, 0 UMA Hash: 256, 0, 48, 72, 108, 0, 0 4 Bucket: 32, 0, 3497, 9753, 338875, 0, 0 6 Bucket: 48, 0, 1700, 10501, 89938, 0, 0 8 Bucket: 64, 0, 611, 10053, 118928, 15, 0 12 Bucket: 96, 0, 3293, 6096, 131089, 0, 0 16 Bucket: 128, 0, 834, 5304, 55984, 0, 0 32 Bucket: 256, 0, 2747, 5668, 373187, 139, 0 64 Bucket: 512, 0, 1884, 4436, 483281, 139, 0 128 Bucket: 1024, 0, 4828, 2996, 613296,16078, = 0 256 Bucket: 2048, 0, 34106, 1058, 1566552, 164, 0 vmem btag: 56, 0, 904562, 3741, 905062,6399, 0 VM OBJECT: 240, 0, 89303, 5129,24512558, 0, 0 RADIX NODE: 144, 0, 634446, 15444,60638431, 0, 0 MAP: 240, 0, 3, 61, 3, 0, 0 KMAP ENTRY: 128, 0, 12, 639, 12, 0, 0 MAP ENTRY: 128, 0, 105906, 71972,273723932, 0, = 0 VMSPACE: 2520, 0, 120, 294, 1064743, 0, 0 fakepg: 104, 0, 7, 3755, 3136, 0, 0 mt_zone: 16400, 0, 420, 0, 420, 0, 0 16: 16, 0, 1810762, 5474,247600975, 0, = 0 32: 32, 0, 430077, 113173,777148991, 0, = 0 64: 64, 0,24475835, 6313799,350608033, 0, = 0 128: 128, 0, 3914907, 76529,394374364, 0, = 0 256: 256, 0, 508512, 1081773,201807283, 0, = 0 512: 512, 0, 74387, 185061,131179473, 0, = 0 1024: 1024, 0, 127067, 8929,51960099, 0, 0 2048: 2048, 0, 48730, 1396, 5337386, 0, 0 4096: 4096, 0, 276060, 827, 6748826, 0, 0 8192: 8192, 0, 54, 68, 1059804, 0, 0 16384: 16384, 0, 38, 65, 23218, 0, 0 32768: 32768, 0, 139, 129, 21968, 0, 0 65536: 65536, 0, 115, 49, 119907, 0, 0 64 pcpu: 8, 0, 4020, 4172, 4118, 0, 0 SLEEPQUEUE: 80, 0, 2131, 2767, 2131, 0, 0 Files: 80, 0, 454, 4796,28333466, 0, 0 filedesc0: 1104, 0, 152, 433, 1064769, 0, 0 TURNSTILE: 136, 0, 2131, 1289, 2131, 0, 0 rl_entry: 40, 0, 561, 7539, 561, 0, 0 umtx pi: 96, 0, 0, 0, 0, 0, 0 umtx_shm: 88, 0, 31217, 3928, 34273, 0, 0 MAC labels: 40, 0, 0, 0, 0, 0, 0 PROC: 1424, 0, 152, 434, 1064768, 0, 0 THREAD: 1552, 0, 1704, 426, 9448, 0, 0 cpuset: 96, 0, 1208, 3712, 1529, 0, 0 audit_record: 1248, 0, 0, 4929,44799582, 0, 0 mbuf_packet: 256, 104630070, 24552, 12386, 4231181, 0, = 0 mbuf: 256, 104630070, 3, 12394, 8548105, 0, = 0 mbuf_cluster: 2048, 16348446, 36938, 138, 60263, 0, = 0 mbuf_jumbo_page: 4096, 8174223, 0, 141, 542447, 0, = 0 mbuf_jumbo_9k: 9216, 2421992, 0, 0, 0, 0, = 0 mbuf_jumbo_16k: 16384, 1362370, 0, 0, 0, 0, = 0 DMAR_MAP_ENTRY: 120, 0, 0, 0, 0, 0, 0 ttyinq: 160, 0, 360, 765, 990, 0, 0 ttyoutq: 256, 0, 188, 667, 517, 0, 0 FPU_save_area: 832, 0, 0, 0, 0, 0, 0 g_bio: 376, 0, 0, 2720,11874693, 0, 0 nvme_request: 128, 0, 4, 4398, 26030, 0, 0 vtnet_tx_hdr: 24, 0, 0, 0, 0, 0, 0 cryptop: 88, 0, 0, 4005, 647540, 0, 0 cryptodesc: 72, 0, 0, 4015, 647540, 0, 0 taskq_zone: 48, 0, 0, 35192, 2413940, 0, 0 VNODE: 472, 0, 142242, 1286, 601710, 0, 0 VNODEPOLL: 120, 0, 2, 625, 9, 0, 0 BUF TRIE: 144, 0, 0, 105948, 0, 0, 0 S VFS Cache: 108, 0, 89310, 2950, 1595636, 0, 0 STS VFS Cache: 148, 0, 0, 0, 0, 0, 0 L VFS Cache: 328, 0, 67, 1037, 1791, 0, 0 LTS VFS Cache: 368, 0, 0, 0, 0, 0, 0 NAMEI: 1024, 0, 0, 704,154021593, 0, = 0 rentr: 24, 0, 0, 8183, 2858, 0, 0 NCLNODE: 576, 0, 0, 0, 0, 0, 0 DIRHASH: 1024, 0, 0, 0, 0, 0, 0 pipe: 760, 0, 32, 568, 297208, 0, 0 procdesc: 136, 0, 1, 1710, 897, 0, 0 AIO: 208, 0, 0, 0, 0, 0, 0 AIOP: 32, 0, 0, 0, 0, 0, 0 AIOCB: 752, 0, 0, 0, 0, 0, 0 AIOLIO: 280, 0, 0, 0, 0, 0, 0 range_seg_cache: 72, 0, 1117410, 81040,23301057, 0, 0 metaslab_alloc_trace_cache: 72, 0, 0, 10065, 4408239, = 0, 0 zio_cache: 1048, 0, 73, 115556,241609900, 0, = 0 zio_link_cache: 48, 0, 0, 116200,64873334, 0, 0 zio_buf_512: 512, 0, 555291, 1877,54193375, 0, 0 zio_data_buf_512: 512, 0, 1805, 1331, 1491667, 0, 0 zio_buf_1024: 1024, 0, 1159, 24081, 224958, 0, 0 zio_data_buf_1024: 1024, 0, 968, 632, 29721, 0, 0 zio_buf_1536: 1536, 0, 342, 24328, 210638, 0, 0 zio_data_buf_1536: 1536, 0, 1129, 401, 49580, 0, 0 zio_buf_2048: 2048, 0, 48731, 531, 630275, 0, 0 zio_data_buf_2048: 2048, 0, 2176, 160, 21869, 0, 0 zio_buf_2560: 2560, 0, 64, 119, 179116, 0, 0 zio_data_buf_2560: 2560, 0, 2181, 132, 20371, 0, 0 zio_buf_3072: 3072, 0, 46, 110, 177331, 0, 0 zio_data_buf_3072: 3072, 0, 1720, 116, 14832, 0, 0 zio_buf_3584: 3584, 0, 40, 50, 174997, 0, 0 zio_data_buf_3584: 3584, 0, 1262, 94, 15376, 0, 0 zio_buf_4096: 4096, 0, 527, 9091, 8627750, 0, 0 zio_data_buf_4096: 4096, 0, 1139, 376, 236336, 0, 0 zio_buf_5120: 5120, 0, 752, 145, 345173, 0, 0 zio_data_buf_5120: 5120, 0, 1584, 136, 28763, 0, 0 zio_buf_6144: 6144, 0, 5941, 3678, 346446, 0, 0 zio_data_buf_6144: 6144, 0, 1059, 61, 24013, 0, 0 zio_buf_7168: 7168, 0, 167, 177, 325035, 0, 0 zio_data_buf_7168: 7168, 0, 880, 56, 21846, 0, 0 zio_buf_8192: 8192, 0, 1110, 1539, 871622, 0, 0 zio_data_buf_8192: 8192, 0, 698, 81, 24196, 0, 0 zio_buf_10240: 10240, 0, 131387, 66570, 521772, 0, 0 zio_data_buf_10240: 10240, 0, 1177, 39, 47154, 0, 0 zio_buf_12288: 12288, 0, 23008, 22661, 630894, 0, 0 zio_data_buf_12288: 12288, 0, 1499, 252, 39611, 0, 0 zio_buf_14336: 14336, 0, 4, 108, 5683, 0, 0 zio_data_buf_14336: 14336, 0, 884, 532, 32480, 0, 0 zio_buf_16384: 16384, 0, 321809, 122568, 1774887, 0, 0 zio_data_buf_16384: 16384, 0, 826, 67, 32998, 0, 0 zio_buf_20480: 20480, 0, 24, 131, 109109, 0, 0 zio_data_buf_20480: 20480, 0, 1414, 97, 49762, 0, 0 zio_buf_24576: 24576, 0, 5, 201, 208036, 0, 0 zio_data_buf_24576: 24576, 0, 788, 94, 46730, 0, 0 zio_buf_28672: 28672, 0, 0, 517, 260314, 0, 0 zio_data_buf_28672: 28672, 0, 563, 50, 44097, 0, 0 zio_buf_32768: 32768, 0, 2, 121, 71174, 0, 0 zio_data_buf_32768: 32768, 0, 485, 59, 44268, 0, 0 zio_buf_40960: 40960, 0, 5, 189, 120351, 0, 0 zio_data_buf_40960: 40960, 0, 846, 148, 88458, 0, 0 zio_buf_49152: 49152, 0, 2, 496, 114293, 0, 0 zio_data_buf_49152: 49152, 0, 1252, 91, 36323, 0, 0 zio_buf_57344: 57344, 0, 0, 206, 93761, 0, 0 zio_data_buf_57344: 57344, 0, 617, 298, 21824, 0, 0 zio_buf_65536: 65536, 0, 1, 213, 50490, 0, 0 zio_data_buf_65536: 65536, 0, 504, 96, 20772, 0, 0 zio_buf_81920: 81920, 0, 2, 124, 76214, 0, 0 zio_data_buf_81920: 81920, 0, 772, 227, 38083, 0, 0 zio_buf_98304: 98304, 0, 1, 139, 77399, 0, 0 zio_data_buf_98304: 98304, 0, 568, 79, 33028, 0, 0 zio_buf_114688: 114688, 0, 2, 122, 69859, 0, 0 zio_data_buf_114688: 114688, 0, 467, 70, 34427, 0, 0 zio_buf_131072: 131072, 0, 126330, 2110, 803167, 0, 0 zio_data_buf_131072: 131072, 0, 38681, 195, 542909, 0, 0 zio_buf_163840: 163840, 0, 0, 119, 68167, 0, 0 zio_data_buf_163840: 163840, 0, 0, 0, 0, 0, 0 zio_buf_196608: 196608, 0, 0, 116, 26793, 0, 0 zio_data_buf_196608: 196608, 0, 0, 0, 0, 0, 0 zio_buf_229376: 229376, 0, 0, 44, 8159, 0, 0 zio_data_buf_229376: 229376, 0, 0, 0, 0, 0, 0 zio_buf_262144: 262144, 0, 0, 43, 5843, 0, 0 zio_data_buf_262144: 262144, 0, 0, 0, 0, 0, 0 zio_buf_327680: 327680, 0, 0, 44, 7784, 0, 0 zio_data_buf_327680: 327680, 0, 0, 0, 0, 0, 0 zio_buf_393216: 393216, 0, 0, 42, 4732, 0, 0 zio_data_buf_393216: 393216, 0, 0, 0, 0, 0, 0 zio_buf_458752: 458752, 0, 0, 42, 3733, 0, 0 zio_data_buf_458752: 458752, 0, 0, 0, 0, 0, 0 zio_buf_524288: 524288, 0, 0, 42, 1996, 0, 0 zio_data_buf_524288: 524288, 0, 0, 0, 0, 0, 0 zio_buf_655360: 655360, 0, 0, 109, 2377, 0, 0 zio_data_buf_655360: 655360, 0, 0, 0, 0, 0, 0 zio_buf_786432: 786432, 0, 0, 41, 959, 0, 0 zio_data_buf_786432: 786432, 0, 0, 0, 0, 0, 0 zio_buf_917504: 917504, 0, 0, 38, 478, 0, 0 zio_data_buf_917504: 917504, 0, 0, 0, 0, 0, 0 zio_buf_1048576: 1048576, 0, 0, 39, 556, 0, = 0 zio_data_buf_1048576: 1048576, 0, 0, 0, 0, 0, = 0 zio_buf_1310720: 1310720, 0, 0, 3, 4, 0, = 0 zio_data_buf_1310720: 1310720, 0, 0, 0, 0, 0, = 0 zio_buf_1572864: 1572864, 0, 0, 0, 0, 0, = 0 zio_data_buf_1572864: 1572864, 0, 0, 0, 0, 0, = 0 zio_buf_1835008: 1835008, 0, 0, 0, 0, 0, = 0 zio_data_buf_1835008: 1835008, 0, 0, 0, 0, 0, = 0 zio_buf_2097152: 2097152, 0, 0, 0, 0, 0, = 0 zio_data_buf_2097152: 2097152, 0, 0, 0, 0, 0, = 0 zio_buf_2621440: 2621440, 0, 0, 0, 0, 0, = 0 zio_data_buf_2621440: 2621440, 0, 0, 0, 0, 0, = 0 zio_buf_3145728: 3145728, 0, 0, 0, 0, 0, = 0 zio_data_buf_3145728: 3145728, 0, 0, 0, 0, 0, = 0 zio_buf_3670016: 3670016, 0, 0, 0, 0, 0, = 0 zio_data_buf_3670016: 3670016, 0, 0, 0, 0, 0, = 0 zio_buf_4194304: 4194304, 0, 0, 0, 0, 0, = 0 zio_data_buf_4194304: 4194304, 0, 0, 0, 0, 0, = 0 zio_buf_5242880: 5242880, 0, 0, 0, 0, 0, = 0 zio_data_buf_5242880: 5242880, 0, 0, 0, 0, 0, = 0 zio_buf_6291456: 6291456, 0, 0, 0, 0, 0, = 0 zio_data_buf_6291456: 6291456, 0, 0, 0, 0, 0, = 0 zio_buf_7340032: 7340032, 0, 0, 0, 0, 0, = 0 zio_data_buf_7340032: 7340032, 0, 0, 0, 0, 0, = 0 zio_buf_8388608: 8388608, 0, 0, 0, 0, 0, = 0 zio_data_buf_8388608: 8388608, 0, 0, 0, 0, 0, = 0 zio_buf_10485760: 10485760, 0, 0, 0, 0, 0, = 0 zio_data_buf_10485760: 10485760, 0, 0, 0, 0, 0, = 0 zio_buf_12582912: 12582912, 0, 0, 0, 0, 0, = 0 zio_data_buf_12582912: 12582912, 0, 0, 0, 0, 0, = 0 zio_buf_14680064: 14680064, 0, 0, 0, 0, 0, = 0 zio_data_buf_14680064: 14680064, 0, 0, 0, 0, 0, = 0 zio_buf_16777216: 16777216, 0, 0, 0, 0, 0, = 0 zio_data_buf_16777216: 16777216, 0, 0, 0, 0, 0, = 0 lz4_ctx: 16384, 0, 0, 167, 2371452, 0, 0 abd_chunk: 4096, 0, 1498299, 193360,66794913, 0, 0 sa_cache: 144, 0, 117910, 1970, 610708, 0, 0 dnode_t: 728, 0, 1118459, 1106, 2539158, 0, 0 arc_buf_hdr_t_full: 224, 0, 985562, 1662,19179064, 0, 0 arc_buf_hdr_t_l2only: 88, 0, 0, 4005, 25678, 0, = 0 arc_buf_t: 64, 0, 729765, 134763,19547924, 0, 0 dmu_buf_impl_t: 240, 0, 1236172, 97444,16369819, 0, 0 zil_lwb_cache: 320, 0, 3, 345, 54, 0, 0 zil_zcw_cache: 80, 0, 0, 1650, 51, 0, 0 sio_cache: 128, 0, 0, 0, 0, 0, 0 zfs_znode_cache: 264, 0, 117910, 1235, 553283, 0, 0 Mountpoints: 816, 0, 24268, 202, 24268, 0, 0 ksiginfo: 112, 0, 570, 6570, 61686, 0, 0 itimer: 352, 0, 1, 175, 9, 0, 0 KNOTE: 128, 0, 45, 3458, 66190, 0, 0 socket: 872, 8384332, 130, 562, 923394, 0, = 0 ipq: 56, 51262, 0, 0, 0, 0, 0 udp_inpcb: 464, 8384336, 25, 1199, 274109, 0, = 0 udpcb: 32, 8384375, 25, 8100, 274109, 0, = 0 tcp_inpcb: 464, 8384336, 36, 892, 36547, 0, = 0 tcpcb: 1048, 8384334, 36, 270, 36547, 0, = 0 tcptw: 88, 27810, 0, 2880, 897, 0, 0 syncache: 160, 15375, 0, 1900, 34233, 0, 0 hostcache: 128, 15376, 21, 1126, 23, 0, 0 sackhole: 32, 0, 0, 0, 0, 0, 0 tcpreass: 48, 1021813, 0, 1328, 9, 0, = 0 sctp_ep: 1248, 8384334, 0, 0, 0, 0, = 0 sctp_asoc: 2408, 40000, 0, 0, 0, 0, 0 sctp_laddr: 48, 80012, 0, 1577, 6, 0, 0 sctp_raddr: 736, 80000, 0, 0, 0, 0, 0 sctp_chunk: 152, 400010, 0, 0, 0, 0, 0 sctp_readq: 152, 400010, 0, 0, 0, 0, 0 sctp_stream_msg_out: 112, 400015, 0, 0, 0, 0, 0 sctp_asconf: 40, 400000, 0, 0, 0, 0, 0 sctp_asconf_ack: 48, 400060, 0, 0, 0, 0, 0 udplite_inpcb: 464, 8384336, 0, 0, 0, 0, = 0 ripcb: 464, 8384336, 2, 902, 1792, 0, = 0 unpcb: 240, 8384336, 65, 2447, 610936, 0, = 0 IPsec SA lft_c: 16, 0, 0, 0, 0, 0, 0 rtentry: 208, 0, 18, 723, 20, 0, 0 selfd: 64, 0, 677, 8995,47464291, 0, 0 swpctrie: 144, 32696973, 0, 0, 0, 0, = 0 swblk: 136, 32696949, 0, 0, 0, 0, = 0 pf mtags: 48, 0, 0, 2822, 41, 0, 0 pf states: 296, 100009, 200, 1113, 109141, 0, 0 pf state keys: 88, 0, 200, 5065, 109141, 0, 0 pf source nodes: 136, 50025, 0, 0, 0, 0, 0 pf table entries: 216, 200016, 18, 54, 18, 0, 0 pf frags: 112, 0, 0, 0, 0, 0, 0 pf frag entries: 40, 50000, 0, 0, 0, 0, 0 pf state scrubs: 40, 0, 0, 0, 0, 0, 0 If I compare that one from a production server that has been running for = 56 days and have about 500 active users (mostly SMB but some NFS), then = the zio_* values that differ much are: ITEM SIZE LIMIT USED FREE REQ = FAIL SLEEP zio_buf_512: 512, 0, 23957617, 20604807, = 28379999753, 0, 0 zio_buf_2048: 2048, 0, 48613, 10045, = 152698166, 0, 0 zio_buf_5120: 5120, 0, 147618, 145002, = 114926222, 0, 0 zio_buf_6144: 6144, 0, 213681, 296569, = 96756130, 0, 0 zio_buf_10240: 10240, 0, 15485, 28955, = 26953062, 0, 0 zio_buf_12288: 12288, 0, 8795, 29201, = 383299702, 0, 0. (zio_buf_cache[20]) zio_buf_16384: 16384, 0, 1300372, 391818, = 1519180982, 0, 0. (zio_buf_cache[28]) The test server for comparison: ITEM SIZE LIMIT USED FREE REQ = FAIL SLEEP zio_buf_512: 512, 0, 555291, 1877, 54193375, = 0, 0 zio_buf_5120: 5120, 0, 752, 145, = 345173, 0, 0 zio_buf_6144: 6144, 0, 5941, 3678, = 346446, 0, 0 zio_buf_10240: 10240, 0, 131387, 66570, 521772, = 0, 0 zio_buf_12288: 12288, 0, 23008, 22661, 630894, = 0, 0 zio_buf_16384: 16384, 0, 321809, 122568, 1774887, 0, = 0 If the test server at 7 hours uptime takes 10 minutes to clean up the = zio_buffers, then I can see why it sometimes will seem to take forever = on a production server and why we thought it had freezed up on us (we = usually give up and send it a hard reset after a while). hmm.. I wonder if perhaps one could add a progress counter to = uma_destroy() somewhere. *Reads the code some more* :-) - Peter > On 2 Dec 2019, at 23:54, Mark Johnston wrote: >=20 > On Mon, Dec 02, 2019 at 11:39:01PM +0100, Peter Eriksson wrote: >> Sigh. >>=20 >> Slight correction, the output below should have said uma_zdestroy() = and not uma_zfree_arg() (wrong printf text, but the right times). >>=20 >> After an uptime of 7 hours, a reboot have these times (I removed the = =E2=80=9Cuma=E2=80=9D printf in this run): >>=20 >> kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds >> kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds >> kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds >> kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds >> kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds >> kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds >> kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds >> kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds >>=20 >> This is on a mostly idle server (well, apart from compiling the = kernel code :-) and some snapshots being taken of all filesystems (once = per hour). >>=20 >>=20 >> So now on to finding out why uma_destroy() is taking so long=E2=80=A6 = :-). >=20 > uma_destroy() frees all of the memory cached in the zone back to the > page allocator. This operation takes time proportional to the number = of > cached items. I would expect most of the time to be spent in > zone_reclaim(), called by zone_dtor(). >=20 > Before rebooting, could you try grabbing the output of vmstat -z? From owner-freebsd-fs@freebsd.org Tue Dec 3 09:03:16 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id B806E1CE83F for ; Tue, 3 Dec 2019 09:03:16 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47Rwxb5qcdz4738; Tue, 3 Dec 2019 09:03:15 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f196.google.com with SMTP id c19so2813369lji.11; Tue, 03 Dec 2019 01:03:15 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=dYGIdAVrHvAt1eIcBwh8g6Q3s+NpUpIKvFp6uJlcI/k=; b=bIotJoaY1M/nYec/yAk/bGVL8AHYRefQlVHyikCBsYzCENwBtAsDJra5tRsLsjR1Yw Zo5/imXX8GL1rOWZWwqhlP3ayYaBYL4coe9x5DcYh3XIR5g1qp0F7B4F45SBXdAeJX1h qXCmP5wNtx7SFca0hP/ef9tMG+InanfaJVpFWNJKerCYfz/DofuchdqWr7YKyrUHvDnv 8aMBI8tcu8JAfnz/cSSUVWNrAo+hsIDMv/9OQum91Ef/vm8jSn198/ap23L+8fkFMIio nc2IuqPg7k4FNejQZxlUtWUFAA2g3aASA8h6JyXDSp3Z/yH205I70470bdsCQXcGkJZH t6nA== X-Gm-Message-State: APjAAAWifm96t0mxDEGeVqMJAVVOxfOJp3+qvgq4s7oWBBVtyhbqHHSQ zB6ClIWZN2XaaKyquqpC/sk08vM46zs= X-Google-Smtp-Source: APXvYqz+0noW63ra/dt7v9sFeb0Ixl8IOhTMRpMHVAjx5SOGs+ofWo/bzDnuUGCVUgOpZg1HVSRB6A== X-Received: by 2002:a05:651c:327:: with SMTP id b7mr1828072ljp.22.1575363793518; Tue, 03 Dec 2019 01:03:13 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id v13sm927264lfq.7.2019.12.03.01.03.12 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Dec 2019 01:03:12 -0800 (PST) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() To: Mark Johnston , Peter Eriksson Cc: freebsd-fs@freebsd.org References: <20191202225424.GG43802@raichu> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> Date: Tue, 3 Dec 2019 11:03:11 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <20191202225424.GG43802@raichu> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 47Rwxb5qcdz4738 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.196 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-3.09 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; RCVD_TLS_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[FreeBSD.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[196.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; IP_SCORE(-1.09)[ip: (-0.31), ipnet: 209.85.128.0/17(-3.15), asn: 15169(-1.94), country: US(-0.05)]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[196.208.85.209.rep.mailspike.net : 127.0.0.17]; MIME_TRACE(0.00)[0:+]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 09:03:16 -0000 On 03/12/2019 00:54, Mark Johnston wrote: > On Mon, Dec 02, 2019 at 11:39:01PM +0100, Peter Eriksson wrote: >> Sigh. >> >> Slight correction, the output below should have said uma_zdestroy() and not uma_zfree_arg() (wrong printf text, but the right times). >> >> After an uptime of 7 hours, a reboot have these times (I removed the “uma” printf in this run): >> >> kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds >> kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds >> kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds >> kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds >> kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds >> kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds >> kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds >> kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds >> >> This is on a mostly idle server (well, apart from compiling the kernel code :-) and some snapshots being taken of all filesystems (once per hour). >> >> >> So now on to finding out why uma_destroy() is taking so long… :-). > > uma_destroy() frees all of the memory cached in the zone back to the > page allocator. This operation takes time proportional to the number of > cached items. I would expect most of the time to be spent in > zone_reclaim(), called by zone_dtor(). But spending *minutes* there is really unexpected. I have never seen anything like that. I wonder if there is anything untypical about the system's hardware (like a very big number of processors) or configuration. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Tue Dec 3 09:29:25 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 91FC21CF34E for ; Tue, 3 Dec 2019 09:29:25 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47RxWn2R5Pz48YJ; Tue, 3 Dec 2019 09:29:24 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 7094040049; Tue, 3 Dec 2019 10:29:22 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075] (unknown [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 5A77F40005; Tue, 3 Dec 2019 10:29:22 +0100 (CET) From: Peter Eriksson Message-Id: Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Tue, 3 Dec 2019 10:29:22 +0100 In-Reply-To: <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> Cc: Mark Johnston , Andriy Gapon To: FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47RxWn2R5Pz48YJ X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.989,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-0.997,0] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 09:29:25 -0000 It=E2=80=99s a fairly standard Dell PowerEdge R730xd server with Intel = Xeon E5-2620v4 CPUs & 256GB of RAM=E2=80=A6 (and an LSI SAS3 HBA and = Intel 10GE ethernet) CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2100.04-MHz K8-class = CPU) Origin=3D"GenuineIntel" Id=3D0x406f1 Family=3D0x6 Model=3D0x4f = Stepping=3D1 = Features=3D0xbfebfbff = Features2=3D0x7ffefbff AMD Features=3D0x2c100800 AMD Features2=3D0x121 Structured Extended = Features=3D0x21cbfbb Structured Extended = Features3=3D0x9c000400 XSAVE Features=3D0x1 VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr TSC: P-state invariant, performance statistics real memory =3D 274869518336 (262136 MB) avail memory =3D 267244859392 (254864 MB) Event timer "LAPIC" quality 600 ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads - Peter >>=20 >> uma_destroy() frees all of the memory cached in the zone back to the >> page allocator. This operation takes time proportional to the number = of >> cached items. I would expect most of the time to be spent in >> zone_reclaim(), called by zone_dtor(). >=20 > But spending *minutes* there is really unexpected. > I have never seen anything like that. > I wonder if there is anything untypical about the system's hardware = (like a very > big number of processors) or configuration. >=20 > --=20 > Andriy Gapon From owner-freebsd-fs@freebsd.org Tue Dec 3 09:32:58 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 4F4FF1CF564 for ; Tue, 3 Dec 2019 09:32:58 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47Rxbs2Mgyz48rt; Tue, 3 Dec 2019 09:32:57 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f195.google.com with SMTP id j6so2994744lja.2; Tue, 03 Dec 2019 01:32:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=NOgwZByuqL/Hej0TLtDWZTupdDgX9+N4Sd2+ZcgQG1M=; b=DBgZExfFvaOjIflG/VxwR2+zKl7JNizRGYJytO3gTfoE2x72gJzqXvkibp9zRN+/Q2 eYAurtYam525doUr/iwqHFJREQAV1cTSDSD6Qv1XvXlfpHMqGxTTrJHpYH1R6m0pEbSe DN2C8hgrkKsWmpIPFRY1z3sqByzKM+0eADKhyT/7SBQrpJAPL4jfRB0E0LmbiUK8vu+3 1rfldLzxD08qQLlbF1OqicmrDnXJ+yyX5mTkbrp2uV9TpE/pmfYgzHDD3PYFzjj+BHDi 8ps9FsYYD+aeBc2ntFjZvyctgvBAwzcSnhir8r78eOvxTxe1EiAjRBuFmQAMVUEuQVnh UcWA== X-Gm-Message-State: APjAAAWFmmmph3OJrmKOdDAXv+JTcTQc1A50iQJDHtneqmTc6ASelBIL ZlPx7YFiG0gm1jS89vmTwzQqYZPwyLs= X-Google-Smtp-Source: APXvYqwpNZpMXnspwFLikifrny6pgHR2CAV+NX4ACJVxyDoU/e6smKB6nLDyYURrnmyi4v8SMvz7BQ== X-Received: by 2002:a2e:7a1a:: with SMTP id v26mr1972150ljc.76.1575365575470; Tue, 03 Dec 2019 01:32:55 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id a23sm993242ljn.79.2019.12.03.01.32.54 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 Dec 2019 01:32:54 -0800 (PST) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() To: Peter Eriksson , FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: <381ab886-fc3b-75e9-afa7-7bde1f712055@FreeBSD.org> Date: Tue, 3 Dec 2019 11:32:53 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 47Rxbs2Mgyz48rt X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.195 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-3.07 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17:c]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[FreeBSD.org]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[195.208.85.209.list.dnswl.org : 127.0.5.0]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; IP_SCORE(-1.07)[ip: (-0.18), ipnet: 209.85.128.0/17(-3.15), asn: 15169(-1.94), country: US(-0.05)]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[195.208.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 09:32:58 -0000 On 03/12/2019 11:29, Peter Eriksson wrote: > It’s a fairly standard Dell PowerEdge R730xd server with Intel Xeon E5-2620v4 > CPUs & 256GB of RAM… (and an LSI SAS3 HBA and Intel 10GE ethernet) > > CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2100.04-MHz K8-class CPU) >   Origin="GenuineIntel"  Id=0x406f1  Family=0x6  Model=0x4f  Stepping=1 >   > Features=0xbfebfbff >   > Features2=0x7ffefbff >   AMD Features=0x2c100800 >   AMD Features2=0x121 >   Structured Extended > Features=0x21cbfbb >   Structured Extended Features3=0x9c000400 >   XSAVE Features=0x1 >   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr >   TSC: P-state invariant, performance statistics > real memory  = 274869518336 (262136 MB) > avail memory = 267244859392 (254864 MB) > Event timer "LAPIC" quality 600 > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs > FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads > > - Peter > >>> >>> uma_destroy() frees all of the memory cached in the zone back to the >>> page allocator.  This operation takes time proportional to the number of >>> cached items.  I would expect most of the time to be spent in >>> zone_reclaim(), called by zone_dtor(). >> >> But spending *minutes* there is really unexpected. >> I have never seen anything like that. >> I wonder if there is anything untypical about the system's hardware (like a very >> big number of processors) or configuration. Yes, this is pretty typical. If you don't have root on ZFS (or can arrange that) maybe you can experiment with kldunload-ing of zfs. That way you will have more tools at your disposal for diagnosing the problem. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Tue Dec 3 16:22:24 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C1C821B45BA for ; Tue, 3 Dec 2019 16:22:24 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-qt1-x843.google.com (mail-qt1-x843.google.com [IPv6:2607:f8b0:4864:20::843]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47S6hJ4l7Mz4ZFL; Tue, 3 Dec 2019 16:22:24 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-qt1-x843.google.com with SMTP id s8so1125260qte.2; Tue, 03 Dec 2019 08:22:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=fhnwZruAW+hV89B1Oe4xe+b1SQSnVof2N96jwNOigJ8=; b=rr3933G5ZB8A7IMwlvKTtxUsyvXxyCBHLq4BVFLR2dV41bbUFugvTjTyyMleibpP4a y4mOCWexf1uv7VuXjcp5U0XqIgSI9qPQeZ6k+AMiPbRdPry0n35wHq8EAGgiwG7zFwTU 4PBor1/qqMsUYu3Y8wQqeIoNAlYacRsEYW9Hak/8CW8iUAO0iSEBSouWf87Kxcmf9ALU PK/k75xgRuRnk8nUKtNnWtyzZGqPStvc0qB+C1zNwxoCmd4f1fDE3/Z6QoinnLnP9njN 4A+5FWFe4OXM5+EGEgVACsEurEyxVTdNq0UylipeA2J+t9sWXrfMhIbgvdlJTOz5ILkX xe4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=fhnwZruAW+hV89B1Oe4xe+b1SQSnVof2N96jwNOigJ8=; b=hTBrRISIpnslGbKAXdJGPON5tGep4Knkdw8kvkdfeGBGnRlegiWeCL/J3hebfHfenJ cJcbH+dES0oIU1bcscG9pkzkHK8IDbGNhQcu4YT+VfDvHCl84rkvUS6YM1M8jjGeQonN CktrRLYlUy1VuLFheZTK1WadIkcchbQPUM5sugxPghEljUBFy4MK3bqqrdd1CAabhizd b8I2kYOrvRLxSOnwR1BVme+VH1w8a32x7b4r9JWwktEJS6wLVvCaC+KaJUqNxca4yDi+ 4M9z0Q1ZpW8Al3aTlGJqCuYw7PVEL6zCy66Gxr6x6NdAO01ZGUZZgTfOWkZX0uRltAKw KddQ== X-Gm-Message-State: APjAAAV4zcfMaGT77SxzmbLp682mAEsX6As83jhUKUSBg+B/PTBA6oR+ Kp/6qx1dAjvXB5g4a1q8br4r34eLLzc= X-Google-Smtp-Source: APXvYqzUWJWybkoCBPzhH0yum8OiM1dEubYFPg3CmvpNBw+UhVzhwAjwLbYS8m+Tla8wImXprozoOA== X-Received: by 2002:ac8:3946:: with SMTP id t6mr5864326qtb.278.1575390142260; Tue, 03 Dec 2019 08:22:22 -0800 (PST) Received: from raichu (toroon0560w-lp130-05-69-158-183-252.dsl.bell.ca. [69.158.183.252]) by smtp.gmail.com with ESMTPSA id u12sm2054577qta.79.2019.12.03.08.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Dec 2019 08:22:21 -0800 (PST) Sender: Mark Johnston Date: Tue, 3 Dec 2019 11:22:19 -0500 From: Mark Johnston To: Andriy Gapon Cc: Peter Eriksson , freebsd-fs@freebsd.org Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Message-ID: <20191203162219.GI43802@raichu> References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> User-Agent: Mutt/1.12.2 (2019-09-21) X-Rspamd-Queue-Id: 47S6hJ4l7Mz4ZFL X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-6.00 / 15.00]; NEURAL_HAM_MEDIUM(-1.00)[-0.996,0]; REPLY(-4.00)[]; NEURAL_HAM_LONG(-1.00)[-0.999,0] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 16:22:24 -0000 On Tue, Dec 03, 2019 at 11:03:11AM +0200, Andriy Gapon wrote: > On 03/12/2019 00:54, Mark Johnston wrote: > > On Mon, Dec 02, 2019 at 11:39:01PM +0100, Peter Eriksson wrote: > >> Sigh. > >> > >> Slight correction, the output below should have said uma_zdestroy() and not uma_zfree_arg() (wrong printf text, but the right times). > >> > >> After an uptime of 7 hours, a reboot have these times (I removed the “uma” printf in this run): > >> > >> kmem_cache_destroy(zio_data_buf_cache[8]) took 2 seconds > >> kmem_cache_destroy(zio_buf_cache[10]) took 6 seconds > >> kmem_cache_destroy(zio_buf_cache[14]) took 2 seconds > >> kmem_cache_destroy(zio_buf_cache[16]) took 136 seconds > >> kmem_cache_destroy(zio_buf_cache[20]) took 31 seconds > >> kmem_cache_destroy(zio_buf_cache[28]) took 303 seconds > >> kmem_cache_destroy(zio_buf_cache[224]) took 89 seconds > >> kmem_cache_destroy(zio_data_buf_cache[224]) took 31 seconds > >> > >> This is on a mostly idle server (well, apart from compiling the kernel code :-) and some snapshots being taken of all filesystems (once per hour). > >> > >> > >> So now on to finding out why uma_destroy() is taking so long… :-). > > > > uma_destroy() frees all of the memory cached in the zone back to the > > page allocator. This operation takes time proportional to the number of > > cached items. I would expect most of the time to be spent in > > zone_reclaim(), called by zone_dtor(). > > But spending *minutes* there is really unexpected. > I have never seen anything like that. > I wonder if there is anything untypical about the system's hardware (like a very > big number of processors) or configuration. Indeed, looking at the vmstat -z output this doesn't really make sense. It would be useful to confirm that we are in fact spending most of the time reclaiming items. From owner-freebsd-fs@freebsd.org Tue Dec 3 17:03:05 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D64A51B5FEC for ; Tue, 3 Dec 2019 17:03:05 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47S7bD2VFgz4dq6; Tue, 3 Dec 2019 17:03:04 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id AD20F4000A; Tue, 3 Dec 2019 18:03:00 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075] (unknown [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 7CAA140006; Tue, 3 Dec 2019 18:03:00 +0100 (CET) From: Peter Eriksson Message-Id: Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Tue, 3 Dec 2019 18:03:00 +0100 In-Reply-To: Cc: Andriy Gapon , Bengt Ahlgren To: FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47S7bD2VFgz4dq6 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.42 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_TLS_LAST(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-0.92)[ip: (-3.77), ipnet: 2001:6b0::/32(-0.46), asn: 1653(-0.38), country: EU(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 17:03:05 -0000 We run without swap on our servers=E2=80=A6 I=E2=80=99m trying to instrument the uma_zdestroy() code now, should = provide some more details in an hour or two=E2=80=A6 :-) - Peter > On 3 Dec 2019, at 17:59, Bengt Ahlgren wrote: >=20 > Peter Eriksson > = writes: >=20 >>>> uma_destroy() frees all of the memory cached in the zone back to = the >>>> page allocator. This operation takes time proportional to the = number of >>>> cached items. I would expect most of the time to be spent in >>>> zone_reclaim(), called by zone_dtor(). >>>=20 >>> But spending *minutes* there is really unexpected. >>> I have never seen anything like that. >>> I wonder if there is anything untypical about the system's hardware >>> (like a very >>> big number of processors) or configuration. >>>=20 >>> --=20 >>> Andriy Gapon >>=20 >> It=E2=80=99s a fairly standard Dell PowerEdge R730xd server with = Intel Xeon >> E5-2620v4 CPUs & 256GB of RAM=E2=80=A6 (and an LSI SAS3 HBA and Intel = 10GE >> ethernet) >>=20 >> CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (2100.04-MHz K8-class = CPU) >> Origin=3D"GenuineIntel" Id=3D0x406f1 Family=3D0x6 Model=3D0x4f = Stepping=3D1 >> = Features=3D0xbfebfbff >> = Features2=3D0x7ffefbff >> AMD Features=3D0x2c100800 >> AMD Features2=3D0x121 >> Structured Extended >> = Features=3D0x21cbfbb >> Structured Extended = Features3=3D0x9c000400 >> XSAVE Features=3D0x1 >> VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr >> TSC: P-state invariant, performance statistics >> real memory =3D 274869518336 (262136 MB) >> avail memory =3D 267244859392 (254864 MB) >> Event timer "LAPIC" quality 600 >> ACPI APIC Table: >> FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs >> FreeBSD/SMP: 2 package(s) x 8 core(s) x 2 hardware threads >>=20 >> - Peter >=20 > I also have a Dell server, but older (R515, 128GB RAM, AMD 2x8 cores, > ~140TB disk, ARC cap 64GB) that has experienced the same behaviour. I > get the impression the shutdown time is related to used swap, but I = have > not investigated further. I think the system disk (with swap - > mechanical!) activity light is on during this time. Now running = 11.3-R, > but I'm sure I saw this with late 10.x as well. >=20 > Bengt From owner-freebsd-fs@freebsd.org Tue Dec 3 17:31:52 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 806E11B6E83 for ; Tue, 3 Dec 2019 17:31:52 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47S8DR2h7lz3Bv5; Tue, 3 Dec 2019 17:31:51 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 94B7440006; Tue, 3 Dec 2019 18:31:49 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075] (unknown [IPv6:2001:6b0:17:f002:182f:9bd3:109e:7075]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 77FF540005; Tue, 3 Dec 2019 18:31:49 +0100 (CET) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() From: Peter Eriksson In-Reply-To: <20191203162219.GI43802@raichu> Date: Tue, 3 Dec 2019 18:31:49 +0100 Cc: Andriy Gapon , Mark Johnston Content-Transfer-Encoding: quoted-printable Message-Id: <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> To: FreeBSD FS X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47S8DR2h7lz3Bv5 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.67 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-1.17)[ip: (-5.00), ipnet: 2001:6b0::/32(-0.46), asn: 1653(-0.38), country: EU(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 17:31:52 -0000 zio_fini: kmem_cache_destroy(zio_buf_cache & zio_data_buf_cache): zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 16 seconds uma_zdestroy(zio_buf_12288) took 16 seconds kmem_cache_destroy: uma_zdestroy(0xfffff8034678e980) [zio_buf_12288] = took 16 seconds kmem_cache_destroy(zio_buf_cache[20]) took 16 seconds zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 60 seconds uma_zdestroy(zio_buf_16384) took 60 seconds kmem_cache_destroy: uma_zdestroy(0xfffff8034678e880) [zio_buf_16384] = took 60 seconds kmem_cache_destroy(zio_buf_cache[28]) took 60 seconds Looks like it is the call to zone->uz_dtor that is taking time (added = time check around the various parts of zone_free_item(). - Peter > Indeed, looking at the vmstat -z output this doesn't really make = sense. > It would be useful to confirm that we are in fact spending most of the > time reclaiming items. From owner-freebsd-fs@freebsd.org Tue Dec 3 22:28:54 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 149671BFCA5 for ; Tue, 3 Dec 2019 22:28:54 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47SGq72CSbz42rV for ; Tue, 3 Dec 2019 22:28:50 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 91CCA4000A for ; Tue, 3 Dec 2019 23:28:48 +0100 (CET) Received: from [192.168.1.132] (h-201-140.A785.priv.bahnhof.se [98.128.201.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 5414A40006 for ; Tue, 3 Dec 2019 23:28:48 +0100 (CET) From: Peter Eriksson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Tue, 3 Dec 2019 23:28:48 +0100 References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> To: FreeBSD FS In-Reply-To: <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> Message-Id: <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47SGq72CSbz42rV X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-6.00 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCPT_COUNT_ONE(0.00)[1]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.254.236.130.list.dnswl.org : 127.0.11.2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; MV_CASE(0.50)[]; IP_SCORE(-3.50)[ip: (-9.17), ipnet: 130.236.0.0/16(-4.60), asn: 2843(-3.68), country: SE(-0.03)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Dec 2019 22:28:54 -0000 =E2=80=A6 and X levels deeper it gives this (not all filesystems mounted = - I rebooted fairly quickly): keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds [20021 = loops] zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 seconds zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 14 seconds uma_zdestroy(zio_buf_12288) took 14 seconds kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] = took 14 seconds kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds Hmm=E2=80=A6 I wonder if it isn=E2=80=99t this code I = /usr/src/sys/sys/queue.h: > #define SLIST_REMOVE(head, elm, type, field) do { = \ > QMD_SAVELINK(oldnext, (elm)->field.sle_next); = \ > if (SLIST_FIRST((head)) =3D=3D (elm)) { = \ > SLIST_REMOVE_HEAD((head), field); = \ > } = \ > else { = \ > QUEUE_TYPEOF(type) *curelm =3D SLIST_FIRST(head); = \ > while (SLIST_NEXT(curelm, field) !=3D (elm)) = \ > curelm =3D SLIST_NEXT(curelm, field); = \ > SLIST_REMOVE_AFTER(curelm, field); = \ > } = \ > TRASHIT(*oldnext); = \ > } while (0) Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain(): > finished: > KEG_UNLOCK(keg); > > while ((slab =3D SLIST_FIRST(&freeslabs)) !=3D NULL) { > SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink); > keg_free_slab(keg, slab, keg->uk_ipers); > } (The keg_drain print above is the time for the while() loop). If I=E2=80=99m reading this right it looks like it=E2=80=99s potentially = doing a linear search through a (long) linked list, for every entry in = the free slabs list. 20000 entries in 14 seconds is 1400 entries per = second. Hmmm=E2=80=A6 It feels like this code could be optimised at bit. - Peter > On 3 Dec 2019, at 18:31, Peter Eriksson wrote: >=20 > zio_fini: kmem_cache_destroy(zio_buf_cache & zio_data_buf_cache): >=20 > zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 16 seconds > uma_zdestroy(zio_buf_12288) took 16 seconds > kmem_cache_destroy: uma_zdestroy(0xfffff8034678e980) [zio_buf_12288] = took 16 seconds > kmem_cache_destroy(zio_buf_cache[20]) took 16 seconds >=20 > zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 60 seconds > uma_zdestroy(zio_buf_16384) took 60 seconds > kmem_cache_destroy: uma_zdestroy(0xfffff8034678e880) [zio_buf_16384] = took 60 seconds > kmem_cache_destroy(zio_buf_cache[28]) took 60 seconds >=20 >=20 > Looks like it is the call to zone->uz_dtor that is taking time (added = time check around the various parts of zone_free_item(). >=20 > - Peter >=20 >> Indeed, looking at the vmstat -z output this doesn't really make = sense. >> It would be useful to confirm that we are in fact spending most of = the >> time reclaiming items. >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Wed Dec 4 09:48:32 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 441E01CF410 for ; Wed, 4 Dec 2019 09:48:32 +0000 (UTC) (envelope-from agapon@gmail.com) Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com [209.85.208.170]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47SYvM1wNwz4f0s for ; Wed, 4 Dec 2019 09:48:31 +0000 (UTC) (envelope-from agapon@gmail.com) Received: by mail-lj1-f170.google.com with SMTP id j6so7344112lja.2 for ; Wed, 04 Dec 2019 01:48:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=pB/zAWZTSgoeeyV0AaGHYdDBdi0bWMiwwMr2wAGCAk4=; b=CRuX9EuBALqO/VblvJu/i2+6wYgaFETqlwn9NExsKs+m8tdK6LVg4E4jgm/pHQH2Ef iG0sfFEiT1VzqFiHTKPD8vvX5dH3c2I0KB6d1JgpILY5e+7zDHjv+ZxBdijh1egMtq4T Svv1xeSgWWWFRuW4GfZEWCQ+aNAU+eisslgRrx2/F0/oTL3hvUS0COiU6fqm6/87Ob8x 4VwITOnRq5tP1aZ+PpuEWR3amVEZmMpHUVqL26+CkHO/TFv62JtWIVmINtQxNxbxoMHk js59YyfbknWjO7B8WNWegg0P7J+YZa1hRZccHu9nR6etYEVJPwXYlLnQBIDmCdFD5ueC qO2Q== X-Gm-Message-State: APjAAAWrggIHyh8yG5hpW+qmMXcPBZtwcuUAsSKFa2DZRr6XOxwwlaQZ y/l1KG/yT1u1akWiD5O8QuAqDqhNY0g= X-Google-Smtp-Source: APXvYqwvBECiCFHl+Tgizy9xP9nja/04hJFLJMX4GVVRcXtWMCBexIXvQVFee+I+qTvRUk1X5I5/PQ== X-Received: by 2002:a2e:9855:: with SMTP id e21mr1379063ljj.170.1575452909195; Wed, 04 Dec 2019 01:48:29 -0800 (PST) Received: from [192.168.0.88] (east.meadow.volia.net. [93.72.151.96]) by smtp.googlemail.com with ESMTPSA id n19sm2909290lfl.85.2019.12.04.01.48.27 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 04 Dec 2019 01:48:28 -0800 (PST) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() To: Peter Eriksson , FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> From: Andriy Gapon Openpgp: preference=signencrypt Autocrypt: addr=avg@FreeBSD.org; prefer-encrypt=mutual; keydata= mQINBFm4LIgBEADNB/3lT7f15UKeQ52xCFQx/GqHkSxEdVyLFZTmY3KyNPQGBtyvVyBfprJ7 mAeXZWfhat6cKNRAGZcL5EmewdQuUfQfBdYmKjbw3a9GFDsDNuhDA2QwFt8BmkiVMRYyvI7l N0eVzszWCUgdc3qqM6qqcgBaqsVmJluwpvwp4ZBXmch5BgDDDb1MPO8AZ2QZfIQmplkj8Y6Z AiNMknkmgaekIINSJX8IzRzKD5WwMsin70psE8dpL/iBsA2cpJGzWMObVTtCxeDKlBCNqM1i gTXta1ukdUT7JgLEFZk9ceYQQMJJtUwzWu1UHfZn0Fs29HTqawfWPSZVbulbrnu5q55R4PlQ /xURkWQUTyDpqUvb4JK371zhepXiXDwrrpnyyZABm3SFLkk2bHlheeKU6Yql4pcmSVym1AS4 dV8y0oHAfdlSCF6tpOPf2+K9nW1CFA8b/tw4oJBTtfZ1kxXOMdyZU5fiG7xb1qDgpQKgHUX8 7Rd2T1UVLVeuhYlXNw2F+a2ucY+cMoqz3LtpksUiBppJhw099gEXehcN2JbUZ2TueJdt1FdS ztnZmsHUXLxrRBtGwqnFL7GSd6snpGIKuuL305iaOGODbb9c7ne1JqBbkw1wh8ci6vvwGlzx rexzimRaBzJxlkjNfMx8WpCvYebGMydNoeEtkWldtjTNVsUAtQARAQABtB5BbmRyaXkgR2Fw b24gPGF2Z0BGcmVlQlNELm9yZz6JAlQEEwEIAD4WIQS+LEO7ngQnXA4Bjr538m7TUc1yjwUC WbgsiAIbIwUJBaOagAULCQgHAgYVCAkKCwIEFgIDAQIeAQIXgAAKCRB38m7TUc1yj+JAEACV l9AK/nOWAt/9cufV2fRj0hdOqB1aCshtSrwHk/exXsDa4/FkmegxXQGY+3GWX3deIyesbVRL rYdtdK0dqJyT1SBqXK1h3/at9rxr9GQA6KWOxTjUFURsU7ok/6SIlm8uLRPNKO+yq0GDjgaO LzN+xykuBA0FlhQAXJnpZLcVfPJdWv7sSHGedL5ln8P8rxR+XnmsA5TUaaPcbhTB+mG+iKFj GghASDSfGqLWFPBlX/fpXikBDZ1gvOr8nyMY9nXhgfXpq3B6QCRYKPy58ChrZ5weeJZ29b7/ QdEO8NFNWHjSD9meiLdWQaqo9Y7uUxN3wySc/YUZxtS0bhAd8zJdNPsJYG8sXgKjeBQMVGuT eCAJFEYJqbwWvIXMfVWop4+O4xB+z2YE3jAbG/9tB/GSnQdVSj3G8MS80iLS58frnt+RSEw/ psahrfh0dh6SFHttE049xYiC+cM8J27Aaf0i9RflyITq57NuJm+AHJoU9SQUkIF0nc6lfA+o JRiyRlHZHKoRQkIg4aiKaZSWjQYRl5Txl0IZUP1dSWMX4s3XTMurC/pnja45dge/4ESOtJ9R 8XuIWg45Oq6MeIWdjKddGhRj3OohsltKgkEU3eLKYtB6qRTQypHHUawCXz88uYt5e3w4V16H lCpSTZV/EVHnNe45FVBlvK7k7HFfDDkryLkCDQRZuCyIARAAlq0slcsVboY/+IUJdcbEiJRW be9HKVz4SUchq0z9MZPX/0dcnvz/gkyYA+OuM78dNS7Mbby5dTvOqfpLJfCuhaNYOhlE0wY+ 1T6Tf1f4c/uA3U/YiadukQ3+6TJuYGAdRZD5EqYFIkreARTVWg87N9g0fT9BEqLw9lJtEGDY EWUE7L++B8o4uu3LQFEYxcrb4K/WKmgtmFcm77s0IKDrfcX4doV92QTIpLiRxcOmCC/OCYuO jB1oaaqXQzZrCutXRK0L5XN1Y1PYjIrEzHMIXmCDlLYnpFkK+itlXwlE2ZQxkfMruCWdQXye syl2fynAe8hvp7Mms9qU2r2K9EcJiR5N1t1C2/kTKNUhcRv7Yd/vwusK7BqJbhlng5ZgRx0m WxdntU/JLEntz3QBsBsWM9Y9wf2V4tLv6/DuDBta781RsCB/UrU2zNuOEkSixlUiHxw1dccI 6CVlaWkkJBxmHX22GdDFrcjvwMNIbbyfQLuBq6IOh8nvu9vuItup7qemDG3Ms6TVwA7BD3j+ 3fGprtyW8Fd/RR2bW2+LWkMrqHffAr6Y6V3h5kd2G9Q8ZWpEJk+LG6Mk3fhZhmCnHhDu6CwN MeUvxXDVO+fqc3JjFm5OxhmfVeJKrbCEUJyM8ESWLoNHLqjywdZga4Q7P12g8DUQ1mRxYg/L HgZY3zfKOqcAEQEAAYkCPAQYAQgAJhYhBL4sQ7ueBCdcDgGOvnfybtNRzXKPBQJZuCyIAhsM BQkFo5qAAAoJEHfybtNRzXKPBVwQAKfFy9P7N3OsLDMB56A4Kf+ZT+d5cIx0Yiaf4n6w7m3i ImHHHk9FIetI4Xe54a2IXh4Bq5UkAGY0667eIs+Z1Ea6I2i27Sdo7DxGwq09Qnm/Y65ADvXs 3aBvokCcm7FsM1wky395m8xUos1681oV5oxgqeRI8/76qy0hD9WR65UW+HQgZRIcIjSel9vR XDaD2HLGPTTGr7u4v00UeTMs6qvPsa2PJagogrKY8RXdFtXvweQFz78NbXhluwix2Tb9ETPk LIpDrtzV73CaE2aqBG/KrboXT2C67BgFtnk7T7Y7iKq4/XvEdDWscz2wws91BOXuMMd4c/c4 OmGW9m3RBLufFrOag1q5yUS9QbFfyqL6dftJP3Zq/xe+mr7sbWbhPVCQFrH3r26mpmy841ym dwQnNcsbIGiBASBSKksOvIDYKa2Wy8htPmWFTEOPRpFXdGQ27awcjjnB42nngyCK5ukZDHi6 w0qK5DNQQCkiweevCIC6wc3p67jl1EMFY5+z+zdTPb3h7LeVnGqW0qBQl99vVFgzLxchKcl0 R/paSFgwqXCZhAKMuUHncJuynDOP7z5LirUeFI8qsBAJi1rXpQoLJTVcW72swZ42IdPiboqx NbTMiNOiE36GqMcTPfKylCbF45JNX4nF9ElM0E+Y8gi4cizJYBRr2FBJgay0b9Cp Message-ID: Date: Wed, 4 Dec 2019 11:48:26 +0200 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 47SYvM1wNwz4f0s X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of agapon@gmail.com designates 209.85.208.170 as permitted sender) smtp.mailfrom=agapon@gmail.com X-Spamd-Result: default: False [-3.11 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; MIME_TRACE(0.00)[0:+]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[170.208.85.209.list.dnswl.org : 127.0.5.0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; IP_SCORE(-1.11)[ip: (-0.42), ipnet: 209.85.128.0/17(-3.15), asn: 15169(-1.93), country: US(-0.05)]; FORGED_SENDER(0.30)[avg@FreeBSD.org,agapon@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[170.208.85.209.rep.mailspike.net : 127.0.0.17]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US]; FROM_NEQ_ENVFROM(0.00)[avg@FreeBSD.org,agapon@gmail.com]; MID_RHS_MATCH_FROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[96.151.72.93.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.10] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Dec 2019 09:48:32 -0000 On 04/12/2019 00:28, Peter Eriksson wrote: > … and X levels deeper it gives this (not all filesystems mounted - I rebooted fairly quickly): > > keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds [20021 loops] > zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 seconds > zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds > zone_free_item(zone=UMA Zones): zone->uz_dtor() took 14 seconds > uma_zdestroy(zio_buf_12288) took 14 seconds > kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] took 14 seconds > kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds > > > Hmm… I wonder if it isn’t this code I /usr/src/sys/sys/queue.h: > >> #define SLIST_REMOVE(head, elm, type, field) do { \ >> QMD_SAVELINK(oldnext, (elm)->field.sle_next); \ >> if (SLIST_FIRST((head)) == (elm)) { \ >> SLIST_REMOVE_HEAD((head), field); \ >> } \ >> else { \ >> QUEUE_TYPEOF(type) *curelm = SLIST_FIRST(head); \ >> while (SLIST_NEXT(curelm, field) != (elm)) \ >> curelm = SLIST_NEXT(curelm, field); \ >> SLIST_REMOVE_AFTER(curelm, field); \ >> } \ >> TRASHIT(*oldnext); \ >> } while (0) > > Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain(): > >> finished: >> KEG_UNLOCK(keg); >> >> while ((slab = SLIST_FIRST(&freeslabs)) != NULL) { >> SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink); >> keg_free_slab(keg, slab, keg->uk_ipers); >> } > > (The keg_drain print above is the time for the while() loop). > > If I’m reading this right it looks like it’s potentially doing a linear search through a (long) linked list, for every entry in the free slabs list. 20000 entries in 14 seconds is 1400 entries per second. It's not a search. The loop is removing *all* items. There is no other way to do it :-) Expanding your calculations, it takes on the order of a millisecond to process a single slab. I bet that the time is spent in keg_free_slab() and it's a lot of time. I wonder if the OR should actually be AND in this condition: if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize * i)) || keg->uk_fini != trash_fini) Looks like there is a "ton" of memory reading done here. -- Andriy Gapon From owner-freebsd-fs@freebsd.org Wed Dec 4 10:13:48 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 241C01CFBC7 for ; Wed, 4 Dec 2019 10:13:48 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47SZSW66psz4gDc; Wed, 4 Dec 2019 10:13:47 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 2859B40017; Wed, 4 Dec 2019 11:13:44 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43] (unknown [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 0846F4000D; Wed, 4 Dec 2019 11:13:44 +0100 (CET) From: Peter Eriksson Message-Id: Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Wed, 4 Dec 2019 11:13:43 +0100 In-Reply-To: Cc: Andriy Gapon To: FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47SZSW66psz4gDc X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.993,0]; NEURAL_HAM_LONG(-1.00)[-0.998,0]; REPLY(-4.00)[] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Dec 2019 10:13:48 -0000 Yeah, I noticed that after I sent that email (should send stuff just = before you go to bed :-) Anyway, some more timing/debug printfs point to the keg->uk_freef(mem) = call in keg_free_slab() which sometimes take about a second to execute. (Should probably switch to some other time source with greater = resolution than 1s next :-) keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds keg_free_slab: keg->uk_freef(mem) took 1 seconds - Peter > On 4 Dec 2019, at 10:48, Andriy Gapon wrote: >=20 > On 04/12/2019 00:28, Peter Eriksson wrote: >> =E2=80=A6 and X levels deeper it gives this (not all filesystems = mounted - I rebooted fairly quickly): >>=20 >> keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds = [20021 loops] >> zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 seconds >> zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds >> zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 14 seconds >> uma_zdestroy(zio_buf_12288) took 14 seconds >> kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] = took 14 seconds >> kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds >>=20 >>=20 >> Hmm=E2=80=A6 I wonder if it isn=E2=80=99t this code I = /usr/src/sys/sys/queue.h: >>=20 >>> #define SLIST_REMOVE(head, elm, type, field) do { = \ >>> QMD_SAVELINK(oldnext, (elm)->field.sle_next); = \ >>> if (SLIST_FIRST((head)) =3D=3D (elm)) { = \ >>> SLIST_REMOVE_HEAD((head), field); = \ >>> } = \ >>> else { = \ >>> QUEUE_TYPEOF(type) *curelm =3D SLIST_FIRST(head); = \ >>> while (SLIST_NEXT(curelm, field) !=3D (elm)) = \ >>> curelm =3D SLIST_NEXT(curelm, field); = \ >>> SLIST_REMOVE_AFTER(curelm, field); = \ >>> } = \ >>> TRASHIT(*oldnext); = \ >>> } while (0) >>=20 >> Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain(): >>=20 >>> finished: >>> KEG_UNLOCK(keg); >>>=20 >>> while ((slab =3D SLIST_FIRST(&freeslabs)) !=3D NULL) { >>> SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink); >>> keg_free_slab(keg, slab, keg->uk_ipers); >>> } >>=20 >> (The keg_drain print above is the time for the while() loop). >>=20 >> If I=E2=80=99m reading this right it looks like it=E2=80=99s = potentially doing a linear search through a (long) linked list, for = every entry in the free slabs list. 20000 entries in 14 seconds is 1400 = entries per second. >=20 > It's not a search. The loop is removing *all* items. There is no = other way to > do it :-) > Expanding your calculations, it takes on the order of a millisecond to = process a > single slab. I bet that the time is spent in keg_free_slab() and it's = a lot of > time. >=20 > I wonder if the OR should actually be AND in this condition: > if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize = * i)) || > keg->uk_fini !=3D trash_fini) > Looks like there is a "ton" of memory reading done here. >=20 > --=20 > Andriy Gapon From owner-freebsd-fs@freebsd.org Wed Dec 4 12:48:10 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 845CB1B6686 for ; Wed, 4 Dec 2019 12:48:10 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47Sdtc69cjz3Jq1; Wed, 4 Dec 2019 12:48:08 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 1AA4F4000D; Wed, 4 Dec 2019 13:48:06 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43] (unknown [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 998C54000E; Wed, 4 Dec 2019 13:48:05 +0100 (CET) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() From: Peter Eriksson In-Reply-To: Date: Wed, 4 Dec 2019 13:48:05 +0100 Cc: Andriy Gapon Content-Transfer-Encoding: quoted-printable Message-Id: <1119144A-C6DA-4CBF-8616-108FB50D2F8D@lysator.liu.se> References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> To: FreeBSD FS X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47Sdtc69cjz3Jq1 X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-3.91 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-1.41)[ip: (-6.19), ipnet: 2001:6b0::/32(-0.46), asn: 1653(-0.38), country: EU(-0.00)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Dec 2019 12:48:10 -0000 Going deeper=E2=80=A6 now it=E2=80=99s pointing towards pmap_remove(), = called from kmem_unback(), which is called from kmem_free(), which is = called from page_free() which is called from keg_free_slab()... kmem_unback: pmap_remove(kernel_pmap, 18446741884786954240, = 18446741884786970624) took 1 seconds kmem_free: kmem_unback(kmem_object, 18446741884786954240, 16384) took 1 = seconds page_free: kmem_free(kmem_arena, 18446741884786954240, 16384) took 1 = seconds keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds kmem_unback: pmap_remove(kernel_pmap, 18446741884989739008, = 18446741884989755392) took 1 seconds kmem_free: kmem_unback(kmem_object, 18446741884989739008, 16384) took 1 = seconds page_free: kmem_free(kmem_arena, 18446741884989739008, 16384) took 1 = seconds keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds kmem_unback: pmap_remove(kernel_pmap, 18446741885257895936, = 18446741885257912320) took 1 seconds kmem_free: kmem_unback(kmem_object, 18446741885257895936, 16384) took 1 = seconds page_free: kmem_free(kmem_arena, 18446741885257895936, 16384) took 1 = seconds keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds - Peter > On 4 Dec 2019, at 11:13, Peter Eriksson wrote: >=20 > Yeah, I noticed that after I sent that email (should send stuff just = before you go to bed :-) >=20 > Anyway, some more timing/debug printfs point to the keg->uk_freef(mem) = call in keg_free_slab() which sometimes take about a second to execute. > (Should probably switch to some other time source with greater = resolution than 1s next :-) >=20 > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds > keg_free_slab: keg->uk_freef(mem) took 1 seconds >=20 > - Peter >=20 >> On 4 Dec 2019, at 10:48, Andriy Gapon wrote: >>=20 >> On 04/12/2019 00:28, Peter Eriksson wrote: >>> =E2=80=A6 and X levels deeper it gives this (not all filesystems = mounted - I rebooted fairly quickly): >>>=20 >>> keg_drain: while (SLIST_FIRST/SLIST_REMOVE)-loop took 14 seconds = [20021 loops] >>> zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 14 = seconds >>> zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 14 seconds >>> zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 14 seconds >>> uma_zdestroy(zio_buf_12288) took 14 seconds >>> kmem_cache_destroy: uma_zdestroy(0xfffff803467c8ac0) [zio_buf_12288] = took 14 seconds >>> kmem_cache_destroy(zio_buf_cache[20]) took 14 seconds >>>=20 >>>=20 >>> Hmm=E2=80=A6 I wonder if it isn=E2=80=99t this code I = /usr/src/sys/sys/queue.h: >>>=20 >>>> #define SLIST_REMOVE(head, elm, type, field) do { = \ >>>> QMD_SAVELINK(oldnext, (elm)->field.sle_next); = \ >>>> if (SLIST_FIRST((head)) =3D=3D (elm)) { = \ >>>> SLIST_REMOVE_HEAD((head), field); = \ >>>> } = \ >>>> else { = \ >>>> QUEUE_TYPEOF(type) *curelm =3D SLIST_FIRST(head); = \ >>>> while (SLIST_NEXT(curelm, field) !=3D (elm)) = \ >>>> curelm =3D SLIST_NEXT(curelm, field); = \ >>>> SLIST_REMOVE_AFTER(curelm, field); = \ >>>> } = \ >>>> TRASHIT(*oldnext); = \ >>>> } while (0) >>>=20 >>> Combined with this in /usr/src/sys/vm/uma_core.c: keg_drain(): >>>=20 >>>> finished: >>>> KEG_UNLOCK(keg); >>>>=20 >>>> while ((slab =3D SLIST_FIRST(&freeslabs)) !=3D NULL) { >>>> SLIST_REMOVE(&freeslabs, slab, uma_slab, us_hlink); >>>> keg_free_slab(keg, slab, keg->uk_ipers); >>>> } >>>=20 >>> (The keg_drain print above is the time for the while() loop). >>>=20 >>> If I=E2=80=99m reading this right it looks like it=E2=80=99s = potentially doing a linear search through a (long) linked list, for = every entry in the free slabs list. 20000 entries in 14 seconds is 1400 = entries per second. >>=20 >> It's not a search. The loop is removing *all* items. There is no = other way to >> do it :-) >> Expanding your calculations, it takes on the order of a millisecond = to process a >> single slab. I bet that the time is spent in keg_free_slab() and = it's a lot of >> time. >>=20 >> I wonder if the OR should actually be AND in this condition: >> if (!uma_dbg_kskip(keg, slab->us_data + (keg->uk_rsize = * i)) || >> keg->uk_fini !=3D trash_fini) >> Looks like there is a "ton" of memory reading done here. >>=20 >> --=20 >> Andriy Gapon >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@freebsd.org Wed Dec 4 15:20:10 2019 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 03EA51C6FB8 for ; Wed, 4 Dec 2019 15:20:10 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [130.236.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 47SjFz1r8Hz3xYm; Wed, 4 Dec 2019 15:20:07 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id 5344F4000E; Wed, 4 Dec 2019 16:20:04 +0100 (CET) Received: from [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43] (unknown [IPv6:2001:6b0:17:f002:8d12:4cc9:4751:6d43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 1D96A4000D; Wed, 4 Dec 2019 16:20:04 +0100 (CET) From: Peter Eriksson Message-Id: Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Subject: Re: Slow reboots due to ZFS cleanup in kern_shutdown() .. zio_fini() Date: Wed, 4 Dec 2019 16:20:03 +0100 In-Reply-To: <1119144A-C6DA-4CBF-8616-108FB50D2F8D@lysator.liu.se> Cc: Andriy Gapon To: FreeBSD FS References: <20191202225424.GG43802@raichu> <3b71fe37-c29f-e3e5-ff96-5dce15cc7553@FreeBSD.org> <20191203162219.GI43802@raichu> <247C6990-BC6E-4E3D-8CEF-5A861D8A25EC@lysator.liu.se> <4D44B25E-2F15-4536-A653-DA242051C8A9@lysator.liu.se> <1119144A-C6DA-4CBF-8616-108FB50D2F8D@lysator.liu.se> X-Mailer: Apple Mail (2.3601.0.10) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 47SjFz1r8Hz3xYm X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 130.236.254.3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-4.91 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; MV_CASE(0.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; URI_COUNT_ODD(1.00)[3]; RCVD_COUNT_THREE(0.00)[3]; RCVD_TLS_LAST(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_IN_DNSWL_MED(-0.20)[3.254.236.130.list.dnswl.org : 127.0.11.2]; RCPT_COUNT_TWO(0.00)[2]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; IP_SCORE(-3.41)[ip: (-8.93), ipnet: 130.236.0.0/16(-4.51), asn: 2843(-3.61), country: SE(-0.03)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:2843, ipnet:130.236.0.0/16, country:SE]; MID_RHS_MATCH_FROM(0.00)[] Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Dec 2019 15:20:10 -0000 Yep, it=E2=80=99s pmap_remove() that occasionally takes ~1s instead of = being very quick. A test I ran: kmem_unback: pmap_remove(kernel_pmap, 18446741878385815552, = 18446741878385827840) took 1 seconds kmem_free: kmem_unback(kmem_object, 18446741878385815552, 12288) took 1 = seconds page_free: kmem_free(kmem_arena, 18446741878385815552, 12288) took 1 = seconds keg_free_slab: keg->uk_freef(mem) {page_free} took 1 seconds keg_drain: while()-keg_free_slab-loop took 17 seconds [23136 loops, 17 = slow calls] zone_drain_wait(): zone_foreach_keg(zone, &keg_drain) took 17 seconds zone_dtor(): zone_drain_wait(zone, M_WAITOK) took 17 seconds zone_free_item(zone=3DUMA Zones): zone->uz_dtor() took 17 seconds uma_zdestroy(zio_buf_12288) took 17 seconds kmem_cache_destroy: uma_zdestroy(0xfffff803465dd980) [zio_buf_12288] = took 17 seconds kmem_cache_destroy(zio_buf_cache[20]) took 17 seconds Ie, 17 out of 23136 calls to = keg_free_slab()->page_free()->kmem_free()->kmem_unback()->pmap_remove() = was slow (took atleast 1s). I created a bug report (242427) on bugs.freebsd.org = for this now in case someone knows how to = debug the PMAP stuff=E2=80=A6 :-) - Peter