From nobody Mon Jul 15 20:24:37 2024 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WNDFp5JYrz5Rb2g for ; Mon, 15 Jul 2024 20:24:46 +0000 (UTC) (envelope-from freebsd-questions@umpquanet.com) Received: from sfo.umpquanet.com (sfo.umpquanet.com [104.245.33.249]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "umpquanet.com", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WNDFn4VTCz4fJM for ; Mon, 15 Jul 2024 20:24:45 +0000 (UTC) (envelope-from freebsd-questions@umpquanet.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=umpquanet.com header.s=20231023 header.b=nJgreMMy; dmarc=none; spf=pass (mx1.freebsd.org: domain of freebsd-questions@umpquanet.com designates 104.245.33.249 as permitted sender) smtp.mailfrom=freebsd-questions@umpquanet.com Received: from sfo.umpquanet.com (localhost [127.0.0.1]) by sfo.umpquanet.com (8.16.1/8.16.1) with ESMTPS id 46FKObQq033930 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 15 Jul 2024 13:24:37 -0700 (PDT) (envelope-from freebsd-questions@umpquanet.com) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=umpquanet.com; s=20231023; t=1721075077; bh=7hYWmypnvcEfS9dPk9sImrGW2r8ydiS+XLe27FaFgVg=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=nJgreMMyuy4ODEi3b+Vv/NTwY4rNOtZib8hJmrT8pexpap6YhLqZFbjf8iXhl1EZx XZ9T2oOeYB+aZo9tMLjA4Siio72eajay35NLXgxl+sobGSJjlW7KxcTLzLsAaQbOms OIfYi72HeUT/TzDck2datasGzhxqWeGnXomFQXgMUj8Ot5aV3gt0rAel0GJpyEFM7e QuQHz+BqdmIrW5zo1fqNfW9//0R/4ecIqu5ofZZLYrYHagG33Of4GLrTUr6dYyQhRg RJtxDMlx0eAKF1K1HXeUY45+1ImZG1nSGZVivI98AA8S9NjAcCXfNMXS9cmlzG0f38 UO0N/AyFPR7mg== Received: (from james@localhost) by sfo.umpquanet.com (8.16.1/8.16.1/Submit) id 46FKObhI033929; Mon, 15 Jul 2024 13:24:37 -0700 (PDT) (envelope-from freebsd-questions@umpquanet.com) X-Authentication-Warning: sfo.umpquanet.com: james set sender to freebsd-questions@umpquanet.com using -f Date: Mon, 15 Jul 2024 13:24:37 -0700 From: Jim Long To: Dan Langille Cc: freebsd-questions Subject: Re: Unable to limit memory consumption with vfs.zfs.arc_max Message-ID: References: <4299f196-1fbe-4590-8668-d023d7044e8d@sentex.net> <3d2f7913-b91d-4d00-a9b4-f37cf1032fc0@app.fastmail.com> List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: freebsd-questions@freebsd.org Sender: owner-freebsd-questions@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3d2f7913-b91d-4d00-a9b4-f37cf1032fc0@app.fastmail.com> X-Spamd-Bar: / X-Spamd-Result: default: False [-1.00 / 15.00]; RBL_SENDERSCORE(2.00)[104.245.33.249:from]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.998]; BAD_REP_POLICIES(0.10)[]; MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[]; R_SPF_ALLOW(0.00)[+a]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:6364, ipnet:104.245.32.0/23, country:US]; MIME_TRACE(0.00)[0:+]; DMARC_NA(0.00)[umpquanet.com]; MLMMJ_DEST(0.00)[freebsd-questions@freebsd.org]; MID_RHS_MATCH_FROMTLD(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCVD_TLS_LAST(0.00)[]; R_DKIM_ALLOW(0.00)[umpquanet.com:s=20231023]; DKIM_TRACE(0.00)[umpquanet.com:+] X-Rspamd-Queue-Id: 4WNDFn4VTCz4fJM Picking up this old thread since it's still vexing me.... On Sat, May 04, 2024 at 07:56:39AM -0400, Dan Langille wrote: > > This is from FreeBSD 14 on an Dell R730 in the basement (primary purpose, poudriere, and PostgreSQL, and running four FreshPorts nodes): > > >From top: > > ARC: 34G Total, 14G MFU, 9963M MRU, 22M Anon, 1043M Header, 9268M Other > 18G Compressed, 41G Uncompressed, 2.28:1 Ratio > > % grep arc /boot/loader.conf > vfs.zfs.arc_max="36000M" > > Looks like the value to set is: > > % sysctl -a vfs.zfs.arc | grep max > vfs.zfs.arc.max: 37748736000 > > Perhaps not a good example, but this might be more appropriate: > > % grep vfs.zfs.arc.max /boot/loader.conf > vfs.zfs.arc_max="1200M" > > with top showing: > > ARC: 1198M Total, 664M MFU, 117M MRU, 3141K Anon, 36M Header, 371M Other > 550M Compressed, 1855M Uncompressed, 3.37:1 Ratio Thank you, Dan, I appreciate you chiming in. Unfortunately, I think I have those bases covered, although I'm open to anything I may have missed: # grep -i arc /boot/loader.conf /etc/sysctl.conf /boot/loader.conf:vfs.zfs.arc.max=4294967296 /boot/loader.conf:vfs.zfs.arc_max=4294967296 /boot/loader.conf:vfs.zfs.arc.min=2147483648 /etc/sysctl.conf:vfs.zfs.arc_max=4294967296 /etc/sysctl.conf:vfs.zfs.arc.max=4294967296 /etc/sysctl.conf:vfs.zfs.arc.min=2147483648 # top -b last pid: 16257; load averages: 0.80, 1.15, 1.18 up 0+02:03:34 12:05:06 55 processes: 2 running, 53 sleeping CPU: 11.7% user, 0.0% nice, 18.4% system, 0.1% interrupt, 69.9% idle Mem: 32M Active, 141M Inact, 11G Wired, 3958M Free ARC: 10G Total, 5143M MFU, 4679M MRU, 2304K Anon, 44M Header, 219M Other 421M Compressed, 4744M Uncompressed, 11.28:1 Ratio PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 11057 root 1 127 0 59M 33M CPU0 0 60:16 82.28% ssh 11056 root 5 24 0 22M 12M pipewr 3 6:00 6.25% zfs 1619 snmpd 1 20 0 34M 14M select 0 0:06 0.00% snmpd 1344 root 1 20 0 14M 3884K select 3 0:03 0.00% devd 1544 root 1 20 0 13M 2776K select 3 0:01 0.00% syslogd 1661 root 1 68 0 22M 9996K select 0 0:01 0.00% sshd 1587 ntpd 1 20 0 23M 5876K select 1 0:00 0.00% ntpd 14391 root 1 20 0 22M 11M select 3 0:00 0.00% sshd 2098 root 1 20 0 24M 11M select 1 0:00 0.00% httpd 1904 root 1 20 0 24M 11M select 2 0:00 0.00% httpd 1870 root 1 20 0 19M 8688K select 2 0:00 0.00% sendmail 2067 root 1 20 0 19M 8688K select 1 0:00 0.00% sendmail 2066 65529 1 20 0 13M 4564K select 2 0:00 0.00% mathlm 1883 65529 1 20 0 11M 2772K select 3 0:00 0.00% mathlm 14397 root 1 20 0 14M 4568K wait 1 0:00 0.00% bash 1636 root 1 20 0 13M 2608K nanslp 0 0:00 0.00% cron 2082 root 1 20 0 13M 2560K nanslp 3 0:00 0.00% cron 1887 root 1 20 0 13M 2568K nanslp 2 0:00 0.00% cron # sysctl -a | grep m.u_evictable kstat.zfs.misc.arcstats.mfu_evictable_metadata: 0 kstat.zfs.misc.arcstats.mfu_evictable_data: 0 kstat.zfs.misc.arcstats.mru_evictable_metadata: 0 kstat.zfs.misc.arcstats.mru_evictable_data: 0 An mrtg graph is attached showing ARC bytes used (kstat.zfs.misc.arcstats.size) in green, vs. ARC bytes max (vfs.zfs.arc.max) in blue. We can see that daily, the ARC bytes used blows right past the 4G limit. Most days, it is brought under control by two reboots in /etc/crontab ("shutdown -r now" at 02:55, 05:35), although some days the system is too far gone by the time the cron job rolls around, and the system stays hung until I can get to the data center and power cycle it. I'm not very skilled at kernel debugging, but is a kernel PR in order? This has happened with a GENERIC kernel across at least two builds of 14-STABLE: FreeBSD 14.0-STABLE #0 stable/14-n267062-77205dbc1397: Thu Mar 28 12:12:02 PDT 2024 FreeBSD 14.1-STABLE #0 stable/14-n267886-4987c12cb878: Thu Jun 6 12:24:06 PDT 2024 Would it help to reproduce this with a -RELEASE version? Thank you again, everyone. Jim