:sender:x-gm-message-state:from:to:cc:subject:date:message-id
         :reply-to;
        bh=Ph/2qZhvnoOvGBUHKebUDeACbjfZcgqDMYisL/XGEas=;
        b=eMW0reiyo8glKHEYrY4zkTK98rFsqJqP4/zEZ4WrvfkXuGCA/b/TSszW94IkYLqEx3
         AUgzx42JEk2Z/H6A1NqwCC5qVP1hhiSi7NCiWxD0gqGJipoJLC6jDVpIj3ppuiwApmI1
         D8lsQjo4b420pxiVhog+XkpT2VOKlNmq2vaZvxoQ3Gb8+N5OXmo37NSkRamFeET67nX8
         ajapFCYFks9SUw1r2+eN80BNb4o5hL6hAov/RmCOB+YsLPzrgdEsqJOazWTUu4LkTV8A
         4b3YuwxX2wfJA2s9NOIVt56AYtEJ7ZktYOFhd2mMX+O5Qmukab1VF0DnkxqbTnEFhSvS
         4N0A==
X-Forwarded-Encrypted: i=1; AJvYcCUa9vvEJEi3e5lSniCymONP0NpxPtmJUhJBLXfgDRWTggNxOZ+OsVfUUF8M1kAvIYjTSHPYCcDxHGIp0OVe6IE=@freebsd.org
X-Gm-Message-State: AOJu0YwI/2p8VEVcHRrk4nd/H5QWTNJedk2yg+bfHLlwhDj6ogqJadyD
	wP8Cr2E3pQ2w0mrza38hjCjwKusetrTXGUEUpNUFAJq4z+s7ZM3E1gEL
X-Gm-Gg: ASbGnctZ0BnOdTtzcTCO8rKcvOEchCepCbArXHZUeO0Nui0WHOSUtXV5SBZddMx86yz
	s7eLk6I5lAVcf1wihC6m5QsZpwHCd6wORORPa4UMXAsRx9gFLJHv26r0NrSG4pU3FIDt+LMhOxS
	KLeIz6m50FaMaf0mKaW9+yjc9j8frANFEZxaTR/rhpr3IB9SucR15FHfdt/RmHKjy+aWxvWm1eM
	e/BoAnzMcGk0wxACYYiy3h0Z+kbQn9EcY33igK2MDtH1UfypuR5jUjXj/CGbsKksMPU85gWftM6
	zmvazXTvTvblPt+h+dHBTUdKPiB12VrgdrKbSTcV5x7Awe6vgpautxgcrxnAdDMefxw3Oj2Sjgf
	t9AETe+S1g4mFs1PzsVMnwIuIlrm6bp8X9iCogJzAhs+cqqzWCbPChXEmFMjIDo5EMSKJHIe086
	uwCJtgzpIOAw==
X-Google-Smtp-Source: AGHT+IFERjt0IqWBFjDNCMaOlxzqofzAFungSsNgODDE6fzAeIqSi4AhAluJYH/Jy1zJn5Edkid6hQ==
X-Received: by 2002:a05:690e:1487:b0:63c:ee51:5cf6 with SMTP id 956f58d0204a3-63e16109eb3mr16217916d50.15.1761145555106;
        Wed, 22 Oct 2025 08:05:55 -0700 (PDT)
Received: from [10.230.45.5] ([38.32.73.2])
        by smtp.gmail.com with ESMTPSA id 00721157ae682-784673c2ae7sm37254557b3.19.2025.10.22.08.05.54
        (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
        Wed, 22 Oct 2025 08:05:54 -0700 (PDT)
Message-ID: <8cc739c5-9cd7-42b6-a9ca-c6c864744fba@FreeBSD.org>
Date: Wed, 22 Oct 2025 11:05:53 -0400
List-Id: Discussions about the use of FreeBSD-current <freebsd-current.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-current
List-Help: <mailto:freebsd-current+help@freebsd.org>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Subscribe: <mailto:freebsd-current+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-current+unsubscribe@freebsd.org>
Sender: owner-freebsd-current@FreeBSD.org
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: RFC: How ZFS handles arc memory use
To: Rick Macklem <rick.macklem@gmail.com>,
 FreeBSD CURRENT <freebsd-current@freebsd.org>,
 Garrett Wollman <wollman@bimajority.org>, Peter Eriksson <pen@lysator.liu.se>
References: <CAM5tNy5b3=04zC84Q_c60A9qssZTEY2n73okXoFPeT+YSK25JQ@mail.gmail.com>
Content-Language: en-US
From: Alexander Motin <mav@FreeBSD.org>
In-Reply-To: <CAM5tNy5b3=04zC84Q_c60A9qssZTEY2n73okXoFPeT+YSK25JQ@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Spamd-Bar: -
X-Spamd-Result: default: False [-1.60 / 15.00];
	SUSPICIOUS_RECIPS(1.50)[];
	NEURAL_HAM_LONG(-1.00)[-1.000];
	NEURAL_HAM_SHORT(-1.00)[-1.000];
	NEURAL_HAM_MEDIUM(-1.00)[-1.000];
	FORGED_SENDER(0.30)[mav@FreeBSD.org,mavbsd@gmail.com];
	R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601];
	R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c];
	MIME_GOOD(-0.10)[text/plain];
	DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), DKIM not aligned (relaxed),none];
	TO_DN_ALL(0.00)[];
	RCVD_TLS_LAST(0.00)[];
	FREEMAIL_TO(0.00)[gmail.com,freebsd.org,bimajority.org,lysator.liu.se];
	MIME_TRACE(0.00)[0:+];
	ARC_NA(0.00)[];
	FREEMAIL_ENVFROM(0.00)[gmail.com];
	DKIM_TRACE(0.00)[gmail.com:+];
	FROM_HAS_DN(0.00)[];
	RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::1130:from];
	TO_MATCH_ENVRCPT_SOME(0.00)[];
	PREVIOUSLY_DELIVERED(0.00)[freebsd-current@freebsd.org];
	RCVD_COUNT_TWO(0.00)[2];
	FROM_NEQ_ENVFROM(0.00)[mav@FreeBSD.org,mavbsd@gmail.com];
	RCPT_COUNT_THREE(0.00)[4];
	MLMMJ_DEST(0.00)[freebsd-current@freebsd.org];
	TAGGED_RCPT(0.00)[];
	MID_RHS_MATCH_FROM(0.00)[];
	ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US];
	RCVD_VIA_SMTP_AUTH(0.00)[];
	DWL_DNSWL_NONE(0.00)[gmail.com:dkim]
X-Rspamd-Queue-Id: 4csCCr72Slz3Fbg

Hi Rick,

On 22.10.2025 10:34, Rick Macklem wrote:
> A couple of people have reported problems with NFS servers,
> where essentially all of the system's memory gets exhausted.
> They see the problem on 14.n FreeBSD servers (which use the
> newer ZFS code) but not on 13.n servers.
> 
> I am trying to learn how ZFS handles arc memory use to try
> and figure out what can be done about this problem.
> 
> I know nothing about ZFS internals or UMA(9) internals,
> so I could be way off, but here is what I think is happening.
> (Please correct me on this.)
> 
> The L1ARC uses uma_zalloc_arg()/uma_zfree_arg() to allocate
> the arc memory. The zones are created using uma_zcreate(),
> so they are regular zones. This means the pages are coming
> from a slab in a keg, which are wired pages.
> 
> The only time the size of the slab/keg will be reduced by ZFS
> is when it calls uma_zone_reclaim(.., UMA_RECLAIM_DRAIN),
> which is called by arc_reap_cb(), triggered by arc_reap_cb_check().
> 
> arc_reap_cb_check() uses arc_available_memory() and triggers
> arc_reap_cb() when arc_available_memory() returns a negative
> value.
> 
> arc_available_memory() returns a negative value when
> zfs_arc_free_target (vfs.zfs.arc.free_target) is greater than freemem.
> (By default, zfs_arc_free_target is set to vm_cnt.v_free_taget.)
> 
> Does all of the above sound about right?

There are two mechanisms to reduce ARC size: either from ZFS side in the 
way you described, or from kernel side, when it calls ZFS low memory 
handler arc_lowmem().  It feels somewhat overkill, but it came this way 
from Solaris.

Once ARC size is reduced and evictions into UMA caches happened, it is 
up to UMA how to drain its caches.  ZFS might trigger that itself, or it 
can be done by kernel, or few years back I've added a mechanism for UMA 
caches to slowly shrink by themselves even without pressure.

> This leads me to...
> - zfs_arc_free_target (vfs.zfs.arc.free_target) needs to be larger

There is a very delicate balance between ZFS and kernel 
(zfs_arc_free_target = vm_cnt.v_free_target).  Imbalance there makes one 
of them suffer.

> or
> - Most of the wired pages in the slab are per-cpu,
>    so the uma_zone_reclaim() needs to UMA_RECLAIM_DRAIN_CPU
>    on some systems. (Not the small test systems I have, where I
>    cannot reproduce the problem.)

Per-CPU caches should be relatively small. IIRC in dozens or hundreds of 
allocations per CPU.  Their drain is expensive and should rarely be 
needed, unless you have too little RAM for the number of CPUs you have.

> or
> - uma_zone_reclaim() needs to be called under other
>    circumstances.
> or
> - ???
> 
> How can you tell if a keg/slab is per-cpu?
> (For my simple test system, I only see "UMA Slabs 0:" and
> "UMA Slabs 1:". It looks like UMA Slabs 0: is being used for
> ZFS arc allocation for this simple test system.)
> 
> Hopefully folk who understand ZFS arc allocation or UMA
> can jump in and help out, rick

Before you dive into UMA, have you checked whether ARC size really 
shrinks and eviction happens?  Considering you mention NFS, I wonder 
what is your number of open files?  Too many open files might in some 
cases restrict ZFS ability to evict metadata from ARC.  arc_summary may 
give some insights about ARC state.

-- 
Alexander Motin