From nobody Wed Oct 22 15:21:42 2025 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4csCZL6t8tz6CjgJ for ; Wed, 22 Oct 2025 15:22:02 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4csCZL4xsnz3Hcn for ; Wed, 22 Oct 2025 15:22:02 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-63c0eb94ac3so12500524a12.2 for ; Wed, 22 Oct 2025 08:22:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1761146516; x=1761751316; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=H5Qfd4YnFMY7xXOSe9ArwCBXoEL9YNRsBYMUtMEGTzI=; b=AZXwM9Shm6dPGacIFBY9Z69n4TzLuP827rT+3owsn6Bp8zQG7r9cAlzDuTCVm3hnIE FcA9QMR3EwG6PVa7mqrGoUhIx5XUwlA/xfFpDjugwjhQVV9At/0kIDyvweGvKHSivDOs uo9u98XS4E7g+urZGIQ816F8Avnjb0F0wn0Bb+VW0y+CUSpACBAbUkrRFAg2x76jAKwt pmLdF7sDv5rWqbhSEYXEeITCqqyahEkOuadNadk5TOFxsHpCOxjjttoKk95Wl0c8C+51 Vq4pTgux9KRSQjuwdXDQ1SdUg/LONFaSLVG552QgaFxIeHgMLG9w8MnQm+/wX40OQw8a Ju6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761146516; x=1761751316; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H5Qfd4YnFMY7xXOSe9ArwCBXoEL9YNRsBYMUtMEGTzI=; b=HLw2sWqWF/17/Cw2tOkLmTU6STqtR/JPukdENTG1Sm7Tswc+6R1oyrLWgpX6QBAUot SazfNCgodYybGhrkTGENWyOEfu9FFMVfn6FmWROtmssoboHa3O8r0CkJgBxtwpWclw4X OJfgEbQzDFEuuL+QUzgmfGo/0CKWRycK0cyWWJJtvvV4m3DCLzXkOiw90PWqgYgAwuRd Is54qMlVkUwP5neWuIAh2/7/Y407ILFNjCB9w18FguMDjsbyMBgU5Nsp4XCydu9weVMo 9kKyWum24aTBkYkom6P0mGOL4Us74KuRDlZYjxrUMbggPqn1nTkUJjhAT0Xk+uFOHjF7 QR9Q== X-Gm-Message-State: AOJu0Yw8GV/mOSXP/zNS6mRSO7ZauDR7YhlEcPQyhMfR0qVzdXtLJ6q0 745MiN2HBsA+FAd/OAjQxdMCmQdOBzMZ4ljI1cXrnhX5RzUDPo+9LI1mlNu+It8bg5WXEcLKdnO v3dWdypitTGn4WgMog3yNYcUMimS9SQ== X-Gm-Gg: ASbGncsl9ZbUtb1qNyxpn6Q5Wf6lrP1Xq22MQRoElP9C4WNnoqJcX+vf98rgU5u+S6c 4cghn+F/MswEe78T5uYiatw/VE+rf7amxRJNQGOjgIRIGm3ZTElB7HLP9RTUhETXvxtm9QtxUP1 DfxFUpmhxQvqbuYMXS+yutxpbsV+p/t7HejBRemFIjh1xWBWAiVV95Ua57Y4W+k3TQDkmidCVaN KgCoN0GDWWRaF40vPVCKlMwMiDR0gWE/bSKYVYnwp0/RmZfjBAaRkIJzasfXQhYbJQHWhGUjZ30 VcaPX44Mko/nQO6Ue7FWNTGyVSE= X-Google-Smtp-Source: AGHT+IHYRe58VYL4tsIaGWneCpCMdEXO4uMuSle+xmeORJVKoUjkEI2aHjqtPL+8Vi5m9+KWH9bQpiEHnmX9fJwYPMo= X-Received: by 2002:a05:6402:34c5:b0:63b:f5cb:e1fa with SMTP id 4fb4d7f45d1cf-63c1f64fd4emr20662412a12.11.1761146515269; Wed, 22 Oct 2025 08:21:55 -0700 (PDT) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@FreeBSD.org MIME-Version: 1.0 References: <8cc739c5-9cd7-42b6-a9ca-c6c864744fba@FreeBSD.org> In-Reply-To: <8cc739c5-9cd7-42b6-a9ca-c6c864744fba@FreeBSD.org> From: Rick Macklem Date: Wed, 22 Oct 2025 08:21:42 -0700 X-Gm-Features: AS18NWCXW6VxaAYIeTce1__aL7pFXN65I2gqmhDW5E3lwY-oRPLoI_rY9RU-QHQ Message-ID: Subject: Re: RFC: How ZFS handles arc memory use To: Alexander Motin Cc: FreeBSD CURRENT , Garrett Wollman , Peter Eriksson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spamd-Bar: ---- X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: 4csCZL4xsnz3Hcn On Wed, Oct 22, 2025 at 8:05=E2=80=AFAM Alexander Motin w= rote: > > Hi Rick, > > On 22.10.2025 10:34, Rick Macklem wrote: > > A couple of people have reported problems with NFS servers, > > where essentially all of the system's memory gets exhausted. > > They see the problem on 14.n FreeBSD servers (which use the > > newer ZFS code) but not on 13.n servers. > > > > I am trying to learn how ZFS handles arc memory use to try > > and figure out what can be done about this problem. > > > > I know nothing about ZFS internals or UMA(9) internals, > > so I could be way off, but here is what I think is happening. > > (Please correct me on this.) > > > > The L1ARC uses uma_zalloc_arg()/uma_zfree_arg() to allocate > > the arc memory. The zones are created using uma_zcreate(), > > so they are regular zones. This means the pages are coming > > from a slab in a keg, which are wired pages. > > > > The only time the size of the slab/keg will be reduced by ZFS > > is when it calls uma_zone_reclaim(.., UMA_RECLAIM_DRAIN), > > which is called by arc_reap_cb(), triggered by arc_reap_cb_check(). > > > > arc_reap_cb_check() uses arc_available_memory() and triggers > > arc_reap_cb() when arc_available_memory() returns a negative > > value. > > > > arc_available_memory() returns a negative value when > > zfs_arc_free_target (vfs.zfs.arc.free_target) is greater than freemem. > > (By default, zfs_arc_free_target is set to vm_cnt.v_free_taget.) > > > > Does all of the above sound about right? > > There are two mechanisms to reduce ARC size: either from ZFS side in the > way you described, or from kernel side, when it calls ZFS low memory > handler arc_lowmem(). It feels somewhat overkill, but it came this way > from Solaris. > > Once ARC size is reduced and evictions into UMA caches happened, it is > up to UMA how to drain its caches. ZFS might trigger that itself, or it > can be done by kernel, or few years back I've added a mechanism for UMA > caches to slowly shrink by themselves even without pressure. > > > This leads me to... > > - zfs_arc_free_target (vfs.zfs.arc.free_target) needs to be larger > > There is a very delicate balance between ZFS and kernel > (zfs_arc_free_target =3D vm_cnt.v_free_target). Imbalance there makes on= e > of them suffer. > > > or > > - Most of the wired pages in the slab are per-cpu, > > so the uma_zone_reclaim() needs to UMA_RECLAIM_DRAIN_CPU > > on some systems. (Not the small test systems I have, where I > > cannot reproduce the problem.) > > Per-CPU caches should be relatively small. IIRC in dozens or hundreds of > allocations per CPU. Their drain is expensive and should rarely be > needed, unless you have too little RAM for the number of CPUs you have. > > > or > > - uma_zone_reclaim() needs to be called under other > > circumstances. > > or > > - ??? > > > > How can you tell if a keg/slab is per-cpu? > > (For my simple test system, I only see "UMA Slabs 0:" and > > "UMA Slabs 1:". It looks like UMA Slabs 0: is being used for > > ZFS arc allocation for this simple test system.) > > > > Hopefully folk who understand ZFS arc allocation or UMA > > can jump in and help out, rick > > Before you dive into UMA, have you checked whether ARC size really > shrinks and eviction happens? Considering you mention NFS, I wonder > what is your number of open files? Too many open files might in some > cases restrict ZFS ability to evict metadata from ARC. arc_summary may > give some insights about ARC state. I don't know if this helps, but the original post is here: https://lists.freebsd.org/archives/freebsd-stable/2025-September/003126.htm= l Then you'll find the email thread that follows it here: https://lists.freebsd.org/archives/freebsd-stable/2025-September/003145.htm= l Hopefully Garrett can respond with more information, rick > > -- > Alexander Motin