Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 16 Mar 2026 15:08:44 -0600
From:      Alan Somers <asomers@freebsd.org>
To:        Garrett Wollman <wollman@bimajority.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: ZFS deadlocks/memory accounting issues
Message-ID:  <CAOtMX2gpNQH9hpCnRP%2Bm5kBJDMf5O3MSHgzPVjBKEObyL8bjdw@mail.gmail.com>
In-Reply-To: <27064.27391.224476.910636@hergotha.csail.mit.edu>

index | next in thread | previous in thread | raw e-mail

On Mon, Mar 16, 2026 at 2:41 PM Garrett Wollman <wollman@bimajority.org> wrote:
>
> Since we upgraded to 14.3 last summer, we have been experiencing
> numerous memory accounting issues on our NFS servers.  These manifest
> as a server *desperate* to free up memory despite having multiple
> gigabytes of physical RAM available.  (Some of these machines have 1
> TiB of RAM, with more than 64 GiB free, and were swapping and invoking
> the OOM-killer.)
>
> I had a server deadlock just now after only three days of uptime with
> 32 GiB of free memory.  Prior to the crash, about 70 GiB (of 128) was
> used by the ARC, of which some 60 GiB was accounted for as
> "evictable", and the load was pretty modest.
>
> In DDB on the console, I noted:
>
>   pid  ppid  pgrp   uid  state   wmesg   wchan               cmd
> 60673 60672  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60672     1  3008     0  S       wait    0xfffffe031ee41560  nrpe
> 60670  1186 60670     0  Ds      db->db_ 0xfffff8173309f1e8  sshd-session
> 60669  1202  1202     0  D       voffloc 0xfffff8024db4966a  perl
> 60668 60667  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60667     1  3008     0  S       wait    0xfffffe031ee41000  nrpe
> 60665 60664  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60664     1  3008     0  S       wait    0xfffffe031723a5c0  nrpe
> 60662 60661  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60661     1  3008     0  S       wait    0xfffffe03172395a0  nrpe
> 60659 60658  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60658     1  3008     0  S       wait    0xfffffe0317239040  nrpe
> 60656 60655  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60655     1  3008     0  S       wait    0xfffffe0317238ae0  nrpe
> 60653 60652  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60652     1  3008     0  S       wait    0xfffffe0317238580  nrpe
> 60650 60649  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60649     1  3008     0  S       wait    0xfffffe0317238020  nrpe
> 60647 60646  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60646     1  3008     0  S       wait    0xfffffe0317237ac0  nrpe
> 60644 60643  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60643     1  3008     0  S       wait    0xfffffe0317237000  nrpe
> 60641 60640  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60640     1  3008     0  S       wait    0xfffffe00d3cfa040  nrpe
> 60638  1202  1202     0  D       voffloc 0xfffff8024db4966a  perl
> 60637  1186 60637     0  Ds      db->db_ 0xfffff8173309f1e8  sshd-session
> 60636 60635  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60635     1  3008     0  S       wait    0xfffffe00d3cf9ae0  nrpe
> 60633 60632  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60632     1  3008     0  S       wait    0xfffffe00d3cf9580  nrpe
> 60630 60629  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60629     1  3008     0  S       wait    0xfffffe00d3cf9020  nrpe
> 60627 60626  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60626     1  3008     0  S       wait    0xfffffe00d3cf8560  nrpe
> 60624 60623  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60623     1  3008     0  S       wait    0xfffffe00d3cf8000  nrpe
> 60621 60620  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60620     1  3008     0  S       wait    0xfffffe0317188060  nrpe
> 60618 60617  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60617     1  3008     0  S       wait    0xfffffe0317187b00  nrpe
> 60615 60614  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60614     1  3008     0  S       wait    0xfffffe03171875a0  nrpe
> 60612 60611  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60611     1  3008     0  S       wait    0xfffffe0317186ae0  nrpe
> 60609 60608  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60608     1  3008     0  S       wait    0xfffffe0317186580  nrpe
> 60606  1186 60606     0  Ds      db->db_ 0xfffff8173309f1e8  sshd-session
> 60605 60604  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60604     1  3008     0  S       wait    0xfffffe0317186020  nrpe
> 60602 60601  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60601     1  3008     0  S       wait    0xfffffe0317185ac0  nrpe
> 60599  1202  1202     0  D       voffloc 0xfffff8024db4966a  perl
> 60598 60597  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60597     1  3008     0  S       wait    0xfffffe0317185560  nrpe
> 60595 60594  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60594     1  3008     0  S       wait    0xfffffe0317185000  nrpe
> 60592 60591  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60591     1  3008     0  S       wait    0xfffffe031724c5c0  nrpe
> 60589 60588  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60588     1  3008     0  S       wait    0xfffffe031724c060  nrpe
> 60586 60585  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60585     1  3008     0  S       wait    0xfffffe031724b5a0  nrpe
> 60583 60582  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60582     1  3008     0  S       wait    0xfffffe031724a580  nrpe
> 60580 60579  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60579     1  3008     0  S       wait    0xfffffe031724a020  nrpe
> 60577  1186 60577     0  Ds      aw.aew_ 0xfffffe0326e5a608  sshd-session
> 60576 60575  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60575     1  3008     0  S       wait    0xfffffe0317249560  nrpe
> 60573  1202  1202     0  D       aw.aew_ 0xfffffe0326df6478  perl
> 60572 60571  3008     0  D       db->db_ 0xfffff8058173af68  nrpe
> 60571     1  3008     0  S       wait    0xfffffe0317249000  nrpe
>  5015  5010  5015  6263  Ss+     ttyin   0xfffff810aa50a8b0  zsh
>  5010  5006  5006  6263  S       select  0xfffff8024ca966c0  sshd-session
>  5006  1186  5006     0  Ss      select  0xfffff8024ca984c0  sshd-session
>  3008     1  3008     0  Ss      select  0xfffff80209dc98c0  nrpe
>  2910     1  2910     0  Ds+     aw.aew_ 0xfffffe03274d66e8  getty
>
> This getty is the one running on the console tty, which was stuck.
> Note the wait channel is "aw.aew_cv", which is part of the logic for
> evicting buffers from the ARC.  Other threads are waiting for a
> dbuf (ZFS disk buffer) object mutex.
>
> I'm currently planning on taking us to 14.4 later this spring, but it
> would be nice to know if anyone else has seen this bug or has a fix.
> I've tried dropping kern.maxvnodes and increasing
> vfs.zfs.arc_free_target, with no change in symptoms.
>
> This particular server is due to be replaced but the new disk array
> (which was ordered in January) won't ship until late April per the
> vendor.
>
> -GAWollman

I once saw a similar bug.  In my case I had a process that mmap()ed
some very large files on fusefs, consuming lots of inactive pages.
And when the system comes under memory pressure, it asks ARC to evict
first.  So the ARC would end up shrinking down to arc_min every time.
In my case, the solution was to set vfs.fusefs.data_cache_mode=0 .  I
suspect that similar bugs could be possible with UFS or tmpfs, if they
have giant files that are mmaped().

A less effective workaround was to set vfs.zfs.arc.min to some
reasonable value.  That can prevent ARC from shrinking too far.  You
could try that.

Another thing you could try is to run "vmstat -o" when the system is
in the problematic state.  That will show you which vm objects are
using the most inactive pages.

Hope this helps,
-Alan


home | help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOtMX2gpNQH9hpCnRP%2Bm5kBJDMf5O3MSHgzPVjBKEObyL8bjdw>