Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Aug 2024 15:20:25 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 280671] Memory leak on FreeBSD 13.3 and 14.1
Message-ID:  <bug-280671-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D280671

            Bug ID: 280671
           Summary: Memory leak on FreeBSD 13.3 and 14.1
           Product: Base System
           Version: 14.1-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: sre@truespeed.com

Created attachment 252589
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D252589&action=
=3Dedit
chart of memory usage

Good afternoon

We recently upgraded the operating system for one of our servers from FreeB=
SD
12.4 to 13.3.
This server uses the Generic Kernel, a Mirrored ZFS Zpool, and has a few Ja=
ils
on FreeBSD 12.0 that are running standard applications (Java based web
services, PostgreSQL, RabbitMQ).
The server has 96GB of RAM and was not experiencing memory shortages prior =
to
upgrading.=20

We followed the standard upgrade process as described here:
https://docs.freebsd.org/en/books/handbook/cutting-edge/#freebsdupdate-upgr=
ade=20
Followed by upgrading packages and our zpools. We have not upgraded our Jai=
ls.

After upgrading, we began to experience what seemed like a memory leak on t=
he
server. Over time the Inactive Memory would grow before dumping gigabytes a=
t a
time into Laundry that was never cleaned before eventually running out of f=
ree
Memory and begin thrashing. At this point we lose access to the server, and=
 the
services it is running become unresponsive. We resolve this by power cycling
the server, and it returns to normal use on reboot.

We are currently rebooting the server every few days before it enters the
thrashing state, but this is not a feasible long-term solution, and we beli=
eve
there is a Memory Leak that is causing this situation.

As part of debugging the memory issue, we have tried to recover memory by
turning off existing jails (as shown on the chart below by a large dip in
memory usage around 1am), but this memory is rapidly consumed again. Also, =
when
the server is close to entering the thrashing state, we have turned off eve=
ry
jail and service (except for a few critical ones, ie SSHd) to see how much
memory is being =E2=80=9Clost=E2=80=9D, and it was about 39GB~, with 9GB us=
ed by ARC.
Mem: 63M Active, 8656M Inact, 18G Laundry, 17G Wired, 50G Free

Also, we have limited ARC usage with the following sysctl vfs.zfs.arc_max, =
but
that hasn=E2=80=99t made any meaningful impact.

There is nothing else standing out on the server, no unusual CPU utilisatio=
n,
no unusual network traffic, all the crons are as before the upgrades, and we
haven=E2=80=99t deployed any additional jails or services to the server.

We then upgraded from 13.3 to 14.1 as there was a ZFS Memory Leak Errata in=
 the
14.1 release notes:
https://www.freebsd.org/security/advisories/FreeBSD-EN-24:10.zfs.asc, but t=
hat
hasn=E2=80=99t resolved our issue.
As you can see from the charts below this is the memory usage pattern we are
dealing with, this data is being pulled from sysctl by the node_exporter for
Prometheus.

The attached chart_1 shows the memory usage of the server over the last wee=
k.

The attached chart_2 shows the final hour before the server begins thrashin=
g.
Laundry grows to 62GB and Inactive and Free Memory are both reduced to <1GB.

The available swap is 4GB but it does not seem like it=E2=80=99s getting us=
ed to
justify needing to increase the swap space. We have also temporarily disabl=
ed
the SWAP entirely and that hasn=E2=80=99t made any difference.

If you require any additional information, please let us know.

Kind regards,
Truespeed

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-280671-227>