Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 06 Aug 2021 04:33:30 -0500
From:      Scott Bennett <bennett@sdf.org>
To:        freebsd@oldach.net
Cc:        bu7cher@yandex.ru, freebsd-stable@freebsd.org, freebsd-net@freebsd.org, ozkan.kirik@gmail.com, markj@freebsd.org
Subject:   Re: Wired Memory Increasing about 500MBytes per day
Message-ID:  <202108060933.1769XU6B001550@sdf.org>

next in thread | raw e-mail | index | archive | help
     On Tue, 3 Aug 2021 17:46:58 +0200 (CEST) freebsd@oldach.net
(Helge Oldach) wrote:
>Andrey V. Elsukov wrote on Tue, 03 Aug 2021 16:54:26 +0200 (CEST):
>> 03.08.2021 17:30, Mark Johnston <U+043F><U+0438><U+0448><U+0435><U+0442>:
>> >>> So if there is some wired page leak, the pgcache zones are probably not
>> >>> directly responsible.
>> >>
>> >> We don't see any leaks, but our monitoring shows that "free" memory
>> >> migrates to "wired" and only these zones are grow.
>> > 
>> > How are you measuring this?  USED or USED+FREE?
>> 
>> AFAIK, monitoring uses sysctl variables:
>> 
>> vm.stats.vm.v_page_size
>> vm.stats.vm.v_free_count
>> vm.stats.vm.v_wire_count
>
>The VM system is occasionally double counting pages because of lazy
>dequeuing, see old PR 234559. So these figures are not fully reliable.
>
>Also see PR 256507 which is discussing a memory leak issue that might be
>related.
>
     Thank you very much, Helge, for the latter PR number above.  These bugs
were first complained of on freebsd-stable@ and, I *think*, on
freebsd-questions@ within days after the release of 11.2.  They have not yet
been fixed or even investigated, AFAICT, by the FreeBSD developers.  However,
it is very good to see these reports that at least the FreeBSD 12.2 kernels
now have some kind of self-repair mechanism that appears to allow the system
to recover from the low-free-list condition that results from the bugs.
     11.2, 11.3, and 11.4 are all still vulnerable to unrecoverable system
work stoppages caused by insufficient free page frames because pages no longer
needed by anything that could justify their pagefixing are in those page
frames.  IOW, pages are being pagefixed and not pagefreed when the need has
expired, *or* perhaps there never was a substantial justification to pagefix
them in the first place.  One thing I have noticed is that the kernel
prioritizes file system cache entries over the needs of executing programs.
A low-free-list condition can often be alleviated in 11.4 by unmounting a
file system that has been very active in the sense of many different files
having been accessed (e.g., large ccache trees, /usr/src, /usr/obj,
$WRKDIRPREFIX).  I have seen that immediately return more than 2 GB to the
free list, which then allowed swapped-out processes to begin to be demand-
paged in and eventually returned the system to relatively normal activity.
However, requiring manual intervention to alleviate a problem caused by a
kernel bug is not suitable in production environments, not to mention that
the alleviative method is not documented anywhere.
     11.2 and 11.3 both went EOL without having these bugs corrected.  11.4
is soon to go the same way.  Methods of minimizing work stoppages and
unjustifiable OOM killings of random processes (e.g., xorg, leaving no console
access to the machine) have been discovered through painful experience and
sharings on this list, but the condition occasionally recurs anyway.  Tasks
that typically are prone to causing the condition include "make buildworld",
"make buildkernel", and "portmaster -a", all of which are essential to
maintain a FreeBSD system, especially -STABLE and -CURRENT systems.  It would
be very good to see the FreeBSD kernel developers finally take the complaints
seriously after several years and now two additional minor releases of 11 and
two further major releases of FreeBSD.
     Until these bugs get fixed, I will continue to maintain that the last
production-quality release of FreeBSD was 11.1 or possibly 12.0, which was
branched prior to 11.2-RELEASE.  That 12.2 can recover is a big improvement,
but that it appears to take a repair method to limit the damage to a system
because the cause of the problem has not been fixed does not make 12.2 a
production-quality system.  Forcing use of 11.2 or later or 12.1 or later onto
the FreeBSD community by means of dropping support on 11.1 when no newer
release had been fixed was an extremely negligent, if not downright
contemptuous, move.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:   bennett at sdf.org   *xor*   bennett at freeshell.org  *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?202108060933.1769XU6B001550>