From nobody Fri Aug 6 09:33:30 2021 X-Original-To: freebsd-stable@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 5970F10FEA27; Fri, 6 Aug 2021 09:33:40 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from mx.sdf.org (mx.sdf.org [205.166.94.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "mx.sdf.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Gh0fC2lhlz3pg8; Fri, 6 Aug 2021 09:33:39 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from sdf.org (IDENT:bennett@otaku.sdf.org [205.166.94.8]) by mx.sdf.org (8.15.2/8.14.5) with ESMTPS id 1769XVS3027492 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits) verified NO); Fri, 6 Aug 2021 09:33:32 GMT Received: (from bennett@localhost) by sdf.org (8.15.2/8.12.8/Submit) id 1769XU6B001550; Fri, 6 Aug 2021 04:33:30 -0500 (CDT) From: Scott Bennett Message-Id: <202108060933.1769XU6B001550@sdf.org> Date: Fri, 06 Aug 2021 04:33:30 -0500 To: freebsd@oldach.net Subject: Re: Wired Memory Increasing about 500MBytes per day Cc: bu7cher@yandex.ru, freebsd-stable@freebsd.org, freebsd-net@freebsd.org, ozkan.kirik@gmail.com, markj@freebsd.org User-Agent: Heirloom mailx 12.5 6/20/10 List-Id: Production branch of FreeBSD source code List-Archive: https://lists.freebsd.org/archives/freebsd-stable List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4Gh0fC2lhlz3pg8 X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=quarantine) header.from=sdf.org; spf=pass (mx1.freebsd.org: domain of bennett@sdf.org designates 205.166.94.24 as permitted sender) smtp.mailfrom=bennett@sdf.org X-Spamd-Result: default: False [-0.78 / 15.00]; FAKE_REPLY(1.00)[]; ARC_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:205.166.94.0/24]; MID_RHS_MATCH_FROM(0.00)[]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; RCPT_COUNT_FIVE(0.00)[6]; NEURAL_HAM_LONG(-1.00)[-1.000]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_HAM_SHORT(-0.48)[-0.482]; DMARC_POLICY_ALLOW(-0.50)[sdf.org,quarantine]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; ASN(0.00)[asn:14361, ipnet:205.166.94.0/24, country:US]; SUSPICIOUS_RECIPS(1.50)[]; FREEMAIL_CC(0.00)[yandex.ru,freebsd.org,gmail.com] X-Spam: Yes X-ThisMailContainsUnwantedMimeParts: N On Tue, 3 Aug 2021 17:46:58 +0200 (CEST) freebsd@oldach.net (Helge Oldach) wrote: >Andrey V. Elsukov wrote on Tue, 03 Aug 2021 16:54:26 +0200 (CEST): >> 03.08.2021 17:30, Mark Johnston : >> >>> So if there is some wired page leak, the pgcache zones are probably not >> >>> directly responsible. >> >> >> >> We don't see any leaks, but our monitoring shows that "free" memory >> >> migrates to "wired" and only these zones are grow. >> > >> > How are you measuring this? USED or USED+FREE? >> >> AFAIK, monitoring uses sysctl variables: >> >> vm.stats.vm.v_page_size >> vm.stats.vm.v_free_count >> vm.stats.vm.v_wire_count > >The VM system is occasionally double counting pages because of lazy >dequeuing, see old PR 234559. So these figures are not fully reliable. > >Also see PR 256507 which is discussing a memory leak issue that might be >related. > Thank you very much, Helge, for the latter PR number above. These bugs were first complained of on freebsd-stable@ and, I *think*, on freebsd-questions@ within days after the release of 11.2. They have not yet been fixed or even investigated, AFAICT, by the FreeBSD developers. However, it is very good to see these reports that at least the FreeBSD 12.2 kernels now have some kind of self-repair mechanism that appears to allow the system to recover from the low-free-list condition that results from the bugs. 11.2, 11.3, and 11.4 are all still vulnerable to unrecoverable system work stoppages caused by insufficient free page frames because pages no longer needed by anything that could justify their pagefixing are in those page frames. IOW, pages are being pagefixed and not pagefreed when the need has expired, *or* perhaps there never was a substantial justification to pagefix them in the first place. One thing I have noticed is that the kernel prioritizes file system cache entries over the needs of executing programs. A low-free-list condition can often be alleviated in 11.4 by unmounting a file system that has been very active in the sense of many different files having been accessed (e.g., large ccache trees, /usr/src, /usr/obj, $WRKDIRPREFIX). I have seen that immediately return more than 2 GB to the free list, which then allowed swapped-out processes to begin to be demand- paged in and eventually returned the system to relatively normal activity. However, requiring manual intervention to alleviate a problem caused by a kernel bug is not suitable in production environments, not to mention that the alleviative method is not documented anywhere. 11.2 and 11.3 both went EOL without having these bugs corrected. 11.4 is soon to go the same way. Methods of minimizing work stoppages and unjustifiable OOM killings of random processes (e.g., xorg, leaving no console access to the machine) have been discovered through painful experience and sharings on this list, but the condition occasionally recurs anyway. Tasks that typically are prone to causing the condition include "make buildworld", "make buildkernel", and "portmaster -a", all of which are essential to maintain a FreeBSD system, especially -STABLE and -CURRENT systems. It would be very good to see the FreeBSD kernel developers finally take the complaints seriously after several years and now two additional minor releases of 11 and two further major releases of FreeBSD. Until these bugs get fixed, I will continue to maintain that the last production-quality release of FreeBSD was 11.1 or possibly 12.0, which was branched prior to 11.2-RELEASE. That 12.2 can recover is a big improvement, but that it appears to take a repair method to limit the damage to a system because the cause of the problem has not been fixed does not make 12.2 a production-quality system. Forcing use of 11.2 or later or 12.1 or later onto the FreeBSD community by means of dropping support on 11.1 when no newer release had been fixed was an extremely negligent, if not downright contemptuous, move. Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************