From owner-freebsd-current@freebsd.org Fri Jan 6 16:33:54 2017 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CEBC2CA23AF for ; Fri, 6 Jan 2017 16:33:54 +0000 (UTC) (envelope-from jonathan@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "freefall.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 939E311A3; Fri, 6 Jan 2017 16:33:54 +0000 (UTC) (envelope-from jonathan@FreeBSD.org) Received: from [172.31.128.104] (freefall.freebsd.org [IPv6:2610:1c1:1:6074::16:84]) by freefall.freebsd.org (Postfix) with ESMTP id BB9ABE94; Fri, 6 Jan 2017 16:33:53 +0000 (UTC) (envelope-from jonathan@FreeBSD.org) From: "Jonathan Anderson" To: "Mark Johnston" Cc: alc@freebsd.org, "freebsd-current Current" Subject: Re: PQ_LAUNDRY: unexpected behaviour Date: Fri, 06 Jan 2017 11:33:53 -0500 Message-ID: <4B9D073A-2F1D-468B-B042-B035734C34FB@FreeBSD.org> In-Reply-To: <20170102183352.GA46812@wkstn-mjohnston.west.isilon.com> References: <20170102183352.GA46812@wkstn-mjohnston.west.isilon.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed X-Mailer: MailMate (1.9.6r5319) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Jan 2017 16:33:54 -0000 On 2 Jan 2017, at 13:33, Mark Johnston wrote: > On Mon, Jan 02, 2017 at 10:31:50AM -0330, Jonathan Anderson wrote: >> Hi all, >> >> I'm seeing some unexpected PQ_LAUNDRY behaviour on something fairly >> close >> to -CURRENT (drm-next-4.7 with an IFC on 26 Dec). Aside from the use >> of >> not-quite-CURRENT, it's also very possible that I don't understand >> how the >> laundry queue is supposed to work. Nonetheless, I thought I'd check >> whether >> there is a tunable I should change, an issue with the laundry queue >> itself, >> etc. > > My suspicion is that this is a memory leak of some sort and unrelated > to > PQ_LAUNDRY itself. That is, with the previous policy you would see > lots > of swap usage and a large inactive queue instead. That sounds very plausible... I'm testing with the new DRM drivers to see if that helps. >> After running X overnight (i915 can now run overnight on >> drm-next-4.7!), I >> end up with a little over half of my system memory in the laundry >> queue and >> a bunch of swap utilization. Even after closing X and shutting down >> lots of >> services, I see the following in top: >> >> ``` >> Mem: 977M Active, 31M Inact, 4722M Laundry, 1917M Wired, 165M Free >> ARC: 697M Total, 67M MFU, 278M MRU, 27K Anon, 22M Header, 331M Other >> Swap: 4096M Total, 2037M Used, 2059M Free, 49% Inuse >> >> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU >> COMMAND >> 911 root 1 52 0 57788K 4308K select 1 0:00 0.00% >> sshd >> 974 root 1 20 0 43780K 0K wait 2 0:00 0.00% >> >> 1406 jon 1 20 0 33520K 2748K select 0 0:04 0.00% >> gpg-agent >> 2038 jon 1 20 0 31280K 5452K ttyin 3 0:18 0.00% >> zsh >> 1251 jon 1 22 0 31280K 4500K pause 3 0:02 1.46% >> zsh >> 7102 jon 1 20 0 31280K 3744K ttyin 0 0:00 0.00% >> zsh >> 1898 jon 1 20 0 31280K 3036K ttyin 1 0:00 0.00% >> zsh >> 1627 jon 1 21 0 31280K 0K pause 0 0:00 0.00% >> >> 22989 jon 1 20 0 31152K 6020K ttyin 1 0:01 0.00% >> zsh >> 22495 jon 1 49 0 31152K 6016K ttyin 0 0:02 0.00% >> zsh >> 1621 jon 1 20 0 28196K 8816K select 2 0:40 0.00% >> tmux >> 6214 jon 1 52 0 27008K 2872K ttyin 1 0:00 0.00% >> zsh >> 6969 jon 1 52 0 27008K 2872K ttyin 3 0:00 0.00% >> zsh >> 6609 root 1 20 0 20688K 4604K select 1 0:00 0.00% >> wpa_supplicant >> 914 root 1 20 0 20664K 5232K select 2 0:02 0.00% >> sendmail >> 917 smmsp 1 20 0 20664K 0K pause 0 0:00 0.00% >> >> 24206 jon 1 23 0 20168K 3500K CPU0 0 0:00 0.00% >> top >> 921 root 1 20 0 12616K 608K nanslp 1 0:00 0.00% >> cron >> ``` >> >> Are there any things I could do (e.g., sysctls, tunables) to figure >> out >> what's happening? Can I manually force the laundry to be done? >> `swapoff -a` >> fails due to a lack of memory. > > Is that the full list of processes? Does "ipcs -m" show any named shm > segments? > > Looking at the DRM code, the GEM uses swap objects to back allocations > by the drivers, so this could be the result of a kernel page leak in > the > drm-next branch. If so, you'll need a reboot to recover. That was the full list of processes, yes. I haven't been able to reproduce this particular issue on new DRM code, as I'm now confronted with a different issue. :) If I do get back to this condition, I'll try checking `ipcs -m`, thanks. Jon -- jonathan@FreeBSD.org