Date: Sat, 14 May 2022 01:09:30 -0700 From: Mark Millard <marklmi@yahoo.com> To: Pete Wright <pete@nomadlogic.org>, freebsd-current <freebsd-current@freebsd.org> Subject: Re: Chasing OOM Issues - good sysctl metrics to use? Message-ID: <8C14A90D-3429-437C-A815-E811B7BFBF05@yahoo.com> References: <8C14A90D-3429-437C-A815-E811B7BFBF05.ref@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Pete Wright <pete_at_nomadlogic.org> wrote on Date: Fri, 13 May 2022 13:43:11 -0700 : > On 5/11/22 12:52, Mark Millard wrote: > > > > > > Relative to avoiding hang-ups, so far it seems that > > use of vm.swap_enabled=3D0 with vm.swap_idle_enabled=3D0 > > makes hang-ups less likely/less frequent/harder to > > produce examples of. But is no guarantee of lack of > > a hang-up. Its does change the cause of the hang-up > > (in that it avoids processes with kernel stacks swapped > > out being involved). >=20 > thanks for the above analysis Mark. i am going to test these settings=20= > out now as i'm still seeing the lockup. >=20 > this most recent hang-up was using a patch tijl_at_ asked me to test=20= > (attached to this email), and the default setting of = vm.pageout_oom_seq:=20 > 12. I also had been run various tests for tijl_at_ , the same sort of 'removal of the " + 1" patch'. I had found a basic way to tell if a fundamental problem was completely avoided or not, without having to wait long periods of activity to do so. But that does not mean the test is a good simulation of your context's sequence that leads to issues. Nor does it indicate how wide a range of activity is fairly likely to reach the failing conditions. You could see how vm.pageout_oom_seq=3D120 does for you with the patch. I was never patient enough to wait long enough for this to OOM kill or hang-up in my test context. I've been reporting the likes of: # sysctl vm.domain.0.stats # done after the fact vm.domain.0.stats.inactive_pps: 1037 vm.domain.0.stats.free_severe: 15566 vm.domain.0.stats.free_min: 25759 vm.domain.0.stats.free_reserved: 5374 vm.domain.0.stats.free_target: 86914 vm.domain.0.stats.inactive_target: 130371 vm.domain.0.stats.unswppdpgs: 0 vm.domain.0.stats.unswappable: 0 vm.domain.0.stats.laundpdpgs: 858845 vm.domain.0.stats.laundry: 9 vm.domain.0.stats.inactpdpgs: 1040939 vm.domain.0.stats.inactive: 1063 vm.domain.0.stats.actpdpgs: 407937767 vm.domain.0.stats.active: 1032 vm.domain.0.stats.free_count: 3252526 But I also have a kernel that reports just before the call that is to cause a OOM kill, ending up with output like: vm_pageout_mightbe_oom: kill context: v_free_count: 15306, = v_inactive_count: 1, v_laundry_count: 64, v_active_count: 3891599 May 11 00:44:11 CA72_Mbin_ZFS kernel: pid 844 (stress), jid 0, uid 0, = was killed: failed to reclaim memory (I was testing main [so: 14].) So I report that as well. Since I was using stress as part of my test context, there were also lines like: stress: FAIL: [843] (415) <-- worker 844 got signal 9 stress: WARN: [843] (417) now reaping child worker processes stress: FAIL: [843] (451) failed run completed in 119s (tijl_at_ had me add v_laundry_count and v_active_count to what I've had carried forward since back in 2018 when Mark J. provided the original extra message.) Turns out the kernel debugger (db> prompt) can report the same general sort of figures: db> show page vm_cnt.v_free_count: 15577 vm_cnt.v_inactive_count: 1 vm_cnt.v_active_count: 3788852 vm_cnt.v_laundry_count: 0 vm_cnt.v_wire_count: 272395 vm_cnt.v_free_reserved: 5374 vm_cnt.v_free_min: 25759 vm_cnt.v_free_target: 86914 vm_cnt.v_inactive_target: 130371 db> show pageq pq_free 15577 dom 0 page_cnt 4077116 free 15577 pq_act 3788852 pq_inact 1 pq_laund 0 = pq_unsw 0 (Note: pq_unsw is a non-swappable count that excludes the wired count, apparently matching vm.domain.0.stats.unswappable .) The above is the most extremely small pq_inact+pq_laund that I saw at the OOM kill time or during a "hang-up" (what I saw across example "hang-ups" suggests to me a livelock context, not a deadlock context). > interestingly enough with the patch applied i observed a smaller=20 > amount of memory used for laundry as well as less swap space used = until=20 > right before the crash. If your logging of values has been made public, I've not (yet?) looked at it at all. None of my testing reached a stage of having much swap space in use. But the test is biased to produce the problems quickly, rather than to explore a range of ways to reach conditions with the problem. I've stopped testing for now and am doing a round of OS building and upgrading, port (re-)building and installing and the like, mostly for aarch64 but also for armv7 and amd64. (This is without the 'remove " + 1"' patch.) One of the points is to see if I get any evidence of vm.swap_enabled=3D0 with vm.swap_idle_enabled=3D0 ending up contributing to any problems in my normal usage. So far: no. vm.pageout_oom_seq=3D120 is in use for this, my normal context since sometime in 2018. =3D=3D=3D Mark Millard marklmi at yahoo.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?8C14A90D-3429-437C-A815-E811B7BFBF05>