From owner-freebsd-arm@freebsd.org Mon Jan 27 19:07:04 2020 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E92C5228273 for ; Mon, 27 Jan 2020 19:07:04 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (www.zefox.net [50.1.20.27]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "www.zefox.org", Issuer "www.zefox.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 485zkv5CKnz3FdR for ; Mon, 27 Jan 2020 19:07:03 +0000 (UTC) (envelope-from fbsd@www.zefox.net) Received: from www.zefox.net (localhost [127.0.0.1]) by www.zefox.net (8.15.2/8.15.2) with ESMTPS id 00RJ7AkP011385 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 27 Jan 2020 11:07:11 -0800 (PST) (envelope-from fbsd@www.zefox.net) Received: (from fbsd@localhost) by www.zefox.net (8.15.2/8.15.2/Submit) id 00RJ7AbT011384; Mon, 27 Jan 2020 11:07:10 -0800 (PST) (envelope-from fbsd) Date: Mon, 27 Jan 2020 11:07:09 -0800 From: bob prohaska To: freebsd-arm@freebsd.org Subject: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 Message-ID: <20200127190709.GA11328@www.zefox.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-Rspamd-Queue-Id: 485zkv5CKnz3FdR X-Spamd-Bar: +++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of fbsd@www.zefox.net has no SPF policy when checking 50.1.20.27) smtp.mailfrom=fbsd@www.zefox.net X-Spamd-Result: default: False [3.27 / 15.00]; ARC_NA(0.00)[]; WWW_DOT_DOMAIN(0.50)[]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; IP_SCORE(0.06)[ip: (0.27), ipnet: 50.1.16.0/20(0.13), asn: 7065(-0.04), country: US(-0.05)]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[zefox.net]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.43)[0.429,0]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; NEURAL_SPAM_LONG(0.88)[0.877,0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:7065, ipnet:50.1.16.0/20, country:US]; MID_RHS_MATCH_FROM(0.00)[]; MID_RHS_WWW(0.50)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Jan 2020 19:07:05 -0000 The latest attempt at buildworld on a Pi3 with kernel and sources at r357147 stopped with an "out of swap" kill. The activity log reported, in the one second samples before, during and after the kill recorded: procs memory page disks faults cpu r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id 4 0 0 1670108 58012 1332 5 2 1 1414 597 0 0 10572 1251 2340 80 18 2 dT: 1.015s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name Mon Jan 27 10:32:06 PST 2020 Device 1K-blocks Used Avail Capacity /dev/mmcsd0s2b 4404252 135540 4268712 3% /dev/da0p6 5242880 132148 5110732 3% Total 9647132 267688 9379444 3% Jan 26 19:44:00 www sshd[1289]: error: maximum authentication attempts exceeded for invalid user from 45.136.108.85 port 16543 ssh2 [preauth] Jan 26 19:44:12 www sshd[1298]: error: maximum authentication attempts exceeded for invalid user from 45.136.108.85 port 4581 ssh2 [preauth] 0/254/254/19180 mbuf clusters in use (current/cache/total/max) procs memory page disks faults cpu r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id 3 0 0 1683560 54412 1332 5 2 1 1414 603 0 0 10572 1251 2340 80 18 2 dT: 1.051s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 0 79 3 11 917.7 76 1557 212.6 0 0 0.0 318.6 mmcsd0 0 79 3 11 918.0 76 1557 212.8 0 0 0.0 318.6 mmcsd0s2 0 52 1 4 1.4 51 780 1.4 0 0 0.0 6.9 da0 0 22 0 0 0.0 22 700 177.2 0 0 0.0 32.5 mmcsd0s2a 0 57 3 11 918.0 54 856 227.2 0 0 0.0 318.6 mmcsd0s2b 0 22 0 0 0.0 22 700 177.3 0 0 0.0 32.5 ufs/rootfs 0 51 1 4 1.4 50 780 1.4 0 0 0.0 7.1 da0p6 Mon Jan 27 10:32:12 PST 2020 Device 1K-blocks Used Avail Capacity /dev/mmcsd0s2b 4404252 137184 4267068 3% /dev/da0p6 5242880 134284 5108596 3% Total 9647132 271468 9375664 3% Jan 26 19:44:00 www sshd[1289]: error: maximum authentication attempts exceeded for invalid user from 45.136.108.85 port 16543 ssh2 [preauth] Jan 26 19:44:12 www sshd[1298]: error: maximum authentication attempts exceeded for invalid user from 45.136.108.85 port 4581 ssh2 [preauth] 0/256/256/19180 mbuf clusters in use (current/cache/total/max) procs memory page disks faults cpu r b w avm fre flt re pi po fr sr mm0 da0 in sy cs us sy id 3 0 0 1394192 142844 1332 5 2 1 1415 607 0 0 10573 1251 2341 80 18 2 dT: 1.006s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps ms/d %busy Name 0 25 25 557 1.9 0 0 0.0 0 0 0.0 4.6 mmcsd0 0 25 25 557 1.9 0 0 0.0 0 0 0.0 4.8 mmcsd0s2 0 17 17 159 1.3 0 0 0.0 0 0 0.0 2.2 da0 0 8 8 231 2.4 0 0 0.0 0 0 0.0 1.9 mmcsd0s2a 0 17 17 326 1.7 0 0 0.0 0 0 0.0 2.9 mmcsd0s2b 0 8 8 231 2.4 0 0 0.0 0 0 0.0 1.9 ufs/rootfs 0 17 17 159 1.3 0 0 0.0 0 0 0.0 2.3 da0p6 Mon Jan 27 10:32:21 PST 2020 Device 1K-blocks Used Avail Capacity /dev/mmcsd0s2b 4404252 43020 4361232 1% /dev/da0p6 5242880 42128 5200752 1% Total 9647132 85148 9561984 1% Jan 26 19:44:12 www sshd[1298]: error: maximum authentication attempts exceeded for invalid user from 45.136.108.85 port 4581 ssh2 [preauth] Jan 27 10:32:18 www kernel: pid 97756 (c++), jid 0, uid 0, was killed: out of swap space Here's the command used to collect the activity log: #!/bin/sh while true do vmstat ; gstat -abd -I 1s ; date ; swapinfo ; tail -n 2 /var/log/messages ; netstat -m | grep "mbuf clusters" done It looks as if the vm.pfault_oom_attempts="-1" no longer shuts OOMA off. Is there another way to deal with the problem? As an aside, it appears the activity percentages in top have changed: Formerly the per-cpu numbers totalled about four times the total %busy. Now the per-cpu numbers roughly add up to total %busy. Not sure it matters, but it's certainly different from previous behavior. Perhaps most surprisingly, after buildworld and the activity logger had stopped (while I was writing this little missive) the machine again panic'd, reporting: panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffd0000eff000, blocked for 1800269 ticks Thanks for reading and any ideas. bob prohaska