From owner-freebsd-hackers@freebsd.org Sat Jan 13 08:59:48 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4D59E7E82F for ; Sat, 13 Jan 2018 08:59:48 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-152.reflexion.net [208.70.210.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 809357CB05 for ; Sat, 13 Jan 2018 08:59:47 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 13158 invoked from network); 13 Jan 2018 01:53:05 -0000 Received: from unknown (HELO mail-cs-02.app.dca.reflexion.local) (10.81.19.2) by 0 (rfx-qmail) with SMTP; 13 Jan 2018 01:53:05 -0000 Received: by mail-cs-02.app.dca.reflexion.local (Reflexion email security v8.40.4) with SMTP; Fri, 12 Jan 2018 20:53:05 -0500 (EST) Received: (qmail 15122 invoked from network); 13 Jan 2018 01:53:05 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 13 Jan 2018 01:53:05 -0000 Received: from [192.168.1.25] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id 831DBEC944A; Fri, 12 Jan 2018 17:53:04 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Builworld stalls on rpi2 [various processes stuck in pfault and vmwait with 1996M Free Swap listed by top] From: Mark Millard In-Reply-To: <20180113005426.GA48702@www.zefox.net> Date: Fri, 12 Jan 2018 17:53:03 -0800 Cc: Freebsd-arm , FreeBSD Hackers Content-Transfer-Encoding: quoted-printable Message-Id: <19904500-5819-47AC-9666-7103ED87C1CA@dsl-only.net> References: <20180113005426.GA48702@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jan 2018 08:59:48 -0000 On 2018-Jan-12, at 4:54 PM, bob prohaska wrote: > Trying to self-host a build of r327859 using a GENERIC kernel at =20 > r327664, make seems to stall, with top showing >=20 > last pid: 28822; load averages: 3.12, 3.95, 5.09 up 0+08:39:01 = 16:39:49 > 50 processes: 1 running, 47 sleeping, 2 waiting > CPU: 0.0% user, 0.0% nice, 0.2% system, 0.9% interrupt, 98.9% idle > Mem: 527M Active, 16M Inact, 98M Laundry, 148M Wired, 86M Buf, 3272K = Free > Swap: 2048M Total, 52M Used, 1996M Free, 2% Inuse >=20 > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU = COMMAND > 769 bob 1 20 0 6204K 1344K CPU0 0 3:02 0.68% = top > 674 bob 1 20 0 11188K 1636K select 1 0:18 0.04% = sshd > 719 root 1 20 0 4572K 552K select 0 0:03 0.01% = make > 28760 root 1 52 0 346M 302M pfault 2 13:59 0.00% = c++ > 28812 root 1 52 0 208M 167M pfault 2 2:54 0.00% = c++ > 28815 root 1 52 0 212M 171M pfault 1 2:20 0.00% = c++ > 22172 root 1 20 0 13036K 4484K select 2 2:09 0.00% = make > 28820 root 1 52 0 145M 104M pfault 1 2:00 0.00% = c++ > 21438 root 1 20 0 7092K 556K select 0 0:05 0.00% = make > 695 root 1 20 0 4016K 552K select 1 0:04 0.00% = make > 593 root 1 20 0 8156K 1596K vmwait 1 0:04 0.00% = sendmail > 20119 root 1 20 0 4516K 548K select 1 0:03 0.00% = make > 21427 root 1 20 0 4484K 556K select 2 0:02 0.00% = make > 590 root 1 20 0 10148K 1552K vmwait 1 0:02 0.00% = sshd > 22168 root 1 20 0 3956K 560K select 0 0:02 0.00% = make > 600 root 1 20 0 4960K 0K WAIT 2 0:01 0.00% = > 461 root 1 20 0 4916K 1020K select 1 0:01 0.00% = syslogd >=20 > The machine seems dead, none of the ssh sessions responds to = keystrokes,=20 > nor the serial console. There are a smattering of=20 > smsc0: warning: Failed to write register 0x114 > smsc0: warning: Failed to read register 0x114 > smsc0: warning: MII is busy > smsc0: warning: Failed to write register 0x114 >=20 > The machine still answers ping. Typing escape control-b does not > bring up a debugger, did the keysequence change? Power cycling seems > to be the only way out. With or without: options ALT_BREAK_TO_DEBUGGER For with: ~^B (with being and ^ being ) is an alternate with this. I've see the smsc0 messages before but I'm not up to -r327664+ yet. This has been with a non-debug kernel running. I've had building large ports get into such states, especially while at least one large link operation was active with other fairly large processes, as I remember. Note all the pfault and vmwait lines. It looks like -r327316 and -r327468 did not happen to avoid this. It looks like the paging/swaping has gotten stuck in some way. How tied that might be to smsc0 messages, I've no clue. You might get through by using -j3 or -j2 or -j1 which likely would use less process space at once (worst case) than -j4 happened to. Of course there are other time consequences as you approach -j1 (or no explicit -j for the buildworld at all). =3D=3D=3D Mark Millard markmi at dsl-only.net