From owner-freebsd-hackers@freebsd.org Thu Dec 7 01:28:25 2017 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A4EF8E94B37 for ; Thu, 7 Dec 2017 01:28:25 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: from asp.reflexion.net (outbound-mail-210-149.reflexion.net [208.70.210.149]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 65ACC6D763 for ; Thu, 7 Dec 2017 01:28:24 +0000 (UTC) (envelope-from markmi@dsl-only.net) Received: (qmail 30726 invoked from network); 7 Dec 2017 01:01:38 -0000 Received: from unknown (HELO rtc-sm-01.app.dca.reflexion.local) (10.81.150.1) by 0 (rfx-qmail) with SMTP; 7 Dec 2017 01:01:38 -0000 Received: by rtc-sm-01.app.dca.reflexion.local (Reflexion email security v8.40.3) with SMTP; Wed, 06 Dec 2017 20:01:38 -0500 (EST) Received: (qmail 11093 invoked from network); 7 Dec 2017 01:01:38 -0000 Received: from unknown (HELO iron2.pdx.net) (69.64.224.71) by 0 (rfx-qmail) with (AES256-SHA encrypted) SMTP; 7 Dec 2017 01:01:38 -0000 Received: from [192.168.1.25] (c-76-115-7-162.hsd1.or.comcast.net [76.115.7.162]) by iron2.pdx.net (Postfix) with ESMTPSA id D0409EC861A; Wed, 6 Dec 2017 17:01:37 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: rpi2 hangup during poudriere build: lots of pfault wmseg status From: Mark Millard In-Reply-To: Date: Wed, 6 Dec 2017 17:01:36 -0800 Cc: freebsd-arm@freebsd.org, freebsd-hackers@freebsd.org, freebsd-current@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <36A8BDCC-4ECE-4187-8705-54A9E38E8AD5@dsl-only.net> References: <05BEA04B-249B-4E7D-855A-46DA1A0DEA16@dsl-only.net> To: Laurent Cimon X-Mailer: Apple Mail (2.3273) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Dec 2017 01:28:25 -0000 On 2017-Dec-6, at 1:54 PM, Laurent Cimon wrote: >> On Dec 6, 2017, at 00:57, Mark Millard = wrote: >>=20 >> I tried to build some ports on a rpi2 >> (via poudriere) but it hung up: >> Ethernet and normal console use. (Note: >> the root file system is on a USB SSD >> and the swap partition is also on that >> USB SSD.) >>=20 >> But ~^b worked for getting to the db> >> prompt on the console. >>=20 >> =46rom there a ps suggests that it got hung >> up in pfault activity. (Possibly insufficient >> RAM+swap-partition space?) But it is not >> clear to me that it should end up hung up >> vs. killing processes or other such. >=20 > Hi, >=20 > =46rom what I know the raspberry pis use the same controller for = ethernet and > the USB hub on which you=E2=80=99re hosting an SSD. It seems like you = make very heavy > use of the USB ports, and all of the resources used by poudriere = except for the > CPU and the (very limited) memory that=E2=80=99s not in swap is = attached to them. If you > really didn=E2=80=99t have enough memory and swap, the linkers = would=E2=80=99ve been stopped. >=20 > I think it might just be a swap death. Poudriere compiles and fetches = in parallel > a lot, ethernet and disk I/O is slow because it=E2=80=99s very = limited, so linking takes > longer. You end up linking a few very big binaries at the same time, = and they > all fight for the memory, to get out of swap through page faults, but = there > are too many page faults, all too big, requesting for more CPU time = that=E2=80=99s > allowed to them. >=20 > This would explain why you have 3 linkers waiting on a page fault out = of the 4 > CPUs poudriere allows builds on, on top of the awk processes. It would = also > explain why you had easy access to the debugger: it was in memory = already with > the kernel. >=20 > I=E2=80=99d advise you to disable parallel builds and see if it = happens again, > but it would make building much slower. Using makejobs would help if = you > can afford watching the build. Otherwise be patient, it should resolve = itself > eventually, but it will take a while and it will happen again. My post was more about how FreeBSD handled the heavy-use context and less about getting the builds to finish: it managed to to get to a state of no-progress for processes and a loss of normal control as far as I could tell. I did a "c" to ddb and left it until just before this note then did ~ ^B again. Things looked the same. [I've finally rebooted the rpi2.] PARALLEL_JOBS=3D1 was already in use but ALLOW_MAKE_JOBS=3Dyes was also in use. USE_TMPFS=3Dno was already in use. While an ssh session was monitoring the build, Ethernet was not in heavy use. (No nfs mounts to its disks, for example.) I may try without ALLOW_MAKE_JOBS=3Dyes and with ALLOW_MAKE_JOBS_PACKAGES empty/undefined to see if it can complete for such a context without having the same sort of problem. Ultimately I can cross-build and install from those materials when I really want updates. I have the context for such. This was more about seeing how well the rpi2 did for self-hosted. Classically I've used a BPI-M3 with 2 GiBytes of RAM and a proportionally bigger swap partition instead (approximately). FYI (rpi2 after rebooting): # swapinfo Device 1K-blocks Used Avail Capacity /dev/label/RPI2swap 1572860 0 1572860 0% # df -m Filesystem 1M-blocks Used Avail Capacity Mounted on /dev/ufs/RPI2rootfs 195378 30791 148957 17% / devfs 0 0 0 100% /dev /dev/label/RPI2Aboot 49 12 37 25% /boot/msdos An rpi3 (aarch64) with the same amount of RAM, same type of USB SSD, etc., but well more swap completed building basically the same set of ports for the same poudriere settings just fine. Interestingly for the default kern.maxswzone: (Just to show the reported recommended maximum figures for swap.) rpi2: . . . exceeds maximum recommended amount (411488 pages). rpi3: . . . exceeds maximum recommended amount (925680 pages). (I was running with somewhat under those maximums for the tests.) # swapinfo Device 1K-blocks Used Avail Capacity /dev/gpt/RPI3swap 3702784 0 3702784 0% # df -m Filesystem 1M-blocks Used Avail Capacity Mounted on /dev/ufs/RPI3rootfs 195378 14937 164811 8% / devfs 0 0 0 100% /dev /dev/label/RPI3Aboot 49 7 42 15% /boot/efi If I restricted the rpi3 to somewhat under what the rpi2 allows for swap, I do not know if it would also hang up vs. not. If having more swap makes the difference, then it would not seem to be being I/O-bound that would explain the hangup. =3D=3D=3D Mark Millard markmi at dsl-only.net