From owner-freebsd-arm@freebsd.org Sun Jan 17 20:30:58 2021 Return-Path: Delivered-To: freebsd-arm@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id A67394D6F9D for ; Sun, 17 Jan 2021 20:30:58 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic302-22.consmr.mail.gq1.yahoo.com (sonic302-22.consmr.mail.gq1.yahoo.com [98.137.68.148]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DJmlP3G4mz4SlJ for ; Sun, 17 Jan 2021 20:30:57 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-SONIC-DKIM-SIGN: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1610915456; bh=Cd2HaY4Pnd1u+IWNtkuP7CHU02P5U0AOnKgCrsv2vI5=; h=Subject:From:Date:To:From:Subject:Reply-To; b=NjF+cjhm5hfrVcsqBj0yRBMSaJq23g5QVpCcqboDwl/Fp2BY8XQag9oT8j3jxicJQmcuVln9lWBX1L9RvqF5lvmyaLoNtmPikuzyLFH+XuHgh/0/RZLubNFs7z17QrC3wi7G0as81pZ0KuAohBwuHICVlGwXETpSlr1+JQED9W4I/iY4yQVyddzroV17ZruX517gGcevXPqK3LWgSZLoirJMkFHK5K9pI+db9V8rM/H6AsVTkx3k1yJXRAx01NH2IRErAvnnOI6z4a4zBeK8jsYyGI1795ZJZYrFbjh1aSWpVqE5Afr9DGePDsDN8R8CPFb6ojlfOiAvErBmePUzsA== X-YMail-OSG: ixDoYs4VM1nPq2scFH9FyUhGO52i17OD8nKkE52E3SaWai2JoSAXdagKcXvNz2y UgbRDViElKbOFzO9lsxI0ubPDIc1gwCmZfb8t6aiDYK_n5jW3dvaV.Qe4iE_QHTDD2dIpB4pAC4Z WOoEIQUV_a_KfBu_WRfbeIEvTD9MWWwZy5n0KpNl6R2yqSYX50rsCcyR7W.ehi9EIKE6CQR9BSK4 fCqGCOZNgAQMMnh6MkIQuPNk5HRWypMHgcBmdAyCyRa8PSPgguSzil0WU7RGJ01kYXbGpMTbyJsA a.Ed0lNZzshKBWzNZA5hMKHbC8.ce1MgEatz9BdVEa4yawMgsVyWYXgNrGthQem7T_gnQ_Wqmfw5 dvscH8cUwvU_fXH8iyFSZbWL3UyuvTpRAmHnLE2onjOnjpm2OhPVksk..rZ6V5KzvrBVqVVyHl96 yHg5MovSVN.Lo9Ds2VX2H_RrRtJwGXASqtCCllpCuA.gmr5c.7eXY5chAFFSbNA8oxT2Z8rHGATC pxIMoDMra19zEspdFsT91fOmJUs1gpFGcyZrmLmlUmdazKWAurjVyZT1OzskVPk7nKMkydD3jGu8 1cwToJQgfpBoRsdaCDJ6bCfx0vQ3lVDgRC9krQR5VCSDgIcMONHrxHxSnIzS2AWYdpGq1gn3K.Oy G5jVgDwakBy3m3xZS9hidM2m3jzsJai3wsJz9eRSuQApJlnLa1EiBqiSGhHhRZd2sItn736VJyuY s7niPlhqfGZNB7Ovk93DUEG_mu7WBCDzO86R2BK_336QkIBORyjrLHcwg_.Pw.es9JOVsmSRofU0 eM6rkfX5bccjX8go3qbNx6WC_QfBMjvDLLkF9l.nrKxAfadJZsgcbxmLG5aKqrgweO99vU16Hrcu 6d_GH2W.Pd9_G6IPFVrvp9szwEWewiy8.TWk3ARh9X_jfkUUgczpZge8ytgRyB1P9cNXG9PYMCzI xUiD.FEYhCbjporqfQBeTHRt08CODqbFY0yU77IfN.8J5W0oEc0ytZ5EDCJZMzGki7h_t1bpcOS0 RoDkZpjRJJzhVCa2aY_XdlSN985KaNRrm31hmt0doHyRXBge9QYu4Ig5vU7fbi7cSEu8GyK1VC6X 4uuxNaaJKMEwX5lXiJoZiBc0fsBsWyFhKLd79ZPt8JIlQE7noSruqO3DEPiAvennfFlNPjjL.b8. hjLcnXEssS_zigoatzaXDNVGaWWKjAoPyJ5nca7pBy6HCC96MQUsfrrY6sQN2bwVcI0.HUku394f KzpnNbkOfPkv32hNiKYU3TmC2AbZH3TSML6LatCyOQ_9sJAJsIcff8oO.MnzOCIfvmqyJALCg6Cr QlxXgxK1e3_AN6qfjiV.MkjBfUCj0dUvq5xNQD7Ei3A.4b2yNr49r3TtFtpsRHb6TIJQjCn0M151 9512gIbUCF6C8BkBUOF_pEfcqanyG1_9cpEvwpBesMnDDsrZ_qp8mxygFgAwCCm0C0wjsCqGhbny r2bnB6Lwe9kbduznvoIYsMvU9pUVYwTngP4UoWtlnkyQF5VA.kiVbfVSA5mt8uwZbuRExNKb9ggi 0Qn1M04M.KL2cq.ciANQSCxJ1rJ1MSH5Q0MuOBtcsq.PO7e8nHZMWrBnPs7aiCyYQ5VYMmpOGvyL pqIqRl.8YAkf_vKHuaxFbsNAMJjK1k0Utc0VtiqFyWoLlZqlFGAAspEwbrjtXLxmcCPfQQnVlQiA IWpdLDKVwJHnFkINlwu825VeTJKxVghuxcnz2jkEI0q23d_4usU5QWv19s9cs2hBufYfzXbbMuZ2 FUagyMWhVUUvTBCwANyPQZreE99kwDRDVRatm8gxhZ1l5Bcb3iQWDAnYeITvoYmXNVhszodbIzDZ blLUjh..U5XR602sn_SH9Au4saZJHogxTr74U1h2V6ilUDZF2gpz1DDrQlrMkfSLitqv4GRj1cU0 DV.Z3m57LwSz_gIpgvcyJxPduY0.jcpOEupsT7jWzIb.Qd9Pw5DD_G3GyTC4rljt7s7VdjZ_4XVX zTzzilVPojnr7nzBolOmdyzXdohVq0SDG54nCmskjl.VgWM0_c5zJXmD5COScUL0YYmBwco4VOQ_ HcQlHMoMGtRyUYerqDPb0xpEqzQ8OOF_XKRfEKFd.rvA0vFWtfCN9PCEbfFNrXlCiNZUPGXhklTi 8qAy7HeSnjdkI4j80Nlcd5CVMBrI8LKJ1h5TDR2m5kWtpLJtFkpXcMYcbg3I0va6ma4vZG0OKrMl OYE0aRgAScyIeEbcEpHA8NkIEaglp80Lzhlzk469BAKrkf94QTSRLPcDsBj4voYUwvycsgg_1D3o _AD96FmcQ0zKPoZfud7XSgZsonIDNk72qR9kBIA1AvKk7.f6TmIyJPSTAiMRvCoS7rYgU0nj6nPi 2VL6oVwp.fXehH3UEmjuwEzqoGnIvQhGcDvP1AiG4b6_YOQqJwYGAC_o70dSciHWG8F93rCAY0cB 3zaifIrN_ecLvAr8TaozHAxSnEnOFfacpLZMdf7fUW9PpJuv3AJ_vhgd.8DVGYZZ641ugUQyfakq Gy2c0vXxZSeUkQbBP9SdZcLsH3lDVzgRTDQWp9sx5P72QlHt_NktdCDU6SXiA27L7qUaNOG4BmHL qqg-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic302.consmr.mail.gq1.yahoo.com with HTTP; Sun, 17 Jan 2021 20:30:56 +0000 Received: by smtp409.mail.gq1.yahoo.com (VZM Hermes SMTP Server) with ESMTPA ID ae0a4229a77b65e188d5e772bbde91a7; Sun, 17 Jan 2021 20:30:52 +0000 (UTC) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.40.0.2.32\)) Subject: Re: Silent hang in buildworld, was Re: Invoking -v for clang during buildworld From: Mark Millard In-Reply-To: <20210117174006.GA30728@www.zefox.net> Date: Sun, 17 Jan 2021 12:30:51 -0800 Cc: Current FreeBSD , freebsd-arm@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: <85889EAE-F579-4220-9185-944D9AA5075A@yahoo.com> References: <20210116043740.GA19523@www.zefox.net> <20210116155538.GA24259@www.zefox.net> <20210116220334.GA26756@www.zefox.net> <20210117174006.GA30728@www.zefox.net> To: bob prohaska X-Mailer: Apple Mail (2.3654.40.0.2.32) X-Rspamd-Queue-Id: 4DJmlP3G4mz4SlJ X-Spamd-Bar: --- X-Spamd-Result: default: False [-3.50 / 15.00]; TO_DN_SOME(0.00)[]; FREEMAIL_FROM(0.00)[yahoo.com]; MV_CASE(0.50)[]; R_SPF_ALLOW(-0.20)[+ptr:yahoo.com]; DKIM_TRACE(0.00)[yahoo.com:+]; DMARC_POLICY_ALLOW(-0.50)[yahoo.com,reject]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_LAST(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[yahoo.com]; ASN(0.00)[asn:36647, ipnet:98.137.64.0/20, country:US]; RBL_DBL_DONT_QUERY_IPS(0.00)[98.137.68.148:from]; DWL_DNSWL_NONE(0.00)[yahoo.com:dkim]; MID_RHS_MATCH_FROM(0.00)[]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[yahoo.com:s=s2048]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-1.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; SPAMHAUS_ZRD(0.00)[98.137.68.148:from:127.0.2.255]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[98.137.68.148:from]; RWL_MAILSPIKE_POSSIBLE(0.00)[98.137.68.148:from]; RCVD_COUNT_TWO(0.00)[2]; MAILMAN_DEST(0.00)[freebsd-arm] X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2021 20:30:58 -0000 On 2021-Jan-17, at 09:40, bob prohaska wrote: > On Sat, Jan 16, 2021 at 03:04:04PM -0800, Mark Millard wrote: >>=20 >> Other than -j1 style builds (or equivalent), one pretty much >> always needs to go looking around for a non-panic failure. It >> is uncommon for all the material to be together in the build >> log in such contexts. >=20 > Running make cleandir twice and restarting -j4 buildworld brought > the process full circle: A silent hang, no debugger response, no > console warnings. That's what sent me down the rabbit hole of make > without clean, which worked at least once... Unfortunately, such a hang tends to mean that log files and such were not completely written out to media. We do not get to see evidence of the actual failure time frame, just somewhat before. (compiler/linker output and such can have the same issues of ending up with incomplete updates.) So, pretty much my notes are unlikely to be strongly tied to any solid evidence: more like alternatives to possibly explore that could be far off the mark. It is not clear if you were using: LDFLAGS.lld+=3D -Wl,--threads=3D1 or some such to limit the multi-thread linking and its memory. I'll note that if -j4 gets 4 links running in parallel it used to be each could have something like 5 threads active on a 4 core machine, so 20 or so threads. (I've not checked llvm11's lld behavior. It might avoid such for defaults.) You have not reported any testing of -j2 or -j3 so far, just -j4 . (Another way of limiting memory use, power use, temperature, etc. .) You have not reported if your boot complained about the swap space size or if you have adjusted related settings to make non-default tradeoffs for swap amanagment for these specific tests. I recommend not tailoring and using a swap size total that is somewhat under what starts to complain when there is no tailoring. > The residue of the top screen shows >=20 > last pid: 63377; load averages: 4.29, 4.18, 4.15 = up 1+07:11:07 04:46:46 > 60 processes: 5 running, 55 sleeping > CPU: 70.7% user, 0.0% nice, 26.5% system, 2.8% interrupt, 0.0% idle > Mem: 631M Active, 4932K Inact, 92M Laundry, 166M Wired, 98M Buf, 18M = Free > Swap: 2048M Total, 119M Used, 1928M Free, 5% Inuse, 16K In, 3180K Out > packet_write_wait: Connection to 50.1.20.26 port 22: Broken pipe > bob@raspberrypi:~ $ ssh www.zefox.com RES STATE C TIME WCPU = COMMAND > ssh: connect to host www.zefox.com port 22: Connection timed out86.17% = c++ > bob@raspberrypi:~ $ 1 99 0 277M 231M RUN 0 3:26 75.00% = c++ > 63245 bob 1 99 0 219M 173M CPU0 0 2:10 73.12% = c++ > 62690 bob 1 98 0 354M 234M RUN 3 9:42 47.06% = c++ > 63377 bob 1 30 0 5856K 2808K nanslp 0 0:00 3.13% = gstat > 38283 bob 1 24 0 5208K 608K wait 2 2:00 0.61% = sh > 995 bob 1 20 0 6668K 1184K CPU3 3 8:46 0.47% = top > 990 bob 1 20 0 12M 1060K select 2 0:48 0.05% = sshd > .... This does not look like ld was in use as of the last top display update's content. But the time between reasonable display updates is fairly long relative to CPU activity so it is only suggestive. > [apologies for typing over the remnants] >=20 > I've put copies of the build and swap logs at >=20 > http://www.zefox.net/~fbsd/rpi2/buildworld/ >=20 > The last vmstat entry (10 second repeat time) reports: > procs memory page disks faults = cpu > r b w avm fre flt re pi po fr sr da0 sd0 in sy = cs us sy id > 4 0 14 969160 91960 685 2 2 1 707 304 0 0 11418 = 692 1273 45 5 50 >=20 > Does that point to the memory exhaustion suggested earlier in the = thread? > At this point /boot/loader.conf contains vm.pfault_oom_attempts=3D"-1", = but=20 > that's a relic of long-ago attempts to use USB flash for root and = swap. > Might removing it stimulate more warning messages? >=20 vm.pfault_oom_attempts=3D"-1" should only be used in contexts where running out of swap will not happen. Otherwise a deadlocked system can result if it does run out of swap. (Run-out has more senses the just the swap partition being fully used: other internal resources for keeping track of the swap can run into its limits.) I've no evidence that the -1 was actually a problem. I do not find any 1000+ ms/w or ms/r figures in swapscript.log . I found 3 examples of a little under 405 (on sdda0*), 3 between 340 and 345 (da0*), 4 in the 200s (da0*), under 60 in the 100s (da0*). It does not look to me like the recorded part had problems with the long latencies that you used to have happen. So I've not found any specific evidence about what led to the hangup. So my earlier questions/suggestions are basically arbitrary and I would not know what to do with any answers to the questions. The only notes that are fairly solid are about the hangup leading to there being some files that were likely incompletely updated (logs, compiler output files, etc.). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)