From owner-freebsd-current@freebsd.org Mon Aug 20 06:37:40 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 20DB51085E52 for ; Mon, 20 Aug 2018 06:37:40 +0000 (UTC) (envelope-from marklmi@yahoo.com) Received: from sonic307-2.consmr.mail.bf2.yahoo.com (sonic307-2.consmr.mail.bf2.yahoo.com [74.6.134.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A4EC37DDC1 for ; Mon, 20 Aug 2018 06:37:39 +0000 (UTC) (envelope-from marklmi@yahoo.com) X-YMail-OSG: Daeo0DcVM1mLQO39Zll4MNywkv74cGaXgJxjUsxLDm05SyInFWR8_955MJcb74E jArYJvz5nvt396i.7uA_51tuXxV7LG1Iobnm5HtAWdNCWyLfahYr0B121zoP4YGJnQS09eK_O5ZZ umTfnEqPnRbT92KSpXAmlmr_dDGYDGrtN5LciQjrl5tIpenD8QPyqY.klGCvZwzAUgoSft.rmd9x DZqIprb4f6QBjSGZcdE1O7cCHuh0cfcxA22i8RFcGoGxwoWSwtsLbCzm1h3vp9qcV39tQDrC8Dx5 eNyaTGQZTSfjVFBqJFCavagw1aKStknBPGe7KpmpXD8272jZHxE_xeQ_jzJjJh2EQSEJxJvddX8H eKX.5DBSMkgltfwHOuhr8VRbdNd_KXdR.zS1az5WF_9C2ZKyqLS30V37lk6WnWqsJmmu4wy3KPJA 3YdwPPMYNjzu52z.flD_tndy5e2O7ikmQORcSa9leWoKcYh7T.apBGErgCzcJairLNk6EXx9Eq63 MzvrV8DLvRAVC0Yj0AC22iGUYy62XxyVPk7iX9KUkuJoNzyctoM1J71hDQR7qBeTDezqf31RNtzq hvWXrzg9OUYUNExdZ_GKW3apa0v9VQKmH9vyTNRtLi3fi4bszrPBIRZrVSgqvp9NTNq9I.IAsr_Z fEPDntFEmWEIQoy4lbHdeg3dnJ71T210RkRJ8GaNnbC9EOH4aYya3WoL7Fnax4eLGSvHXOVfNRPt Ya71qJssXl9HIw0YPHmdcb34oDqIruN91E2C3YHLzCEcAUOPG0XDJpP5SkYIqQdsgmld_iw5emhJ bwlMrIjvKDP6j9xLeT.xXAxinEsYgCKjf5ZF9ssPNo0k45XL.cv0pkweLL9nU05EQYdONUkx5vsE ABhSKZf8Q9R1B4bFyHUNsR9XwLXdEmlY18tcqVWU37TaQXVeNxdmW.7pBa0fOgsTsxcbxzpEawyR LSYlMSARO13RVO.9GmAxCus.FEfo9SC1Fnu4VsE3Dgzkpi1LyycNZ8QcLn_jPY25sv0o9fzPNEIB OBRh8zmsTAw-- Received: from sonic.gate.mail.ne1.yahoo.com by sonic307.consmr.mail.bf2.yahoo.com with HTTP; Mon, 20 Aug 2018 06:37:32 +0000 Received: from ip70-189-131-151.lv.lv.cox.net (EHLO [192.168.0.105]) ([70.189.131.151]) by smtp415.mail.bf1.yahoo.com (Oath Hermes SMTP Server) with ESMTPA ID 0eb3d29fa6783bb2e3620bd6fec10b41; Mon, 20 Aug 2018 06:37:29 +0000 (UTC) From: Mark Millard Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\)) Subject: Re: building LLVM threads gets killed Date: Sun, 19 Aug 2018 23:37:27 -0700 References: To: gurenchan@gmail.com, FreeBSD Current In-Reply-To: Message-Id: <048B761D-E1A0-4EE3-AA55-E2FFBD19F9F6@yahoo.com> X-Mailer: Apple Mail (2.3445.9.1) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2018 06:37:40 -0000 [In part a resend from the right Email account. In part adding a note about another Mark Johnston patch for reporting information.] On 2018-Aug-19, at 11:25 PM, Mark Millard = wrote: > blubee blubeeme gurenchan at gmail.com wrote on > Mon Aug 20 03:02:01 UTC 2018 : >=20 >> I am running current compiling LLVM60 and when it comes to linking >> basically all the processes on my computer gets killed; Chrome, = Firefox and >> some of the LLVM threads as well >=20 >> . . . >=20 >> last pid: 20965; load averages: 0.64, 5.79, 7.73 >> up 12+01:35:46 11:00:36 >> 76 processes: 1 running, 75 sleeping >> CPU: 0.8% user, 0.5% nice, 1.0% system, 0.0% interrupt, 98.1% = idle >> Mem: 10G Active, 3G Inact, 100M Laundry, 13G Wired, 6G Free >> ARC: 4G Total, 942M MFU, 1G MRU, 1M Anon, 43M Header, 2G Other >> 630M Compressed, 2G Uncompressed, 2.74:1 Ratio >> Swap: 2G Total, 1G Used, 739M Free, 63% Inuse >> . . . >=20 > The timing of that top output relative to the first or > any OOM kill of a process is not clear. After? Just > before? How long before? What it is like leading up > to the first kill is of interest. >=20 > Folks that deal with this are likely to want do know > if you got console messages ( or var/log/messages content) > such as: >=20 > pid 49735 (c++), uid 0, was killed: out of swap space >=20 > (Note: "out of swap space" can be a misnomer for having > low Free RAM for "too long" [vm.pageout_oom_seq based], > even with swap unused or little used.) >=20 > And: Were you also getting messages like: >=20 > swap_pager_getswapspace(4): failed >=20 > and/or: >=20 > swap_pager: out of swap space >=20 > (These indicate the "killed: out of swap space" is not > necessarily a misnomer relative to swap space, even if > low free RAM over a time drives the process kills.) >=20 > How about messages like: >=20 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 28139, size: = 65536 >=20 > or any I/O error reports or retry reports? >=20 >=20 >=20 > Notes: >=20 > Mark Johnston published a patch used for some investigations of > the OOM killing: >=20 > https://people.freebsd.org/~markj/patches/slow_swap.diff >=20 > But this is tied to the I/O swap latencies involved and if they > are driving some time frames. It just adds more reporting to > the console ( and /var/log/messages ). It is not a fix. IT may > not be likely to report much for your context. >=20 >=20 > vm.pageout_oom_seq controls the "how long is low free RAM > tolerated" (my hprasing), though the units are not directly > time. In various arm contexts with small boards going from > the default of 12 to 120 allowed things to complete or get > much farther. So: >=20 > sysctl vm.pageout_oom_seq=3D120 >=20 > but 120 is not the limit: it is a C int parameter. >=20 > I'll note that "low free RAM" is as FreeBSD classifies it, > whatever the details are. >=20 >=20 >=20 > Most of the arm examples have been small memory contexts > and many of them likely avoid ZFS and use UFS instead. > ZFS and its ARC and such an additional complicated > context to the type of issue. There are lots of reports > around of the ARC growing too big. I do not know the > status of -r336196 relative to ZFS/ARC memory management > or if more recent versions have improvements. (I do not > use ZFS normally.) I've seen messages making suggestions > for controlling the growth but I'm no ZFS expert. >=20 >=20 > Just to give an idea what is sufficient to build > devel/llvm60: >=20 > I will note that on a Pine64+ 2GB (so 2 GiBytes of RAM > in a aarch64 context with 4 cores, 1 HW-thread per core) > running -r337400, and using UFS on a USB drive and a > swap partition that drive too, I have built devel/llvm60 > 2 times via poudriere-devel: just one builder > allowed but it being allowed to use all 4 cores in > parallel, about 14.5 hr each time. (Different USB media > each time.) This did require the: >=20 > sysctl vm.pageout_oom_seq=3D120 >=20 > Mark Johnston's slow_swap.diff patch code did not > report any I/O latency problems in the swap subsystem. >=20 > I've also built lang/gcc8 2 times, about 12.5 hrs > each time. >=20 > No ZFS, no ARC, no Chrome, no FireFox. Nothing else > major going on beyond the devel/llvm60 build (or, later, > the lang/gcc8 build) in each case. Mark Johnston in the investigation for the arm context also had us use the following patch: diff --git a/sys/vm/vm_pageout.c b/sys/vm/vm_pageout.c index 264c98203c51..9c7ebcf451ec 100644 --- a/sys/vm/vm_pageout.c +++ b/sys/vm/vm_pageout.c @@ -1670,6 +1670,8 @@ vm_pageout_mightbe_oom(struct vm_domain *vmd, int = page_shortage, * start OOM. Initiate the selection and signaling of the * victim. */ + printf("v_free_count: %u, v_inactive_count: %u\n", + vmd->vmd_free_count, = vmd->vmd_pagequeues[PQ_INACTIVE].pq_cnt); vm_pageout_oom(VM_OOM_MEM); /* This patch is not about the I/O latencies but about the free RAM and inactive RAM at exactly the point of the OOM kill activity. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)