Date: Mon, 18 Jun 2018 18:35:29 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: GPT vs MBR for swap devices Message-ID: <C7835435-30E5-4424-A897-8835AA08E12B@yahoo.com> In-Reply-To: <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com> References: <7AB401DF-7AE4-409B-8263-719FD3D889E5@yahoo.com> <20180618230419.GA81275@www.zefox.net> <A8D00616-ADA7-4A33-8787-637AFEF547CF@yahoo.com> <20180619005519.GB81275@www.zefox.net> <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2018-Jun-18, at 6:31 PM, Mark Millard <marklmi at yahoo.com> wrote: > On 2018-Jun-18, at 5:55 PM, bob prohaska <fbsd at www.zefox.net> = wrote: >=20 >> On Mon, Jun 18, 2018 at 04:42:21PM -0700, Mark Millard wrote: >>>=20 >>>=20 >>> On 2018-Jun-18, at 4:04 PM, bob prohaska <fbsd at www.zefox.net> = wrote: >>>=20 >>>> On Sat, Jun 16, 2018 at 04:03:06PM -0700, Mark Millard wrote: >>>>>=20 >>>>> Since the "multiple swap partitions across multiple >>>>> devices" context (my description) is what has problems, >>>>> it would be interesting to see swapinfo information >>>>> from around the time frame of the failures: how much is >>>>> used vs. available on each swap partition? Is only one >>>>> being (significantly) used? The small one (1 GiByte)? >>>>>=20 >>>> There are some preliminary observations at >>>>=20 >>>> = http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_1gbsdflash_= swapinfo/1gbusbflash_1gbsdflash_swapinfo.log >>>>=20 >>>> If you search for 09:44: (the time of the OOM kills) it looks like >>>> both swap partitions are equally used, but only 8% full. >>>>=20 >>>> At this point I'm wondering if the gstat interval (presently 10 = seconds) >>>> might well be shortened and the ten second sleep eliminated. On the = runs >>>> that succeed swap usage changes little in twenty seconds, but the = failures >>>> seem to to culminate rather briskly. >>>=20 >>> One thing I find interesting somewhat before the OOM activity is >>> the 12355 ms/w and 12318 ms/w on da0 and da0d that goes along >>> with having 46 or 33 L(q) and large %busy figures in the same >>> lines --and 0 w/s on every line: >>>=20 >>> Mon Jun 18 09:42:05 PDT 2018 >>> Device 1K-blocks Used Avail Capacity >>> /dev/da0b 1048576 3412 1045164 0% >>> /dev/mmcsd0s3b 1048576 3508 1045068 0% >>> Total 2097152 6920 2090232 0% >>> dT: 10.043s w: 10.000s >>> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name >>> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0 >>> 46 0 0 0 0.0 0 16 12355 0 0 = 0.0 85.9 da0 >>> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0s3 >>> 0 0 0 0 0.0 0 9 10.8 0 0 = 0.0 0.1 mmcsd0s3a >>> 33 0 0 0 0.0 0 22 12318 0 0 = 0.0 114.1 da0d >>> Mon Jun 18 09:42:25 PDT 2018 >>> Device 1K-blocks Used Avail Capacity >>> /dev/da0b 1048576 3412 1045164 0% >>> /dev/mmcsd0s3b 1048576 3508 1045068 0% >>> Total 2097152 6920 2090232 0% >>>=20 >>>=20 >>> The kBps figures for the writes are not very big above. >>>=20 >>=20 >> If it takes 12 seconds to write, I can understand the swapper getting = impatient.... >> However, the delay is on /usr, not swap. >>=20 >> In the subsequent 1 GB USB flash-alone test case at >> = http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_swapinfo/1g= busbflash_swapinfo.log >> the worst-case seems to be at time 13:45:00 >>=20 >> dT: 13.298s w: 10.000s >> L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name >> 0 0 0 0 0.0 0 5 5.5 0 0 = 0.0 0.1 mmcsd0 >> 9 84 0 0 0.0 84 1237 59.6 0 0 = 0.0 94.1 da0 >> 0 0 0 0 0.0 0 5 5.5 0 0 = 0.0 0.1 mmcsd0s3 >> 0 0 0 0 0.0 0 5 5.6 0 0 = 0.0 0.1 mmcsd0s3a >> 5 80 0 0 0.0 80 1235 47.2 0 0 = 0.0 94.1 da0b >> 4 0 0 0 0.0 0 1 88.1 0 0 = 0.0 0.7 da0d >> Mon Jun 18 13:45:00 PDT 2018 >> Device 1K-blocks Used Avail Capacity >> /dev/da0b 1048576 22872 1025704 2% >>=20 >> 1.2 MB/s writing to swap seems not too shabby, hardly reason to kill = a process. >=20 > That is kBps instead of ms/w. >=20 > I see a ms/w (and ms/r) that is fairly large (but notably > smaller than the ms/w of over 12000): >=20 > Mon Jun 18 13:12:58 PDT 2018 > Device 1K-blocks Used Avail Capacity > /dev/da0b 1048576 0 1048576 0% > dT: 10.400s w: 10.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 0 4 0 0 0.0 4 66 3.4 0 0 = 0.0 1.3 mmcsd0 > 8 18 1 32 1991 17 938 2529 0 0 = 0.0 88.1 da0 > 0 4 0 0 0.0 4 63 3.5 0 0 = 0.0 1.3 mmcsd0s3 > 0 4 0 0 0.0 4 63 3.5 0 0 = 0.0 1.3 mmcsd0s3a > 6 11 1 32 1991 10 938 3207 0 0 = 0.0 94.7 da0d > Mon Jun 18 13:13:19 PDT 2018 > Device 1K-blocks Used Avail Capacity > /dev/da0b 1048576 0 1048576 0% >=20 >=20 > Going in a different direction, I believe that you have > reported needing more than 1 GiByte of swap space so the > 1048576 "1K-blocks" would not be expected to be sufficient. > So the specific failing point may well be odd but the build > would not be expected to finish without an OOM for this > context if I understand right. >=20 >> Thus far I'm baffled. Any suggestions? >=20 > Can you get a failure without involving da0, the drive that is > sometimes showing these huge ms/w (and ms/r) figures? (This question > presumes having sufficient swap space, so, say, 1.5 GiByte or more > total.) >=20 > Having the partition(s) each be sufficiently sized but for which > the total would not produce the notice for too large of a swap > space was my original "additional" suggestion. I still want to > see what such does as a variation of a failing context. But now > it would seem to be a good idea to avoid da0 and its sometimes > large ms/w and /ms/r figures. >=20 One more point: I'd suggest avoiding da0 for holding any log files or other such files that are being updated and used: Avoid da0 as much as possible, not just its swap partition(s). =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C7835435-30E5-4424-A897-8835AA08E12B>