Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Jun 2018 18:35:29 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: GPT vs MBR for swap devices
Message-ID:  <C7835435-30E5-4424-A897-8835AA08E12B@yahoo.com>
In-Reply-To: <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com>
References:  <7AB401DF-7AE4-409B-8263-719FD3D889E5@yahoo.com> <20180618230419.GA81275@www.zefox.net> <A8D00616-ADA7-4A33-8787-637AFEF547CF@yahoo.com> <20180619005519.GB81275@www.zefox.net> <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help


On 2018-Jun-18, at 6:31 PM, Mark Millard <marklmi at yahoo.com> wrote:

> On 2018-Jun-18, at 5:55 PM, bob prohaska <fbsd at www.zefox.net> =
wrote:
>=20
>> On Mon, Jun 18, 2018 at 04:42:21PM -0700, Mark Millard wrote:
>>>=20
>>>=20
>>> On 2018-Jun-18, at 4:04 PM, bob prohaska <fbsd at www.zefox.net> =
wrote:
>>>=20
>>>> On Sat, Jun 16, 2018 at 04:03:06PM -0700, Mark Millard wrote:
>>>>>=20
>>>>> Since the "multiple swap partitions across multiple
>>>>> devices" context (my description) is what has problems,
>>>>> it would be interesting to see swapinfo information
>>>>> from around the time frame of the failures: how much is
>>>>> used vs. available on each swap partition? Is only one
>>>>> being (significantly) used? The small one (1 GiByte)?
>>>>>=20
>>>> There are some preliminary observations at
>>>>=20
>>>> =
http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_1gbsdflash_=
swapinfo/1gbusbflash_1gbsdflash_swapinfo.log
>>>>=20
>>>> If you search for 09:44: (the time of the OOM kills) it looks like
>>>> both swap partitions are equally used, but only 8% full.
>>>>=20
>>>> At this point I'm wondering if the gstat interval (presently 10 =
seconds)
>>>> might well be shortened and the ten second sleep eliminated. On the =
runs
>>>> that succeed swap usage changes little in twenty seconds, but the =
failures
>>>> seem to to culminate rather briskly.
>>>=20
>>> One thing I find interesting somewhat before the OOM activity is
>>> the 12355 ms/w and 12318 ms/w on da0 and da0d that goes along
>>> with having 46 or 33 L(q) and large %busy figures in the same
>>> lines --and 0 w/s on every line:
>>>=20
>>> Mon Jun 18 09:42:05 PDT 2018
>>> Device          1K-blocks     Used    Avail Capacity
>>> /dev/da0b         1048576     3412  1045164     0%
>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>> Total             2097152     6920  2090232     0%
>>> dT: 10.043s  w: 10.000s
>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps  =
 ms/d   %busy Name
>>>   0      0      0      0    0.0      0      9   10.8      0      0   =
 0.0    0.1  mmcsd0
>>>  46      0      0      0    0.0      0     16  12355      0      0   =
 0.0   85.9  da0
>>>   0      0      0      0    0.0      0      9   10.8      0      0   =
 0.0    0.1  mmcsd0s3
>>>   0      0      0      0    0.0      0      9   10.8      0      0   =
 0.0    0.1  mmcsd0s3a
>>>  33      0      0      0    0.0      0     22  12318      0      0   =
 0.0  114.1  da0d
>>> Mon Jun 18 09:42:25 PDT 2018
>>> Device          1K-blocks     Used    Avail Capacity
>>> /dev/da0b         1048576     3412  1045164     0%
>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>> Total             2097152     6920  2090232     0%
>>>=20
>>>=20
>>> The kBps figures for the writes are not very big above.
>>>=20
>>=20
>> If it takes 12 seconds to write, I can understand the swapper getting =
impatient....
>> However, the delay is on /usr, not swap.
>>=20
>> In the subsequent 1 GB USB flash-alone test case at
>> =
http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_swapinfo/1g=
busbflash_swapinfo.log
>> the worst-case seems to be at time 13:45:00
>>=20
>> dT: 13.298s  w: 10.000s
>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   =
ms/d   %busy Name
>>   0      0      0      0    0.0      0      5    5.5      0      0    =
0.0    0.1  mmcsd0
>>   9     84      0      0    0.0     84   1237   59.6      0      0    =
0.0   94.1  da0
>>   0      0      0      0    0.0      0      5    5.5      0      0    =
0.0    0.1  mmcsd0s3
>>   0      0      0      0    0.0      0      5    5.6      0      0    =
0.0    0.1  mmcsd0s3a
>>   5     80      0      0    0.0     80   1235   47.2      0      0    =
0.0   94.1  da0b
>>   4      0      0      0    0.0      0      1   88.1      0      0    =
0.0    0.7  da0d
>> Mon Jun 18 13:45:00 PDT 2018
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/da0b         1048576    22872  1025704     2%
>>=20
>> 1.2 MB/s writing to swap seems not too shabby, hardly reason to kill =
a process.
>=20
> That is kBps instead of ms/w.
>=20
> I see a ms/w (and ms/r) that is fairly large (but notably
> smaller than the ms/w of over 12000):
>=20
> Mon Jun 18 13:12:58 PDT 2018
> Device          1K-blocks     Used    Avail Capacity
> /dev/da0b         1048576        0  1048576     0%
> dT: 10.400s  w: 10.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   =
ms/d   %busy Name
>    0      4      0      0    0.0      4     66    3.4      0      0    =
0.0    1.3  mmcsd0
>    8     18      1     32   1991     17    938   2529      0      0    =
0.0   88.1  da0
>    0      4      0      0    0.0      4     63    3.5      0      0    =
0.0    1.3  mmcsd0s3
>    0      4      0      0    0.0      4     63    3.5      0      0    =
0.0    1.3  mmcsd0s3a
>    6     11      1     32   1991     10    938   3207      0      0    =
0.0   94.7  da0d
> Mon Jun 18 13:13:19 PDT 2018
> Device          1K-blocks     Used    Avail Capacity
> /dev/da0b         1048576        0  1048576     0%
>=20
>=20
> Going in a different direction, I believe that you have
> reported needing more than 1 GiByte of swap space so the
> 1048576 "1K-blocks" would not be expected to be sufficient.
> So the specific failing point may well be odd but the build
> would not be expected to finish without an OOM for this
> context if I understand right.
>=20
>> Thus far I'm baffled. Any suggestions?
>=20
> Can you get a failure without involving da0, the drive that is
> sometimes showing these huge ms/w (and ms/r) figures? (This question
> presumes having sufficient swap space, so, say, 1.5 GiByte or more
> total.)
>=20
> Having the partition(s) each be sufficiently sized but for which
> the total would not produce the notice for too large of a swap
> space was my original "additional" suggestion. I still want to
> see what such does as a variation of a failing context. But now
> it would seem to be a good idea to avoid da0 and its sometimes
> large ms/w and /ms/r figures.
>=20

One more point: I'd suggest avoiding da0 for holding any log files
or other such files that are being updated and used: Avoid da0
as much as possible, not just its swap partition(s).

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C7835435-30E5-4424-A897-8835AA08E12B>