FreeBSD Mail Archives

Date:      Tue, 19 Jun 2018 07:06:18 +0300
From:      Jukka Ukkonen <jau789@gmail.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        Mark Millard <marklmi@yahoo.com>, freebsd-arm@freebsd.org
Subject:   Re: GPT vs MBR for swap devices
Message-ID:  <0C802675-9DE2-4446-B0F1-528D40C69C68@gmail.com>
In-Reply-To: <20180619034232.GA81800@www.zefox.net>
References:  <7AB401DF-7AE4-409B-8263-719FD3D889E5@yahoo.com> <20180618230419.GA81275@www.zefox.net> <A8D00616-ADA7-4A33-8787-637AFEF547CF@yahoo.com> <20180619005519.GB81275@www.zefox.net> <BC42DDF9-9383-437B-8AE2-A538050C5160@yahoo.com> <20180619034232.GA81800@www.zefox.net>


Are you sure it is not /usr/obj activity which you are seeing when
there are large write delays?
On systems using traditional spinning disks for everything else
it really makes sense to put /usr/obj on its own SSD making sure
the SSD does not share an I/O channel with any other device.

--jau


> On 19 Jun 2018, at 6.42, bob prohaska <fbsd@www.zefox.net> wrote:
>=20
>> On Mon, Jun 18, 2018 at 06:31:40PM -0700, Mark Millard wrote:
>>> On 2018-Jun-18, at 5:55 PM, bob prohaska <fbsd at www.zefox.net> wrote:
>>>=20
>>>> On Mon, Jun 18, 2018 at 04:42:21PM -0700, Mark Millard wrote:
>>>>=20
>>>>=20
>>>>> On 2018-Jun-18, at 4:04 PM, bob prohaska <fbsd at www.zefox.net> wrote=
:
>>>>>=20
>>>>>> On Sat, Jun 16, 2018 at 04:03:06PM -0700, Mark Millard wrote:
>>>>>>=20
>>>>>> Since the "multiple swap partitions across multiple
>>>>>> devices" context (my description) is what has problems,
>>>>>> it would be interesting to see swapinfo information
>>>>>> from around the time frame of the failures: how much is
>>>>>> used vs. available on each swap partition? Is only one
>>>>>> being (significantly) used? The small one (1 GiByte)?
>>>>>>=20
>>>>> There are some preliminary observations at
>>>>>=20
>>>>> http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_1gbsdfl=
ash_swapinfo/1gbusbflash_1gbsdflash_swapinfo.log
>>>>>=20
>>>>> If you search for 09:44: (the time of the OOM kills) it looks like
>>>>> both swap partitions are equally used, but only 8% full.
>>>>>=20
>>>>> At this point I'm wondering if the gstat interval (presently 10 second=
s)
>>>>> might well be shortened and the ten second sleep eliminated. On the ru=
ns
>>>>> that succeed swap usage changes little in twenty seconds, but the fail=
ures
>>>>> seem to to culminate rather briskly.
>>>>=20
>>>> One thing I find interesting somewhat before the OOM activity is
>>>> the 12355 ms/w and 12318 ms/w on da0 and da0d that goes along
>>>> with having 46 or 33 L(q) and large %busy figures in the same
>>>> lines --and 0 w/s on every line:
>>>>=20
>>>> Mon Jun 18 09:42:05 PDT 2018
>>>> Device          1K-blocks     Used    Avail Capacity
>>>> /dev/da0b         1048576     3412  1045164     0%
>>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>>> Total             2097152     6920  2090232     0%
>>>> dT: 10.043s  w: 10.000s
>>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   m=
s/d   %busy Name
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0=
.0    0.1  mmcsd0
>>>>  46      0      0      0    0.0      0     16  12355      0      0    0=
.0   85.9  da0
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0=
.0    0.1  mmcsd0s3
>>>>   0      0      0      0    0.0      0      9   10.8      0      0    0=
.0    0.1  mmcsd0s3a
>>>>  33      0      0      0    0.0      0     22  12318      0      0    0=
.0  114.1  da0d
>>>> Mon Jun 18 09:42:25 PDT 2018
>>>> Device          1K-blocks     Used    Avail Capacity
>>>> /dev/da0b         1048576     3412  1045164     0%
>>>> /dev/mmcsd0s3b    1048576     3508  1045068     0%
>>>> Total             2097152     6920  2090232     0%
>>>>=20
>>>>=20
>>>> The kBps figures for the writes are not very big above.
>>>>=20
>>>=20
>>> If it takes 12 seconds to write, I can understand the swapper getting im=
patient....
>>> However, the delay is on /usr, not swap.
>>>=20
>>> In the subsequent 1 GB USB flash-alone test case at
>>> http://www.zefox.net/~fbsd/rpi3/swaptests/newtests/1gbusbflash_swapinfo/=
1gbusbflash_swapinfo.log
>>> the worst-case seems to be at time 13:45:00
>>>=20
>>> dT: 13.298s  w: 10.000s
>>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms=
/d   %busy Name
>>>   0      0      0      0    0.0      0      5    5.5      0      0    0.=
0    0.1  mmcsd0
>>>   9     84      0      0    0.0     84   1237   59.6      0      0    0.=
0   94.1  da0
>>>   0      0      0      0    0.0      0      5    5.5      0      0    0.=
0    0.1  mmcsd0s3
>>>   0      0      0      0    0.0      0      5    5.6      0      0    0.=
0    0.1  mmcsd0s3a
>>>   5     80      0      0    0.0     80   1235   47.2      0      0    0.=
0   94.1  da0b
>>>   4      0      0      0    0.0      0      1   88.1      0      0    0.=
0    0.7  da0d
>>> Mon Jun 18 13:45:00 PDT 2018
>>> Device          1K-blocks     Used    Avail Capacity
>>> /dev/da0b         1048576    22872  1025704     2%
>>>=20
>>> 1.2 MB/s writing to swap seems not too shabby, hardly reason to kill a p=
rocess.
>>=20
>> That is kBps instead of ms/w.
>>=20
>> I see a ms/w (and ms/r) that is fairly large (but notably
>> smaller than the ms/w of over 12000):
>>=20
>> Mon Jun 18 13:12:58 PDT 2018
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/da0b         1048576        0  1048576     0%
>> dT: 10.400s  w: 10.000s
>> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w    d/s   kBps   ms/=
d   %busy Name
>>    0      4      0      0    0.0      4     66    3.4      0      0    0.=
0    1.3  mmcsd0
>>    8     18      1     32   1991     17    938   2529      0      0    0.=
0   88.1  da0
>>    0      4      0      0    0.0      4     63    3.5      0      0    0.=
0    1.3  mmcsd0s3
>>    0      4      0      0    0.0      4     63    3.5      0      0    0.=
0    1.3  mmcsd0s3a
>>    6     11      1     32   1991     10    938   3207      0      0    0.=
0   94.7  da0d
>> Mon Jun 18 13:13:19 PDT 2018
>> Device          1K-blocks     Used    Avail Capacity
>> /dev/da0b         1048576        0  1048576     0%
>>=20
>>=20
> Yes, but again, it's on /usr, not  swap. One could argue that there are ot=
her
> write delays, not seen here, that do affect swap. To forestall that object=
ion=20
> I'll get rid of the ten second sleep in the script when the present test r=
un
> finishes.=20
>=20
>=20
>> Going in a different direction, I believe that you have
>> reported needing more than 1 GiByte of swap space so the
>> 1048576 "1K-blocks" would not be expected to be sufficient.
>> So the specific failing point may well be odd but the build
>> would not be expected to finish without an OOM for this
>> context if I understand right.
>>=20
> Yes, the actual swap requirement seems to be slightly over 1.4 GB=20
> at the peak based on other tests. I fully expected a failure, but
> at a much higher swap utilization.
>=20
>=20
>>> Thus far I'm baffled. Any suggestions?
>>=20
>> Can you get a failure without involving da0, the drive that is
>> sometimes showing these huge ms/w (and ms/r) figures? (This question
>> presumes having sufficient swap space, so, say, 1.5 GiByte or more
>> total.)
>>=20
> If you mean not using da0, no; it holds /usr. If you mean not swapping
> to da0, yes it's been done. Having 3 GB swap on microSD works.=20
> Which suggests an experiment: use 1 GB SD swap and 1.3 GB mechanical
> USB swap. That's easy to try.=20
>=20
>> Having the partition(s) each be sufficiently sized but for which
>> the total would not produce the notice for too large of a swap
>> space was my original "additional" suggestion. I still want to
>> see what such does as a variation of a failing context.=20
>=20
> I'm afraid you've lost me here. With two partitions, one USB and
> the other SD of one GB each OOM kills happen at 8% utilization,=20
> spread evenly across both. Does the size of the partition affect
> the speed of it? Capacity does not seem the problem.
>=20
>> it would seem to be a good idea to avoid da0 and its sometimes
>> large ms/w and /ms/r figures.
>>=20
>=20
> I think the next experiment will be to use 1 GB of SD swap and
> 1.3 GB of mechanical USB swap. We know the SD swap is fast enough,
> and we know the USB mechanical swap is fast enough. If that
> combination works, maybe the trouble is congestion on da0. If the combo
> fails as before I'll be tempted to think it's USB or the swapper.
>=20
> Thanks for reading!
>=20
>=20
> bob prohaska
>>=20
>>=20
> _______________________________________________
> freebsd-arm@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-arm
> To unsubscribe, send any mail to "freebsd-arm-unsubscribe@freebsd.org"

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0C802675-9DE2-4446-B0F1-528D40C69C68>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation