FreeBSD Mail Archives

Date:      Fri, 8 May 2020 15:08:54 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        bob prohaska <fbsd@www.zefox.net>
Cc:        freebsd-arm@freebsd.org
Subject:   Re: Curious buildworld stoppage on RPi3B running 12.1-stable.
Message-ID:  <E86257BD-42F5-47F4-A0D7-AD66D41DADF9@yahoo.com>
In-Reply-To: <20200508180252.GA29832@www.zefox.net>
References:  <20200508180252.GA29832@www.zefox.net>

On 2020-May-8, at 11:02, bob prohaska <fbsd at www.zefox.net> wrote:

> Here's an example of a curious hang on a Pi3 during -j4 buildworld.
> The machine uses a 37GB mechanical USB disk running 12.1 r360689.
>=20
> /boot/loader.conf contains
> kern.cam.boot_delay=3D"20000"
> vm.pageout_oom_seq=3D"4096"
> vm.pfault_oom_attempts=3D"-1"
>=20
> Top reports:
>=20
> last pid: 84778;  load averages:  0.00,  0.47,  1.35                   =
   up 0+19:11:52  09:43:58
> 55 processes:  1 running, 54 sleeping
> CPU:  0.0% user,  0.0% nice,  0.7% system,  0.0% interrupt, 99.3% idle
> Mem: 243M Active, 26M Inact, 13M Laundry, 171M Wired, 98M Buf, 453M =
Free
> Swap: 2263M Total, 698M Used, 1565M Free, 30% Inuse, 696K In

453 MiByte Free, 698MiByte Swap/paging space used, and, see
below, 263 MiByte RESident for c++ --on a machine with 1 GiByte
of RAM with 171 MiByte Wired. The c++ RESident is 20 MiByte or
so more than the 243 MiByte Active figure.

>  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU =
COMMAND
> 82615 root          1  20    0   726M   263M swread   3   8:43   0.46% =
c++
> 1025 bob           1  20    0    13M  1052K CPU0     0   2:47   0.16% =
top
> 1020 bob           1  20    0    19M   800K select   1   0:18   0.02% =
sshd
> 66244 root          1  20    0    11M   228K select   0   0:03   0.00% =
make
>  656 root          1  20    0    10M   244K select   0   0:00   0.00% =
devd
> 82405 root          1  20    0    17M  2460K select   0   0:25   0.00% =
make
> 27467 root          1  20    0    10M   228K select   0   0:11   0.00% =
make
> 27499 root          1  20    0    11M   232K select   0   0:07   0.00% =
make
> 40948 root          1  20    0    11M   228K select   0   0:06   0.00% =
make
>  928 root          1  20    0    16M  1004K select   0   0:05   0.00% =
sendmail
> 66255 root          1  20    0    14M   236K select   1   0:04   0.00% =
make
> 73291 root          1  20    0    10M   224K select   1   0:02   0.00% =
make
>  932 root          1  20    0    11M   340K nanslp   3   0:01   0.00% =
cron
>  727 root          1  20    0    11M   664K select   3   0:01   0.00% =
syslogd
>=20
> The buildworld output log file has stopped updating:
>=20
> -rw-r--r--  1 root  wheel  18178806 May  8 06:34 buildworld.log
> bob@nemesis:/usr/src % ls -l *.log
> -rw-r--r--  1 root  wheel  18345625 May  8 08:17 buildworld.log
> bob@nemesis:/usr/src % ls -l *.log
> -rw-r--r--  1 root  wheel  18532738 May  8 09:34 buildworld.log
> bob@nemesis:/usr/src % ls -l *.log
> -rw-r--r--  1 root  wheel  18532738 May  8 09:34 buildworld.log
> bob@nemesis:/usr/src % ls -l *.log
> -rw-r--r--  1 root  wheel  18532738 May  8 09:34 buildworld.log
> bob@nemesis:/usr/src % ls -l *.log
> -rw-r--r--  1 root  wheel  18532738 May  8 09:34 buildworld.log
> bob@nemesis:/usr/src %=20
> At this point clock time is 09:46, so it's been stuck 8 minutes.

See later below about how long loading from swap/paging
space might be taking based on the below gstat information
and the earlier top information.

> Gstat reports
> dT: 1.010s  w: 1.000s
> L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
>    0      0      0      0    0.0      0      0    0.0    0.0| mmcsd0
>    1     80     80    321   12.4      0      0    0.0   99.4| da0
>    0      0      0      0    0.0      0      0    0.0    0.0| mmcsd0s1
>    0      0      0      0    0.0      0      0    0.0    0.0| mmcsd0s2
>    0      0      0      0    0.0      0      0    0.0    0.0| =
diskid/DISK-9B9A5304
>    0      0      0      0    0.0      0      0    0.0    0.0| =
msdosfs/MSDOSBOOT
>    0      0      0      0    0.0      0      0    0.0    0.0| =
mmcsd0s2a
>    0      0      0      0    0.0      0      0    0.0    0.0| =
mmcsd0s2b
>    0      0      0      0    0.0      0      0    0.0    0.0| =
diskid/DISK-9B9A5304s1
>    0      0      0      0    0.0      0      0    0.0    0.0| =
diskid/DISK-9B9A5304s2
>    0      0      0      0    0.0      0      0    0.0    0.0| =
ufsid/5bbd5a23da8b6195
>    0      0      0      0    0.0      0      0    0.0    0.0| =
ufs/rootfs
>    0      0      0      0    0.0      0      0    0.0    0.0| =
diskid/DISK-9B9A5304s2a
>    0      0      0      0    0.0      0      0    0.0    0.0| =
diskid/DISK-9B9A5304s2b
>    0      0      0      0    0.0      0      0    0.0    0.0| da0s1
>    1     80     80    321   12.4      0      0    0.0   99.5| da0s2
>    0      0      0      0    0.0      0      0    0.0    0.0| da0s2a
>    1     80     80    321   12.4      0      0    0.0   99.5| da0s2b
> bob@nemesis:~ %=20

Note the lack of write activity and top also is reported
showing: swread . Sort of like it is mostly doing a large
reload of something that had been mostly paged out earlier.

This is combined with that top output showing: 453M Free .
Was there a time earlier with far less free?

Also: top only shows one c++ to match up with your -j4 . This
suggests possibly a period of time when other things have not
started, pending needing what that c++ is building. Or it just
finished a parallel job but has not yet started another.
Somewhat earlier there might have been more than one thing
active.

Can you find out what the c++ is compiling during this period?
that might allow figuring out if it is something that normally
sends some time has the only job despite -j4 .

> It looks as if the swap device is saturated, but swap is not =
exhausted.

You later mention swap_pager_getswapspace errors, leading to the
question if there was a period when swap (or its supporting RAM
data structure space) had been exhausted. Were those the only
messages in such earlier time frames?

As for being "saturated":

At 321 KB/s (the example in the gstat output), if it was loading
something large, such as 512 MiBytes, just basically reading that
much material at a mean of such a rate would be over 25 minutes:

512*1024*1024_bytes * (1sec/(321*1024_bytes)) * (1minute/(60sec))

(Not that I know the 512 MiByte is realistic or that the 321
KiBYtes/sec is a realistic mean rate.) This just gives an idea
what might be involved and that being beyond seconds or even a
small number of minutes for the general time scale may well be
possible.

Extensive paging/swapping can vastly slow things down.

> The machine isn't hung; ssh sessions are responsive, the serial =
console
> refreshes the login prompt when the Enter key is hit.=20
>=20
> Also, there are no _current_ errors on the serial console. However, =
there
> a quite a number of swap_pager_getswapspace(32): failed messages, =
which
> stopped some considerable time  earlier. Buildworld didn't seem =
affected by them.=20

I'm not sure what can be said about the status of the machine
after swap_pager_getswapspace failures are reported that do not
have matching OOM kills involved. It may be that pages stayed
resident that otherwise would have paged out and that other
tradeoffs lead to slow-downs compared to having been able to
free some RAM for other uses. (No deadlock happened, however.)

My expectation here is that you did run out of free swap/paging
space space for a while. (It may be that internal data structures
for managing swap/paging space use are the limit hit, instead of
disk space, for all I know.) You did not indicate how big the
swap/paging partition is.

You did not mention if booting is suggesting kern.maxswzone
adjustments --or if you have done something to control that
context to not use the default.

> Perhaps strangest of all, starting up a gstat script to log system
> activity seems to slowly revive buildworld. %idle gradually drops to=20=

> a few percent.

Did you look to see if it was still just the one c++ process (PID
82615) for the -j4 ?

Other things may also have changed, not just you starting a gstat
script. This sort of judgement requires examining/monitoring the
context for a time.

> If the gstat script is started before buildworld=20
> the stall doesn't seem to happen and buildworld finishes in about 27
> hours, which is as good as it gets. =20

Are all the comparisons of from-scratch buildworld buildkernel
activity and otherwise controlled to be very similar contexts?

> Earlier tests  letting the machine run undisturbed without the gastat=20=

> script running suggests it'll eventually finish buildworld in around=20=

> 36 hours.

Also has the question of how well guaranteed is the
degree of similarity for the compared contexts?

> This vaguely echoes an incident long ago where Xorg was unusably slow
> until a small job was started in the background, at which point it
> became responsive. Can't readily imagine a connection, though.

Definitely changes what the scheduler has to do. That likely has
other non-obvious consequences when pushing the system near its
limits.

> It's rather counterintituve that giving the machine more work to do
> should make it work faster....If anybody can suggest what's going
> on I'd most curious. I could try poking around in the debugger if
> someone can offer guidance. On my own I'm clueless.=20

=3D=3D=3D
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E86257BD-42F5-47F4-A0D7-AD66D41DADF9>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation