Date: Wed, 5 Sep 2018 19:05:11 -0700 From: Mark Millard <marklmi@yahoo.com> To: bob prohaska <fbsd@www.zefox.net> Cc: freebsd-arm@freebsd.org Subject: Re: RPI3 swap experiments (r338342 with vm.pageout_oom_seq="1024" and 6 GB swap) Message-ID: <FB333A71-47D8-4038-9983-116DA80FC952@yahoo.com> In-Reply-To: <20180906003829.GC818@www.zefox.net> References: <0D8B9A29-DD95-4FA3-8F7D-4B85A3BB54D7@yahoo.com> <FC0798A1-C805-4096-9EB1-15E3F854F729@yahoo.com> <20180813185350.GA47132@www.zefox.net> <FA3B8541-73E0-4796-B2AB-D55CE40B9654@yahoo.com> <20180814014226.GA50013@www.zefox.net> <CANCZdfqFKY3Woa%2B9pVS5hika_JUAUCxAvLznSS4gaLq2kKoWtQ@mail.gmail.com> <20180815013612.GB51051@www.zefox.net> <CANCZdfoB_AcidFpKT_ZmZWUFnmC4Bw55krK%2BMqEmmj=f9KMQ2Q@mail.gmail.com> <20180815225504.GB59074@www.zefox.net> <20180901230233.GA42895@www.zefox.net> <20180906003829.GC818@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
[I've omitted Kirk McKusick as my notes are largely off subject for what he asked about for testing specific to his changes.] On 2018-Sep-5, at 5:38 PM, bob prohaska <fbsd at www.zefox.net> wrote: > On Sat, Sep 01, 2018 at 04:02:33PM -0700, bob prohaska wrote: >>=20 >> With r338342 and >> vm.pageout_oom_seq=3D"1024" >> in /boot/loader.conf the RPI3 is a bit closer to a Mars Rover. >> No panics, crashes or USB errors, -j4 buildworld runs to completion. >> When swap usage goes over about 50% the system slows, but doesn't = give up. >> There are six 1 GB swap partitions available, 3 on USB and 3 on = microSD. >>=20 >> Log files are at >> http://www.zefox.net/~fbsd/rpi3/swaptests/r338342/ >> for the combinations tried so far. >>=20 >=20 > It looks as if using all six GB of swap doesn't cause any immediate = problem, > at least so long as swap usage stays relatively low, say 1.5 GB. In a = final > test, TRIM was turned on without catastrophe, though it had little to = do > given that all the busy filesystems were on USB. The penalty was about = one > hour extra (25 vs 24 hours) to run -j4 buildworld from a clean start. What UFS file systems with TRIM enabled were on some /dev/mmcsd0* ? Did you 1st use "fsck_ffs -E" on any of the file systems where trim would work? If I gather right, the "clean start" was on USB where TRIM during the clean would not be available. The extra swap space may have contributed to the extra time? Having more swap uses more kernel memory for keeping track of the swap if I understand right. That leaves less for other things. That could have consequences other than outright failure. Quoting "man 8 loader" related to kern.maxswzone : Note that swap metadata can be fragmented, which means = that the system can run out of space before it reaches the theoretical limit. Therefore, care should be taken to = not configure more swap than approximately half of the theoretical maximum. Running out of space for swap metadata can leave the = system in an unrecoverable state. This wording suggests not allocating 6 GiBytes of swap when 3.5 GiBytes is approximately half the theoretical maximum --even if the system does still operate with 6 GiBytes. (Note: The man page's reference to "eight times the amount of physical = memory" and such does not seem to apply to all platforms. And rpi2 V1.1 and an = rpi3 with the same amount of RAM get rather difference recommended figures according to the messages generated.) > One chance observation caught my attention, however. I'd always = thought > the VM system would favor fast swap devices over slow, but the gstat = log > recorded this, visible at > = http://www.zefox.net/~fbsd/rpi3/swaptests/r338342/3gbsd_3gbusb/trim_on/swa= pscript.log >=20 >=20 >=20 > dT: 10.004s w: 10.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps = ms/d %busy Name > 3 175 91 673 4.0 84 701 4.0 0 0 = 0.0 24.4 mmcsd0 > 4 173 88 693 106.6 86 723 176.5 0 0 = 0.0 103.4 da0 > 1 58 30 224 4.5 28 220 4.1 0 0 = 0.0 14.5 mmcsd0s2b > 3 175 91 673 4.0 84 701 4.0 0 0 = 0.0 24.7 mmcsd0s2 > 1 58 30 223 4.0 28 244 3.8 0 0 = 0.0 14.0 mmcsd0s2d > 1 59 31 227 3.7 28 237 4.3 0 0 = 0.0 14.9 mmcsd0s2e > 2 57 28 235 140.2 28 236 103.8 0 0 = 0.0 186.1 da0a > 0 56 28 224 178.4 28 222 35.9 0 0 = 0.0 131.5 da0b > 2 59 31 234 9.4 28 240 59.1 0 0 = 0.0 99.5 da0d > 0 0 0 0 0.0 0 3 15011 0 0 = 0.0 150.1 da0e > 0 1 0 0 0.0 1 22 13376 0 0 = 0.0 147.8 da0g Are there any examples of "d/s kBps ms/d" being non-zero? If they are always zero then no TRIMing likely happened. That in turn would make TRIM an unlikely use of an extra hour. > Tue Sep 4 15:07:39 PDT 2018 > Device 1K-blocks Used Avail Capacity > /dev/da0b 1048576 236872 811704 23% > /dev/mmcsd0s2b 1048576 221568 827008 21% > /dev/da0d 1048576 218636 829940 21% > /dev/da0a 1048576 222028 826548 21% > /dev/mmcsd0s2d 1048576 221660 826916 21% > /dev/mmcsd0s2e 1048576 221392 827184 21% > Total 6291456 1342156 4949300 21% As I understand the normal use of multiple swap partitions is to split the load across channels that can operate independently in parallel. Having 3 such partitions on the same channel/device may only add overhead vs. one full-size partition per channel/device. I also do not know if mmcsd0 and da0 can have independent, parallel I/O activity in the rpi3 context. > Sep 4 14:57:52 www sshd[41673]: error: Received disconnect from = 103.207.39.197 port 64499:3: com.jcraft.jsch.JSchException: Auth cancel = [preauth] > Sep 4 15:04:19 www kernel: swap_pager: indefinite wait buffer: = bufobj: 0, blkno: 2217840, size: 12288 Note: my context is very different from yours and I get no console messages about I/O or waits during buildworld buildkernel or other such build/install tests. > The system has lots of fast swap available on microSD, but is = seemingly choking=20 > trying to use the slow swap on da0 _and_ run traffic to /usr and /var. = Buildworld > doesn't run any faster with less swap, so I don't think the oversupply = is the problem. If I understand right, your only 6 GiByte swap experiment was slower but you attributed all time variations to an (inactive? ever used?) TRIM enabled status. You might want to manipulate the two separately. For all I know something else may also have contributed. I've no clue if having so many swap partitions on the same = channel/device has consequences that having only one per channel/device would avoid. > Is this expected behavior? =20 As I understand the approximately even split across the in-use swap partitions is the normal way things are split. It is the placement of the partitions themselves that contributes to how effective that split is at improving the swap/paging I/O if I understand right. =3D=3D=3D Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FB333A71-47D8-4038-9983-116DA80FC952>