Date: Mon, 20 Aug 2018 08:59:41 -0600 From: Warner Losh <imp@bsdimp.com> To: Mark Millard <marklmi@yahoo.com> Cc: bob prohaska <fbsd@www.zefox.net>, freebsd-arm <freebsd-arm@freebsd.org>, Mark Johnston <markj@freebsd.org> Subject: Re: RPI3 swap experiments ["was killed: out of swap space" with: "v_free_count: 5439, v_inactive_count: 1"] Message-ID: <CANCZdfroR72T_9Omo0bAoWLs9JsTPvJ4xnYrqabSgUuX-Pe7qg@mail.gmail.com> In-Reply-To: <BD92A3B6-BCD2-4FA5-A9C6-0B950D3AD378@yahoo.com> References: <20180808153800.GF26133@www.zefox.net> <20180808204841.GA19379@raichu> <2DC1A479-92A0-48E6-9245-3FF5CFD89DEF@yahoo.com> <20180809033735.GJ30738@phouka1.phouka.net> <20180809175802.GA32974@www.zefox.net> <20180812173248.GA81324@phouka1.phouka.net> <20180812224021.GA46372@www.zefox.net> <B81E53A9-459E-4489-883B-24175B87D049@yahoo.com> <20180813021226.GA46750@www.zefox.net> <0D8B9A29-DD95-4FA3-8F7D-4B85A3BB54D7@yahoo.com> <20180815221728.GA59074@www.zefox.net> <9EA5D75D-A03F-4B25-B65E-03E93DE30130@yahoo.com> <BD92A3B6-BCD2-4FA5-A9C6-0B950D3AD378@yahoo.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 20, 2018 at 8:00 AM, Mark Millard via freebsd-arm < freebsd-arm@freebsd.org> wrote: > There is a way to explore Mark Johnston's swap information > reports (from his patches for such). > > Taking a Pine64+ 2GB as an example (4 cores with 1 > HW-thread per core, 2 GiBytes of RAM, USB device for > root file system and swap partition): > > In one login: > # nice -20 gstat -pd > > In another login: > # stress -d 2 -m 4 --vm-keep --vm-bytes 536870912 > > That "4" and "536870912" total to the 2 GiBytes so > swapping is induced for the context in question. > (Scale --vm-bytes appropriately to context.) > > [stress is from sysutils/stress .] > > gstat tends to show things such as: > > dT: 1.006s w: 1.000s > L(q) ops/s r/s kBps ms/r w/s kBps ms/w d/s kBps > ms/d %busy Name > 0 0 0 0 0.0 0 0 0.0 0 0 > 0.0 0.0| mmcsd0 > 56 312 0 0 0.0 312 19985 142.6 0 0 > 0.0 99.6| da0 > > where the ms/w and kBps are fairly stable but the Length > of the queue length is widely variable. For the above it > makes the likes of 56 writes queued * 142.6 ms/write (mean) > [as an estimate] a rather large total time for the last > of the queued writes to complete. (If I understand how > to interpret the above.) > No. 142.6ms/write is the average of the time that the operations that completed during the polling interval took to complete. There's no estimating here. So, at 6 or 7 per second for the operation to complete, coupled with a parallel factor of 1 (typical for low end junk flash), we wind up with 56 operations in the queue taking 8-10s to complete. > It appears to me that, compared to a observed capacity of > roughly around 20 MiBytes/sec for writes, large amounts of > bytes are being queued up to be written in a short time, > for which it just takes a while for the backlog to be > finished. > Yes. That matches my expectation as well. In other devices, I've found that I needed to rate-limit things to more like 50-75% of the max value to keep variance in performance low. It's the whole reason I wrote the CAM I/O scheduler. > The following is from multiple such runs, several manually > stopped but some killed because of sustained low free > memory. I had left vm.pageout_oom_seq=12 in place for this, > making the kills easier to get than the 120 figure would. It > does not take very long generally for some sort of message to > show up. > > > > waited 9s for async swap write > waited 9s for swap buffer > waited 9s for async swap write > waited 9s for async swap write > waited 9s for async swap write > v_free_count: 1357, v_inactive_count: 1 > Aug 20 06:04:27 pine64 kernel: pid 1010 (stress), uid 0, was killed: out > of swap space > waited 5s for async swap write > waited 5s for swap buffer > waited 5s for async swap write > waited 5s for async swap write > waited 5s for async swap write > waited 13s for async swap write > waited 12s for swap buffer > waited 13s for async swap write > waited 12s for async swap write > waited 12s for async swap write > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161177, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164766, size: 65536 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 161630, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 164064, size: 12288 > waited 65s for async swap write > waited 65s for swap buffer > waited 65s for async swap write > waited 65s for async swap write > waited 65s for async swap write > v_free_count: 955, v_inactive_count: 1 > Aug 20 06:11:49 pine64 kernel: pid 1047 (stress), uid 0, was killed: out > of swap space > waited 5s for async swap write > waited 5s for swap buffer > waited 5s for async swap write > waited 5s for async swap write > waited 5s for async swap write > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314021, size: 12288 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314084, size: 32768 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314856, size: 32768 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314638, size: 131072 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312518, size: 4096 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312416, size: 16384 > waited 39s for async swap write > waited 39s for swap buffer > waited 39s for async swap write > waited 39s for async swap write > waited 39s for async swap write > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314802, size: 24576 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 314934, size: 40960 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 315617, size: 32768 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 312484, size: 4096 > swap_pager: indefinite wait buffer: bufobj: 0, blkno: 328616, size: 131072 > waited 31s for async swap write > waited 31s for swap buffer > waited 31s for async swap write > waited 31s for async swap write > waited 31s for async swap write > waited 5s for async swap write > waited 5s for swap buffer > waited 5s for async swap write > waited 5s for async swap write > waited 5s for async swap write These numbers are consistent with the theory that the swap device becomes overwhelmed, spiking latency and causing crappy down-stream effects. You can use the I/O scheduler to limit the write rates at the low end. You might also be able to schedule a lower write queue depth at the top end as well, but I've not seen good ways to do that. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfroR72T_9Omo0bAoWLs9JsTPvJ4xnYrqabSgUuX-Pe7qg>