Date: Fri, 15 Jun 2018 09:57:17 -0600 From: Warner Losh <imp@bsdimp.com> To: bob prohaska <fbsd@www.zefox.net> Cc: Mark Millard <marklmi@yahoo.com>, "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, "Rodney W. Grimes" <freebsd-rwg@pdx.rh.cn85.dnsmgr.net> Subject: Re: GPT vs MBR for swap devices Message-ID: <CANCZdfoCA=E=Sh2X6H=Fi-TBkhiTdyzXAkjXr4usa8ie6%2Buo4g@mail.gmail.com> In-Reply-To: <20180615154334.GA39777@www.zefox.net> References: <20180614175622.GC35161@www.zefox.net> <201806142110.w5ELAL0N046840@pdx.rh.CN85.dnsmgr.net> <20180615035225.GA37370@www.zefox.net> <CANCZdfoNasSpvEN-y3bzsDfWT=_atfp62AKvdpwK8bUQKi=bgA@mail.gmail.com> <20180615051527.GB37370@www.zefox.net> <834EA7A6-B567-436F-96B2-0C75FACA3FF9@yahoo.com> <20180615154334.GA39777@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jun 15, 2018 at 9:43 AM, bob prohaska <fbsd@www.zefox.net> wrote: > On Thu, Jun 14, 2018 at 11:37:48PM -0700, Mark Millard wrote: > > > > When I look at: > > > > # vmstat -c -w 5 > > procs memory page disks faults cpu > > r b w avm fre flt re pi po fr sr da0 ad0 in sy cs > us sy id > > 1 0 0 416M 224M 1647 1 0 0 1856 142 0 0 144 1791 1024 > 4 2 94 > > 0 0 0 416M 224M 9 0 0 0 0 1 0 0 4 85 116 > 0 0 100 > > 0 0 0 416M 224M 12 0 0 0 0 1 0 0 2 93 113 > 0 0 100 > > 0 0 0 416M 224M 9 0 0 0 2 1 1 0 4 64 121 > 0 0 100 > > . . . > > > > and "man vmstat" I do not see any column that is the swap space > > usage (nor any combination of columns to do such a calculation > > from). > > > > I do not expect that vmstat reports what you are likely/primarily > > looking for. > > > > An example is "avm" which for which the man page reports: > > > > . . . Note that the entire > > memory object's size is considered mapped even if only a > subset > > of the object's pages are currently mapped. This statistic > is > > not related to the active page queue which is used to track > real > > memory. > > > > The free list size ("fre") is not sufficient either. > > > > That seems astonishing. I imagined that among those columns _had_ to be > reads from and writes to the swap partitions. > > It looks as if > top -d 1000 | grep Swap > produces a running list of swap usage, but one must guess how many > times to iterate: > > bob@www:/usr/src % top -d 1000 | grep Swap > Swap: 3072M Total, 30M Used, 3041M Free > Swap: 3072M Total, 30M Used, 3041M Free > Swap: 3072M Total, 30M Used, 3041M Free > Swap: 3072M Total, 30M Used, 3041M Free > Swap: 3072M Total, 30M Used, 3041M Free > ....... > > Replacing the "1000" with "0" or "infinite" triggers > a syntax error. Is there a special parameter that makes top run till > it's killed, as in interactive mode? I didn't recognize any hint in the > man page. > > Thanks for reading! > Right, this is why I was suggesting gstat. It's a direct measure of the read/write performance of the device with some latency numbers. It will give the kind of data I'm looking for. vmstat won't, top won't. I don't care about used/free swap usage. I care about performance to the swap partition. That's what I'm suspecting in the USB thumb drive FTL. I don't care what the total swap usage is. I suspect that's irrelevant to the issue at hand since the OOM isn't triggering because we're filling swap, but more that it's due to not being able to get enough pages to the swap device fast enough to satisfy the memory shortages, triggering OOM. As for why it would affect the USB drive and not SD cards, I can only say that USB drives tend to be first to market with bigger capacities. This has traditionally made them less well tuned for anything other than large, long sequential reads or writes that aren't mixed. More so than even SD or uSD cards which tend to do better than USB drives at that workload. It's the FTL that's the issue, not the NAND itself. The FTL is the software that translates the log-style device you have to have for flash to work to the LBA style devices that people attach to systems. If it can't cope with a mixed workload, or needs to do too much garbage collection or read/modify/write operations due to it's poor quality / tuning, that will show up as long delays. USB flash also tends to suck more with BIO_DELETE than others, though the swapper doesn't do that, so that's one fewer wildcards we need to look at. gstat -Bd -I 10 -f <regexp for your swap partition> > gstat-swap-data.dat would be how I'd recommend collecting it. This file may get kinda big depending how long it takes to trigger the weird state. I'm hoping that if you put this on a known good device, we'll power through the issues. We might not get perfect correlation with this, but the data should show all kinds of crazy before the system drives off the cliff if I'm right, so we don't need perfect data. There's some higher fidelity numbers we can get from the I/O scheduler with dynamic scheduling compiled in, but I don't think we'll need those. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfoCA=E=Sh2X6H=Fi-TBkhiTdyzXAkjXr4usa8ie6%2Buo4g>