Date: Tue, 14 Aug 2018 17:50:11 -0600 From: Warner Losh <imp@bsdimp.com> To: bob prohaska <fbsd@www.zefox.net> Cc: Mark Millard <marklmi@yahoo.com>, freebsd-arm <freebsd-arm@freebsd.org>, Mark Johnston <markj@freebsd.org> Subject: Re: RPI3 swap experiments (grace under pressure) Message-ID: <CANCZdfqFKY3Woa%2B9pVS5hika_JUAUCxAvLznSS4gaLq2kKoWtQ@mail.gmail.com> In-Reply-To: <20180814014226.GA50013@www.zefox.net> References: <20180809033735.GJ30738@phouka1.phouka.net> <20180809175802.GA32974@www.zefox.net> <20180812173248.GA81324@phouka1.phouka.net> <20180812224021.GA46372@www.zefox.net> <B81E53A9-459E-4489-883B-24175B87D049@yahoo.com> <20180813021226.GA46750@www.zefox.net> <0D8B9A29-DD95-4FA3-8F7D-4B85A3BB54D7@yahoo.com> <FC0798A1-C805-4096-9EB1-15E3F854F729@yahoo.com> <20180813185350.GA47132@www.zefox.net> <FA3B8541-73E0-4796-B2AB-D55CE40B9654@yahoo.com> <20180814014226.GA50013@www.zefox.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 13, 2018 at 7:42 PM, bob prohaska <fbsd@www.zefox.net> wrote: > [Altered subject, philosophical question] > On Mon, Aug 13, 2018 at 01:05:38PM -0700, Mark Millard wrote: > > > > Here there is architecture choice and goals/primary > > contexts. FreeBSD is never likely to primarily target > > anything with a workload like buildworld buildkernel > > on hardware like rpi3's and rpi2 V1.1's and > > Pine64+ 2GB's and so on. > > > > I understand that the RPi isn't a primary platform for FreeBSD. > But, decent performance under overload seems like a universal > problem that's always worth solving, whether for a computer or > an office. The exact goals might vary, but coping with too much > to do and not enough to do it with is humanity's oldest puzzle. > > Maybe I should ask what the goals of the OOMA process serve. > I always thought an OS's goals were along the lines of: > 1. maintain control > 2. get the work done > 3. remain responsive > Simplistically, one can view the VM system as a producer of dirty pages, and a cleaner of dirty pages. These happen at different rates, but usually are closely matched. We're normally able to launder enough pages to satisfy the need for new pages from the VM system (since clean pages can just be thrown away w/o any loss of data). The problem happens when we put a large load onto the creation side with a build. This generates a lot of dirty pages, and we have to flush the writes of the dirty pages quickly to keep up. When the backing store has time-varying write rates that vary substantially, we run into problems. We're not able to clean enough pages to keep up with demand. The system does what it can to slow down demand, but at some point it just can't keep up and we trigger OOM. I'm still firmly convinced that a combination of bugs that's making the storage system less robust. The solution? Fix those bugs. Once you do that, however, you are still stuck with crappy hardware is crappy. Swapping to the ultra-low-end is still going to suck. USB and SD cards generally is geared to long stretches of sequential writes and random reads since they are expected to go into cameras, or used as sneaker net. We might be able to not overload the device so much via tweaks to either the swap-out code (to reduce its rate more quickly when the GC on the card goes wonkies). But that might also allow for some way to write bigger, contiguous blocks when swapping out (which would help avoid the Read Modify Write behavior on 'small' writes that grind performance of some USB/SD flash devices into the ground). That would help this workload (and likely others). This is tricky because you'd want to do that as part of a single write which has some tricky implications for the VM system. These can be dealt with, of course. And the code to page it out will need a scatter gather list do the DMA works right, so we have to be careful not to exceed those limits. There's some clustering in the page-out code, but the swapper looks like it could use some work... I've not studied closely though to start work. At Netflix we've seen some workloads that suggest some improvements there would be helpful for us, but I don't know if that's the same problem or a different, related one. So, philosophically, I agree that the system shouldn't suck. Making it robust against suckage for extreme events that don't match the historic usage of BSD, though, is going to take some work. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfqFKY3Woa%2B9pVS5hika_JUAUCxAvLznSS4gaLq2kKoWtQ>