Date: Tue, 29 Nov 2016 08:33:24 -0700 From: Warner Losh <imp@bsdimp.com> To: Fabian Keil <freebsd-listen@fabiankeil.de> Cc: David Cross <dcrosstech@gmail.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: Re: FreeBSD 11 i386 disk deadlock (I think) (now with reproduction steps!) Message-ID: <CANCZdfq==1UNPE6k5jh9oQ=at-pyyc5v=jOCou647cTFwMu1yQ@mail.gmail.com> In-Reply-To: <20161129131738.792efbd1@fabiankeil.de> References: <CAM9edeMYMhnkWid7Lig5D-FjhahniFm0VbFRm8ysyb85h29wXg@mail.gmail.com> <20161128041847.GA65249@charmander> <20161128120046.GP54029@kib.kiev.ua> <CAM9edeNDWcJ7R_%2B_Q%2BMksVcL_pcJVR%2BO7t98s5XyfmOpXgc-zw@mail.gmail.com> <20161128144135.10f93205@fabiankeil.de> <20161128160311.GQ54029@kib.kiev.ua> <20161128162240.GM99742@zxy.spb.ru> <CAM9edePGR_XBNpKctX9%2Bsr6y2SAROhtRvD_bUq3TsFyUqnOFFg@mail.gmail.com> <20161129131738.792efbd1@fabiankeil.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, Nov 29, 2016 at 5:17 AM, Fabian Keil <freebsd-listen@fabiankeil.de> wrote: > David Cross <dcrosstech@gmail.com> wrote: > >> I wouldn't call this a 'workaround', but the right answer. Something in >> the disk io path shouldn't be allocating memory out of the pool that can >> cause paging (since any of that could be IN the path for paging). It was >> what I assumed Fabian's proposed patch was. > > That's indeed what the patch does (for geli). I took a look at the patch. I think it's the wrong approach in the detail, though the general idea is good. It seems good enough to work around the problem. I think it would be better to have a pre-allocated area for one write of a certain size. We'd normally not use this at all. In the write path, we'd try to allocate what we need, and if that fails, we push down one write with the pre-allocated area. We queue further writes that fail to allocate the area they need. Once the one write that's using the pre-allocated area is done, we push down another one. This allows us to always make progress. Bonus points if you can do this only for the swapper. To do that latter bit requires help from the swapper. I've been working on some back-pressure into the VM layer to replace the current runningbuf limiter. Part of that work assigns a priority to the I/Os that's visible down the stack. That could be used to determine whether to dip into the reserve or not and may produce better results when we're not in a memory starved situation. It would be better to know you need to do this than to guess based on it being a onetime provider. Warner
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfq==1UNPE6k5jh9oQ=at-pyyc5v=jOCou647cTFwMu1yQ>