Date: Mon, 28 Nov 2016 16:19:28 -0700 From: Warner Losh <imp@bsdimp.com> To: David Cross <dcrosstech@gmail.com> Cc: Slawa Olhovchenkov <slw@zxy.spb.ru>, Konstantin Belousov <kostikbel@gmail.com>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, Fabian Keil <freebsd-listen@fabiankeil.de> Subject: Re: FreeBSD 11 i386 disk deadlock (I think) (now with reproduction steps!) Message-ID: <CANCZdfoYiXAU8cXvb9xf3g9rsoGTm62S_xOu2mMeF9Da_ith_w@mail.gmail.com> In-Reply-To: <CAM9edePGR_XBNpKctX9%2Bsr6y2SAROhtRvD_bUq3TsFyUqnOFFg@mail.gmail.com> References: <CAM9edeMYMhnkWid7Lig5D-FjhahniFm0VbFRm8ysyb85h29wXg@mail.gmail.com> <20161128041847.GA65249@charmander> <20161128120046.GP54029@kib.kiev.ua> <CAM9edeNDWcJ7R_%2B_Q%2BMksVcL_pcJVR%2BO7t98s5XyfmOpXgc-zw@mail.gmail.com> <20161128144135.10f93205@fabiankeil.de> <20161128160311.GQ54029@kib.kiev.ua> <20161128162240.GM99742@zxy.spb.ru> <CAM9edePGR_XBNpKctX9%2Bsr6y2SAROhtRvD_bUq3TsFyUqnOFFg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Nov 28, 2016 at 10:50 AM, David Cross <dcrosstech@gmail.com> wrote: > I wouldn't call this a 'workaround', but the right answer. Something in > the disk io path shouldn't be allocating memory out of the pool that can > cause paging (since any of that could be IN the path for paging). It was > what I assumed Fabian's proposed patch was. > > From looking at the process list on my machine, it seems that geli > allocates a process per core per provider, is there a reason to not have > each of these on startup allocate themselves a single buffer of > sector-size, and just put all operations through that? You're not > (realistically) going to get more concurrency than that. I guess another > approach would be to pre-allocate a ring buffer of the desired operational > depth.. but that seems overkill. I have some code that helps fix this in the GEOM layer. For the swapper, it will allocate out of a pool of memory that's set aside for that. While it is still a pool, the only time things are allocated out of it is when the swapper is swapping stuff out. So if you hit a resource shortage and have to wait, you know the wait will be bounded unless the disk I/O never completes. This is already weakly done with UMA, but the guarantees aren't strong enough that we'll always make progress. There are other places in the stack that allocate shared resources, but this one bit us at Netflix. I've not yet cleaned up the patches for upstreaming... I want to let the recent vm changes settle before tackling this again as well... Warner > On Mon, Nov 28, 2016 at 11:22 AM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote: > >> On Mon, Nov 28, 2016 at 06:03:11PM +0200, Konstantin Belousov wrote: >> >> > On Mon, Nov 28, 2016 at 02:43:30PM +0100, Fabian Keil wrote: >> > > David Cross <dcrosstech@gmail.com> wrote: >> > > >> > > > This is certainly new behavior, or a new manifestation. >> > > >> > > Recently a couple of uma consumers were changed to share uma zones >> > > instead of using a dedicated zone. As a result geli competes with >> > > more uma consumers and is more likely to deadlock. The bug isn't >> > > new, it's just triggered more often now. >> > The problem happens on layer much lower than UMA, it is whole reusable >> > page pool which is depleted and cannot be re-filled without allocating >> > more memory. If you think about it, the deadlock is obviously trivial: >> > pagedaemon is the main source of the free pages, but if producing free >> > page requires allocating one, low memory condition is equal to deadlock. >> > >> > It was always there, in the sense that for all versions of freebsd, if >> > file/disk write path requires memory allocation, there is the trouble. >> > >> > For geom, some special unique measures were taken so that bio allocations >> > do not cause the issue in typical situations. >> >> Typical workaround for this is pre-allocate some memory for this >> operation. >> > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CANCZdfoYiXAU8cXvb9xf3g9rsoGTm62S_xOu2mMeF9Da_ith_w>