Date: Mon, 1 Aug 2016 22:08:43 +0200 From: Mateusz Guzik <mjguzik@gmail.com> To: John Baldwin <jhb@freebsd.org> Cc: Konstantin Belousov <kostikbel@gmail.com>, freebsd-current@freebsd.org Subject: Re: [PATCH] randomized delay in locking primitives, take 2 Message-ID: <20160801200842.GC24633@dft-labs.eu> In-Reply-To: <15005477.9uZ5EJCdhW@ralph.baldwin.cx> References: <20160731095706.GB9408@dft-labs.eu> <20160731104928.GW83214@kib.kiev.ua> <20160731124113.GE9408@dft-labs.eu> <15005477.9uZ5EJCdhW@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Aug 01, 2016 at 11:37:50AM -0700, John Baldwin wrote: > On Sunday, July 31, 2016 02:41:13 PM Mateusz Guzik wrote: > > On Sun, Jul 31, 2016 at 01:49:28PM +0300, Konstantin Belousov wrote: > > [snip] > > > > After an irc discussion, the following was produced (also available at: > > https://people.freebsd.org/~mjg/lock_backoff_complete4.diff): > > > > Differences: > > - uint64_t usage was converted to u_int (also see r303584) > > - currently unused features (cap limit and return value) were removed > > - lock_delay args got packed into a dedicated structure > > lock_delay_enabled declaration seems to be stale? > Oops, thanks. > I would maybe just provide a "standard" lock_delay_init function that the > sysinit's use rather than duplicating the same exact code 3 times. I'm > not sure we really want to use different tunables for different lock types > anyway. (Alternatively we could even just have a single 'config' variable > that is a global. We can always revisit this in the future if we find that > we need that granularity, but it would remove an extra pointer indirection > if you just had a single 'lock_delay_config' that was exported as a global > for now and initialized in a single SYSINIT.) > The per-lock type config is partially an artifact of the real version of the patch which has different configs per state of the lock, see loops with rowner_loops in the current implementation of rw and sx locks and this is were it mattered. It was cut off from this patch for simplicity (90% of the benefit for 10% of the work). That said, fine tuned it does matter for "mere" spinning as well but here I put very low values on purpose. Putting them all in one config makes for a small compatibility issue, where debug.lock.delay_* sysctls would disappear later. So I would prefer to just keep this as I don't think it matters much. I have further optimisation to primitives not related to spinning. They boil down to the fact that KDTRACE_HOOKS-enabled kernels contain an unconditional function call to lockstat_nsecs even with the lock held. > I think the idea is fine. I'm less worried about the overhead of the > divide as you are only doing it when you are contesting (so you are already > sort of hosed anyway). Long delays in checking the lock cookie can be > bad (see my local APIC snafu which only polled once per microsecond). I > don't really think a divide is going to be that long? > This should be perfectly fine. One could argue the time wasted should be wasted efficiently, i.e. the more cpu_spinwait, the better, at least on amd64. -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160801200842.GC24633>