Date: Mon, 9 Mar 2020 23:35:12 +0200 From: Konstantin Belousov <kostikbel@gmail.com> To: Mateusz Guzik <mjguzik@gmail.com> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r358733 - head/sys/sys Message-ID: <20200309213512.GS98340@kib.kiev.ua> In-Reply-To: <CAGudoHE4hTN5niCAsA4tfggywkEnGxiaoPz=rTX3AHA=fgZfkQ@mail.gmail.com> References: <202003080022.0280MX9j055805@repo.freebsd.org> <CAGudoHE4hTN5niCAsA4tfggywkEnGxiaoPz=rTX3AHA=fgZfkQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Mar 09, 2020 at 01:56:17AM +0100, Mateusz Guzik wrote: > On 3/8/20, Mateusz Guzik <mjg@freebsd.org> wrote: > > Author: mjg > > Date: Sun Mar 8 00:22:32 2020 > > New Revision: 358733 > > URL: https://svnweb.freebsd.org/changeset/base/358733 > > > > Log: > > seqc: tidy up > > > > - avoid hand-rolled read > > - match begin/end in terms of fence style > > > > There were off lists questions about this so let me clarify. > > The first bit is a cosmetic change, but the second one is not. > > > Modified: > > head/sys/sys/seqc.h > > > > Modified: head/sys/sys/seqc.h > > ============================================================================== > > --- head/sys/sys/seqc.h Sat Mar 7 15:37:23 2020 (r358732) > > +++ head/sys/sys/seqc.h Sun Mar 8 00:22:32 2020 (r358733) > > @@ -66,7 +66,8 @@ static __inline void > > seqc_write_end(seqc_t *seqcp) > > { > > > > - atomic_store_rel_int(seqcp, *seqcp + 1); > > + atomic_thread_fence_rel(); > > + *seqcp += 1; > > MPASS(!seqc_in_modify(*seqcp)); > > critical_exit(); > > } > > For correct operation the counter has to be modified *before* any work > is done and *after* it is completed. > > Consider a trivial case: > seqc++; > foo[0] = 0; > foo[1] = 1; > seqc++; > > There are 2 ways in which this can be mucked with: > - the compiler can be looking at reordering or reworking the increment > (e.g., to collapse it to just += 2 instead). a compiler barrier > prevents that. > - even with generated machine code which increments in correct places > the cpu can be looking at reordering the operation (which is heavily > dependent on architecture), in particular it could end up issuing > seqc++; foo[1] = 1; which would defeat the point. This is where > release fences come in. > > With the patched code there is: > seqc++; > atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo > starts getting modified */ > foo[0] = 0; > foo[1] = 1; > atomic_thread_fence_rel(); /* make sure modifications to foo don't > leak past this spot */ > seqc++; > > In comparison, the previous code was: > seqc++; > atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo > starts getting modified */ > foo[0] = 0; > foo[1] = 1; > atomic_store_rel_int(seqcp, *seqcp + 1); /* make sure modifications to > too don't leak past this spot and update seqc immediately */ > > There are 2 differences here -- one is an improvement and second one > does not matter: > the win is: the previous code forces the compiler to compute an > incremented value and then store it. On amd64 this translates to the > following (with rdx holding the address of the counter): > > mov (%rdx),%eax /* load the value */ > add $0x1,%eax /* add 1 */ > mov %eax,(%rdx) /* store it */ > > On patched kernel this is: > addl $0x1,(%rdx) > > which is clearly much nicer. I am not sure that the new code on amd64 is much nicer. But the point was that on non-amd64, i.e. armv8 and potentially on powerpc 3.0, the patch substitutes release store with full barrier. The later is (much) slower. > > the not a problem: the code does not issue the release fence after > incrementing the counter the second time, meaning that in principle it > can happen arbitrarily late, possibly disturbing code which waits for > the counter to become even. This is not a problem because in whoever > modifies it is supposed to have a lock and release it shortly after, > which also posts the release fence and consequently makes sure the > counter update is visible. Also note these routines come with > preemption disablement around the modification -- if the thread gets > preempted as a result of critical_exit from seqc_write_end, it will > also publish the updated counter. If the updater never does anything > which flushes the store buffer then in the absolute worst case they > will get preempted, at which point once more we are covered. Note that > code which wants more deterministic behavior will want to > seqc_read_any, test the counter once and if it is caught uneven resort > to locking. > > > @@ -84,7 +85,7 @@ seqc_read(const seqc_t *seqcp) > > seqc_t ret; > > > > for (;;) { > > - ret = atomic_load_acq_int(__DECONST(seqc_t *, seqcp)); > > + ret = seqc_read_any(seqcp); > > if (__predict_false(seqc_in_modify(ret))) { > > cpu_spinwait(); > > continue; > > _______________________________________________ > > svn-src-all@freebsd.org mailing list > > https://lists.freebsd.org/mailman/listinfo/svn-src-all > > To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" > > > > > -- > Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200309213512.GS98340>