Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 9 Mar 2020 23:35:12 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Mateusz Guzik <mjguzik@gmail.com>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r358733 - head/sys/sys
Message-ID:  <20200309213512.GS98340@kib.kiev.ua>
In-Reply-To: <CAGudoHE4hTN5niCAsA4tfggywkEnGxiaoPz=rTX3AHA=fgZfkQ@mail.gmail.com>
References:  <202003080022.0280MX9j055805@repo.freebsd.org> <CAGudoHE4hTN5niCAsA4tfggywkEnGxiaoPz=rTX3AHA=fgZfkQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Mar 09, 2020 at 01:56:17AM +0100, Mateusz Guzik wrote:
> On 3/8/20, Mateusz Guzik <mjg@freebsd.org> wrote:
> > Author: mjg
> > Date: Sun Mar  8 00:22:32 2020
> > New Revision: 358733
> > URL: https://svnweb.freebsd.org/changeset/base/358733
> >
> > Log:
> >   seqc: tidy up
> >
> >   - avoid hand-rolled read
> >   - match begin/end in terms of fence style
> >
> 
> There were off lists questions about this so let me clarify.
> 
> The first bit is a cosmetic change, but the second one is not.
> 
> > Modified:
> >   head/sys/sys/seqc.h
> >
> > Modified: head/sys/sys/seqc.h
> > ==============================================================================
> > --- head/sys/sys/seqc.h	Sat Mar  7 15:37:23 2020	(r358732)
> > +++ head/sys/sys/seqc.h	Sun Mar  8 00:22:32 2020	(r358733)
> > @@ -66,7 +66,8 @@ static __inline void
> >  seqc_write_end(seqc_t *seqcp)
> >  {
> >
> > -	atomic_store_rel_int(seqcp, *seqcp + 1);
> > +	atomic_thread_fence_rel();
> > +	*seqcp += 1;
> >  	MPASS(!seqc_in_modify(*seqcp));
> >  	critical_exit();
> >  }
> 
> For correct operation the counter has to be modified *before* any work
> is done and *after* it is completed.
> 
> Consider a trivial case:
> seqc++;
> foo[0] = 0;
> foo[1] = 1;
> seqc++;
> 
> There are 2 ways in which this can be mucked with:
> - the compiler can be looking at reordering or reworking the increment
> (e.g., to collapse it to just += 2 instead). a compiler barrier
> prevents that.
> - even with generated machine code which increments in correct places
> the cpu can be looking at reordering the operation (which is heavily
> dependent on architecture), in particular it could end up issuing
> seqc++; foo[1] = 1; which would defeat the point. This is where
> release fences come in.
> 
> With the patched code there is:
> seqc++;
> atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo
> starts getting modified */
> foo[0] = 0;
> foo[1] = 1;
> atomic_thread_fence_rel(); /* make sure modifications to foo don't
> leak past this spot */
> seqc++;
> 
> In comparison, the previous code was:
> seqc++;
> atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo
> starts getting modified */
> foo[0] = 0;
> foo[1] = 1;
> atomic_store_rel_int(seqcp, *seqcp + 1); /* make sure modifications to
> too don't leak past this spot and update seqc immediately */
> 
> There are 2 differences here -- one is an improvement and second one
> does not matter:
> the win is: the previous code forces the compiler to compute an
> incremented value and then store it. On amd64 this translates to the
> following (with rdx holding the address of the counter):
> 
> mov    (%rdx),%eax /* load the value */
> add    $0x1,%eax    /* add 1 */
> mov    %eax,(%rdx) /* store it */
> 
> On patched kernel this is:
> addl   $0x1,(%rdx)
> 
> which is clearly much nicer.
I am not sure that the new code on amd64 is much nicer.

But the point was that on non-amd64, i.e. armv8 and potentially on
powerpc 3.0, the patch substitutes release store with full barrier. The
later is (much) slower.

> 
> the not a problem: the code does not issue the release fence after
> incrementing the counter the second time, meaning that in principle it
> can happen arbitrarily late, possibly disturbing code which waits for
> the counter to become even. This is not a problem because in whoever
> modifies it is supposed to have a lock and release it shortly after,
> which also posts the release fence and consequently makes sure the
> counter update is visible. Also note these routines come with
> preemption disablement around the modification -- if the thread gets
> preempted as a result of critical_exit from seqc_write_end, it will
> also publish the updated counter. If the updater never does anything
> which flushes the store buffer then in the absolute worst case they
> will get preempted, at which point once more we are covered. Note that
> code which wants more deterministic behavior will want to
> seqc_read_any, test the counter once and if it is caught uneven resort
> to locking.
> 
> > @@ -84,7 +85,7 @@ seqc_read(const seqc_t *seqcp)
> >  	seqc_t ret;
> >
> >  	for (;;) {
> > -		ret = atomic_load_acq_int(__DECONST(seqc_t *, seqcp));
> > +		ret = seqc_read_any(seqcp);
> >  		if (__predict_false(seqc_in_modify(ret))) {
> >  			cpu_spinwait();
> >  			continue;
> > _______________________________________________
> > svn-src-all@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/svn-src-all
> > To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org"
> >
> 
> 
> -- 
> Mateusz Guzik <mjguzik gmail.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20200309213512.GS98340>