From owner-svn-src-head@freebsd.org Thu Mar 12 07:48:15 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 68B662774FF; Thu, 12 Mar 2020 07:48:14 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mx0a-0010f301.pphosted.com (mx0a-0010f301.pphosted.com [148.163.149.254]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.pphosted.com", Issuer "Thawte RSA CA 2018" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48dLXr3HLBz48x2; Thu, 12 Mar 2020 07:48:10 +0000 (UTC) (envelope-from alc@rice.edu) Received: from pps.filterd (m0102857.ppops.net [127.0.0.1]) by mx0b-0010f301.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 02C7kdsT028076; Thu, 12 Mar 2020 02:48:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rice.edu; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=ricemail; bh=wrAtbxld4X+vJo8e4JaHu+WXOLqRV4Wdm2+Lf8H2Uxg=; b=Iri7FVmTUYtTCTHmmd5UEoril9G+COOcmxFNTkCzJvYGjuTGbMlMRnHTVDZu9gtrMVau K55GC9BuGC3okLzRhMRT5tgflwr1oyUF37DN0/JCevl0k2fFETRX1egcLz4/1dVYX/u+ 7oAy1DPmlAkSTbstLXaXJtAoMPe6HrcR1sXdJ40xFMf6mUt8a4zn+YhmzWhpAIz4Zwdp NGg2zIJeo4fgN4T63GH5IKYmRqHFeryPtZRv4NRcKrsDYKLBqPX8kYXS8gq7LA6SQwlx ePjgv7BRTxIQ4MeKmvbHCV54nScHilchHvvzzFgsLXV0N15DDPc9mNlJQUgkwqw7wpFw 7g== Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by mx0b-0010f301.pphosted.com with ESMTP id 2ypk1tj08r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 12 Mar 2020 02:48:08 -0500 Received-X: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id A13CA4C0664; Thu, 12 Mar 2020 02:48:07 -0500 (CDT) Received-X: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id 9C90F4C08E9; Thu, 12 Mar 2020 02:48:07 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received-X: from mh11.mail.rice.edu ([127.0.0.1]) by mh11.mail.rice.edu (mh11.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id ehh8NBp04wms; Thu, 12 Mar 2020 02:48:07 -0500 (CDT) Received: from 108-254-203-201.lightspeed.hstntx.sbcglobal.net (108-254-203-201.lightspeed.hstntx.sbcglobal.net [108.254.203.201]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id 2C86C4C0664; Thu, 12 Mar 2020 02:48:07 -0500 (CDT) Subject: Re: svn commit: r358733 - head/sys/sys To: Mateusz Guzik , Konstantin Belousov Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <202003080022.0280MX9j055805@repo.freebsd.org> <20200309213512.GS98340@kib.kiev.ua> <20200310192326.GC1992@kib.kiev.ua> From: Alan Cox Message-ID: <0d1145b3-1c29-aa21-1bd5-1129fdb88144@rice.edu> Date: Thu, 12 Mar 2020 02:48:06 -0500 User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-03-11_15:2020-03-11, 2020-03-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 bulkscore=0 impostorscore=0 adultscore=0 priorityscore=1501 spamscore=0 malwarescore=0 suspectscore=1 phishscore=0 clxscore=1011 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2003120041 X-Rspamd-Queue-Id: 48dLXr3HLBz48x2 X-Spamd-Bar: ------- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=rice.edu header.s=ricemail header.b=Iri7FVmT; dmarc=pass (policy=none) header.from=rice.edu; spf=pass (mx1.freebsd.org: domain of alc@rice.edu designates 148.163.149.254 as permitted sender) smtp.mailfrom=alc@rice.edu X-Spamd-Result: default: False [-7.53 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_DKIM_ALLOW(-0.20)[rice.edu:s=ricemail]; NEURAL_HAM_MEDIUM(-1.00)[-0.998,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+a:a32.spf.rice.edu]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; RCPT_COUNT_FIVE(0.00)[5]; DWL_DNSWL_LOW(-1.00)[rice.edu.dwl.dnswl.org : 127.0.11.1]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_TRACE(0.00)[rice.edu:+]; DMARC_POLICY_ALLOW(-0.50)[rice.edu,none]; IP_SCORE(-3.43)[ip: (-9.90), ipnet: 148.163.148.0/22(-4.85), asn: 26211(-2.36), country: US(-0.05)]; FREEMAIL_TO(0.00)[gmail.com]; RCVD_IN_DNSWL_LOW(-0.10)[254.149.163.148.list.dnswl.org : 127.0.3.1]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:26211, ipnet:148.163.148.0/22, country:US]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Mar 2020 07:48:16 -0000 On 3/11/20 9:16 PM, Mateusz Guzik wrote: > On 3/10/20, Konstantin Belousov wrote: >> On Tue, Mar 10, 2020 at 12:22:09AM +0100, Mateusz Guzik wrote: >>> On 3/9/20, Konstantin Belousov wrote: >>>> On Mon, Mar 09, 2020 at 01:56:17AM +0100, Mateusz Guzik wrote: >>>>> On 3/8/20, Mateusz Guzik wrote: >>>>>> Author: mjg >>>>>> Date: Sun Mar 8 00:22:32 2020 >>>>>> New Revision: 358733 >>>>>> URL: https://svnweb.freebsd.org/changeset/base/358733 >>>>>> >>>>>> Log: >>>>>> seqc: tidy up >>>>>> >>>>>> - avoid hand-rolled read >>>>>> - match begin/end in terms of fence style >>>>>> >>>>> There were off lists questions about this so let me clarify. >>>>> >>>>> The first bit is a cosmetic change, but the second one is not. >>>>> >>>>>> Modified: >>>>>> head/sys/sys/seqc.h >>>>>> >>>>>> Modified: head/sys/sys/seqc.h >>>>>> ============================================================================== >>>>>> --- head/sys/sys/seqc.h Sat Mar 7 15:37:23 2020 (r358732) >>>>>> +++ head/sys/sys/seqc.h Sun Mar 8 00:22:32 2020 (r358733) >>>>>> @@ -66,7 +66,8 @@ static __inline void >>>>>> seqc_write_end(seqc_t *seqcp) >>>>>> { >>>>>> >>>>>> - atomic_store_rel_int(seqcp, *seqcp + 1); >>>>>> + atomic_thread_fence_rel(); >>>>>> + *seqcp += 1; >>>>>> MPASS(!seqc_in_modify(*seqcp)); >>>>>> critical_exit(); >>>>>> } >>>>> For correct operation the counter has to be modified *before* any work >>>>> is done and *after* it is completed. >>>>> >>>>> Consider a trivial case: >>>>> seqc++; >>>>> foo[0] = 0; >>>>> foo[1] = 1; >>>>> seqc++; >>>>> >>>>> There are 2 ways in which this can be mucked with: >>>>> - the compiler can be looking at reordering or reworking the increment >>>>> (e.g., to collapse it to just += 2 instead). a compiler barrier >>>>> prevents that. >>>>> - even with generated machine code which increments in correct places >>>>> the cpu can be looking at reordering the operation (which is heavily >>>>> dependent on architecture), in particular it could end up issuing >>>>> seqc++; foo[1] = 1; which would defeat the point. This is where >>>>> release fences come in. >>>>> >>>>> With the patched code there is: >>>>> seqc++; >>>>> atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo >>>>> starts getting modified */ >>>>> foo[0] = 0; >>>>> foo[1] = 1; >>>>> atomic_thread_fence_rel(); /* make sure modifications to foo don't >>>>> leak past this spot */ >>>>> seqc++; >>>>> >>>>> In comparison, the previous code was: >>>>> seqc++; >>>>> atomic_thread_fence_rel(); /* make sure seqc++ is issued before foo >>>>> starts getting modified */ >>>>> foo[0] = 0; >>>>> foo[1] = 1; >>>>> atomic_store_rel_int(seqcp, *seqcp + 1); /* make sure modifications to >>>>> too don't leak past this spot and update seqc immediately */ >>>>> >>>>> There are 2 differences here -- one is an improvement and second one >>>>> does not matter: >>>>> the win is: the previous code forces the compiler to compute an >>>>> incremented value and then store it. On amd64 this translates to the >>>>> following (with rdx holding the address of the counter): >>>>> >>>>> mov (%rdx),%eax /* load the value */ >>>>> add $0x1,%eax /* add 1 */ >>>>> mov %eax,(%rdx) /* store it */ >>>>> >>>>> On patched kernel this is: >>>>> addl $0x1,(%rdx) >>>>> >>>>> which is clearly much nicer. >>>> I am not sure that the new code on amd64 is much nicer. >>>> >>>> But the point was that on non-amd64, i.e. armv8 and potentially on >>>> powerpc 3.0, the patch substitutes release store with full barrier. The >>>> later is (much) slower. >>>> >>> If an arch performs something significantly more expensive here it's >>> probably a performance bug in the port. >> It is a question of how much hardware support for fine-grained fences. >> Relaxed architectures add some optimized operations already, but did not >> (yet) implemented complete C11 optimized set. >> I think they eventually converge. >> >> Until they did not, I think it is more important to not pessimize >> to-be Tier1 arches than to collapse three fast integer instructions >> on amd64 into one rw op. >> >>> But perhaps more importantly >>> there are significantly more frequent users of >>> atomic_thread_fence_rel, like refcount(9). If the issue is real that >>> should be looked at. >> For refcount(9) use of release fence removal, we would need >> atomic_fetchadd_rel_int(). >> > To reiterate, if atomic_thread_fence_rel is indeed very expensive on > certain architectures compared to other _rel variants, the kernel is > already heavily pessimized because of use of said standalone fence in > refcount(9) and few other places and this needs to be fixed. Just to be clear, this is not just a matter of whether our release fences are poorly or inefficiently implemented.  Release fences (as defined by the C/C++ standards that inspire our atomics) have *different* semantics than release stores. Specifically, a release fence places more ordering restrictions on the nearby store accesses than a release store does.  (In fact, "man 9 atomic" describes this.)  Thus, it is almost inevitable that a release fence will have a higher cost than the best possible implementation of a release store. > If the cost can't be reduced (e.g., perhaps this can do a no-op store > + fence somewhere akin to how lock xadd 0,(%esp) trickery used to > work), then the atomic APIs do indeed not to be completed w.r.t. > adding all the missing acq/rel variants (that's at least fetchadd and > swap). > > We can also extend the API to do relevant addition/subtraction without > guaranteeing atomicity vs concurrent updates, but while providing > relevant fences. This would preserve the intended state achieved with > this commit while not interfering with others. > > I can add the missing amd64 bits if other people cover their > architectures, but otherwise I'm not interested in doing anything > here. > > seqc itself, apart from being a minor consumer compared to refcount, > was already using the standalone fence even prior to this change thus > I don't think there is any rush to do anything with regards to this > particular commit. Indeed, it was.  It was using a release fence where it was semantically required to do so for *correctness*.  Specifically, where the extra ordering restrictions of a release fence were needed.  But, it was also using the simpler release store where the extra ordering restrictions were not needed. > >>>>> the not a problem: the code does not issue the release fence after >>>>> incrementing the counter the second time, meaning that in principle it >>>>> can happen arbitrarily late, possibly disturbing code which waits for >>>>> the counter to become even. This is not a problem because in whoever >>>>> modifies it is supposed to have a lock and release it shortly after, >>>>> which also posts the release fence and consequently makes sure the >>>>> counter update is visible. Also note these routines come with >>>>> preemption disablement around the modification -- if the thread gets >>>>> preempted as a result of critical_exit from seqc_write_end, it will >>>>> also publish the updated counter. If the updater never does anything >>>>> which flushes the store buffer then in the absolute worst case they >>>>> will get preempted, at which point once more we are covered. Note that >>>>> code which wants more deterministic behavior will want to >>>>> seqc_read_any, test the counter once and if it is caught uneven resort >>>>> to locking. >>>>> >>>>>> @@ -84,7 +85,7 @@ seqc_read(const seqc_t *seqcp) >>>>>> seqc_t ret; >>>>>> >>>>>> for (;;) { >>>>>> - ret = atomic_load_acq_int(__DECONST(seqc_t *, seqcp)); >>>>>> + ret = seqc_read_any(seqcp); >>>>>> if (__predict_false(seqc_in_modify(ret))) { >>>>>> cpu_spinwait(); >>>>>> continue; >>>>>> _______________________________________________ >>>>>> svn-src-all@freebsd.org mailing list >>>>>> https://lists.freebsd.org/mailman/listinfo/svn-src-all >>>>>> To unsubscribe, send any mail to >>>>>> "svn-src-all-unsubscribe@freebsd.org" >>>>>> >>>>> >>>>> -- >>>>> Mateusz Guzik >>> >>> -- >>> Mateusz Guzik >