From owner-freebsd-current@freebsd.org Sun Jul 10 14:33:17 2016 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A7501B83F1D for ; Sun, 10 Jul 2016 14:33:17 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound1a.eu.mailhop.org (outbound1a.eu.mailhop.org [52.58.109.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1C29F11EF for ; Sun, 10 Jul 2016 14:33:16 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-User: 184f535e-46ab-11e6-ac92-3142cfe117f2 X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 73.34.117.227 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [73.34.117.227]) by outbound1.eu.mailhop.org (Halon Mail Gateway) with ESMTPSA; Sun, 10 Jul 2016 14:32:13 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.14.9) with ESMTP id u6AEW1NH001127; Sun, 10 Jul 2016 08:32:01 -0600 (MDT) (envelope-from ian@freebsd.org) Message-ID: <1468161121.72182.115.camel@freebsd.org> Subject: Re: [PATCH] microoptimize locking primitives by introducing randomized delay between atomic ops From: Ian Lepore To: Mateusz Guzik , freebsd-current@freebsd.org Date: Sun, 10 Jul 2016 08:32:01 -0600 In-Reply-To: <20160710111326.GA7853@dft-labs.eu> References: <20160710111326.GA7853@dft-labs.eu> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 10 Jul 2016 14:33:17 -0000 On Sun, 2016-07-10 at 13:13 +0200, Mateusz Guzik wrote: > If the lock is contended, primitives like __mtx_lock_sleep will spin > checking if the owner is running or the lock was freed. The problem > is > that once it is discovered that the lock is free, multiple CPUs are > likely to try to do the atomic op which will make it more costly for > everyone and throughput suffers. > > The standard thing to do is to have some sort of a randomized delay > so > that this kind of behaviour is reduced. > > As such, below is a trivial hack which takes cpu_ticks() into account > and performs % 2048, which in my testing gives reasonbly good > results. > > Please note there is definitely way more room for improvement in > general. > > In terms of results, there was no statistically significant change in > -j 40 buildworld nor buildkernel. > > However, a 40-way find on a ports tree placed on tmpfs yielded the > following: > > x vanilla > + patched > +-------------------------------------------------------------------- > --------------------+ > > ++++ + x > > x x x | > > + ++++ +++ + + + ++ + + x x > > x xxxxxxxx x x x| > > |_____M____A__________| > > |________AM______| | > +-------------------------------------------------------------------- > --------------------+ > N Min Max Median Avg > Stddev > x 20 12.431 15.952 14.897 14.7444 > 0.74241657 > + 20 8.103 11.863 9.0135 9.44565 > 1.0059484 > Difference at 95.0% confidence > -5.29875 +/- 0.565836 > -35.9374% +/- 3.83764% > (Student's t, pooled s = 0.884057) > > The patch: [...] What about platforms that don't have a useful implementation of cpu_ticks()? What about platforms that don't suffer the large expense for atomic ops that x86 apparently does? -- Ian