From owner-freebsd-arch@FreeBSD.ORG Sun Aug 19 07:58:55 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E3A016A419 for ; Sun, 19 Aug 2007 07:58:55 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id D420B13C465 for ; Sun, 19 Aug 2007 07:58:54 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id C4371487F0; Sun, 19 Aug 2007 09:58:52 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 4FE6845685; Sun, 19 Aug 2007 09:58:48 +0200 (CEST) Date: Sun, 19 Aug 2007 09:57:50 +0200 From: Pawel Jakub Dawidek To: Jeff Roberson Message-ID: <20070819075750.GB11792@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818220756.GH6498@garage.freebsd.pl> <20070818230917.GI6498@garage.freebsd.pl> <20070818163503.T568@10.0.0.1> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0eh6TmSyL6TZE2Uz" Content-Disposition: inline In-Reply-To: <20070818163503.T568@10.0.0.1> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-1.8 required=3.0 tests=BAYES_00,WHY_WAIT autolearn=no version=3.0.4 Cc: freebsd-arch@freebsd.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Aug 2007 07:58:55 -0000 --0eh6TmSyL6TZE2Uz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 18, 2007 at 04:35:42PM -0700, Jeff Roberson wrote: > On Sun, 19 Aug 2007, Pawel Jakub Dawidek wrote: > >Ok, after implementing atomic_fetchadd_long() on amd64, we get additional > >6% of performance improvement: > > > >x ./uidinfo_lockfree.txt (atomic_cmpset_long loop) > >+ ./uidinfo_waitfree.txt (atomic_fetchadd_long) > >+-----------------------------------------------------------------------= -------+ > >| = =20 > >+| > >| = =20 > >+| > >|x xx xx = =20 > >+ ++| > >| |__MA___| = =20 > >|AM| > >+-----------------------------------------------------------------------= -------+ > > N Min Max Median Avg Stdd= ev > >x 5 1561566 1575987 1568964 1569767 5853.1= 399 > >+ 5 1662362 1665936 1665810 1664881.8 1541.2= 693 > >Difference at 95.0% confidence > > 95114.8 +/- 6241.96 > > 6.05917% +/- 0.397636% > > (Student's t, pooled s =3D 4279.88) >=20 > How does this effect the single-threaded performance? Do you attribute= =20 > this to atomic fetchadd being cheaper than atomic cmpset? What is your= =20 > processor? CPU: Intel(R) Xeon(R) CPU E5310 @ 1.60GHz (1597.65-MHz K8-class CPU) Origin =3D "GenuineIntel" Id =3D 0x6f7 Stepping =3D 7 Features=3D0xbfebfbff Features2=3D0x4e33d AMD Features=3D0x20100800 AMD Features2=3D0x1 Cores per package: 4 Ok, I changed the code to something like this: long old; int diff, loops; atomic_add_int(&uidinfo_cnt1, 1); if (diff > 0) { loops =3D 0; do { loops++; old =3D uip->ui_sbsize; if (old + diff > max) return (0); } while (atomic_cmpset_long(&uip->ui_sbsize, old, old + diff) =3D=3D 0); if (loops > 1) atomic_add_int(&uidinfo_cnt2, loops); } else { atomic_add_long(&uip->ui_sbsize, (long)diff); } This allows me to see how many additional loops I do, because with lock-free version we still can have contention and loop, that's why wait-free version is superior. Actually I was a bit surprised with the results: debug.uidinfo.cnt1: 88746008 debug.uidinfo.cnt2: 31296304 (Running 8 processes.) Which means, because of contention, we do 31296304 additional atomic operations, which is about 30% more. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --0eh6TmSyL6TZE2Uz Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGx/f9ForvXbEpPzQRAoIwAKCL/fLfk/Wow6njyNFLyOXjKky5RwCfUKoX 7ZGZAv/M+5w9Xu5RFPFoJRE= =yhzP -----END PGP SIGNATURE----- --0eh6TmSyL6TZE2Uz--