From owner-freebsd-arch@FreeBSD.ORG Sun Jul 29 17:51:09 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADDAC16A41B; Sun, 29 Jul 2007 17:51:09 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 956E413C45E; Sun, 29 Jul 2007 17:51:09 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from rot26.obsecurity.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 56A1A1A3C1A; Sun, 29 Jul 2007 10:50:55 -0700 (PDT) Received: by rot26.obsecurity.org (Postfix, from userid 1001) id DB4ADBB43; Sun, 29 Jul 2007 13:51:08 -0400 (EDT) Date: Sun, 29 Jul 2007 13:51:08 -0400 From: Kris Kennaway To: Robert Watson , Attilio Rao , arch@freebsd.org, Alfred Perlstein Message-ID: <20070729175108.GA85196@rot26.obsecurity.org> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704124833.W37059@fledge.watson.org> <3bbf2fe10707040800p4e003df0p65e2b802f81ec51e@mail.gmail.com> <20070704174511.C67251@fledge.watson.org> <20070704170522.GB53564@in-addr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070704170522.GB53564@in-addr.com> User-Agent: Mutt/1.4.2.3i Cc: Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jul 2007 17:51:09 -0000 On Wed, Jul 04, 2007 at 01:05:22PM -0400, Gary Palmer wrote: > On Wed, Jul 04, 2007 at 05:46:34PM +0100, Robert Watson wrote: > > > > On Wed, 4 Jul 2007, Attilio Rao wrote: > > > > >2007/7/4, Robert Watson : > > >>There seem to be two parts of owning a benchmark: > > >> > > >>- Establishing baselines over time -- how doe FreeBSD 4.8, 5.5, 6.0, 6.1, > > >>6.2, > > >> 6-STABLE weekly, 7-CURRENT weekly, and maybe a Linux or NetBSD version > > >> perform for the workload using otherwise identical configuration. > > >> > > >>- Measurement and feedback -- identifying bottlenecks, working with > > >>developers > > >> to measure the results of specific optimizations, etc, across the life > > >>cycle > > >> of the patch. > > > > > >Another problem here would be about the hardware availabilty (obviously > > >I'm speaking about scalability improvements). Until now, tests have been > > >done mainly on amd64 machines provided by Kris and Jeff, IIRC. Having a > > >wider range of targets would help a lot in these cases. > > > > The FreeBSD Foundation is currently working on updating the Netperf test > > cluster from dual-cpu HTT boxes to 8-core systems, and from 1gbps to 10gbps > > ethernet. Hopefully this will improve access to larger multicore systems > > for developers without local hardware. This project has been "in progress" > > for a while now, but will wrap up soon. > > Hi Robert, > > Another way of looking at Attilio's message is that we need to focus on > more than one type of platform. In addition to benchmarking any differences > between large 8 core Opteron and Xeon systems and the Sun "CoolThreads" > platform, we need to maintain scalability on "more affordable" single > core hardware as well. An immediate thought is embedded type systems > such as the Soekris. While high-end server farms have always been our > bread and butter, I think widening our focus might be worthwhile. I might > have missed it, but I don't remember results being published to ensure > that while SMP systems gain performance that we don't adversely impact > UP systems in the process. (My memory is far from perfect, apologies if > I'm wrong) I do keep a close eye on UP performance on "my" benchmark targets, and you will be pleased to know that the same optimizations that have such a big effect on SMP systems often also have a positive effect on UP systems, and do not regress performance in the other cases. Kris From owner-freebsd-arch@FreeBSD.ORG Sun Jul 29 18:07:23 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95D2416A41B; Sun, 29 Jul 2007 18:07:23 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 8098D13C45E; Sun, 29 Jul 2007 18:07:23 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from rot26.obsecurity.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id F3E901A3C1A; Sun, 29 Jul 2007 11:07:08 -0700 (PDT) Received: by rot26.obsecurity.org (Postfix, from userid 1001) id 8E45BBE6D; Sun, 29 Jul 2007 14:07:22 -0400 (EDT) Date: Sun, 29 Jul 2007 14:07:22 -0400 From: Kris Kennaway To: Jeff Roberson Message-ID: <20070729180722.GB85196@rot26.obsecurity.org> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070704114005.X552@10.0.0.1> User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org, Alfred Perlstein Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Jul 2007 18:07:23 -0000 On Wed, Jul 04, 2007 at 11:47:35AM -0700, Jeff Roberson wrote: > >>http://people.freebsd.org/~jeff/select2.diff > > > >Jeff, I understand you're trying to speed up mysql micro benchmarks, > >but have you done any benchmarking on large select operations? > > I don't know that I'd call mysql a micro-benchmark. This patch also > didn't help there as much as I had hoped and I'm still trying to > understand why. Here is a graph of the performance effects to sysbench with this patch: http://obsecurity.dyndns.org/select.png mysql ===== It appears that at higher loads most of the contention is now in userland, no longer within the kernel. There is also significant contention on the proc lock. Peak load (8 clients): 31 1137 169 1280 0 0 126 56 kern/kern_umtx.c:325 (sleep mutex:umtxql) 7 5688 554 4750 1 0 722 294 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) 3 2335 1155 8732 0 0 669 571 kern/sys_generic.c:955 (sleep mutex:process lock) Higher load (20 clients): 88 6714 807 4763 1 0 754 276 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) 3 2342 1228 7656 0 0 650 550 kern/sys_generic.c:955 (sleep mutex:process lock) 2 431 1299 1023 0 1 53 77 kern/kern_sig.c:996 (sleep mutex:process lock) 7 371 3545 635 0 5 58 131 kern/kern_mutex.c:141 (sleep mutex:umtxql) 70 5085 7433 3184 1 2 507 377 kern/kern_umtx.c:325 (sleep mutex:umtxql) I looked in the past at replacing the proc mutex with a rwlock and looking for places where shared locking could be used, but at least as the code is written currently I dont think any of those apply here. With the select locking patch overall mysql performance does not change much, but the total amount of time spent waiting for locks is greatly reduced (by about an order of magnitude), so system time should be lower with these changes (unless it's counterbalanced by greater time spent doing other things than lock waits). I have not measured this though. We might be able to obtain some further improvement at higher loads by improving the contention behaviour of umtx objects (the kernel part of the libthr pthread mutex). I suspect most of the problem is in mysql itself. What we need is a userland counterpart of lock profiling, for profiling contention on pthread mutexes. pgsql ===== Clear performance benefit from select locking, on the order of 5-10%. Reduction in lock wait time is about *two* orders of magnitude. Peak load: 5 2942 1437 2607 1 0 818 446 kern/subr_turnstile.c:546 (spin mutex:turnstile chain) 13 9250 1474 9572 0 0 1405 585 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) 39 3019 2856 9458 0 0 1613 1131 kern/sys_generic.c:955 (sleep mutex:process lock) 120 5540 5494 16954 0 0 3536 2017 kern/kern_sig.c:996 (sleep mutex:process lock) 20 clients: 8 5336 3506 4338 1 0 1610 910 kern/subr_turnstile.c:546 (spin mutex:turnstile chain) 2 2828 4261 8787 0 0 1749 1298 kern/sys_generic.c:955 (sleep mutex:process lock) 56 10717 4568 8968 1 0 3092 1382 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) 4 5390 7646 15766 0 0 3325 2568 kern/kern_sig.c:996 (sleep mutex:process lock) 79 9423 70619 33525 0 2 154 92 kern/uipc_syscalls.c:135 (sleep mutex:sleep mtxpool) i.e. much the same lock workload as mysql except for no umtx contention (pgsql is not threaded), and huge wait time (but not much contention) on the following: static int getsock(struct filedesc *fdp, int fd, struct file **fpp, u_int *fflagp) { ... if (fdp == NULL) error = EBADF; else { FILEDESC_SLOCK(fdp); fp = fget_locked(fdp, fd); if (fp == NULL) error = EBADF; else if (fp->f_type != DTYPE_SOCKET) { fp = NULL; error = ENOTSOCK; } else { fhold(fp); ... } I think this is mostly because it's called so often, with small incremental but large total cost. Kris From owner-freebsd-arch@FreeBSD.ORG Tue Jul 31 21:08:13 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0865016A41B for ; Tue, 31 Jul 2007 21:08:13 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 808BC13C483 for ; Tue, 31 Jul 2007 21:08:12 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IFxrt-0001RU-KZ for freebsd-arch@freebsd.org; Tue, 31 Jul 2007 21:58:57 +0200 Received: from 89-172-51-174.adsl.net.t-com.hr ([89.172.51.174]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 31 Jul 2007 21:58:57 +0200 Received: from ivoras by 89-172-51-174.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 31 Jul 2007 21:58:57 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Tue, 31 Jul 2007 21:58:38 +0200 Lines: 47 Message-ID: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig6D3FB5C1C7E5E9FADE991F97" X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 89-172-51-174.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) X-Enigmail-Version: 0.94.3.0 Sender: news Subject: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Jul 2007 21:08:13 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6D3FB5C1C7E5E9FADE991F97 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi, I've just stumbled on the LKML (via Slashdot) discussion on schedulers, nicely compiled here: http://kerneltrap.org/node/14023 . I don't think 3D performance is of concern for FreeBSD, but I'm wondering how would ULE and the latest incarnation of 4BSD fare in that discussion? Specifically, I'm interested in this result in Linux: 2.6.22-ck1 2.6.22-cfs-v19 ------------------------ ------------------------ quake + 0 loops | 41 fps quake + 0 loops | 41 fps quake + 1 loop | 3 fps quake + 1 loop | 41 fps quake + 2 loops | 2 fps quake + 2 loops | 32 fps quake + 3 loops | 1 fps quake + 3 loops | 24 fps quake + 4 loops | 0 fps quake + 4 loops | 20 fps quake + 5 loops | 0 fps quake + 5 loops | 16 fps (for the impatient: the benchmark is of running quake with several "idle loop" processes, presumably on a single CPU machine. On the left is the SD (staircase deadline) and on the right is the CF (completely fair) scheduler). How would this behave on FreeBSD? Is there a paper on how ULE should behave / is modeled? --------------enig6D3FB5C1C7E5E9FADE991F97 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGr5R0ldnAQVacBcgRAjcGAJ9EPkePgt5V8uqk7SHdBB+eLlHMJACgyVSA Q0QzXcf0gD8mViCyr0kTckw= =zoYR -----END PGP SIGNATURE----- --------------enig6D3FB5C1C7E5E9FADE991F97-- From owner-freebsd-arch@FreeBSD.ORG Wed Aug 1 05:14:05 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9D7116A951 for ; Wed, 1 Aug 2007 05:13:53 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from gnome.kiev.sovam.com (gnome.kiev.sovam.com [212.109.32.24]) by mx1.freebsd.org (Postfix) with ESMTP id 5FD8913C45A for ; Wed, 1 Aug 2007 05:13:53 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com ([62.64.120.197]) by gnome.kiev.sovam.com with esmtp (Exim 4.60) (envelope-from ) id 1IG6Wu-0001Eq-0E for freebsd-arch@freebsd.org; Wed, 01 Aug 2007 08:13:52 +0300 Received: from [89.162.146.170] (helo=skuns.kiev.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1IG5aQ-0003XW-PC for freebsd-arch@freebsd.org; Wed, 01 Aug 2007 07:13:29 +0300 Received: from deviant.kiev.zoral.com.ua (root@[10.1.1.148]) by skuns.kiev.zoral.com.ua (8.14.1/8.14.1) with ESMTP id l714DLmY095341 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 1 Aug 2007 07:13:21 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1) with ESMTP id l714DK71058896; Wed, 1 Aug 2007 07:13:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1/Submit) id l714DKXW058895; Wed, 1 Aug 2007 07:13:20 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 1 Aug 2007 07:13:20 +0300 From: Kostik Belousov To: Ivan Voras Message-ID: <20070801041320.GH2262@deviant.kiev.zoral.com.ua> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/9ZOS6odDaRI+0hI" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.90.3, clamav-milter version 0.90.3 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=failed version=3.2.1 X-Spam-Checker-Version: SpamAssassin 3.2.1 (2007-05-02) on skuns.kiev.zoral.com.ua X-Scanner-Signature: 69470ef176344869aa12c1af2e75e870 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Not Detected X-SpamTest-Info: Profiles 1318 [July 31 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Method: none X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release X-Delayed: more then 1h on relay02.kiev.sovam.com Cc: freebsd-arch@freebsd.org Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2007 05:14:06 -0000 --/9ZOS6odDaRI+0hI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 31, 2007 at 09:58:38PM +0200, Ivan Voras wrote: > Hi, >=20 > I've just stumbled on the LKML (via Slashdot) discussion on schedulers, > nicely compiled here: http://kerneltrap.org/node/14023 . I don't think > 3D performance is of concern for FreeBSD, but I'm wondering how would Why do you think so ? 3D on FreeBSD is quite important for some of the users, mine in particular. > ULE and the latest incarnation of 4BSD fare in that discussion? --/9ZOS6odDaRI+0hI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGsAhfC3+MBN1Mb4gRAgNnAKCljFi0xmIGN0WPmwrMkpbFzqawEACeNUnb 2Ri8YmmYlbXh7Cf+qrrvcIE= =kkC0 -----END PGP SIGNATURE----- --/9ZOS6odDaRI+0hI-- From owner-freebsd-arch@FreeBSD.ORG Wed Aug 1 12:43:15 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B370416A417 for ; Wed, 1 Aug 2007 12:43:15 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 6D43413C459 for ; Wed, 1 Aug 2007 12:43:15 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IGDXc-00062m-62 for freebsd-arch@freebsd.org; Wed, 01 Aug 2007 14:43:04 +0200 Received: from 78-1-114-214.adsl.net.t-com.hr ([78.1.114.214]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 01 Aug 2007 14:43:04 +0200 Received: from ivoras by 78-1-114-214.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 01 Aug 2007 14:43:04 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Wed, 01 Aug 2007 14:42:51 +0200 Lines: 28 Message-ID: References: <20070801041320.GH2262@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig332A718156BB0C0CE4BE3C0F" X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 78-1-114-214.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) In-Reply-To: <20070801041320.GH2262@deviant.kiev.zoral.com.ua> X-Enigmail-Version: 0.94.3.0 Sender: news Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2007 12:43:15 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig332A718156BB0C0CE4BE3C0F Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Kostik Belousov wrote: > Why do you think so ? 3D on FreeBSD is quite important for some of the > users, mine in particular. Sorry, I generalized too much from the lack of quality drivers :) --------------enig332A718156BB0C0CE4BE3C0F Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGsH/LldnAQVacBcgRAsxbAJ9sfB5WqP9/pqAZ9Gx4wsujqeB6oQCgrRsC +9fdEV0PWmbFI5HpV3HeC9Q= =tmvJ -----END PGP SIGNATURE----- --------------enig332A718156BB0C0CE4BE3C0F-- From owner-freebsd-arch@FreeBSD.ORG Wed Aug 1 12:55:39 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A97D16A417 for ; Wed, 1 Aug 2007 12:55:39 +0000 (UTC) (envelope-from rnsanchez@wait4.org) Received: from spunkymail-a14.g.dreamhost.com (mailbigip.dreamhost.com [208.97.132.5]) by mx1.freebsd.org (Postfix) with ESMTP id 271AE13C4A8 for ; Wed, 1 Aug 2007 12:55:39 +0000 (UTC) (envelope-from rnsanchez@wait4.org) Received: from sauron.lan.box (200-180-178-1.paemt706.dsl.brasiltelecom.net.br [200.180.178.1]) by spunkymail-a14.g.dreamhost.com (Postfix) with ESMTP id D669F190E26; Wed, 1 Aug 2007 05:55:38 -0700 (PDT) Date: Wed, 1 Aug 2007 09:55:23 -0300 From: Ricardo Nabinger Sanchez To: Kostik Belousov Message-Id: <20070801095523.30c8a145.rnsanchez@wait4.org> In-Reply-To: <20070801041320.GH2262@deviant.kiev.zoral.com.ua> References: <20070801041320.GH2262@deviant.kiev.zoral.com.ua> Organization: SYS_WAIT4 X-Mailer: Sylpheed 2.4.4 (GTK+ 2.10.14; i386-unknown-freebsd6.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Ivan Voras , freebsd-arch@freebsd.org Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Aug 2007 12:55:39 -0000 On Wed, 1 Aug 2007 07:13:20 +0300 Kostik Belousov wrote: > I don't think > > 3D performance is of concern for FreeBSD, but I'm wondering how would > Why do you think so ? 3D on FreeBSD is quite important for some of the > users, mine in particular. I think Ivan means "not a primary concern" instead of "no concern at all". -- Ricardo Nabinger Sanchez rnsanchez@wait4.org Powered by FreeBSD "Left to themselves, things tend to go from bad to worse." From owner-freebsd-arch@FreeBSD.ORG Thu Aug 2 12:25:19 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD32C16A418 for ; Thu, 2 Aug 2007 12:25:19 +0000 (UTC) (envelope-from nike_d@cytexbg.com) Received: from mail.interbgc.com (mx03.interbgc.com [217.9.224.229]) by mx1.freebsd.org (Postfix) with SMTP id 1D19113C465 for ; Thu, 2 Aug 2007 12:25:18 +0000 (UTC) (envelope-from nike_d@cytexbg.com) Received: (qmail 48915 invoked from network); 2 Aug 2007 11:58:37 -0000 Received: from nike_d@cytexbg.com by keeper.interbgc.com by uid 1002 with qmail-scanner-1.14 (uvscan: v4.2.40/v4374. spamassassin: 2.63. Clear:SA:0(-2.6/8.0):. Processed in 4.892163 secs); 02 Aug 2007 11:58:37 -0000 X-Spam-Status: No, hits=-2.6 required=8.0 Received: from unknown (HELO ndenev.totalterror.net) (85.130.16.146) by mx03.interbgc.com with SMTP; 2 Aug 2007 11:58:29 -0000 Received: (qmail 1732 invoked from network); 2 Aug 2007 14:57:17 +0300 Received: from unknown (HELO ?127.0.0.1?) (127.0.0.1) by ndenev.totalterror.net with SMTP; 2 Aug 2007 14:57:17 +0300 Message-ID: <46B1C69D.6070503@cytexbg.com> Date: Thu, 02 Aug 2007 14:57:17 +0300 From: Niki Denev User-Agent: Thunderbird 1.5.0.10 (X11/20070326) MIME-Version: 1.0 To: Ivan Voras References: In-Reply-To: X-Enigmail-Version: 0.94.3.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-arch@freebsd.org Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Aug 2007 12:25:20 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Ivan Voras wrote: > Hi, > > I've just stumbled on the LKML (via Slashdot) discussion on schedulers, > nicely compiled here: http://kerneltrap.org/node/14023 . I don't think > 3D performance is of concern for FreeBSD, but I'm wondering how would > ULE and the latest incarnation of 4BSD fare in that discussion? > > Specifically, I'm interested in this result in Linux: > > 2.6.22-ck1 2.6.22-cfs-v19 > ------------------------ ------------------------ > quake + 0 loops | 41 fps quake + 0 loops | 41 fps > quake + 1 loop | 3 fps quake + 1 loop | 41 fps > quake + 2 loops | 2 fps quake + 2 loops | 32 fps > quake + 3 loops | 1 fps quake + 3 loops | 24 fps > quake + 4 loops | 0 fps quake + 4 loops | 20 fps > quake + 5 loops | 0 fps quake + 5 loops | 16 fps > > (for the impatient: the benchmark is of running quake with several "idle > loop" processes, presumably on a single CPU machine. On the left is the > SD (staircase deadline) and on the right is the CF (completely fair) > scheduler). > > How would this behave on FreeBSD? Is there a paper on how ULE should > behave / is modeled? > This is on a Intel C2D E6420 with 2G of ram, Nvidia 7900GTO (nvidia-driver-1.0.9746) running xorg-server-6.9.0_5 on a recent -CURRENT : idle is basicaly a small C program with just for(;;); in its main() function. I've run glxgears for 20 secs each (to get four reports) Both idle and glxgears are run as normal user. SMP+ULE 0 idle 101446 frames in 5.0 seconds = 20289.099 FPS 101590 frames in 5.0 seconds = 20317.975 FPS 101701 frames in 5.0 seconds = 20340.037 FPS 101489 frames in 5.0 seconds = 20297.670 FPS SMP+ULE 1 idle 97430 frames in 5.0 seconds = 19485.840 FPS 102176 frames in 5.0 seconds = 20435.017 FPS 102402 frames in 5.0 seconds = 20480.318 FPS 102430 frames in 5.0 seconds = 20485.865 FPS SMP+ULE 2 idle 30 frames in 5.0 seconds = 5.978 FPS 31 frames in 5.0 seconds = 6.182 FPS 31 frames in 5.0 seconds = 6.172 FPS 30 frames in 5.2 seconds = 5.744 FPS SMP+ULE 3 idle 29 frames in 5.2 seconds = 5.631 FPS 30 frames in 5.0 seconds = 5.952 FPS 31 frames in 5.1 seconds = 6.054 FPS 32 frames in 5.2 seconds = 6.213 FPS SMP+ULE 4 idle 21 frames in 5.1 seconds = 4.151 FPS 20 frames in 5.1 seconds = 3.942 FPS 21 frames in 5.2 seconds = 4.066 FPS 20 frames in 5.2 seconds = 3.841 FPS UP+ULE 0 idle 102152 frames in 5.0 seconds = 20430.299 FPS 102572 frames in 5.0 seconds = 20514.236 FPS 102533 frames in 5.0 seconds = 20506.522 FPS 102129 frames in 5.0 seconds = 20425.654 FPS UP+ULE 1 idle 21 frames in 5.1 seconds = 4.158 FPS 24 frames in 5.2 seconds = 4.624 FPS 26 frames in 5.0 seconds = 5.153 FPS 28 frames in 5.0 seconds = 5.586 FPS UP+ULE 2 idle 21 frames in 5.1 seconds = 4.093 FPS 21 frames in 5.1 seconds = 4.093 FPS 21 frames in 5.1 seconds = 4.115 FPS 21 frames in 5.1 seconds = 4.115 FPS UP+ULE 3 idle 20 frames in 5.3 seconds = 3.804 FPS 19 frames in 5.2 seconds = 3.624 FPS 19 frames in 5.2 seconds = 3.619 FPS 19 frames in 5.3 seconds = 3.612 FPS UP+ULE 4 idle 19 frames in 5.3 seconds = 3.600 FPS 17 frames in 5.0 seconds = 3.388 FPS 17 frames in 5.0 seconds = 3.393 FPS 17 frames in 5.0 seconds = 3.380 FPS SMP+4BSD 0 idle 102440 frames in 5.0 seconds = 20487.893 FPS 102285 frames in 5.0 seconds = 20456.848 FPS 102276 frames in 5.0 seconds = 20455.065 FPS 102312 frames in 5.0 seconds = 20462.289 FPS SMP+4BSD 1 idle 101798 frames in 5.0 seconds = 20359.526 FPS 102732 frames in 5.0 seconds = 20546.202 FPS 102619 frames in 5.0 seconds = 20523.692 FPS 102788 frames in 5.0 seconds = 20557.526 FPS SMP+4BSD 2 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS SMP+4BSD 3 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS SMP+4BSD 4 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS UP+4BSD 0 idle 102864 frames in 5.0 seconds = 20572.665 FPS 102569 frames in 5.0 seconds = 20513.792 FPS 102559 frames in 5.0 seconds = 20511.775 FPS 102333 frames in 5.0 seconds = 20466.543 FPS UP+4BSD 1 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS UP+4BSD 2 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS UP+4BSD 3 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS UP+4BSD 4 idle 6 frames in 5.0 seconds = 1.193 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS 5 frames in 5.0 seconds = 0.994 FPS -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGscadHNAJ/fLbfrkRAnDOAJ9yipwexiBUrZbS3RJ5R0YDZyn4pACfS/Od gMVwrhA3NYlaQkPNOaEZ7S8= =98Za -----END PGP SIGNATURE----- From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 00:52:40 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F27916A417; Fri, 3 Aug 2007 00:52:40 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 313CC13C45A; Fri, 3 Aug 2007 00:52:40 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l730qbtE026836 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 2 Aug 2007 20:52:38 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 2 Aug 2007 17:55:12 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Kris Kennaway In-Reply-To: <20070729180722.GB85196@rot26.obsecurity.org> Message-ID: <20070802174819.S561@10.0.0.1> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Alfred Perlstein Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 00:52:40 -0000 On Sun, 29 Jul 2007, Kris Kennaway wrote: > On Wed, Jul 04, 2007 at 11:47:35AM -0700, Jeff Roberson wrote: > >>>> http://people.freebsd.org/~jeff/select2.diff >>> >>> Jeff, I understand you're trying to speed up mysql micro benchmarks, >>> but have you done any benchmarking on large select operations? >> >> I don't know that I'd call mysql a micro-benchmark. This patch also >> didn't help there as much as I had hoped and I'm still trying to >> understand why. > > Here is a graph of the performance effects to sysbench with this > patch: > > http://obsecurity.dyndns.org/select.png > Kris, Thanks very much for looking into this. The pgsql numbers and lock profiling output seem to verify the concurrency of the patch. Hopefully this and db's microbenchmark are enough to convince people this should go in after 7.0 branches. In regards to the proc locking; I believe we need to make seperate locks for signal processing, the various limits, etc. I think if we add one or two more locks per-proc we won't need to do rwlocks and we can fix most of this contention. I believe filedescriptor locking is the place where we are most lacking. The new sx helped tremendously. However, this is still going to be a scalability limiter. I have looked into both linux and solaris's solution to this problem. Briefly, linux uses RCU to protect the list, which is close to ideal as this is certainly a read heavy workload. Solaris on the other hand uses the actual file lock to protect the descriptor slot. So they fetch the file pointer, lock it, and then check to see if they lost a race with the slot being reassigned while they were acquiring the lock. This approach is perhaps better than rcu in many cases except when the descriptor set is expanded. Then they have to lock every file in the set. I hope we can hash out a good plan to resolve this for 8.0. filedesc and lockmgr are the biggest hitters on mysql writes. I suspect this is also the case for pgsql and likely other network server type programs. Thanks, Jeff > mysql > ===== > > It appears that at higher loads most of the contention is now in > userland, no longer within the kernel. There is also significant > contention on the proc lock. > > Peak load (8 clients): > 31 1137 169 1280 0 0 126 56 kern/kern_umtx.c:325 (sleep mutex:umtxql) > 7 5688 554 4750 1 0 722 294 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) > 3 2335 1155 8732 0 0 669 571 kern/sys_generic.c:955 (sleep mutex:process lock) > > Higher load (20 clients): > 88 6714 807 4763 1 0 754 276 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) > 3 2342 1228 7656 0 0 650 550 kern/sys_generic.c:955 (sleep mutex:process lock) > 2 431 1299 1023 0 1 53 77 kern/kern_sig.c:996 (sleep mutex:process lock) > 7 371 3545 635 0 5 58 131 kern/kern_mutex.c:141 (sleep mutex:umtxql) > 70 5085 7433 3184 1 2 507 377 kern/kern_umtx.c:325 (sleep mutex:umtxql) > > I looked in the past at replacing the proc mutex with a rwlock and > looking for places where shared locking could be used, but at least as > the code is written currently I dont think any of those apply here. > > With the select locking patch overall mysql performance does not > change much, but the total amount of time spent waiting for locks is > greatly reduced (by about an order of magnitude), so system time > should be lower with these changes (unless it's counterbalanced by > greater time spent doing other things than lock waits). I have not > measured this though. > > We might be able to obtain some further improvement at higher loads by > improving the contention behaviour of umtx objects (the kernel part of > the libthr pthread mutex). I suspect most of the problem is in mysql > itself. What we need is a userland counterpart of lock profiling, for > profiling contention on pthread mutexes. > > pgsql > ===== > > Clear performance benefit from select locking, on the order of 5-10%. > Reduction in lock wait time is about *two* orders of magnitude. > > Peak load: > > 5 2942 1437 2607 1 0 818 446 kern/subr_turnstile.c:546 (spin mutex:turnstile chain) > 13 9250 1474 9572 0 0 1405 585 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) > 39 3019 2856 9458 0 0 1613 1131 kern/sys_generic.c:955 (sleep mutex:process lock) > 120 5540 5494 16954 0 0 3536 2017 kern/kern_sig.c:996 (sleep mutex:process lock) > > 20 clients: > 8 5336 3506 4338 1 0 1610 910 kern/subr_turnstile.c:546 (spin mutex:turnstile chain) > 2 2828 4261 8787 0 0 1749 1298 kern/sys_generic.c:955 (sleep mutex:process lock) > 56 10717 4568 8968 1 0 3092 1382 kern/subr_sleepqueue.c:388 (sleep mutex:process lock) > 4 5390 7646 15766 0 0 3325 2568 kern/kern_sig.c:996 (sleep mutex:process lock) > 79 9423 70619 33525 0 2 154 92 kern/uipc_syscalls.c:135 (sleep mutex:sleep mtxpool) > > i.e. much the same lock workload as mysql except for no umtx > contention (pgsql is not threaded), and huge wait time (but not much > contention) on the following: > > static int > getsock(struct filedesc *fdp, int fd, struct file **fpp, u_int *fflagp) > { > ... > > if (fdp == NULL) > error = EBADF; > else { > FILEDESC_SLOCK(fdp); > fp = fget_locked(fdp, fd); > if (fp == NULL) > error = EBADF; > else if (fp->f_type != DTYPE_SOCKET) { > fp = NULL; > error = ENOTSOCK; > } else { > fhold(fp); > ... > > } > > I think this is mostly because it's called so often, with small > incremental but large total cost. > > Kris > From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 01:15:53 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F26E16A41F for ; Fri, 3 Aug 2007 01:15:53 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 178E613C4DA for ; Fri, 3 Aug 2007 01:15:53 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l731FeaR032258 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 2 Aug 2007 21:15:43 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 2 Aug 2007 18:18:15 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Niki Denev In-Reply-To: <46B1C69D.6070503@cytexbg.com> Message-ID: <20070802181239.O561@10.0.0.1> References: <46B1C69D.6070503@cytexbg.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Ivan Voras , freebsd-arch@freebsd.org Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 01:15:53 -0000 On Thu, 2 Aug 2007, Niki Denev wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Ivan Voras wrote: >> Hi, >> >> I've just stumbled on the LKML (via Slashdot) discussion on schedulers, >> nicely compiled here: http://kerneltrap.org/node/14023 . I don't think >> 3D performance is of concern for FreeBSD, but I'm wondering how would >> ULE and the latest incarnation of 4BSD fare in that discussion? >> >> Specifically, I'm interested in this result in Linux: >> >> 2.6.22-ck1 2.6.22-cfs-v19 >> ------------------------ ------------------------ >> quake + 0 loops | 41 fps quake + 0 loops | 41 fps >> quake + 1 loop | 3 fps quake + 1 loop | 41 fps >> quake + 2 loops | 2 fps quake + 2 loops | 32 fps >> quake + 3 loops | 1 fps quake + 3 loops | 24 fps >> quake + 4 loops | 0 fps quake + 4 loops | 20 fps >> quake + 5 loops | 0 fps quake + 5 loops | 16 fps >> >> (for the impatient: the benchmark is of running quake with several "idle >> loop" processes, presumably on a single CPU machine. On the left is the >> SD (staircase deadline) and on the right is the CF (completely fair) >> scheduler). >> >> How would this behave on FreeBSD? Is there a paper on how ULE should >> behave / is modeled? >> > > This is on a Intel C2D E6420 with 2G of ram, > Nvidia 7900GTO (nvidia-driver-1.0.9746) > running xorg-server-6.9.0_5 on a recent -CURRENT : > > idle is basicaly a small C program with just for(;;); in its main() > function. > I've run glxgears for 20 secs each (to get four reports) > Both idle and glxgears are run as normal user. Can you tell me what % cpu is going to each process during this time? These results are surprising. For workloads like this ULE should essentially implement a 'fair' scheduling policy. However, so should 4BSD. So I'm not yet sure why the slowdown wouldn't be relative to the number of running threads. Also, 'vmstat 1' output would be useful. Can I recreate this test without a fancy video card? I have the following in my laptop: vgapci0@pci1:0:0: class=0x030000 card=0x054f1014 chip=0x4e541002 rev=0x80 hdr=0x00 vendor = 'ATI Technologies Inc.' device = 'Radeon Mobility M10 NT (RV350-WS)' Thanks, Jeff > > SMP+ULE 0 idle > 101446 frames in 5.0 seconds = 20289.099 FPS > 101590 frames in 5.0 seconds = 20317.975 FPS > 101701 frames in 5.0 seconds = 20340.037 FPS > 101489 frames in 5.0 seconds = 20297.670 FPS > > SMP+ULE 1 idle > 97430 frames in 5.0 seconds = 19485.840 FPS > 102176 frames in 5.0 seconds = 20435.017 FPS > 102402 frames in 5.0 seconds = 20480.318 FPS > 102430 frames in 5.0 seconds = 20485.865 FPS > > SMP+ULE 2 idle > 30 frames in 5.0 seconds = 5.978 FPS > 31 frames in 5.0 seconds = 6.182 FPS > 31 frames in 5.0 seconds = 6.172 FPS > 30 frames in 5.2 seconds = 5.744 FPS > > SMP+ULE 3 idle > 29 frames in 5.2 seconds = 5.631 FPS > 30 frames in 5.0 seconds = 5.952 FPS > 31 frames in 5.1 seconds = 6.054 FPS > 32 frames in 5.2 seconds = 6.213 FPS > > SMP+ULE 4 idle > 21 frames in 5.1 seconds = 4.151 FPS > 20 frames in 5.1 seconds = 3.942 FPS > 21 frames in 5.2 seconds = 4.066 FPS > 20 frames in 5.2 seconds = 3.841 FPS > > UP+ULE 0 idle > 102152 frames in 5.0 seconds = 20430.299 FPS > 102572 frames in 5.0 seconds = 20514.236 FPS > 102533 frames in 5.0 seconds = 20506.522 FPS > 102129 frames in 5.0 seconds = 20425.654 FPS > > UP+ULE 1 idle > 21 frames in 5.1 seconds = 4.158 FPS > 24 frames in 5.2 seconds = 4.624 FPS > 26 frames in 5.0 seconds = 5.153 FPS > 28 frames in 5.0 seconds = 5.586 FPS > > UP+ULE 2 idle > 21 frames in 5.1 seconds = 4.093 FPS > 21 frames in 5.1 seconds = 4.093 FPS > 21 frames in 5.1 seconds = 4.115 FPS > 21 frames in 5.1 seconds = 4.115 FPS > > UP+ULE 3 idle > 20 frames in 5.3 seconds = 3.804 FPS > 19 frames in 5.2 seconds = 3.624 FPS > 19 frames in 5.2 seconds = 3.619 FPS > 19 frames in 5.3 seconds = 3.612 FPS > > UP+ULE 4 idle > 19 frames in 5.3 seconds = 3.600 FPS > 17 frames in 5.0 seconds = 3.388 FPS > 17 frames in 5.0 seconds = 3.393 FPS > 17 frames in 5.0 seconds = 3.380 FPS > > SMP+4BSD 0 idle > 102440 frames in 5.0 seconds = 20487.893 FPS > 102285 frames in 5.0 seconds = 20456.848 FPS > 102276 frames in 5.0 seconds = 20455.065 FPS > 102312 frames in 5.0 seconds = 20462.289 FPS > > SMP+4BSD 1 idle > 101798 frames in 5.0 seconds = 20359.526 FPS > 102732 frames in 5.0 seconds = 20546.202 FPS > 102619 frames in 5.0 seconds = 20523.692 FPS > 102788 frames in 5.0 seconds = 20557.526 FPS > > SMP+4BSD 2 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > SMP+4BSD 3 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > SMP+4BSD 4 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > UP+4BSD 0 idle > 102864 frames in 5.0 seconds = 20572.665 FPS > 102569 frames in 5.0 seconds = 20513.792 FPS > 102559 frames in 5.0 seconds = 20511.775 FPS > 102333 frames in 5.0 seconds = 20466.543 FPS > > UP+4BSD 1 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > UP+4BSD 2 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > UP+4BSD 3 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > > UP+4BSD 4 idle > 6 frames in 5.0 seconds = 1.193 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > 5 frames in 5.0 seconds = 0.994 FPS > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.7 (FreeBSD) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iD8DBQFGscadHNAJ/fLbfrkRAnDOAJ9yipwexiBUrZbS3RJ5R0YDZyn4pACfS/Od > gMVwrhA3NYlaQkPNOaEZ7S8= > =98Za > -----END PGP SIGNATURE----- > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 01:45:18 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A351116A417 for ; Fri, 3 Aug 2007 01:45:18 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9324213C457 for ; Fri, 3 Aug 2007 01:45:18 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 9F62E1A4D86; Thu, 2 Aug 2007 18:44:45 -0700 (PDT) Date: Thu, 2 Aug 2007 18:44:45 -0700 From: Alfred Perlstein To: Jeff Roberson Message-ID: <20070803014445.GS92956@elvis.mu.org> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070802174819.S561@10.0.0.1> User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org, Kris Kennaway Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 01:45:18 -0000 * Jeff Roberson [070802 17:52] wrote: > > I believe filedescriptor locking is the place where we are most lacking. > The new sx helped tremendously. However, this is still going to be a > scalability limiter. I have looked into both linux and solaris's solution > to this problem. Briefly, linux uses RCU to protect the list, which is > close to ideal as this is certainly a read heavy workload. Solaris on the > other hand uses the actual file lock to protect the descriptor slot. So > they fetch the file pointer, lock it, and then check to see if they lost a > race with the slot being reassigned while they were acquiring the lock. > This approach is perhaps better than rcu in many cases except when the > descriptor set is expanded. Then they have to lock every file in the set. Certainly this is an extreme edge case... ? I could see it happening if we started with low limits, but perhaps by keeping counters/stats we could tell people how to tune their systems, or even autotune them. -Alfred From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 02:03:56 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17DD716A41B; Fri, 3 Aug 2007 02:03:56 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id BC0A313C480; Fri, 3 Aug 2007 02:03:55 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l7323rr4042573 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 2 Aug 2007 22:03:54 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 2 Aug 2007 19:06:28 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Alfred Perlstein In-Reply-To: <20070803014445.GS92956@elvis.mu.org> Message-ID: <20070802190033.J561@10.0.0.1> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> <20070803014445.GS92956@elvis.mu.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Kris Kennaway Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 02:03:56 -0000 On Thu, 2 Aug 2007, Alfred Perlstein wrote: > * Jeff Roberson [070802 17:52] wrote: >> >> I believe filedescriptor locking is the place where we are most lacking. >> The new sx helped tremendously. However, this is still going to be a >> scalability limiter. I have looked into both linux and solaris's solution >> to this problem. Briefly, linux uses RCU to protect the list, which is >> close to ideal as this is certainly a read heavy workload. Solaris on the >> other hand uses the actual file lock to protect the descriptor slot. So >> they fetch the file pointer, lock it, and then check to see if they lost a >> race with the slot being reassigned while they were acquiring the lock. >> This approach is perhaps better than rcu in many cases except when the >> descriptor set is expanded. Then they have to lock every file in the set. > > Certainly this is an extreme edge case... ? Well that may be true, yes. However, there are other problems with this scheme. For example, flags settings could be done entirely with cmpset, without using a lock at all. In most cases we're just setting a bit which can be done with atomic_set. When we're doing multiple operations we could compute the value and attempt to est it in a loop. So we can totally eliminate locking the descriptor here. We also could use atomic ops to protect the file descriptor reference count. This would eliminate another use of the FILE_LOCK(). I'm not sure if it's possible to merge this with an approach that uses the FILE_LOCK() to protect the descriptor table. Although I've not thought it all the way through. If the ref count and flags were done with atomics the main consumer of FILE_LOCK would actually be the unix domain socket garbage collection code. How's that for old unix baggage. Do many programs actually pass around descriptors these days? inetd? others? It might be worth it to lock this seperately from the file lock. Anyway, these things need to be explored for 8.0. With more cores and more multi-threaded applications file desc locking is one of the first points of contention as evidenced by the lock profiles and the sophistication of the solutions in other kernels. Jeff > > I could see it happening if we started with low limits, but perhaps > by keeping counters/stats we could tell people how to tune their > systems, or even autotune them. > > -Alfred > From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 07:18:08 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 501DF16A469; Fri, 3 Aug 2007 07:18:08 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 114F713C46A; Fri, 3 Aug 2007 07:18:08 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 28ED6483AC; Fri, 3 Aug 2007 03:18:07 -0400 (EDT) Date: Fri, 3 Aug 2007 08:18:07 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Jeff Roberson In-Reply-To: <20070802174819.S561@10.0.0.1> Message-ID: <20070803081520.B18327@fledge.watson.org> References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-101292342-1186125487=:18327" Cc: arch@freebsd.org, Alfred Perlstein , Kris Kennaway Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 07:18:08 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-101292342-1186125487=:18327 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Thu, 2 Aug 2007, Jeff Roberson wrote: > I hope we can hash out a good plan to resolve this for 8.0. filedesc and= =20 > lockmgr are the biggest hitters on mysql writes. I suspect this is also = the=20 > case for pgsql and likely other network server type programs. Actually, I'd scope this claim a bit -- it's certainly going to be an issue= =20 for MySQL and any *threaded* network server type programs. As process base= d=20 parallelism doesn't share the file descriptor table, it shouldn't affect=20 non-threaded workloads such as Apache and pgSQL. This is one of the=20 interesting contradictions with threads -- developers are taught that they = are=20 lightweight, and they are in the sense that they cost less to have many of = and=20 support very easy IPC, but they are also more likely to contend kernel lock= s=20 as they involve more shared data structures being referenced. In contrast,= =20 process-driven parallelism allows a much higher level of independence betwe= en=20 executing threads of control, allowing IPC to be explicit rather than=20 continuous. Anything that reduces the overhead of filedesc locking will=20 improve the performance of all apps,=A0but reducing the contention on filed= esc=20 will only help applications sharing the file descriptor array between many= =20 threads/processes. Robert N M Watson Computer Laboratory University of Cambridge --0-101292342-1186125487=:18327-- From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 07:31:29 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E4EB16A41B for ; Fri, 3 Aug 2007 07:31:29 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.186]) by mx1.freebsd.org (Postfix) with ESMTP id 0E89613C45A for ; Fri, 3 Aug 2007 07:31:28 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by nf-out-0910.google.com with SMTP id b2so203787nfb for ; Fri, 03 Aug 2007 00:31:28 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=UAr/oCQexNU/d2a0ZPLuZKvTuq+dcbGMgQH4h9XebudA+0NzeKZKq5Etx3q1SnCsfAjFTdYBDDuhAuc0DNgSj9YaJmFVxWi0vkd416npzucsun4QFXSzcvtkpex8Ap2F9yx5qetd3/heA8YpsnF5AT7ignmL/g89b0KKe2b4qkg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=U6wyr/2TCzlDxP10FrvN5xCfN1d0bB/fD1+W0QXNJ11RenN3qTML3F7GmhMOEgcwHD6p/O1wSklIaH0SI2mv6wGsIoXBLPDjruFMtC2zEIm91PbyKc8/KPDtZmuqE3QNlBsgpbnbRT+Ad72JjMCWz21YFKsaCAILVqABxpCn4XM= Received: by 10.78.201.10 with SMTP id y10mr756074huf.1186126287739; Fri, 03 Aug 2007 00:31:27 -0700 (PDT) Received: from ?172.31.5.25? ( [89.97.252.178]) by mx.google.com with ESMTPS id 36sm1126164huc.2007.08.03.00.31.27 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 03 Aug 2007 00:31:27 -0700 (PDT) Message-ID: <46B2D993.6070409@FreeBSD.org> Date: Fri, 03 Aug 2007 09:30:27 +0200 From: Attilio Rao User-Agent: Thunderbird 1.5 (X11/20060526) MIME-Version: 1.0 To: Jeff Roberson References: <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> <20070803014445.GS92956@elvis.mu.org> <20070802190033.J561@10.0.0.1> In-Reply-To: <20070802190033.J561@10.0.0.1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Attilio Rao Cc: arch@freebsd.org, Alfred Perlstein , Kris Kennaway Subject: Re: Fine grain select locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 07:31:29 -0000 Jeff Roberson wrote: > > On Thu, 2 Aug 2007, Alfred Perlstein wrote: > >> * Jeff Roberson [070802 17:52] wrote: >>> >>> I believe filedescriptor locking is the place where we are most lacking. >>> The new sx helped tremendously. However, this is still going to be a >>> scalability limiter. I have looked into both linux and solaris's >>> solution >>> to this problem. Briefly, linux uses RCU to protect the list, which is >>> close to ideal as this is certainly a read heavy workload. Solaris >>> on the >>> other hand uses the actual file lock to protect the descriptor slot. So >>> they fetch the file pointer, lock it, and then check to see if they >>> lost a >>> race with the slot being reassigned while they were acquiring the lock. >>> This approach is perhaps better than rcu in many cases except when the >>> descriptor set is expanded. Then they have to lock every file in the >>> set. >> >> Certainly this is an extreme edge case... ? > > Well that may be true, yes. However, there are other problems with this > scheme. For example, flags settings could be done entirely with cmpset, > without using a lock at all. In most cases we're just setting a bit > which can be done with atomic_set. When we're doing multiple operations > we could compute the value and attempt to est it in a loop. So we can > totally eliminate locking the descriptor here. > > We also could use atomic ops to protect the file descriptor reference > count. This would eliminate another use of the FILE_LOCK(). I'm not > sure if it's possible to merge this with an approach that uses the > FILE_LOCK() to protect the descriptor table. Although I've not thought > it all the way through. > > If the ref count and flags were done with atomics the main consumer of > FILE_LOCK would actually be the unix domain socket garbage collection > code. How's that for old unix baggage. Do many programs actually pass > around descriptors these days? inetd? others? It might be worth it to > lock this seperately from the file lock. I'm sure I've alredy implemented it, but later I realized that there is a race with the p_fd field (if I got you right you are referring to the fdesc_mtx here), so we probabilly should better arrange those paths firstly. Thanks, Attilio From owner-freebsd-arch@FreeBSD.ORG Fri Aug 3 10:49:20 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13CB216A417 for ; Fri, 3 Aug 2007 10:49:20 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id C6D5813C442 for ; Fri, 3 Aug 2007 10:49:19 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l73An45Z096845 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Fri, 3 Aug 2007 06:49:06 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Fri, 3 Aug 2007 03:51:39 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Niki Denev In-Reply-To: <20070802181239.O561@10.0.0.1> Message-ID: <20070803034628.U561@10.0.0.1> References: <46B1C69D.6070503@cytexbg.com> <20070802181239.O561@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Ivan Voras , freebsd-arch@freebsd.org Subject: Re: On schedulers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Aug 2007 10:49:20 -0000 On Thu, 2 Aug 2007, Jeff Roberson wrote: > On Thu, 2 Aug 2007, Niki Denev wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> Ivan Voras wrote: >>> Hi, >>> >>> I've just stumbled on the LKML (via Slashdot) discussion on schedulers, >>> nicely compiled here: http://kerneltrap.org/node/14023 . I don't think >>> 3D performance is of concern for FreeBSD, but I'm wondering how would >>> ULE and the latest incarnation of 4BSD fare in that discussion? >>> >>> Specifically, I'm interested in this result in Linux: >>> >>> 2.6.22-ck1 2.6.22-cfs-v19 >>> ------------------------ ------------------------ >>> quake + 0 loops | 41 fps quake + 0 loops | 41 fps >>> quake + 1 loop | 3 fps quake + 1 loop | 41 fps >>> quake + 2 loops | 2 fps quake + 2 loops | 32 fps >>> quake + 3 loops | 1 fps quake + 3 loops | 24 fps >>> quake + 4 loops | 0 fps quake + 4 loops | 20 fps >>> quake + 5 loops | 0 fps quake + 5 loops | 16 fps >>> >>> (for the impatient: the benchmark is of running quake with several "idle >>> loop" processes, presumably on a single CPU machine. On the left is the >>> SD (staircase deadline) and on the right is the CF (completely fair) >>> scheduler). >>> >>> How would this behave on FreeBSD? Is there a paper on how ULE should >>> behave / is modeled? >>> >> >> This is on a Intel C2D E6420 with 2G of ram, >> Nvidia 7900GTO (nvidia-driver-1.0.9746) >> running xorg-server-6.9.0_5 on a recent -CURRENT : >> >> idle is basicaly a small C program with just for(;;); in its main() >> function. >> I've run glxgears for 20 secs each (to get four reports) >> Both idle and glxgears are run as normal user. > > Can you tell me what % cpu is going to each process during this time? These > results are surprising. For workloads like this ULE should essentially > implement a 'fair' scheduling policy. However, so should 4BSD. So I'm not > yet sure why the slowdown wouldn't be relative to the number of running > threads. Also, 'vmstat 1' output would be useful. > > Can I recreate this test without a fancy video card? I have the following in > my laptop: > > vgapci0@pci1:0:0: class=0x030000 card=0x054f1014 chip=0x4e541002 > rev=0x80 > hdr=0x00 > vendor = 'ATI Technologies Inc.' > device = 'Radeon Mobility M10 NT (RV350-WS)' Well this must behave very differently when you have hardware acceleration. I for example see ~288 fps with no other cpu hogs running. This consumes 100% of the cpu. With 1 cpu hog running I see ~148. With two I see ~100 fps. It's also worth noting that at no time does interactivity suffer. My 'idle' is called loop.sh as I don't think it's particularly idle. ;-) Here it is: while true; do echo -n; done; This does no system calls and spends all of it's time in user-space. Thanks, Jeff > > > Thanks, > Jeff > >> >> SMP+ULE 0 idle >> 101446 frames in 5.0 seconds = 20289.099 FPS >> 101590 frames in 5.0 seconds = 20317.975 FPS >> 101701 frames in 5.0 seconds = 20340.037 FPS >> 101489 frames in 5.0 seconds = 20297.670 FPS >> >> SMP+ULE 1 idle >> 97430 frames in 5.0 seconds = 19485.840 FPS >> 102176 frames in 5.0 seconds = 20435.017 FPS >> 102402 frames in 5.0 seconds = 20480.318 FPS >> 102430 frames in 5.0 seconds = 20485.865 FPS >> >> SMP+ULE 2 idle >> 30 frames in 5.0 seconds = 5.978 FPS >> 31 frames in 5.0 seconds = 6.182 FPS >> 31 frames in 5.0 seconds = 6.172 FPS >> 30 frames in 5.2 seconds = 5.744 FPS >> >> SMP+ULE 3 idle >> 29 frames in 5.2 seconds = 5.631 FPS >> 30 frames in 5.0 seconds = 5.952 FPS >> 31 frames in 5.1 seconds = 6.054 FPS >> 32 frames in 5.2 seconds = 6.213 FPS >> >> SMP+ULE 4 idle >> 21 frames in 5.1 seconds = 4.151 FPS >> 20 frames in 5.1 seconds = 3.942 FPS >> 21 frames in 5.2 seconds = 4.066 FPS >> 20 frames in 5.2 seconds = 3.841 FPS >> >> UP+ULE 0 idle >> 102152 frames in 5.0 seconds = 20430.299 FPS >> 102572 frames in 5.0 seconds = 20514.236 FPS >> 102533 frames in 5.0 seconds = 20506.522 FPS >> 102129 frames in 5.0 seconds = 20425.654 FPS >> >> UP+ULE 1 idle >> 21 frames in 5.1 seconds = 4.158 FPS >> 24 frames in 5.2 seconds = 4.624 FPS >> 26 frames in 5.0 seconds = 5.153 FPS >> 28 frames in 5.0 seconds = 5.586 FPS >> >> UP+ULE 2 idle >> 21 frames in 5.1 seconds = 4.093 FPS >> 21 frames in 5.1 seconds = 4.093 FPS >> 21 frames in 5.1 seconds = 4.115 FPS >> 21 frames in 5.1 seconds = 4.115 FPS >> >> UP+ULE 3 idle >> 20 frames in 5.3 seconds = 3.804 FPS >> 19 frames in 5.2 seconds = 3.624 FPS >> 19 frames in 5.2 seconds = 3.619 FPS >> 19 frames in 5.3 seconds = 3.612 FPS >> >> UP+ULE 4 idle >> 19 frames in 5.3 seconds = 3.600 FPS >> 17 frames in 5.0 seconds = 3.388 FPS >> 17 frames in 5.0 seconds = 3.393 FPS >> 17 frames in 5.0 seconds = 3.380 FPS >> >> SMP+4BSD 0 idle >> 102440 frames in 5.0 seconds = 20487.893 FPS >> 102285 frames in 5.0 seconds = 20456.848 FPS >> 102276 frames in 5.0 seconds = 20455.065 FPS >> 102312 frames in 5.0 seconds = 20462.289 FPS >> >> SMP+4BSD 1 idle >> 101798 frames in 5.0 seconds = 20359.526 FPS >> 102732 frames in 5.0 seconds = 20546.202 FPS >> 102619 frames in 5.0 seconds = 20523.692 FPS >> 102788 frames in 5.0 seconds = 20557.526 FPS >> >> SMP+4BSD 2 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> SMP+4BSD 3 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> SMP+4BSD 4 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> UP+4BSD 0 idle >> 102864 frames in 5.0 seconds = 20572.665 FPS >> 102569 frames in 5.0 seconds = 20513.792 FPS >> 102559 frames in 5.0 seconds = 20511.775 FPS >> 102333 frames in 5.0 seconds = 20466.543 FPS >> >> UP+4BSD 1 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> UP+4BSD 2 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> UP+4BSD 3 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> >> UP+4BSD 4 idle >> 6 frames in 5.0 seconds = 1.193 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> 5 frames in 5.0 seconds = 0.994 FPS >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG v1.4.7 (FreeBSD) >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org >> >> iD8DBQFGscadHNAJ/fLbfrkRAnDOAJ9yipwexiBUrZbS3RJ5R0YDZyn4pACfS/Od >> gMVwrhA3NYlaQkPNOaEZ7S8= >> =98Za >> -----END PGP SIGNATURE----- >> _______________________________________________ >> freebsd-arch@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-arch >> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >> > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Sat Aug 4 08:05:36 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9C79416A421; Sat, 4 Aug 2007 08:05:36 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id A585613C46C; Sat, 4 Aug 2007 08:05:36 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from rot26.obsecurity.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 8349F1A4D7E; Sat, 4 Aug 2007 01:04:57 -0700 (PDT) Received: by rot26.obsecurity.org (Postfix, from userid 1001) id 52514C0F6; Sat, 4 Aug 2007 04:05:35 -0400 (EDT) Date: Sat, 4 Aug 2007 04:05:35 -0400 From: Kris Kennaway To: performance@FreeBSD.org Message-ID: <20070804080535.GA3952@rot26.obsecurity.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="W/nzBZO5zC0uMSeA" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: arch@FreeBSD.org Subject: read-write SQL performance X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Aug 2007 08:05:36 -0000 --W/nzBZO5zC0uMSeA Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I did some benchmarking of sysbench in read-write mode (previous tests have focused on read-only mode). The reason for this is that the disk hardware in my 8-core test system is slow (single disk) and is too easily saturated. In fact mysql and pgsql have identical performance when writing to disk. In other words, I seem to be mostly benchmarking the disk performance and not database or kernel performance. Faster disk hardware is necessary to explore database performance differences or kernel bottlenecks. An upper bound on possible read-write performance comes from using a memory disk instead of physical disk hardware. I replicated the databases onto a suitably large (2gb) tmpfs and reran the tests together with some mutex profiling. Results are here: http://obsecurity.dyndns.org/sysbench-write.png There are a couple of interesting features. mysql has better peak performance than pgsql, but then quickly falls in the toilet. Profiling indicates that at peak there is some contention on lockmgr locks and the proc lock, but most of the contention is in userland (i.e. within mysql itself). At higher loads the bottleneck is overwhelmingly within mysql (and the system is actually 90-100% idle). This seems to be a serious scaling problem within mysql. Peak pgsql performance is lower than mysql, but there is comparatively little degradation at higher loads. Profiling shows that the dominant bottleneck at all workloads is lockmgr. Fortunately there is a lockmgr rewrite in progress by Attilio for SoC, so there is great scope for performance improvements to pgsql. Significant mysql performance improvements may require fundamental architectural work by the mysql developers. Kris --W/nzBZO5zC0uMSeA Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGtDNPWry0BWjoQKURArsaAJ0TBneLAFm0JZl16qo/wpeCbxJ7NgCgwHcg NIrOTxurHgldrKsD9BdiPLY= =xNb8 -----END PGP SIGNATURE----- --W/nzBZO5zC0uMSeA-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 4 20:45:26 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3B6C16A417 for ; Sat, 4 Aug 2007 20:45:26 +0000 (UTC) (envelope-from almarrie@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.250]) by mx1.freebsd.org (Postfix) with ESMTP id 6E71513C45B for ; Sat, 4 Aug 2007 20:45:21 +0000 (UTC) (envelope-from almarrie@gmail.com) Received: by an-out-0708.google.com with SMTP id c14so260684anc for ; Sat, 04 Aug 2007 13:45:20 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=AsqBLMkKarPCF4w4noo3JRiKG6gOt9WLhPHQUs5rc/vdCnmUn1Q/YQwWP7ZxbtAYnxO5BRET29/ZRuoK4NrWCTlwKjqFs4ufR8lbASYoLRZ4kP4K4ECDabCPUZIYoLd33tTkdD9GRLMGBK343rtTRAAhwUZ279t2QS1Uhl3oADI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=cNUSNTe9VsI+LASPQOokydb5SmjxGxEfeubbSwQjMsaXTcTx1PsbmCwrTumLpOCpO8NWNHVbbGmxVvEVG78bHrACKlwMopxRIhGsN7Tp2Yk6xLWRNF2fZypZgvFFTR/Zuu8yow17zJ3f73y577aYjEmUAF7EjeoPtKCGkcgxEiU= Received: by 10.100.168.13 with SMTP id q13mr2369906ane.1186258840847; Sat, 04 Aug 2007 13:20:40 -0700 (PDT) Received: by 10.100.9.14 with HTTP; Sat, 4 Aug 2007 13:20:40 -0700 (PDT) Message-ID: <499c70c0708041320r1f51cb3qe6f05376cfb8a470@mail.gmail.com> Date: Sat, 4 Aug 2007 23:20:40 +0300 From: "Abdullah Ibn Hamad Al-Marri" To: "Kris Kennaway" In-Reply-To: <20070804080535.GA3952@rot26.obsecurity.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20070804080535.GA3952@rot26.obsecurity.org> Cc: Greg 'groggy' Lehey , arch@freebsd.org, performance@freebsd.org Subject: Re: read-write SQL performance X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 04 Aug 2007 20:45:26 -0000 On 8/4/07, Kris Kennaway wrote: > I did some benchmarking of sysbench in read-write mode (previous tests > have focused on read-only mode). The reason for this is that the disk > hardware in my 8-core test system is slow (single disk) and is too > easily saturated. In fact mysql and pgsql have identical performance > when writing to disk. In other words, I seem to be mostly > benchmarking the disk performance and not database or kernel > performance. > > Faster disk hardware is necessary to explore database performance > differences or kernel bottlenecks. An upper bound on possible > read-write performance comes from using a memory disk instead of > physical disk hardware. I replicated the databases onto a suitably > large (2gb) tmpfs and reran the tests together with some mutex > profiling. > > Results are here: > > http://obsecurity.dyndns.org/sysbench-write.png > > There are a couple of interesting features. > > mysql has better peak performance than pgsql, but then quickly falls > in the toilet. Profiling indicates that at peak there is some > contention on lockmgr locks and the proc lock, but most of the > contention is in userland (i.e. within mysql itself). At higher loads > the bottleneck is overwhelmingly within mysql (and the system is > actually 90-100% idle). This seems to be a serious scaling problem > within mysql. > > Peak pgsql performance is lower than mysql, but there is comparatively > little degradation at higher loads. Profiling shows that the dominant > bottleneck at all workloads is lockmgr. > > Fortunately there is a lockmgr rewrite in progress by Attilio for SoC, > so there is great scope for performance improvements to pgsql. > Significant mysql performance improvements may require fundamental > architectural work by the mysql developers. > > Kris Maybe Greg would be interested in the MySQL issues? -- Regards, -Abdullah Ibn Hamad Al-Marri Arab Portal http://www.WeArab.Net/