From owner-freebsd-arch@FreeBSD.ORG Mon Aug 13 21:08:15 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95E7316A419; Mon, 13 Aug 2007 21:08:15 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from signal.itea.ntnu.no (signal.itea.ntnu.no [129.241.190.231]) by mx1.freebsd.org (Postfix) with ESMTP id 5219413C457; Mon, 13 Aug 2007 21:08:15 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by signal.itea.ntnu.no (Postfix) with ESMTP id A49AD34A20; Mon, 13 Aug 2007 22:40:37 +0200 (CEST) Received: from gaupe.stud.ntnu.no (gaupe.stud.ntnu.no [129.241.56.184]) by signal.itea.ntnu.no (Postfix) with ESMTP; Mon, 13 Aug 2007 22:40:35 +0200 (CEST) Received: by gaupe.stud.ntnu.no (Postfix, from userid 2312) id DB455D0046; Mon, 13 Aug 2007 22:40:35 +0200 (CEST) Date: Mon, 13 Aug 2007 22:40:35 +0200 From: Ulf Lilleengen To: freebsd-current@freebsd.org, freebsd-geom@freebsd.org, freebsd-arch@freebsd.org Message-ID: <20070813204035.GA5338@stud.ntnu.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Cc: le@FreeBSD.org Subject: Testers wanted: Gvinum patches of SoC 2007 work X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2007 21:08:15 -0000 Hi, It's here! The new and hopefully better gvinum patch. This is perhaps my final patch of the work I've done during GSoC 2007 (the patch will be updated when I fix a bug). This doesn't mean I'll stop work on gvinum, but rather that I'm not adding more features until this gets into the tree. But, for this to get into the tree, I need people to test it. _ALL_ reports on how it works is good. So, what should you test? * Plain normal use. * Mirror synchronization, rebuild if raid-5 arrays, growing of raid-5 arrays etc. These should work, and probably is the most tested, but some weird combinations that I have not forseen might show itself. * Try weird combinations to check if it crashes. * Test mirror, concat, stripe and raid5 commands. * If there are any issues with the usability aspect. E.g. if the information gvinum gives you is good enough for you to understand what it's doing, if one way to do things seems unnatural to you etc. I'd like to hear all of this, no matter how bikshedish it might sound, it might be something that have been overlooked. These things are hard to test for the people that have been developing it, since we know how it "should" be used. Before you head on, beware that the new gvinum does not give messages back to the userland gvinum (so you won't get them into your terminal). This is because it's not very simple to do with the new event system. !! This means you'll have to look after messages in /var/log/messages !! And thanks to people for comments and help that I've been getting during the summer. -- Ulf Lilleengen From owner-freebsd-arch@FreeBSD.ORG Mon Aug 13 21:21:17 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DADD816A418; Mon, 13 Aug 2007 21:21:17 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from fri.itea.ntnu.no (fri.itea.ntnu.no [129.241.7.60]) by mx1.freebsd.org (Postfix) with ESMTP id 94C5613C457; Mon, 13 Aug 2007 21:21:15 +0000 (UTC) (envelope-from lulf@stud.ntnu.no) Received: from localhost (localhost [127.0.0.1]) by fri.itea.ntnu.no (Postfix) with ESMTP id 10CA6868F; Mon, 13 Aug 2007 22:46:55 +0200 (CEST) Received: from gaupe.stud.ntnu.no (gaupe.stud.ntnu.no [129.241.56.184]) by fri.itea.ntnu.no (Postfix) with ESMTP; Mon, 13 Aug 2007 22:46:51 +0200 (CEST) Received: by gaupe.stud.ntnu.no (Postfix, from userid 2312) id DACCFD0046; Mon, 13 Aug 2007 22:46:51 +0200 (CEST) Date: Mon, 13 Aug 2007 22:46:51 +0200 From: Ulf Lilleengen To: freebsd-current@freebsd.org, freebsd-geom@freebsd.org, freebsd-arch@freebsd.org Message-ID: <20070813204651.GB5338@stud.ntnu.no> References: <20070813204035.GA5338@stud.ntnu.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070813204035.GA5338@stud.ntnu.no> User-Agent: Mutt/1.5.9i X-Content-Scanned: with sophos and spamassassin at mailgw.ntnu.no. Cc: le@FreeBSD.org Subject: Re: Testers wanted: Gvinum patches of SoC 2007 work X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Aug 2007 21:21:17 -0000 On man, aug 13, 2007 at 10:40:35 +0200, Ulf Lilleengen wrote: > Hi, > > It's here! The new and hopefully better gvinum patch. This is perhaps my final > patch of the work I've done during GSoC 2007 (the patch will be updated when I > fix a bug). This doesn't mean I'll stop work on gvinum, but rather that I'm not > adding more features until this gets into the tree. But, for this to get into > the tree, I need people to test it. _ALL_ reports on how it works is good. > *SNIP* > Ehm, And ofcourse, the patches can be found here: http://folk.ntnu.no/lulf/patches/freebsd/gvinum One for releng_6* and one for current. The patch is applied like this: # cd /usr/src && patch < /path/to/patch Remember to not have the old gvinum module running. Then install the module: # cd /usr/src/sys/modules/geom/geom_vinum && make && make install clean Then install userland gvinum # cd /usr/src/sbin/gvinum && make && make install clean And you should be ready to go. The updated manpage in /usr/src/sbin/gvinum/gvinum.8 describes how growing is done. Just gzip it and put it in /usr/share/man/man8 to use it. -- Ulf Lilleengen From owner-freebsd-arch@FreeBSD.ORG Tue Aug 14 13:53:37 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CCF7916A421 for ; Tue, 14 Aug 2007 13:53:37 +0000 (UTC) (envelope-from flz@xbsd.org) Received: from postfix1-g20.free.fr (postfix1-g20.free.fr [212.27.60.42]) by mx1.freebsd.org (Postfix) with ESMTP id 6AB9413C49D for ; Tue, 14 Aug 2007 13:53:37 +0000 (UTC) (envelope-from flz@xbsd.org) Received: from smtp6-g19.free.fr (smtp6-g19.free.fr [212.27.42.36]) by postfix1-g20.free.fr (Postfix) with ESMTP id 50340189493A for ; Tue, 14 Aug 2007 15:31:40 +0200 (CEST) Received: from smtp6-g19.free.fr (localhost.localdomain [127.0.0.1]) by smtp6-g19.free.fr (Postfix) with ESMTP id 5FDAFB8AA3 for ; Tue, 14 Aug 2007 15:31:39 +0200 (CEST) Received: from smtp.xbsd.org (xbsd.org [82.233.2.192]) by smtp6-g19.free.fr (Postfix) with ESMTP id 47366B8A2E for ; Tue, 14 Aug 2007 15:31:38 +0200 (CEST) Received: from localhost (localhost.xbsd.org [127.0.0.1]) by smtp.xbsd.org (Postfix) with ESMTP id 549D612020 for ; Tue, 14 Aug 2007 15:31:38 +0200 (CEST) X-Virus-Scanned: amavisd-new at xbsd.org Received: from smtp.xbsd.org ([127.0.0.1]) by localhost (srv1.xbsd.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lazhhujHg5dY for ; Tue, 14 Aug 2007 15:31:29 +0200 (CEST) Received: from [193.120.13.132] (innercity.xbsd.org [193.120.13.132]) by smtp.xbsd.org (Postfix) with ESMTP id 615F11201A for ; Tue, 14 Aug 2007 15:31:29 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v752.2) Content-Transfer-Encoding: 7bit Message-Id: <9EF501DD-BBCA-40E1-B261-99B68B1E0B61@xbsd.org> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: arch@FreeBSD.org From: Florent Thoumie Date: Tue, 14 Aug 2007 14:31:24 +0100 X-Mailer: Apple Mail (2.752.2) Cc: Subject: Remove /boot/firmware from BSD.root.dist X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Aug 2007 13:53:37 -0000 /boot/firmware was added to BSD.root.dist for ipw/iwi. Both drivers now use the firmware(9) framework, which means firmwares are installed in /boot/modules. I think it's safe to remove the directory from the mtree file. Any objections? -- Florent Thoumie flz@FreeBSD.org FreeBSD Committer From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 04:37:26 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4E55C16A46B for ; Thu, 16 Aug 2007 04:37:26 +0000 (UTC) (envelope-from scsvc1@scsupport.net) Received: from camel.he.net (camel.he.net [216.218.242.2]) by mx1.freebsd.org (Postfix) with SMTP id 2B84013C4CB for ; Thu, 16 Aug 2007 04:37:26 +0000 (UTC) (envelope-from scsvc1@scsupport.net) Message-Id: <1187237347.1745@camel.he.net> Date: Wed, 15 Aug 2007 21:09:07 -0700 To: freebsd-arch@freebsd.org From: SwissCash Support Content-Transfer-Encoding: 8bit MIME-Version: 1.0 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: [SwissCash Warning] Multiple password failure X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: scsvc1@scsupport.net List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 04:37:26 -0000 Dear Investors, We recently have determined that different computers have logged into your SwissCash Account, and multiple password failure were present before the log on. We need you to re-confirm your account information with us now. If this is not completed we will be forced to suspend your account indefinitely. We thank you for your cooperation in this manner. To confirm your account records click here : [1]https://secured.sip25.com/web/login.aspx [2]https://secured.swisscash.net/web/login.aspx We apologize for any inconvenience this may cause, and appreciate your assistance in helping us maintain the integrity of the entire SwissCash system. We thank you for your prompt attention to this matter. Please understand that this is a security measure intended to help protect us and your account. Thank you, We are here always to serve you better. Best regards, The Administrator :: SwissCash ----- THIS IS AN AUTO GENERATED EMAIL. PLEASE DO NOT REPLY ----- References 1. http://secured.sip25.bz/web/login.aspx/ 2. http://secured.swisscash.net.in/web/login.aspx/ From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 06:58:27 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33F3216A421 for ; Thu, 16 Aug 2007 06:58:27 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id E14C513C428 for ; Thu, 16 Aug 2007 06:58:26 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.103] (c-67-160-44-208.hsd1.wa.comcast.net [67.160.44.208]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l7G6wNTV035703 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO) for ; Thu, 16 Aug 2007 02:58:24 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 16 Aug 2007 00:01:18 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: arch@freebsd.org Message-ID: <20070815233852.X568@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 06:58:27 -0000 I have been looking at file locking for 8.0 and have come up with a way to completely eliminate the file lock, reduce the size of struct file by ~4 pointers (22%), remove the global list of files and associated lock, and restrict the scope of unp_gc while removing several race conditions. The whole thing hinges on reducing the complexity and scope of unp_gc to remove several fields from struct file. The remaining parts can be protected by atomics or are already protected by other locks. f_count and f_type are now completely updated using atomics. The ref counting with atomics results in significantly fewer atomics and cheaper fhold/fdrop. Protecting f_type was only complicated in cases where there were compound operations done on it, which are now implemented with atomic_cmpset_int loops. The unix domain socket garbage collection was changed to scan the list of unp sockets rather than the list of all files. This code is only responsible for finding dead cycles of unp sockets which reference each other. Evaluating other descriptors is not necessary. This allowed me to move f_gcflag and f_msgcount into unpcb. This also removed a use of the global filelist allowing me to remove two pointers from struct file. The only negative part of the new algorithm is that a back-pointer to the referencing struct file must be stored in any unix domain socket that is referenced via another in a rights message. It is a slight layering violation but there is only ever 1 file for each unix domain socket so it is correct. This is required because the garbage collection algorithm needs to know about the external references via the file. The patch is available at: http://people.freebsd.org/~jeff/fd.diff pho and kris have both tested this and found it to be stable. Kris has done some performance measurement and found it to be a win on microbenchmarks. I can't imagine a case that would actually be slower, except perhaps endless loops of sysctl kern.file. This also removes several sources of contention for multithreaded applications in particular, but also a global lock on allocating/freeing files. This also resolves several cases where f_flag was not protected when it should be, as well as removing race conditions in the garbage collector code due to dropping locks, and fixing unprotected variables in the garbage collection code. I intend to commit this soon after the 7.0 branch is made. I also have the final revision of my improved select locking ready. Thanks, Jeff From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 14:57:57 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E16D16A419 for ; Thu, 16 Aug 2007 14:57:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 47EEE13C459 for ; Thu, 16 Aug 2007 14:57:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8k) with ESMTP id 203659652-1834499 for multiple; Thu, 16 Aug 2007 10:57:57 -0400 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l7GEvn9o046082; Thu, 16 Aug 2007 10:57:50 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Thu, 16 Aug 2007 10:56:31 -0400 User-Agent: KMail/1.9.6 References: <20070815233852.X568@10.0.0.1> In-Reply-To: <20070815233852.X568@10.0.0.1> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200708161056.31494.jhb@freebsd.org> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 16 Aug 2007 10:57:50 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3966/Wed Aug 15 20:48:06 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Subject: Re: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 14:57:57 -0000 On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote: > I have been looking at file locking for 8.0 and have come up with a way to > completely eliminate the file lock, reduce the size of struct file by > ~4 pointers (22%), remove the global list of files and associated lock, > and restrict the scope of unp_gc while removing several race conditions. I like finit(). I would maybe change socketpair() to pass so1 and so2 to finit() rather than setting f_data twice. Did you consider using refcount_* for f_count rather than using the atomic operations directly? It appears you never call unp_discard() and thus 'closef()' on the sockets in unp_gc() now. Perhaps the fdrop()'s in the end of the loop should be unp_discard()'s instead? Also, it's a bit of a shame to lose 'show files' from ddb. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 20:16:02 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C24316A417; Thu, 16 Aug 2007 20:16:02 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 11DA713C478; Thu, 16 Aug 2007 20:16:01 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.103] (c-67-160-44-208.hsd1.wa.comcast.net [67.160.44.208]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l7GKFvGf015845 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2007 16:15:59 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 16 Aug 2007 13:18:51 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: John Baldwin In-Reply-To: <200708161056.31494.jhb@freebsd.org> Message-ID: <20070816131327.J568@10.0.0.1> References: <20070815233852.X568@10.0.0.1> <200708161056.31494.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 20:16:02 -0000 On Thu, 16 Aug 2007, John Baldwin wrote: > On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote: >> I have been looking at file locking for 8.0 and have come up with a way to >> completely eliminate the file lock, reduce the size of struct file by >> ~4 pointers (22%), remove the global list of files and associated lock, >> and restrict the scope of unp_gc while removing several race conditions. > Thanks for the review. > I like finit(). I would maybe change socketpair() to pass so1 and so2 to > finit() rather than setting f_data twice. I'm not sure what you mean? > > Did you consider using refcount_* for f_count rather than using the atomic > operations directly? Yes, I may change it. I was investigating some schemes where it may not have been sufficient but I think it will work fine for what I've settled on. I also think I'll wrap some atomics for manipulating f_type in macros. > > It appears you never call unp_discard() and thus 'closef()' on the sockets in > unp_gc() now. Perhaps the fdrop()'s in the end of the loop should be > unp_discard()'s instead? unp_discard() happens as a side-effect of sorflush(). > > Also, it's a bit of a shame to lose 'show files' from ddb. Yes, I can re-implement that using the same technique as sysctl kern.file. What's more troubling is the continued erosion of support for libkvm as it uses filelist. I don't think libkvm is a strong enough case to keep filelist around. I guess I will have to hack it to work similarly to sysctl as well. Do we have an official stance on libkvm? Now that we have sysctl for run-time it's only useful for crashdump debugging. Really in most cases it could be replaced with a reasonable set of gcc scripts. Jeff > > -- > John Baldwin > From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 20:36:40 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5C3916A418 for ; Thu, 16 Aug 2007 20:36:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 8A3D113C45B for ; Thu, 16 Aug 2007 20:36:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8k) with ESMTP id 203717310-1834499 for multiple; Thu, 16 Aug 2007 16:36:19 -0400 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l7GKaAgM048300; Thu, 16 Aug 2007 16:36:11 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Jeff Roberson Date: Thu, 16 Aug 2007 16:35:20 -0400 User-Agent: KMail/1.9.6 References: <20070815233852.X568@10.0.0.1> <200708161056.31494.jhb@freebsd.org> <20070816131327.J568@10.0.0.1> In-Reply-To: <20070816131327.J568@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200708161635.20935.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 16 Aug 2007 16:36:11 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3967/Thu Aug 16 11:32:14 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-arch@freebsd.org Subject: Re: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 20:36:41 -0000 On Thursday 16 August 2007 04:18:51 pm Jeff Roberson wrote: > On Thu, 16 Aug 2007, John Baldwin wrote: > > > On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote: > >> I have been looking at file locking for 8.0 and have come up with a way to > >> completely eliminate the file lock, reduce the size of struct file by > >> ~4 pointers (22%), remove the global list of files and associated lock, > >> and restrict the scope of unp_gc while removing several race conditions. > > > > Thanks for the review. > > > I like finit(). I would maybe change socketpair() to pass so1 and so2 to > > finit() rather than setting f_data twice. > > I'm not sure what you mean? In socketpair() the new code does this: fp1->f_data = so1; ... fp2->f_data = so2; ... finit(fp1, ..., fp1->f_data, ...); finit(fp2, ..., fp2->f_data, ...); It might be cleaner to do this: ... ... finit(fp1, ..., so1, ...); finit(fp2, ..., so2, ...); > > Did you consider using refcount_* for f_count rather than using the atomic > > operations directly? > > Yes, I may change it. I was investigating some schemes where it may not > have been sufficient but I think it will work fine for what I've settled > on. > > I also think I'll wrap some atomics for manipulating f_type in macros. That would be good. > > It appears you never call unp_discard() and thus 'closef()' on the sockets in > > unp_gc() now. Perhaps the fdrop()'s in the end of the loop should be > > unp_discard()'s instead? > > unp_discard() happens as a side-effect of sorflush(). So, in the old code there's a really big comment about how it makes sure to only do closef() (via unp_discard()) once but does a sorflush() for each f_msgcount. Was that comment no longer true? > > Also, it's a bit of a shame to lose 'show files' from ddb. > > Yes, I can re-implement that using the same technique as sysctl kern.file. > What's more troubling is the continued erosion of support for libkvm as it > uses filelist. I don't think libkvm is a strong enough case to keep > filelist around. I guess I will have to hack it to work similarly to > sysctl as well. > > Do we have an official stance on libkvm? Now that we have sysctl for > run-time it's only useful for crashdump debugging. Really in most cases > it could be replaced with a reasonable set of gcc scripts. s/gcc/gdb/. At work we do mostly post-mortem analysis, so having working libkvm is still very important for us. xref the way I just fixed netstat to work again on coredumps recently. Breaking fstat on coredumps would probably be very annoying. libkvm can always use the same algo as the sysctl if necessary though. How much overhead is the filelist if it is only used in file creation/destruction? -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 22:28:24 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA61E16A417; Thu, 16 Aug 2007 22:28:24 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 9124413C428; Thu, 16 Aug 2007 22:28:24 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.103] (c-67-160-44-208.hsd1.wa.comcast.net [67.160.44.208]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l7GMSJPj047535 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 16 Aug 2007 18:28:21 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 16 Aug 2007 15:31:13 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: John Baldwin In-Reply-To: <200708161635.20935.jhb@freebsd.org> Message-ID: <20070816151932.R568@10.0.0.1> References: <20070815233852.X568@10.0.0.1> <200708161056.31494.jhb@freebsd.org> <20070816131327.J568@10.0.0.1> <200708161635.20935.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 22:28:24 -0000 On Thu, 16 Aug 2007, John Baldwin wrote: > On Thursday 16 August 2007 04:18:51 pm Jeff Roberson wrote: >> On Thu, 16 Aug 2007, John Baldwin wrote: >> >>> On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote: >>>> I have been looking at file locking for 8.0 and have come up with a way > to >>>> completely eliminate the file lock, reduce the size of struct file by >>>> ~4 pointers (22%), remove the global list of files and associated lock, >>>> and restrict the scope of unp_gc while removing several race conditions. >>> >> >> Thanks for the review. >> >>> I like finit(). I would maybe change socketpair() to pass so1 and so2 to >>> finit() rather than setting f_data twice. >> >> I'm not sure what you mean? > > In socketpair() the new code does this: > > fp1->f_data = so1; > ... > fp2->f_data = so2; > ... > finit(fp1, ..., fp1->f_data, ...); > finit(fp2, ..., fp2->f_data, ...); > > It might be cleaner to do this: > > ... > ... > finit(fp1, ..., so1, ...); > finit(fp2, ..., so2, ...); I did not want to have one file pointing at another without an initialized f_data field. However, I guess the underlying sockets are already setup so this may not be important. The code did go to some effort to setup f_data early before as well so I didn't want to change that. > >>> Did you consider using refcount_* for f_count rather than using the atomic >>> operations directly? >> >> Yes, I may change it. I was investigating some schemes where it may not >> have been sufficient but I think it will work fine for what I've settled >> on. >> >> I also think I'll wrap some atomics for manipulating f_type in macros. > > That would be good. > >>> It appears you never call unp_discard() and thus 'closef()' on the sockets > in >>> unp_gc() now. Perhaps the fdrop()'s in the end of the loop should be >>> unp_discard()'s instead? >> >> unp_discard() happens as a side-effect of sorflush(). > > So, in the old code there's a really big comment about how it makes sure to > only do closef() (via unp_discard()) once but does a sorflush() for each > f_msgcount. Was that comment no longer true? The comment actually says: * * It is incorrect to simply unp_discard each entry for f_msgcount * times What we do is grab an extra ref to each struct file that is dead and then explicitly sorflush() them. This closes all of the references held by that socket, which would free any unreferenced non-unp descriptors. However, we want to prevent the algorithm from recursing in on itself so we hold the extra file ref for unp sockets that would be closed. Then when we loop releasing this one last ref at the end the actually fo_close will be called. This portion of the algorithm is not significantly different from before. I just introduced an extra flag so I could remove the race from dropping the lock inbetween operations and get an accurate count of how big the array needs to be. > >>> Also, it's a bit of a shame to lose 'show files' from ddb. >> >> Yes, I can re-implement that using the same technique as sysctl kern.file. >> What's more troubling is the continued erosion of support for libkvm as it >> uses filelist. I don't think libkvm is a strong enough case to keep >> filelist around. I guess I will have to hack it to work similarly to >> sysctl as well. >> >> Do we have an official stance on libkvm? Now that we have sysctl for >> run-time it's only useful for crashdump debugging. Really in most cases >> it could be replaced with a reasonable set of gcc scripts. > > s/gcc/gdb/. At work we do mostly post-mortem analysis, so having working > libkvm is still very important for us. xref the way I just fixed netstat to > work again on coredumps recently. Breaking fstat on coredumps would probably > be very annoying. libkvm can always use the same algo as the sysctl if > necessary though. Yes, I'll do that. > > How much overhead is the filelist if it is only used in file > creation/destruction? Well shaving off two pointers gets us into cacheline size for struct file which has some perf improvement. Furthermore, having a global lock in open/close will impact even single threaded workloads on larger machines. Truthfully I haven't seen contention on this yet but I suspect it's just because I haven't tried the right workload. :-) If we can do it without hindering other areas, which we can, I think it's a good idea. I included it because it was easy to do so after the gc changed. fwiw this is about a 2% improvement in peak throughput on my mysql test on the 8way with better scaling at very high numbers of threads. At peak throughput there was no measured contention on this lock. That improvement is purely from fewer atomics and smaller struct file. Kris created a synthetic benchmark that had over 50% improvement in throughput with lots of threads contending on the same descriptor. single-threaded performance is probably much better with this as well, although less dramaticaly so. > > -- > John Baldwin > From owner-freebsd-arch@FreeBSD.ORG Thu Aug 16 23:20:33 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD3C516A421 for ; Thu, 16 Aug 2007 23:20:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 82C3613C457 for ; Thu, 16 Aug 2007 23:20:33 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8k) with ESMTP id 203738904-1834499 for multiple; Thu, 16 Aug 2007 19:20:29 -0400 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l7GNKKKU049205; Thu, 16 Aug 2007 19:20:20 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: Jeff Roberson Date: Thu, 16 Aug 2007 19:04:06 -0400 User-Agent: KMail/1.9.6 References: <20070815233852.X568@10.0.0.1> <200708161635.20935.jhb@freebsd.org> <20070816151932.R568@10.0.0.1> In-Reply-To: <20070816151932.R568@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200708161904.06299.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 16 Aug 2007 19:20:20 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3967/Thu Aug 16 11:32:14 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-arch@freebsd.org Subject: Re: file locking. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Aug 2007 23:20:33 -0000 On Thursday 16 August 2007 06:31:13 pm Jeff Roberson wrote: > On Thu, 16 Aug 2007, John Baldwin wrote: > > > On Thursday 16 August 2007 04:18:51 pm Jeff Roberson wrote: > >> On Thu, 16 Aug 2007, John Baldwin wrote: > >> > >>> On Thursday 16 August 2007 03:01:18 am Jeff Roberson wrote: > >>>> I have been looking at file locking for 8.0 and have come up with a way > > to > >>>> completely eliminate the file lock, reduce the size of struct file by > >>>> ~4 pointers (22%), remove the global list of files and associated lock, > >>>> and restrict the scope of unp_gc while removing several race conditions. > >>> > >> > >> Thanks for the review. > >> > >>> I like finit(). I would maybe change socketpair() to pass so1 and so2 to > >>> finit() rather than setting f_data twice. > >> > >> I'm not sure what you mean? > > > > In socketpair() the new code does this: > > > > fp1->f_data = so1; > > ... > > fp2->f_data = so2; > > ... > > finit(fp1, ..., fp1->f_data, ...); > > finit(fp2, ..., fp2->f_data, ...); > > > > It might be cleaner to do this: > > > > ... > > ... > > finit(fp1, ..., so1, ...); > > finit(fp2, ..., so2, ...); > > I did not want to have one file pointing at another without an initialized > f_data field. However, I guess the underlying sockets are already setup > so this may not be important. The code did go to some effort to setup > f_data early before as well so I didn't want to change that. Until f_ops is set, f_data is irrelevant as badfileops ignores f_data. > > So, in the old code there's a really big comment about how it makes sure to > > only do closef() (via unp_discard()) once but does a sorflush() for each > > f_msgcount. Was that comment no longer true? > > The comment actually says: > > * > * It is incorrect to simply unp_discard each entry for f_msgcount > * times > > What we do is grab an extra ref to each struct file that is dead and then > explicitly sorflush() them. This closes all of the references held by > that socket, which would free any unreferenced non-unp descriptors. > However, we want to prevent the algorithm from recursing in on itself so > we hold the extra file ref for unp sockets that would be closed. Then > when we loop releasing this one last ref at the end the actually fo_close > will be called. > > This portion of the algorithm is not significantly different from before. > I just introduced an extra flag so I could remove the race from dropping > the lock inbetween operations and get an accurate count of how big the > array needs to be. Ok. > >> Do we have an official stance on libkvm? Now that we have sysctl for > >> run-time it's only useful for crashdump debugging. Really in most cases > >> it could be replaced with a reasonable set of gcc scripts. > > > > s/gcc/gdb/. At work we do mostly post-mortem analysis, so having working > > libkvm is still very important for us. xref the way I just fixed netstat to > > work again on coredumps recently. Breaking fstat on coredumps would probably > > be very annoying. libkvm can always use the same algo as the sysctl if > > necessary though. > > Yes, I'll do that. Cool, thanks! -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 12:33:32 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9105F16A468 for ; Sat, 18 Aug 2007 12:33:32 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 6DD3E13C4B3 for ; Sat, 18 Aug 2007 12:33:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 55CC145685; Sat, 18 Aug 2007 14:01:57 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id BF8CC45683 for ; Sat, 18 Aug 2007 14:01:52 +0200 (CEST) Date: Sat, 18 Aug 2007 14:00:56 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@FreeBSD.org Message-ID: <20070818120056.GA6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/9DWx/yDrRhgMJTb" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Subject: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 12:33:32 -0000 --/9DWx/yDrRhgMJTb Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. The patch below remove per-uidinfo locks: http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch With the patch uidinfo is handled using atomics only and no locks (except for the global hash lock, which is not really important, as it's not used in the fast paths). I needed to change ui_sbsize from rlim_t (64bit) to long, because we don't have 64bit atomics on all archs, and because sbsize represents size in bytes, it can't go beyond 32bit on 32bit archs (PAE might be a bit of a problem). I changed maxval argument in chgproccnt() from int to rlim_t, as this is what is passed to the function. In simple ping-pong test on unix domain socket, uidinfo lock was highly contented: max total wait_total count avg wait_avg cnt_hold cnt_= lock name 1508 3242859 96267052 1467476 2 65 374553 10247= 43 /usr/src/sys/kern/kern_resource.c:1339 (sleep mutex:sleep mtxpool) The ping-pong program you can find here: http://people.freebsd.org/~pjd/misc/unixpingpong.c At the end we reduced uidinfo structure size by 8 bytes and gain no measurable performance improvements:) Yes, I wasn't able to measure anything interesting, unfortunately, but I still believe the patch should be committed, as I'm sure there are workloads that will see improvements - note that uidinfo structure is per-uid, so if there are thousands of processes running with the same uid, they all need to fight for this one lock. Not only contention is important, but also the fact that number of atomic operations in chgsbsize() and chgproccnt() functions was reduced from 2 to 1. Ok, while writting this e-mail I came up with a better benchmark: http://people.freebsd.org/~pjd/misc/unixpingpong2.c This one doesn't do ping-pong between two processes, but within one proccess only. This way we eliminate cost of context switches. I was running 8 such processes (I tested it on a 8way machine) and here are the results: x ./uidinfo1.txt + ./uidinfo2.txt +--------------------------------------------------------------------------= ----+ |x = +| |xx = ++| |xx = ++| |A| = |A| +--------------------------------------------------------------------------= ----+ N Min Max Median Avg Stddev x 5 402742 416417 406216 408269.2 5388.3485 + 5 1561566 1575987 1568964 1569767 5853.1399 Difference at 95.0% confidence 1.1615e+06 +/- 8204.54 284.493% +/- 2.00959% (Student's t, pooled s =3D 5625.55) As you can see this was just a matter of good benchmark - here we can see 284% performance improvement, yay:) --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --/9DWx/yDrRhgMJTb Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGxt94ForvXbEpPzQRAs2+AJ9jS36KZJYz3FT/Bi77cgpxp2KtbgCgjKQx OatG9wUb6zvM2mTPvgHyfsQ= =+fmg -----END PGP SIGNATURE----- --/9DWx/yDrRhgMJTb-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 12:44:54 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C818716A41A for ; Sat, 18 Aug 2007 12:44:54 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id EF67013C481 for ; Sat, 18 Aug 2007 12:44:53 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 8268245696; Sat, 18 Aug 2007 14:44:52 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id BDBC945681 for ; Sat, 18 Aug 2007 14:44:47 +0200 (CEST) Date: Sat, 18 Aug 2007 14:43:52 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@FreeBSD.org Message-ID: <20070818124352.GB6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="U+BazGySraz5kW0T" Content-Disposition: inline In-Reply-To: <20070818120056.GA6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 12:44:55 -0000 --U+BazGySraz5kW0T Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 18, 2007 at 02:00:56PM +0200, Pawel Jakub Dawidek wrote: > Hi. >=20 > The patch below remove per-uidinfo locks: >=20 > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch >=20 > With the patch uidinfo is handled using atomics only and no locks > (except for the global hash lock, which is not really important, as it's > not used in the fast paths). >=20 > I needed to change ui_sbsize from rlim_t (64bit) to long, because we > don't have 64bit atomics on all archs, and because sbsize represents > size in bytes, it can't go beyond 32bit on 32bit archs (PAE might be a > bit of a problem). >=20 > I changed maxval argument in chgproccnt() from int to rlim_t, as this is > what is passed to the function. >=20 > In simple ping-pong test on unix domain socket, uidinfo lock was highly > contented: >=20 > max total wait_total count avg wait_avg cnt_hold cn= t_lock name > 1508 3242859 96267052 1467476 2 65 374553 102= 4743 /usr/src/sys/kern/kern_resource.c:1339 (sleep mutex:sleep mtxpool) >=20 > The ping-pong program you can find here: >=20 > http://people.freebsd.org/~pjd/misc/unixpingpong.c >=20 > At the end we reduced uidinfo structure size by 8 bytes and gain no > measurable performance improvements:) > Yes, I wasn't able to measure anything interesting, unfortunately, but I > still believe the patch should be committed, as I'm sure there are > workloads that will see improvements - note that uidinfo structure is > per-uid, so if there are thousands of processes running with the same > uid, they all need to fight for this one lock. > Not only contention is important, but also the fact that number of > atomic operations in chgsbsize() and chgproccnt() functions was reduced > from 2 to 1. >=20 > Ok, while writting this e-mail I came up with a better benchmark: >=20 > http://people.freebsd.org/~pjd/misc/unixpingpong2.c >=20 > This one doesn't do ping-pong between two processes, but within one > proccess only. This way we eliminate cost of context switches. I was > running 8 such processes (I tested it on a 8way machine) and here are > the results: >=20 > x ./uidinfo1.txt > + ./uidinfo2.txt > +------------------------------------------------------------------------= ------+ > |x = +| > |xx = ++| > |xx = ++| > |A| = |A| > +------------------------------------------------------------------------= ------+ > N Min Max Median Avg Stdd= ev > x 5 402742 416417 406216 408269.2 5388.34= 85 > + 5 1561566 1575987 1568964 1569767 5853.13= 99 > Difference at 95.0% confidence > 1.1615e+06 +/- 8204.54 > 284.493% +/- 2.00959% > (Student's t, pooled s =3D 5625.55) >=20 > As you can see this was just a matter of good benchmark - here we can see > 284% performance improvement, yay:) Kris asked me to verify just in case that the unit cost is not higher and here are the results from running only one ping-pong process (so no contention): x ./uidinfo_up1.txt + ./uidinfo_up2.txt +--------------------------------------------------------------------------= ----+ |x x x x x + + + + = +| | |________A_M_______| |___________A_M__________= | | +--------------------------------------------------------------------------= ----+ N Min Max Median Avg Stddev x 5 415023 420059 418405 418016.6 1878.5011 + 5 423731 430984 428084 427735.8 2623.4883 Difference at 95.0% confidence 9719.2 +/- 3327.59 2.32508% +/- 0.796043% (Student's t, pooled s =3D 2281.61) We can see 2.3% of improvement most likely because of atomic operations reduction. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --U+BazGySraz5kW0T Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGxumIForvXbEpPzQRAoRAAKDUmAIgqAc6kx4KvXs+snHuAiqWegCg7z6C VtWabfYPU5PAeGxjX4Tg8cA= =rtmv -----END PGP SIGNATURE----- --U+BazGySraz5kW0T-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 14:56:46 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DF0416A417; Sat, 18 Aug 2007 14:56:46 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 8D16313C481; Sat, 18 Aug 2007 14:56:46 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 1E3251A4D7C; Sat, 18 Aug 2007 07:23:37 -0700 (PDT) Date: Sat, 18 Aug 2007 07:23:37 -0700 From: Alfred Perlstein To: Pawel Jakub Dawidek Message-ID: <20070818142337.GW90381@elvis.mu.org> References: <20070818120056.GA6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070818120056.GA6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 14:56:47 -0000 * Pawel Jakub Dawidek [070818 05:31] wrote: > Hi. > > The patch below remove per-uidinfo locks: > > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch In uifree() is it ok to manually check the refcount for 0? I'm gussing the hashmtx is used as a barrier? -Alfred From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 15:01:31 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CEC5916A418; Sat, 18 Aug 2007 15:01:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 22F4D13C48E; Sat, 18 Aug 2007 15:01:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 20A504569A; Sat, 18 Aug 2007 17:01:30 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 0B67245696; Sat, 18 Aug 2007 17:01:25 +0200 (CEST) Date: Sat, 18 Aug 2007 17:00:28 +0200 From: Pawel Jakub Dawidek To: Alfred Perlstein Message-ID: <20070818150028.GD6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sgneBHv3152wZ8jf" Content-Disposition: inline In-Reply-To: <20070818142337.GW90381@elvis.mu.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 15:01:31 -0000 --sgneBHv3152wZ8jf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 18, 2007 at 07:23:37AM -0700, Alfred Perlstein wrote: > * Pawel Jakub Dawidek [070818 05:31] wrote: > > Hi. > >=20 > > The patch below remove per-uidinfo locks: > >=20 > > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch >=20 > In uifree() is it ok to manually check the refcount for 0? >=20 > I'm gussing the hashmtx is used as a barrier? Yes, to lookup uidinfo you need to hold uihashtbl_mtx mutex, so once you hold it and ui_ref is 0, noone will be able to reference it, because it has to wait to look it up. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --sgneBHv3152wZ8jf Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGxwmMForvXbEpPzQRAij1AKD39VFS9BweDqwgf9MRFZ9roIOxGQCgkPvW ekGd4bRXzvLa6EpgqnPzcaE= =PYJc -----END PGP SIGNATURE----- --sgneBHv3152wZ8jf-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 15:52:22 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49F1C16A417; Sat, 18 Aug 2007 15:52:22 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 37ADA13C45A; Sat, 18 Aug 2007 15:52:22 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id ACC5D1A4D82; Sat, 18 Aug 2007 08:50:41 -0700 (PDT) Date: Sat, 18 Aug 2007 08:50:41 -0700 From: Alfred Perlstein To: Pawel Jakub Dawidek Message-ID: <20070818155041.GY90381@elvis.mu.org> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> <20070818150028.GD6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070818150028.GD6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 15:52:22 -0000 * Pawel Jakub Dawidek [070818 07:59] wrote: > On Sat, Aug 18, 2007 at 07:23:37AM -0700, Alfred Perlstein wrote: > > * Pawel Jakub Dawidek [070818 05:31] wrote: > > > Hi. > > > > > > The patch below remove per-uidinfo locks: > > > > > > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch > > > > In uifree() is it ok to manually check the refcount for 0? > > > > I'm gussing the hashmtx is used as a barrier? > > Yes, to lookup uidinfo you need to hold uihashtbl_mtx mutex, so once you > hold it and ui_ref is 0, noone will be able to reference it, because it > has to wait to look it up. And the field doesn't need to be volatile to prevent cached/opportunitic reads? -- - Alfred Perlstein From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 16:15:54 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 918A316A41B; Sat, 18 Aug 2007 16:15:54 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 1E13713C48D; Sat, 18 Aug 2007 16:15:53 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 3941945CD9; Sat, 18 Aug 2007 18:15:51 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C36AE45685; Sat, 18 Aug 2007 18:15:45 +0200 (CEST) Date: Sat, 18 Aug 2007 18:14:49 +0200 From: Pawel Jakub Dawidek To: Alfred Perlstein Message-ID: <20070818161449.GE6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> <20070818150028.GD6498@garage.freebsd.pl> <20070818155041.GY90381@elvis.mu.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="brEuL7wsLY8+TuWz" Content-Disposition: inline In-Reply-To: <20070818155041.GY90381@elvis.mu.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 16:15:54 -0000 --brEuL7wsLY8+TuWz Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 18, 2007 at 08:50:41AM -0700, Alfred Perlstein wrote: > * Pawel Jakub Dawidek [070818 07:59] wrote: > > Yes, to lookup uidinfo you need to hold uihashtbl_mtx mutex, so once you > > hold it and ui_ref is 0, noone will be able to reference it, because it > > has to wait to look it up. >=20 > And the field doesn't need to be volatile to prevent cached/opportunitic > reads? The only chance of something like this will be the scenario below: thread1 (uifind) thread2 (uifree) ---------------- ---------------- refcount_release(&uip->ui_ref)) /* ui_ref =3D=3D 0 */ mtx_lock(&uihashtbl_mtx); refcount_acquire(&uip->ui_ref); /* ui_ref =3D=3D 1 */ mtx_unlock(&uihashtbl_mtx); mtx_lock(&uihashtbl_mtx); if (uip->ui_ref > 0) { mtx_unlock(&uihashtbl_mtx); return; } Now, you suggest that ui_ref in 'if (uip->ui_ref > 0)' may still have cached 0? I don't think it is possible, first refcount_acquire() uses read memory bariers (but we may still need ui_ref to volatile for this to make any difference) and second, think of ui_ref as a field protected by uihashtbl_mtx mutex in this very case. Is my thinking correct? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --brEuL7wsLY8+TuWz Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGxxr5ForvXbEpPzQRAlMmAKDKLAdY8FdL3zxvKydBiyBkSYglGACfR6Bz VWMnsUbbb/z6hvcfToxJWic= =z16K -----END PGP SIGNATURE----- --brEuL7wsLY8+TuWz-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 17:19:19 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 193C516A417; Sat, 18 Aug 2007 17:19:19 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 0563013C49D; Sat, 18 Aug 2007 17:19:19 +0000 (UTC) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id 5071E1A4D7C; Sat, 18 Aug 2007 10:17:38 -0700 (PDT) Date: Sat, 18 Aug 2007 10:17:38 -0700 From: Alfred Perlstein To: Pawel Jakub Dawidek Message-ID: <20070818171738.GB90381@elvis.mu.org> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> <20070818150028.GD6498@garage.freebsd.pl> <20070818155041.GY90381@elvis.mu.org> <20070818161449.GE6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070818161449.GE6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 17:19:19 -0000 * Pawel Jakub Dawidek [070818 09:14] wrote: > On Sat, Aug 18, 2007 at 08:50:41AM -0700, Alfred Perlstein wrote: > > * Pawel Jakub Dawidek [070818 07:59] wrote: > > > Yes, to lookup uidinfo you need to hold uihashtbl_mtx mutex, so once you > > > hold it and ui_ref is 0, noone will be able to reference it, because it > > > has to wait to look it up. > > > > And the field doesn't need to be volatile to prevent cached/opportunitic > > reads? > > The only chance of something like this will be the scenario below: > > thread1 (uifind) thread2 (uifree) > ---------------- ---------------- > refcount_release(&uip->ui_ref)) > /* ui_ref == 0 */ > mtx_lock(&uihashtbl_mtx); > refcount_acquire(&uip->ui_ref); > /* ui_ref == 1 */ > mtx_unlock(&uihashtbl_mtx); > mtx_lock(&uihashtbl_mtx); > if (uip->ui_ref > 0) { > mtx_unlock(&uihashtbl_mtx); > return; > } > > Now, you suggest that ui_ref in 'if (uip->ui_ref > 0)' may still have > cached 0? I don't think it is possible, first refcount_acquire() uses > read memory bariers (but we may still need ui_ref to volatile for this > to make any difference) and second, think of ui_ref as a field protected > by uihashtbl_mtx mutex in this very case. > > Is my thinking correct? I don't know, that's why I was asking you. :) -- - Alfred Perlstein From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 17:28:26 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A61F16A420; Sat, 18 Aug 2007 17:28:26 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id D1F2E13C442; Sat, 18 Aug 2007 17:28:25 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 9131D487F0; Sat, 18 Aug 2007 19:28:24 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 26E0D4569A; Sat, 18 Aug 2007 19:28:20 +0200 (CEST) Date: Sat, 18 Aug 2007 19:27:19 +0200 From: Pawel Jakub Dawidek To: Alfred Perlstein Message-ID: <20070818172719.GF6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> <20070818150028.GD6498@garage.freebsd.pl> <20070818155041.GY90381@elvis.mu.org> <20070818161449.GE6498@garage.freebsd.pl> <20070818171738.GB90381@elvis.mu.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="eDB11BtaWSyaBkpc" Content-Disposition: inline In-Reply-To: <20070818171738.GB90381@elvis.mu.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 17:28:26 -0000 --eDB11BtaWSyaBkpc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Aug 18, 2007 at 10:17:38AM -0700, Alfred Perlstein wrote: > * Pawel Jakub Dawidek [070818 09:14] wrote: > > On Sat, Aug 18, 2007 at 08:50:41AM -0700, Alfred Perlstein wrote: > > > * Pawel Jakub Dawidek [070818 07:59] wrote: > > > > Yes, to lookup uidinfo you need to hold uihashtbl_mtx mutex, so onc= e you > > > > hold it and ui_ref is 0, noone will be able to reference it, becaus= e it > > > > has to wait to look it up. > > >=20 > > > And the field doesn't need to be volatile to prevent cached/opportuni= tic > > > reads? > >=20 > > The only chance of something like this will be the scenario below: > >=20 > > thread1 (uifind) thread2 (uifree) > > ---------------- ---------------- > > refcount_release(&uip->ui_ref)) > > /* ui_ref =3D=3D 0 */ > > mtx_lock(&uihashtbl_mtx); > > refcount_acquire(&uip->ui_ref); > > /* ui_ref =3D=3D 1 */ > > mtx_unlock(&uihashtbl_mtx); > > mtx_lock(&uihashtbl_mtx); > > if (uip->ui_ref > 0) { > > mtx_unlock(&uihashtbl_mtx); > > return; > > } > >=20 > > Now, you suggest that ui_ref in 'if (uip->ui_ref > 0)' may still have > > cached 0? I don't think it is possible, first refcount_acquire() uses > > read memory bariers (but we may still need ui_ref to volatile for this > > to make any difference) and second, think of ui_ref as a field protected > > by uihashtbl_mtx mutex in this very case. > >=20 > > Is my thinking correct? >=20 > I don't know, that's why I was asking you. :) In previous version of the patch I had atomic_load() in there, but after some thinking I decided it's not needed and I change it. Now, that you asked about it I was afraid that maybe my thinking isn't correct. Anyway, it'll be good if someone could confirm it's ok. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --eDB11BtaWSyaBkpc Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGxyv3ForvXbEpPzQRAsJXAJ0e00hB+95tdLvWgtkiorckarjC0gCfVsDf pWL2e7NV3LNx7NhWl9B6D2s= =4DHl -----END PGP SIGNATURE----- --eDB11BtaWSyaBkpc-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 20:29:07 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A98BA16A41B for ; Sat, 18 Aug 2007 20:29:07 +0000 (UTC) (envelope-from prog@msobczak.com) Received: from mail1.fluidhosting.com (mx12.fluidhosting.com [204.14.89.2]) by mx1.freebsd.org (Postfix) with SMTP id E393213C46A for ; Sat, 18 Aug 2007 20:29:04 +0000 (UTC) (envelope-from prog@msobczak.com) Received: (qmail 31592 invoked by uid 399); 18 Aug 2007 20:02:24 -0000 Received: from localhost (HELO maciej-sobczaks-computer.local) (maciej@msobczak.com@127.0.0.1) by localhost with ESMTP; 18 Aug 2007 20:02:24 -0000 X-Originating-IP: 127.0.0.1 Message-ID: <46C75045.8000503@msobczak.com> Date: Sat, 18 Aug 2007 22:02:13 +0200 From: Maciej Sobczak User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <20070818120056.GA6498@garage.freebsd.pl> <20070818142337.GW90381@elvis.mu.org> <20070818150028.GD6498@garage.freebsd.pl> <20070818155041.GY90381@elvis.mu.org> <20070818161449.GE6498@garage.freebsd.pl> In-Reply-To: <20070818161449.GE6498@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-arch@FreeBSD.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 20:29:07 -0000 Pawel Jakub Dawidek wrote: > thread1 (uifind) thread2 (uifree) > ---------------- ---------------- > refcount_release(&uip->ui_ref)) > /* ui_ref == 0 */ > mtx_lock(&uihashtbl_mtx); > refcount_acquire(&uip->ui_ref); > /* ui_ref == 1 */ > mtx_unlock(&uihashtbl_mtx); > mtx_lock(&uihashtbl_mtx); > if (uip->ui_ref > 0) { > mtx_unlock(&uihashtbl_mtx); > return; > } > > Now, you suggest that ui_ref in 'if (uip->ui_ref > 0)' may still have > cached 0? I don't think it is possible, first refcount_acquire() uses > read memory bariers (but we may still need ui_ref to volatile for this > to make any difference) and second, think of ui_ref as a field protected > by uihashtbl_mtx mutex in this very case. > > Is my thinking correct? Yes, but I believe you are too conservative even with the above explanation. Unlocking (thread1) and subsequent locking (thread2) of the same mutex guarantees memory visibility between threads, at least if the mutex provides the fundamental release-acquire consistency. In this case, the memory barrier is part of this process itself and you don't need to do anything else to guarantee the visibility of ui_ref == 1 in thread2. The only thing to worry about is caching of values in CPU registers (note that this issue is separate from visibility), but these should be prevented by the compiler at the point of mtx_lock. There are basically two ways to guarantee it: either the compiler is too stupid/conservative to cache the value across mtx_lock if it's a function call, or it is smart enough to know (or just see) that there is a membar inside. In any case no register-level caching will take place. There should be no need to make anything volatile. -- Maciej Sobczak : http://www.msobczak.com/ Programming : http://www.msobczak.com/prog/ From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 22:09:01 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BC7E16A41A for ; Sat, 18 Aug 2007 22:09:01 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id B33A113C46A for ; Sat, 18 Aug 2007 22:08:59 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 8D0B2487F0; Sun, 19 Aug 2007 00:08:58 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 2D13645683 for ; Sun, 19 Aug 2007 00:08:53 +0200 (CEST) Date: Sun, 19 Aug 2007 00:07:56 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@FreeBSD.org Message-ID: <20070818220756.GH6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gKijDXBCEH69PxaN" Content-Disposition: inline In-Reply-To: <20070818120056.GA6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 22:09:01 -0000 --gKijDXBCEH69PxaN Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Two more things... > The patch below remove per-uidinfo locks: >=20 > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch We could upgrade from lock-free algorithm I used here to wait-free algorithm, but we don't have atomic_fetchadd_long(). How hard will it be to implement it? We could then change: do { old =3D uip->ui_proccnt; if (old + diff > max) return (0); } while (atomic_cmpset_long(&uip->ui_proccnt, old, old + diff) =3D=3D 0); to something like this: if (atomic_fetchadd_long(&uip->ui_proccnt, diff) + diff > max) { atomic_subtract_long(&uip->ui_proccnt, diff); return (0); } > I needed to change ui_sbsize from rlim_t (64bit) to long, because we > don't have 64bit atomics on all archs, and because sbsize represents > size in bytes, it can't go beyond 32bit on 32bit archs (PAE might be a > bit of a problem). Currently it's not a problem, because socket buffers have to be mapped in kernel space, so we can't map more than 4GB. This might be eventually a problem if we implement unmapped socket buffers and ui_sbsize will be sum of socket buffers from many processes. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --gKijDXBCEH69PxaN Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGx228ForvXbEpPzQRAj+QAJ9Y4g6hqHOfEhndF71nVdz5e36KRwCgnJPM R3JbjJEwLidFJy9fJ9FGzP0= =yQ2d -----END PGP SIGNATURE----- --gKijDXBCEH69PxaN-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 23:10:22 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9955716A419 for ; Sat, 18 Aug 2007 23:10:22 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 34A1813C47E for ; Sat, 18 Aug 2007 23:10:21 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 54FAC45CD9; Sun, 19 Aug 2007 01:10:20 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id D3EAB45684 for ; Sun, 19 Aug 2007 01:10:14 +0200 (CEST) Date: Sun, 19 Aug 2007 01:09:17 +0200 From: Pawel Jakub Dawidek To: freebsd-arch@FreeBSD.org Message-ID: <20070818230917.GI6498@garage.freebsd.pl> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818220756.GH6498@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Li7ckgedzMh1NgdW" Content-Disposition: inline In-Reply-To: <20070818220756.GH6498@garage.freebsd.pl> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 23:10:22 -0000 --Li7ckgedzMh1NgdW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Aug 19, 2007 at 12:07:56AM +0200, Pawel Jakub Dawidek wrote: > Two more things... >=20 > > The patch below remove per-uidinfo locks: > >=20 > > http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch >=20 > We could upgrade from lock-free algorithm I used here to wait-free > algorithm, but we don't have atomic_fetchadd_long(). How hard will it be > to implement it? >=20 > We could then change: >=20 > do { > old =3D uip->ui_proccnt; > if (old + diff > max) > return (0); > } while (atomic_cmpset_long(&uip->ui_proccnt, old, old + diff) =3D=3D 0); >=20 > to something like this: >=20 > if (atomic_fetchadd_long(&uip->ui_proccnt, diff) + diff > max) { > atomic_subtract_long(&uip->ui_proccnt, diff); > return (0); > } Ok, after implementing atomic_fetchadd_long() on amd64, we get additional 6% of performance improvement: x ./uidinfo_lockfree.txt (atomic_cmpset_long loop) + ./uidinfo_waitfree.txt (atomic_fetchadd_long) +--------------------------------------------------------------------------= ----+ | = +| | = +| |x xx xx = + ++| | |__MA___| = |AM| +--------------------------------------------------------------------------= ----+ N Min Max Median Avg Stddev x 5 1561566 1575987 1568964 1569767 5853.1399 + 5 1662362 1665936 1665810 1664881.8 1541.2693 Difference at 95.0% confidence 95114.8 +/- 6241.96 6.05917% +/- 0.397636% (Student's t, pooled s =3D 4279.88) --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Li7ckgedzMh1NgdW Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGx3wdForvXbEpPzQRApSQAKDIUM4EV6lomKGgQouhx/RlehhVbwCgoTPf rG0yxVgxqJb74QUyxYDSZKY= =R+1E -----END PGP SIGNATURE----- --Li7ckgedzMh1NgdW-- From owner-freebsd-arch@FreeBSD.ORG Sat Aug 18 23:33:00 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A9B116A41B; Sat, 18 Aug 2007 23:33:00 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 4768713C469; Sat, 18 Aug 2007 23:33:00 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.103] (c-67-160-44-208.hsd1.wa.comcast.net [67.160.44.208]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l7INWsUI026056 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Sat, 18 Aug 2007 19:32:55 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Sat, 18 Aug 2007 16:35:42 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Pawel Jakub Dawidek In-Reply-To: <20070818230917.GI6498@garage.freebsd.pl> Message-ID: <20070818163503.T568@10.0.0.1> References: <20070818120056.GA6498@garage.freebsd.pl> <20070818220756.GH6498@garage.freebsd.pl> <20070818230917.GI6498@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: Lockless uidinfo. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Aug 2007 23:33:00 -0000 On Sun, 19 Aug 2007, Pawel Jakub Dawidek wrote: > On Sun, Aug 19, 2007 at 12:07:56AM +0200, Pawel Jakub Dawidek wrote: >> Two more things... >> >>> The patch below remove per-uidinfo locks: >>> >>> http://people.freebsd.org/~pjd/patches/uidinfo_lockless.patch >> >> We could upgrade from lock-free algorithm I used here to wait-free >> algorithm, but we don't have atomic_fetchadd_long(). How hard will it be >> to implement it? >> >> We could then change: >> >> do { >> old = uip->ui_proccnt; >> if (old + diff > max) >> return (0); >> } while (atomic_cmpset_long(&uip->ui_proccnt, old, old + diff) == 0); >> >> to something like this: >> >> if (atomic_fetchadd_long(&uip->ui_proccnt, diff) + diff > max) { >> atomic_subtract_long(&uip->ui_proccnt, diff); >> return (0); >> } > > Ok, after implementing atomic_fetchadd_long() on amd64, we get additional > 6% of performance improvement: > > x ./uidinfo_lockfree.txt (atomic_cmpset_long loop) > + ./uidinfo_waitfree.txt (atomic_fetchadd_long) > +------------------------------------------------------------------------------+ > | +| > | +| > |x xx xx + ++| > | |__MA___| |AM| > +------------------------------------------------------------------------------+ > N Min Max Median Avg Stddev > x 5 1561566 1575987 1568964 1569767 5853.1399 > + 5 1662362 1665936 1665810 1664881.8 1541.2693 > Difference at 95.0% confidence > 95114.8 +/- 6241.96 > 6.05917% +/- 0.397636% > (Student's t, pooled s = 4279.88) How does this effect the single-threaded performance? Do you attribute this to atomic fetchadd being cheaper than atomic cmpset? What is your processor? Thanks, Jeff > > -- > Pawel Jakub Dawidek http://www.wheel.pl > pjd@FreeBSD.org http://www.FreeBSD.org > FreeBSD committer Am I Evil? Yes, I Am! >