From owner-freebsd-current@FreeBSD.ORG Sat Nov 11 09:03:42 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DE16A16A412; Sat, 11 Nov 2006 09:03:41 +0000 (UTC) (envelope-from ssouhlal@FreeBSD.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 61C2043D73; Sat, 11 Nov 2006 09:03:41 +0000 (GMT) (envelope-from ssouhlal@FreeBSD.org) Received: from [192.168.0.103] (c-24-6-50-125.hsd1.ca.comcast.net [24.6.50.125]) by elvis.mu.org (Postfix) with ESMTP id C7E3C1A4D82; Sat, 11 Nov 2006 01:03:40 -0800 (PST) Message-ID: <455591E0.5070509@FreeBSD.org> Date: Sat, 11 Nov 2006 01:03:28 -0800 From: Suleiman Souhlal User-Agent: Mozilla Thunderbird 1.0.7 (X11/20051204) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Attilio Rao References: <3bbf2fe10610181518k68356528i154267c0bd1b1a77@mail.gmail.com> <3bbf2fe10611051857m4c644ad2o7d71a86e46eaf9a8@mail.gmail.com> In-Reply-To: <3bbf2fe10611051857m4c644ad2o7d71a86e46eaf9a8@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: pho@freebsd.org, freebsd-current@freebsd.org, kmacy@freebsd.org, freebsd-arch@freebsd.org Subject: Re: sx locks rewriting - needs testers X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Nov 2006 09:03:42 -0000 Attilio Rao wrote: > 2006/10/18, Attilio Rao : > >> In my P4 branch: //depot/user/attilio/attilio_smpng/... >> you can find a sx locks rewriting using the optimized semantic of >> rwlocks; in the end this might result in a valuable performance >> improvement. Excellent work. Do you have any benchmark results? You might have a hard time convincing someone to commit this if you don't show that it's worth it, especially if you consider how much it complicates the code. > After have received very positive stress-test feedbacks from pho@ > (that I would like to thank for his patience and work) I went ahead > and rather completed the implementation. Now this is ready to be > reviewed and possibly committed into the CVS. > Even if I plan a longer work on this branch (about syncronizing > primitives), I think it is time for a revision of the work done until > now from SMPng people (jhb, kmacy, etc.) and possibily an inclusion > into HEAD (patch actually is 58k...). > > Some few hints about the patch: > - LOCK_DEBUG adds a dependence between sx.h and lock.h (as for rwlocks > and mutex) and a the new options SXLOCK_NOINLINE is added > - XFS locking is still disabled in the patch (I hope to do it for > tomorrow, I'm in GMT+1). > - Possibly the sleepqueue interface modifies and new flags might be > documented in the manpages (and NOTES file too, in order to reflect > SXLOCK_NOINLINE inclusion). > - It misses still of the adaptive spinning code, but it can be > inserted after without problems. We might want to avoid adaptive spinning, since sx locks may be held for quite long periods of time. > You can download the code directly from perforce > (//depot/user/attilio/attilio_smpng/...) but patches are available > here: > http://users.gufi.org/~rookie/works/patches/smpng06112006.diff > http://users.gufi.org/~rookie/works/patches/_sx.h > > I hope you will enjoy it (feedbacks, ideas, comments would be very > appreciated). > > Attilio A few comments: --- //depot/vendor/freebsd/src/sys/i386/include/param.h 2006/01/09 06:10:20 +++ //depot/user/attilio/attilio_smpng/i386/include/param.h 2006/10/03 21:33:06 @@ -109,6 +109,15 @@ #endif /* + * Define our own cache alignment mask for syncronizing primitives. Pentium4 + * and Xeon want a 128-byte wide aligned syncronizing primitive in order to + * minimize cache bus traffic on CPUs cache lines movements. + */ +#ifndef SYNC_ALIGN +#define SYNC_ALIGN (128 - 1) +#endif + Please also add this to amd64, and maybe rename it to UMA_ALIGN_SYNC. On other architectures, have it be the same as UMA_ALIGN_CACHE. This way, you don't have to define SYNC_ALIGN in every C file that uses it. While there, you might also want to make UMA_ALIGN_CACHE the correct size for i386/amd64 (64 bytes, on most machines, i believe?). Ideally, this would be a variable set at boot time, depending on what the CPU reports. I also feel that the part that makes turnstiles and sleepqueues 128 byte aligned should be broken up as as a separate patch/commit, as it doesn't really have much to do with your sx rewriting, as I understand it. --- //depot/vendor/freebsd/src/sys/kern/kern_sx.c 2006/08/15 18:31:36 +++ //depot/user/attilio/attilio_smpng/kern/kern_sx.c 2006/11/06 02:24:27 @@ -78,13 +50,8 @@ sx_init(struct sx *sx, const char *description) { - sx->sx_lock = mtx_pool_find(mtxpool_lockbuilder, sx); - sx->sx_cnt = 0; - cv_init(&sx->sx_shrd_cv, description); - sx->sx_shrd_wcnt = 0; - cv_init(&sx->sx_excl_cv, description); - sx->sx_excl_wcnt = 0; - sx->sx_xholder = NULL; + sx->sx_lock = SX_UNHELD; + sx->sx_desc = "sx lock"; lock_init(&sx->sx_object, &lock_class_sx, description, NULL, LO_WITNESS | LO_RECURSABLE | LO_SLEEPABLE | LO_UPGRADABLE); Shouldn't it be sx->sx_desc = description; ? Keep up the good work! I hope to see some benchmark results and to see it committed soon! :-) -- Suleiman