From owner-freebsd-current@FreeBSD.ORG Thu May 6 11:11:18 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 35B8516A4CE for ; Thu, 6 May 2004 11:11:18 -0700 (PDT) Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by mx1.FreeBSD.org (Postfix) with ESMTP id D698343D1D for ; Thu, 6 May 2004 11:11:17 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 29377 invoked from network); 6 May 2004 18:11:17 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 6 May 2004 18:11:17 -0000 Received: from 10.50.40.205 (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i46IB7FC007624; Thu, 6 May 2004 14:11:14 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Thu, 6 May 2004 14:11:27 -0400 User-Agent: KMail/1.6 References: <20040506184749.R19447@gamplex.bde.org> In-Reply-To: <20040506184749.R19447@gamplex.bde.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200405061411.27216.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: Gerrit Nagelhout cc: 'Andrew Gallatin' Subject: Re: 4.7 vs 5.2.1 SMP/UP bridging performance X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 May 2004 18:11:18 -0000 On Thursday 06 May 2004 06:18 am, Bruce Evans wrote: > On Wed, 5 May 2004, Gerrit Nagelhout wrote: > > Andrew Gallatin wrote: > > > If its really safe to remove the xchg* from non-SMP atomic_store_rel*, > > > then I think you should do it. Of course, that still leaves mutexes > > > as very expensive on SMP (253 cycles on the 2.53GHz from above). > > See my other reply [1 memory barrier but not 2 seems to be needed for > each lock/unlock pair in the !SMP case, and the xchgl accidentally (?) > provides it; perhaps [lms]fence would give a faster memory barrier]. > More ideas on this: > - compilers should probably now generate memory barrier instructions foe > volatile variables (so volatile variables would be even slower :-). I > haven't seen gcc on i386's do this. > - jhb once tried changing mtx_lolock_spin(mtx)/mtx_unlock_spin(mtx) to > crticial_enter()/critical_exit(). This didn't work because it broke > mtx_assert(). It might also not work because it removes the memory > barrier. criticial_enter() only has the very weak memory barrier in > disable_intr() on i386's. That was only for the UP case, in which case you don't need the membar's. A single CPU always consistently sees what it has written. The only case when it doesn't is for memory that can be written to by device DMA, and that doesn't apply to kernel data structures, esp. not to ones for scheduling, etc. I actually have (untested) patches in the smpng branch to remove the one use of mtx_owned() (mtx_assert is not as big of a deal, that one can work fine by checking td_critnest) on sched_lock (the TSS munging code). The problem with the [lms]fence instructions is that sfence is only one PIII+, and lfence is only on PIV+. I don't recall when mfence first appeared.. perhaps PII? If the lock is really expensive, then perhaps we could make atomic_cmpset() be actual functions (ugh) rather than inlines that did a branch to use foofence for PIV rather than the default. The branches would suck, but it might be faster than the lock. Of course, this would greatly pessimize non-PIV. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org