From owner-cvs-all@FreeBSD.ORG Wed Mar 10 10:20:10 2004 Return-Path: Delivered-To: cvs-all@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DAA8016A4CF; Wed, 10 Mar 2004 10:20:09 -0800 (PST) Received: from mailout1.pacific.net.au (mailout1.pacific.net.au [61.8.0.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 224A743D60; Wed, 10 Mar 2004 10:20:07 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.0.86])i2AIJHUe002248; Thu, 11 Mar 2004 05:19:17 +1100 Received: from gamplex.bde.org (katana.zip.com.au [61.8.7.246]) i2AIJ7gC024043; Thu, 11 Mar 2004 05:19:08 +1100 Date: Thu, 11 Mar 2004 05:19:07 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Tim Robbins In-Reply-To: <20040310131348.GA95975@cat.robbins.dropbear.id.au> Message-ID: <20040311044539.S2105@gamplex.bde.org> References: <200403090245.i292j0a6035728@repoman.freebsd.org> <20040309032248.GA88649@cat.robbins.dropbear.id.au> <20040309035532.GA88825@cat.robbins.dropbear.id.au> <20040309043207.GA65153@kanpc.gte.com> <20040310035912.GQ56622@elvis.mu.org> <20040310131348.GA95975@cat.robbins.dropbear.id.au> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: src-committers@freebsd.org cc: cvs-src@freebsd.org cc: Alfred Perlstein cc: cvs-all@freebsd.org cc: John Birrell cc: Alexander Kabaev cc: John Birrell Subject: Re: cvs commit: src/lib/libc/stdio _flock_stub.c local.h X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Mar 2004 18:20:10 -0000 On Thu, 11 Mar 2004, Tim Robbins wrote: > On Tue, Mar 09, 2004 at 07:59:12PM -0800, Alfred Perlstein wrote: > > > * Bruce Evans [040309 07:50] wrote: > > > This would pessimize even getc_unlocked() and putc_unlocked(). getc() > > > and putc() are now extern functions, but the old macro/inline versions > > > are still available as getc_unlocked() and putc_unlocked(). Simple > > > benchmarks for reading a 100MB file on an Athlon XP1600 overclocked > > > show that the function versions are up to 9 times slower: > > > ... > > > > Hmm, can't we use macros that do this: > > > > #define getc() (__isthreaded ? old_unlocked_code : getc_unlocked()) > > > > Where __isthreaded is a global that's set by threading libraries > > to 1 and 0 by non-threaded libc, this should get rid of a lot of > > the function call overhead. > > Sounds like a good idea to me. In my testing, this approach was about 5% > slower than calling getc_unlocked() directly (due to the conditional jump), > but roughly 3 times faster than a call to the getc() function. You must not have tested the dynamic linkage case :-). > If there aren't any objections, I think we should implement getc()/putc() > this way (and all the other stdio functions that have traditionally had > macro equivalents) before 5-stable to try to recoup some of the performance > losses caused by the removal of the macros. Is __isthreaded always set early enough? What about if the application is dynamically linked and loads thread support later (is this supported)? The 5% cost of checking on every call can be avoided by pushing the check into a fucntion. E.g.: for getc(): % #define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++)) It can be arranged that --(p)->_r < 0 is always true for the threaded case (by keeping only a flag in it and keeping the real count elsewhere). The threaded case would become slightly slower since it would always have the dummy count check plus a dummy count fixup. Actually it shouldn't need the fixup. The above is hand-optimized for PDP11's and would probably be no slower on current hardware with current compilers written as: #define getc(p) ((p)->_r <= 0 ? __srget(p) : (--(p)->_r, (int)(*(p)->_p++))) I once wrote a version of stdio that optimized the usual case putc() similarly by arranging that the write count is always 0 except for the fully buffered case, so that other cases get handled by a function (this pessimizes the line buffered case relative to the FreeBSD putc_unlocked()). Bruce