Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 May 2007 16:45:19 +1000 (EST)
From:      Bruce Evans <bde@optusnet.com.au>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        arch@freebsd.org
Subject:   Re: sched_lock && thread_lock()
Message-ID:  <20070522162819.N5249@besplex.bde.org>
In-Reply-To: <20070521225032.C57233@delplex.bde.org>
References:  <20070520155103.K632@10.0.0.1> <20070521113648.F86217@besplex.bde.org> <20070520213132.K632@10.0.0.1> <4651CAB8.8070007@FreeBSD.org> <4651CE2F.8080908@FreeBSD.org> <20070521022847.D679@10.0.0.1> <20070521195811.G56785@delplex.bde.org> <4651FCB5.7070604@FreeBSD.org> <20070521225032.C57233@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 22 May 2007, Bruce Evans wrote:

> Bugs found while grepping for sched_locking of VMCNT fields:
> - v_trap is supposed to be updated by PCPU_LAZY_INC(), but subr_trap.c
>  updates it by VMCNT_ADD().  The direct access used to be perfectly unlocked
>  and non-atomic, since it occurs immediately after releasing sched_lock
>  and just used "++".  Now it is atomic and a micro-pessimization.
> - v_trap's name is now hidden by the macro so grepping for it doesn't work.

I'm still searching for a 1% performance loss in makeworld under UP
that occurred just after the VMCNT changes.  I've only found this
harmless (?) bug so far:

% Index: vnode_pager.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/vm/vnode_pager.c,v
% retrieving revision 1.232
% retrieving revision 1.233
% diff -u -2 -r1.232 -r1.233
% --- vnode_pager.c	14 Oct 2006 23:21:48 -0000	1.232
% +++ vnode_pager.c	18 May 2007 07:10:50 -0000	1.233
% @@ -910,6 +910,6 @@
%  	atomic_add_int(&runningbufspace, bp->b_runningbufspace);
% 
% -	cnt.v_vnodein++;
% -	cnt.v_vnodepgsin += count;
    	                    ^^^^^
% +	VMCNT_ADD(vnodein, 1);
% +	VMCNT_ADD(vnodepgsin, 1);
    	                      ^
% 
%  	/* do the input */

Related API problem: v_vnodein and v_vnodepgsin are not used in the kernel
except to increment them and to export them using sysctl, so they should
be incremented using PCPY_LAZY_INC().  However, PCPU_LAZY_INC() can only
do increments of 1.

The increments of +1 also complicate searching for the regression.  After
removing `volatile', the VMCNT change should have no effect on the binary,
but it has an effect since increments of 1 are replaced by "movl $1,%reg;
addl %reg,mem" in i386 code, since the i386 atomic asms don't use incl
and the "ir" constraint isn't working as intended (this is with gcc-3.4.6).
The "ir" constraint should result in the "i" variant being chosen and
generate "addl $1,mem".

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070522162819.N5249>