Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 May 2007 20:44:36 +1000 (EST)
From:      Bruce Evans <bde@optusnet.com.au>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        arch@freebsd.org
Subject:   Re: sched_lock && thread_lock()
Message-ID:  <20070522201336.C87981@besplex.bde.org>
In-Reply-To: <20070522162819.N5249@besplex.bde.org>
References:  <20070520155103.K632@10.0.0.1> <20070521113648.F86217@besplex.bde.org> <20070520213132.K632@10.0.0.1> <4651CAB8.8070007@FreeBSD.org> <4651CE2F.8080908@FreeBSD.org> <20070521022847.D679@10.0.0.1> <20070521195811.G56785@delplex.bde.org> <4651FCB5.7070604@FreeBSD.org> <20070521225032.C57233@delplex.bde.org> <20070522162819.N5249@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 22 May 2007, Bruce Evans wrote:

> I'm still searching for a 1% performance loss in makeworld under UP
> that occurred just after the VMCNT changes.  I've only found this
> harmless (?) bug so far:
>
> % Index: vnode_pager.c
> % ===================================================================
> % RCS file: /home/ncvs/src/sys/vm/vnode_pager.c,v
> % retrieving revision 1.232
> % retrieving revision 1.233
> % diff -u -2 -r1.232 -r1.233
> % --- vnode_pager.c	14 Oct 2006 23:21:48 -0000	1.232
> % +++ vnode_pager.c	18 May 2007 07:10:50 -0000	1.233
> % @@ -910,6 +910,6 @@
> %  	atomic_add_int(&runningbufspace, bp->b_runningbufspace);
> % % -	cnt.v_vnodein++;
> % -	cnt.v_vnodepgsin += count;
>   	                    ^^^^^
> % +	VMCNT_ADD(vnodein, 1);
> % +	VMCNT_ADD(vnodepgsin, 1);
>   	                      ^
> % %  	/* do the input */

4 more translation errors breaking 8 counters altogether (v_vnodepgsin
is broken twice):

% Index: kern_fork.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/kern/kern_fork.c,v
% retrieving revision 1.270
% retrieving revision 1.271
% diff -u -2 -r1.270 -r1.271
% --- kern_fork.c	4 Apr 2007 09:11:32 -0000	1.270
% +++ kern_fork.c	18 May 2007 07:10:45 -0000	1.271
% @@ -666,18 +666,18 @@
% 
%  	if (flags == (RFFDG | RFPROC)) {
% -		atomic_add_int(&cnt.v_forks, 1);
% -		atomic_add_int(&cnt.v_forkpages, p2->p_vmspace->vm_dsize +
% +		VMCNT_ADD(forks, 1);
% +		VMCNT_ADD(forkpages, p2->p_vmspace->vm_dsize +
%  		    p2->p_vmspace->vm_ssize);
%  	} else if (flags == (RFFDG | RFPROC | RFPPWAIT | RFMEM)) {
% -		atomic_add_int(&cnt.v_vforks, 1);
    		                      ^^^^^^
% -		atomic_add_int(&cnt.v_vforkpages, p2->p_vmspace->vm_dsize +
    		                      ^^^^^^^^^^
% +		VMCNT_ADD(forks, 1);
    		          ^^^^^
% +		VMCNT_ADD(forkpages, p2->p_vmspace->vm_dsize +
    		          ^^^^^^^^^
%  		    p2->p_vmspace->vm_ssize);
%  	} else if (p1 == &proc0) {
% -		atomic_add_int(&cnt.v_kthreads, 1);
% -		atomic_add_int(&cnt.v_kthreadpages, p2->p_vmspace->vm_dsize +
% +		VMCNT_ADD(kthreads, 1);
% +		VMCNT_ADD(kthreadpages, p2->p_vmspace->vm_dsize +
%  		    p2->p_vmspace->vm_ssize);
%  	} else {
% -		atomic_add_int(&cnt.v_rforks, 1);
% -		atomic_add_int(&cnt.v_rforkpages, p2->p_vmspace->vm_dsize +
% +		VMCNT_ADD(rforks, 1);
% +		VMCNT_ADD(rforkpages, p2->p_vmspace->vm_dsize +
%  		    p2->p_vmspace->vm_ssize);
%  	}

% Index: vnode_pager.c
% ===================================================================
% RCS file: /home/ncvs/src/sys/vm/vnode_pager.c,v
% retrieving revision 1.232
% retrieving revision 1.233
% diff -u -2 -r1.232 -r1.233
% --- vnode_pager.c	14 Oct 2006 23:21:48 -0000	1.232
% +++ vnode_pager.c	18 May 2007 07:10:50 -0000	1.233
% @@ -1158,6 +1159,6 @@
%  	auio.uio_td = (struct thread *) 0;
%  	error = VOP_WRITE(vp, &auio, ioflags, curthread->td_ucred);
% -	cnt.v_vnodeout++;
    	           ^^^
% -	cnt.v_vnodepgsout += ncount;
    	              ^^^
% +	VMCNT_ADD(vnodein, 1);
    	               ^^
% +	VMCNT_ADD(vnodepgsin, ncount);
    	                  ^^
% 
%  	if (error) {

> Related API problem: v_vnodein and v_vnodepgsin are not used in the kernel
> except to increment them and to export them using sysctl, so they should
> be incremented using PCPY_LAZY_INC().  However, PCPU_LAZY_INC() can only
> do increments of 1.

Another problem with using PCU_LAZY_INC() is that all counters might need
to be exported to userland by emulators, but only the vm_meter sysctls
understand PCPU_LAZY_INC().  At least the following counters are currently
exported by at least the linprocfs emulator:

 	v_intr, v_swtch, v_vnodepgsin, v_vnodepgsout

linprocfs accesses all these counters using VMCNT_GET(), but VMCNT_GET()
doesn't work for counters in the pcpu.  v_intr is updated by PCPU_LAZY_INC()
so it is broken in linprocfs.  The others aren't yet updated by
PCPU_LAZY_INC() so they aren't broken in linprocfs (except the above
bugs break v_vnodepgs*).

Apart from these problems and bugs, PCPU_LAZY_INC() should be used for all
of the counters visible in the above patches.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070522201336.C87981>