Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 May 2007 01:16:22 +0200
From:      Attilio Rao <attilio@FreeBSD.org>
To:        Bruce Evans <bde@optusnet.com.au>
Cc:        arch@freebsd.org
Subject:   Re: sched_lock && thread_lock()
Message-ID:  <46522846.5000707@FreeBSD.org>
In-Reply-To: <20070521225032.C57233@delplex.bde.org>
References:  <20070520155103.K632@10.0.0.1> <20070521113648.F86217@besplex.bde.org> <20070520213132.K632@10.0.0.1> <4651CAB8.8070007@FreeBSD.org> <4651CE2F.8080908@FreeBSD.org> <20070521022847.D679@10.0.0.1> <20070521195811.G56785@delplex.bde.org> <4651FCB5.7070604@FreeBSD.org> <20070521225032.C57233@delplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce Evans wrote:
> Just change the PCPU_LAZY_INC() macro to access the non-per-cpu global,
> using the same locking as you use for the non-pcpu access method.  I
> do this in some (old) sys trees for the !SMP case.  This gives a small
> unpessimization and fixes reading the values from dead kernels, but
> only in the !SMP case.  I also remove pc_cnt from the pcpu data in the
> !SMP case.  This is a smaller optimization and requires larger changes
> to remove adding up the pcpu data in vm_meter.c.  The adding up can
> be left unchanged since it becomes a no-op if the data is left at all
> zeros.  In the old sys trees, there is no locking for individual accesses,
> so PCPU_LAZY_INC(var) becomes (++(var)).  In the current tree,
> PCPU_LAZY_INC(var) is (++(<pcu var>)) written in asm for some arches to
> ensure that it is atomic with respect to interrupts on these arches.
> Atomicness with respect to interrupts is necessary and sufficient for
> the non-per-cpu global !SMP case too.

So you want these fields to be nomore per-cpu entirely, ok, now I got 
what you mean.

> Bugs found while grepping for sched_locking of VMCNT fields:
> - v_trap is supposed to be updated by PCPU_LAZY_INC(), but subr_trap.c
>   updates it by VMCNT_ADD().  The direct access used to be perfectly 
> unlocked
>   and non-atomic, since it occurs immediately after releasing sched_lock
>   and just used "++".  Now it is atomic and a micro-pessimization.
> - v_trap's name is now hidden by the macro so grepping for it doesn't work.

Great, this is a bug introduced in old code that I didn't catch.
I'm going to switch it to PCPU_LAZY_INC.
Anyways you can grap those in this way (assuming you are in /usr/src/sys/):
$ grep -r trap * | grep VMCNT_

> Non(?)-bugs found while grepping for VMCNT:
> - most updates of v_free_count are via pq->lcnt++ and pq->lcnt-- in
>   vm_page.c and vm_pageq.  pq->lcnt is VMCNT_PTR(&cnt.vm_free_count)
>   for all pq.  The locking for this seems to be correct except of course
>   in sysctls, since it uses vm_page_queue_free_mtx and never needed
>   VMCNT access methods.  The unneeded VMCNT_GET()s for this alone comprise
>   about 1/3 of the VMCNT_GET()s in vm.

You have to consider that VMCNT_*() is used also for dealing with 
'volatilization' of struct vmmeter cnt. it, so, entirely hides all 
implementation details for vmmeter counter. VMCNT_PTR is currently the 
right way to access to these datas if you want to hide the volatile to 
sysctls (and please note that VMCNT_GET() doesn't add any pessimization 
to the current code).

Thanks,
Attilio



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?46522846.5000707>