Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Apr 2005 09:27:55 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Jeff Roberson <jroberson@chesapeake.net>
Cc:        cvs-all@FreeBSD.org
Subject:   Re: cvs commit: src/sys/vm uma_core.c uma_int.h
Message-ID:  <20050430091629.O31768@fledge.watson.org>
In-Reply-To: <20050429215256.E5127@mail.chesapeake.net>
References:  <200504291856.j3TIuapc077941@repoman.freebsd.org> <20050429215256.E5127@mail.chesapeake.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, 29 Apr 2005, Jeff Roberson wrote:

>>   Modify UMA to use critical sections to protect per-CPU caches, rather than
>>   mutexes, which offers lower overhead on both UP and SMP.  When allocating
>>   from or freeing to the per-cpu cache, without INVARIANTS enabled, we now
>>   no longer perform any mutex operations, which offers a 1%-3% performance
>>   improvement in a variety of micro-benchmarks.  We rely on critical
>>   sections to prevent (a) preemption resulting in reentrant access to UMA on
>>   a single CPU, and (b) migration of the thread during access.  In the event
>>   we need to go back to the zone for a new bucket, we release the critical
>>   section to acquire the global zone mutex, and must re-acquire the critical
>>   section and re-evaluate which cache we are accessing in case migration has
>>   occured, or circumstances have changed in the current cache.
>
> Excellent work.  thanks.  You could also use sched_pin() in uma_zalloc 
> to prevent migration so you can be certain that you're still accessing 
> the same cache.  You wont be able to trust the state of that cache. 
> I'm not sure whether or not this would make a difference, but it could 
> be beneificial if we decide to do per-cpu slab lists for locality on 
> NUMA machines.

In my first pass, I did use sched_pin, but I found that since I had to 
revalidate the state of the cache anyway in the event the critical section 
was released so we could acquire a mutex, pinning added complexity without 
immediate measurable benefit.  I'm also a bit tepid about over-pinning, as 
that prevents the scheduler from balancing the load well, such as 
migrating the thread in the event a higher priority thread wants to run on 
the current CPU (such as a pinned ithread that preempts).  I don't have 
any measurement to suggest to what extent this occurs in practice, 
currently, but think these are issues we should explore.  A case like that 
isn't quite priority inversion, since the tread with the right priority 
will take precedence, but might result in less effective utilization if 
pinneds thread uses quite a bit of CPU.

Robert N M Watson



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050430091629.O31768>