Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 2 Aug 2007 19:06:28 -0700 (PDT)
From:      Jeff Roberson <jroberson@chesapeake.net>
To:        Alfred Perlstein <alfred@freebsd.org>
Cc:        arch@freebsd.org, Kris Kennaway <kris@obsecurity.org>
Subject:   Re: Fine grain select locking.
Message-ID:  <20070802190033.J561@10.0.0.1>
In-Reply-To: <20070803014445.GS92956@elvis.mu.org>
References:  <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org> <20070704114005.X552@10.0.0.1> <20070729180722.GB85196@rot26.obsecurity.org> <20070802174819.S561@10.0.0.1> <20070803014445.GS92956@elvis.mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Thu, 2 Aug 2007, Alfred Perlstein wrote:

> * Jeff Roberson <jroberson@chesapeake.net> [070802 17:52] wrote:
>>
>> I believe filedescriptor locking is the place where we are most lacking.
>> The new sx helped tremendously.  However, this is still going to be a
>> scalability limiter.  I have looked into both linux and solaris's solution
>> to this problem.  Briefly, linux uses RCU to protect the list, which is
>> close to ideal as this is certainly a read heavy workload.  Solaris on the
>> other hand uses the actual file lock to protect the descriptor slot.  So
>> they fetch the file pointer, lock it, and then check to see if they lost a
>> race with the slot being reassigned while they were acquiring the lock.
>> This approach is perhaps better than rcu in many cases except when the
>> descriptor set is expanded.  Then they have to lock every file in the set.
>
> Certainly this is an extreme edge case... ?

Well that may be true, yes.  However, there are other problems with this 
scheme.  For example, flags settings could be done entirely with cmpset, 
without using a lock at all.  In most cases we're just setting a bit which 
can be done with atomic_set.  When we're doing multiple operations we 
could compute the value and attempt to est it in a loop.  So we can 
totally eliminate locking the descriptor here.

We also could use atomic ops to protect the file descriptor reference 
count.  This would eliminate another use of the FILE_LOCK().  I'm not sure 
if it's possible to merge this with an approach that uses the FILE_LOCK() 
to protect the descriptor table.  Although I've not thought it all the way 
through.

If the ref count and flags were done with atomics the main consumer of 
FILE_LOCK would actually be the unix domain socket garbage collection 
code.  How's that for old unix baggage.  Do many programs actually pass 
around descriptors these days?  inetd?  others?  It might be worth it to 
lock this seperately from the file lock.

Anyway, these things need to be explored for 8.0.  With more cores and 
more multi-threaded applications file desc locking is one of the first 
points of contention as evidenced by the lock profiles and the 
sophistication of the solutions in other kernels.

Jeff


>
> I could see it happening if we started with low limits, but perhaps
> by keeping counters/stats we could tell people how to tune their
> systems, or even autotune them.
>
> -Alfred
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070802190033.J561>