Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 4 Jul 2007 11:47:35 -0700 (PDT)
From:      Jeff Roberson <jroberson@chesapeake.net>
To:        Alfred Perlstein <alfred@freebsd.org>
Cc:        arch@freebsd.org
Subject:   Re: Fine grain select locking.
Message-ID:  <20070704114005.X552@10.0.0.1>
In-Reply-To: <20070704105525.GU45894@elvis.mu.org>
References:  <20070702230728.E552@10.0.0.1> <20070703181242.T552@10.0.0.1> <20070704105525.GU45894@elvis.mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 4 Jul 2007, Alfred Perlstein wrote:

> * Jeff Roberson <jroberson@chesapeake.net> [070703 18:16] wrote:
>> Here is an update that avoids the malloc per fd when there are no
>> collisions.  This unfortunately adds 64bytes to every socket in the
>> system.  This is less than 10% of the size of the socket.  Vnodes only
>> allocate their selinfo structures on demand so this does not cause a
>> per-file overhead.  This was suggested by Peter.   This patch also uses a
>> vm zone for the selfd structures.  I can shrink them slightly by using a
>> SLIST in one case vs TAILQ as well.
>>
>> http://people.freebsd.org/~jeff/select2.diff
>
> Jeff, I understand you're trying to speed up mysql micro benchmarks,
> but have you done any benchmarking on large select operations?

I don't know that I'd call mysql a micro-benchmark.  This patch also 
didn't help there as much as I had hoped and I'm still trying to 
understand why.

>
> You seemed very dismissive when I brought up caching of the selfd
> objects and malloc'd bitmap space per-thread on IRC, so I'd like
> to know if that was based on anything.

Caching the selfd objects would have a tremendous impact on storage 
overhead.  I'm not sure how valuable caching the bits is.  We avoid the 
malloc using the following stack allocation in most cases:

fd_mask s_selbits[howmany(2048, NFDBITS)];

If you're interested in further optimizing that I welcome you to it.  My 
motivation as to address the locking without significantly altering other 
characteristics.   If you wanted to cache the bits you might also be able 
to cache the selfd registration and do something similar to selrescan(). 
This would have to be done very carefully to avoid semantic changes.

>
> What are the numbers before and after for selecting on 1000
> or maybe 10000 descriptors before and after your patch?

I agree this needs to be done.  Diane Bruce has a patch to improve the 
instruction cost of scanning the fds as well.  She has a test framework 
for select and has most graciously offered to do these measurements.

>
> This is especially important if you'd like it in the door
> for 7.0, right?

I decided I wouldn't even propose that.  I think it's premature and needs 
to be studied more.

Thanks,
Jeff

>
> -Alfred
>
>
>
>>
>> Thanks,
>> Jeff
>>
>> On Mon, 2 Jul 2007, Jeff Roberson wrote:
>>
>>> I have a diff which makes the following improvements to select:
>>>
>>> 1) Per-thread wait channel rather than global select wait channel.
>>> 2) Per-thread select lock.
>>> 3) Rescan after sleep scans only descriptors which have come active.
>>> 4) No exposed select internals.
>>> 5) selwakeuppri() works again.
>>> 6) No thread_lock()ing in select, no TDF_SELECT required.
>>> 7) No more collisions.
>>>
>>> This is based on an approach from Alfred with some locking and rescan
>>> improvements by me.  It only required changing select users in cases where
>>> they assumed only one thread could select at a time.
>>>
>>> The unfortunate cost of this patch is that a descriptor per select fd must
>>> be allocated to track individual threads.  This is what allows us to know
>>> which descriptor has fired an event and allows us to use per-thread
>>> locking etc.
>>>
>>> The one thing I haven't fixed is netsmb and netncp which both have some
>>> wonky select implementation that could be replaced with kern_select().
>>> That could be done seperately from this patch but is required for this to
>>> go in.
>>>
>>> http://people.freebsd.org/~jeff/select.diff
>>>
>>> Comments and suggestions welcome.
>>>
>>> Thanks,
>>> Jeff
>>> _______________________________________________
>>> freebsd-arch@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
>>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
>>>
>> _______________________________________________
>> freebsd-arch@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
>> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
>
> -- 
> - Alfred Perlstein
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070704114005.X552>