FreeBSD Mail Archives

Date:      Mon, 5 Jul 1999 16:11:07 -0400 (EDT)
From:      Zach Brown <zab@zabbo.net>
To:        Jonathan Lemon <jlemon@americantv.com>
Cc:        Mike Smith <mike@smith.net.au>, hackers@FreeBSD.ORG
Subject:   Re: poll() scalability
Message-ID:  <Pine.LNX.4.10.9907051551020.5548-100000@hoser>
In-Reply-To: <19990705144352.55649@right.PCS>

On Mon, 5 Jul 1999, Jonathan Lemon wrote:

> Yes, but I also need support for temporarily "de-registering" interest
> in an fd, as well selectively choosing read/write/close events.

yeah, this isn't terribly doable in the sigio/signal model.  as you note
later, this is indeed edge triggered so if you were to 'turn off' sigio on
the fds for a bit, you might lose a state transition.  this can be
trivially accounted for in the userland code by receiving all events and
filtering out the ones you aren't interested in, and thats what I
currently do, but having a mask in the kernel might be more sensible.  
(as it turns out there are very few errant signals)

> In this case, it doesn't seem all that different than a poll() type
> call, or an event queue that Peter was talking about.  If the signal
> is blocked, then sigwaitinfo() effectively becomes a poll(), but with
> no timeout parameter.

in that its blocking, yes, but its the only thing we're blocking on in
the event engine.  thats the only way I can see that it is like poll() :)

> I agree.  One aspect of the design that should be explored is whether
> the new call should report "events", or "state".  My current implementation
> reports state; (think of it as a level-triggered design), while the 

the call should be able to do either, I'd guess.  everything will be based
around changes in state, regardless.  the difference is wether you send
those as units of work off to userland or if you toss them at an internal
data structure that will then determine if the user cares about the
resulting state.  (so note that the 'level triggered' system could be
built in user space code provided a 'edge triggered' kernel facility. some
might argue taking this path)

> siginfo approach appears to be more of an "edge-triggered" design.  I

definitely..

> Hah.  It showed up on my profiling; the kernel I'm running has routines
> so the child fd inherits certain settings from the parent.

that strikes me as odd.. these calls should be trivial when compared to
other things that are going on..  the only annoying thing this does to the
sigio/siginfo state machine is requiring an initial read() on the socket..
but as that almost always returns the request, its all right.

> > 	read() in the header (usually done in one read, but rarely
> > 		will block and require falling back to a POLL_IN on
> > 		the new fd)
> 
> Well, we _NEVER_ want to block.  Ever.  And in my particular case, 
> it is quite common for the header/data to be spread out over several
> reads.

this only blocks when there is no work to do :)  what I was referring to
up there was the read() returning EAGAIN and us waiting for the POLL_IN
event on the socket to say that there is data ready..

> Correct.  I need this for a web caching proxy.  The above loop won't work
> in my particular case.

??  works fine for me.  (am getting 3500 reqs/second over a single thread
over localhost with small files on an smp machine)

> Exactly.  Sometimes, we just want to close() an fd, even if there are 
> pending events, and then immediately re-open with a new fd.  Deferred
> closes are not an option.  What I do at the moment is remove queued 
> events/state when the fd is closed.  (actually, my implementation sucks
> a bit, as I re-scan the state for this particular case).

the act of walking the queues could easily be more expensive than doing
the defered close stuff if your queues are of respectable length.  all the
deferred close really does is add a latency to fd re-use, it doesn't
significantly change the work involved..

> Well, I'm sure that we have a lot of engineering talent around here.  :-)

:)

I need to go give that paper a read..

-- zach

- - - - - -
007 373 5963

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.10.9907051551020.5548-100000>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation