Date: Sun, 4 Jul 1999 11:15:13 +0100 (BST) From: Doug Rabson <dfr@nlsystems.com> To: Jonathan Lemon <jlemon@americantv.com> Cc: hackers@freebsd.org, grog@freebie.lemis.com, peter@netplex.com.au Subject: Re: poll() scalability Message-ID: <Pine.BSF.4.10.9907041107490.15087-100000@salmon.nlsystems.com> In-Reply-To: <19990704000042.59954@right.PCS>
index | next in thread | previous in thread | raw e-mail
On Sun, 4 Jul 1999, Jonathan Lemon wrote:
>
> This is an earlier posting that I attempted to make, perhaps
> it can provide a starting point for discussion. While this
> is already implemented, I'm not adverse to tossing it all for
> something better.
> --
> Jonathan
>
>
> ----- Forwarded message from owner-freebsd-arch@FreeBSD.ORG -----
>
> Date: Mon, 5 Apr 1999 17:42:02 -0500
> From: Jonathan Lemon <jlemon@cs.wisc.edu>
> To: freebsd-arch@freebsd.org
>
> I'd like to open discussion on adding a new interface to FreeBSD,
> specifically, a variant of poll().
>
> The problem is that poll() (and select(), as well) do not scale
> well as the number of open file descriptors increases. When there
> are a large number of descriptors under consideration, poll() takes
> an inordinate amount of time. For the purposes of my particular
> application, "large" translates into roughly 40K descriptors.
>
> As having to walk this descriptor list (and pass it between user and
> kernel space) is unpalatable, I would like to have the interface
> simply take a "change" list instead. The kernel would keep the
> state of the descriptors being examined, and would in turn, return
> a short list of descriptors that actually had any activity.
>
> In essence, I want to move the large "struct pollfd" array that I
> have into the kernel, and then instruct the kernel to add/remove
> entries from this array, and only return the array subset which
> has activity.
How does the kernel manage this? Does each process potentially store a
struct pollfd in struct proc? This seems a bit limiting since it forces a
program to have exactly one call to poll.
Peter's description of David Filo's event queue thing seems to solve that
problem by introducing a new kernel object (the event queue).
>
> A possible (actually, my current) implementation looks like this:
>
> struct fd_change {
> short fd;
> short events;
> };
Limited to 32767 file descriptors. Trivial to change though. Do you remove
a fd from the list by setting events to 0?
>
> int
> new_poll(
> int nchanges; // entries in new changelist
> struct fd_change *changelist; // changes to be made
> int n_events; // max size of output list
> struct fd_change *event; // returned list of events
> int timeout; // timeout (same as poll)
> )
>
> Where the returned value is either an error, or the number of events
> stored in the returned changelist.
>
> Some pseudo-code that would exercise the interface:
>
> struct fd_change fc[ MAXCHANGE ];
>
> fc[0].fd = 20;
> fc[0].events = ADD | READ ; // add, mark read "interest"
>
> fc[1].fd = -1; // ignore this one
>
> fc[2].fd = 32;
> fc[2].events = DELETE ; // delete previous fd
>
> fc[3].fd = 46;
> fc[3].events = WRITE ; // ask for writable events
>
> n_changes = new_poll(4, fc, MAXCHANGE, fc, -1);
>
>
> Comments? Note that I haven't discussed the implementation details;
> the implementation is done, and can probably be altered/improved,
> but I would like to solicit feedback on the feasability of the interface.
As I said before I'm uneasy about the kernel tracking the state (list of
fds) in the process. A separate kernel object would be a much cleaner
solution and would be usable by a program which called poll in many
different ways.
With this api, a library would be unable to use the new interface since it
would not know the new_poll state setup by the main program and would not
be able to change it without potentially breaking the caller's state.
--
Doug Rabson Mail: dfr@nlsystems.com
Nonlinear Systems Ltd. Phone: +44 181 442 9037
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message
help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.10.9907041107490.15087-100000>
