From owner-freebsd-arch  Thu Apr  6 20:41:15 2000
Delivered-To: freebsd-arch@freebsd.org
Received: from ns1.yes.no (ns1.yes.no [195.204.136.10])
	by hub.freebsd.org (Postfix) with ESMTP id 0E2C437BA53
	for <freebsd-arch@freebsd.org>; Thu,  6 Apr 2000 20:41:06 -0700 (PDT)
	(envelope-from eivind@bitbox.follo.net)
Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218])
	by ns1.yes.no (8.9.3/8.9.3) with ESMTP id FAA16196
	for <freebsd-arch@freebsd.org>; Fri, 7 Apr 2000 05:44:36 +0200 (CEST)
Received: (from eivind@localhost)
	by bitbox.follo.net (8.8.8/8.8.6) id FAA35072
	for freebsd-arch@freebsd.org; Fri, 7 Apr 2000 05:41:00 +0200 (CEST)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by hub.freebsd.org (Postfix) with ESMTP id 3375637BD8C
	for <freebsd-arch@FreeBSD.ORG>; Thu,  6 Apr 2000 20:40:24 -0700 (PDT)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.9.3/8.9.1) id UAA93335;
	Thu, 6 Apr 2000 20:40:19 -0700 (PDT)
	(envelope-from dillon)
Date: Thu, 6 Apr 2000 20:40:19 -0700 (PDT)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200004070340.UAA93335@apollo.backplane.com>
To: Jonathan Lemon <jlemon@flugsvamp.com>
Cc: Archie Cobbs <archie@whistle.com>,
	Jonathan Lemon <jlemon@flugsvamp.com>, freebsd-arch@freebsd.org
Subject: Re: RFC: kqueue API and rough code
References: <200004070107.SAA97591@bubba.whistle.com> <200004070220.TAA92896@apollo.backplane.com> <20000406220454.J80578@prism.flugsvamp.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:> 		short	flags;
:> 		long	data;
:> 		union {
:> 		    int idata;
:> 		    void *pdata;
:> 		} udata;
:> 		void 	*reserved2[2];
:> 	};
:
:Um.  One problem with this: the structure just went from 12 bytes
:(on a x86) to 32 bytes.  I'm trying to keep this as small as possible,
:since I really want to minimize both the overhead of copying data
:back and forth, and the amount of space used in the kernel.   I can
:see adding one more field (to 16 bytes), but doubling the size seems
:to be just too much.

    While this is relevant, not having important features just to keep 
    the structure small is only going to hurt the system call's functionality
    and scaleability.

    But if you are really that worried, then the most important field you
    can add is 'udata'.

:This essentially moves the state into the kernel.  My feeling is that
:it's not the kernel's job to track the data; if the user wants to 
:associate more state with the identifier, then it can maintain that
:in user space.  The (ident/filter) should be enough to identify the 
:event.

    Udata is *extremely* important.  In order to associate user data with
    an even in user space the user program must associate the data with the
    file descriptor.  Since you can have multiple events associated with
    a single descriptor (e.g. read and write), this complicates matters even
    more.  Worse, by NOT having udata or dispatch capabilities you 
    essentially require that all event handling go through a single user-level
    mechanism, which is every similar to what must be done with select() now.

    With udata, especially when combined with a function dispatch of some
    sort (either event-to-thread or a function-dispatch field which is to
    be called on an event), various libraries in the program can all use the
    kernel queue system calls in their own way without interfering with each
    other.

    In fact, I would go as far as to say that a function dispatch field should
    be included in the event data structure as well.  It's even more important
    then thread dispatch fields.

    It is VERY important that you be able to do this if you want to allow
    third party libraries to use the kernel event queueing mechanism
    without intefering with your own use of the mechanism.

:> 	thread:	(future) Set thread to return event on, 0 to return the event
:> 		on any thread (or the main thread), -1 to create a new thread
:> 		on the fly to handle the event.
:
:The event is returned in the context of the calling thread; that is,
:the thread itself is what does the dequeing, so I'm not sure how this
:would be useful.  Could you explain how this would work?
:...
:Jonathan

    Consider a program that is linked against three or four third party
    libraries... for example, consider a program linked against the X11
    libraries.

    Now lets say that the X11 libraries are multi-threaded and want to use
    the kernel queue mechanism to handle events asynchronously in the
    user space of the program using the library.

    It is not possible for the X11 libraries to do this if they happen to use
    a different dispatch mechanism then your main loop uses.  In fact, if you
    look at how X (and other non-embedded subsystems) are organized, it is
    precisely this problem that leads to an inability to scale their 
    interfaces.  i.e. you can't use your neat cool event mechanism if you 
    have to call another library's code that serves as the main loop for your
    program rather then you being able to serve as the main loop for your
    program.

    How much thread programming or embedded work have you ever done?  This
    sort of stuff is bread and butter to those of us that do that.

    Here is an example of what a kernel-managed dispatch system with
    dispatch and udata could do:

    /*
     * Module implements an asynchronous write with timeout.
     */
    somemodule_messing_with_some_tcp_connection(struct manage *m)
    {
	setdispatch(m->fd, m, module_write, priority, EVENT_WRITE);
	setdispatch(m->fd, m, module_read, priority, EVENT_READ);
	/* returns here */
    }

    module_write(int fd, struct manage *m, int r)
    {
	issues write and either clears the dispatch function
	or allows it to remain, depending.
    }

    module_read(int fd, struct manage *m, int r)
    {
	issues read, deals with read data.
    }

    Notice a couple of things?  Like for example the above code is 
    completely independant of any other module or library.  It does not
    require a central loop to read events nor does it have to use the
    same support library that some other library might use to implement
    event handling.  But only if the event dispatch and user data
    ('m' in this case, which the kernel does not interpret but simply
    passes to the event handler) are directly supported by the kernel.


					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message