Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Jan 2003 11:01:43 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Robert Watson <rwatson@FreeBSD.ORG>
Cc:        Peter Wemm <peter@wemm.org>, Terry Lambert <tlambert2@mindspring.com>, "Alan L. Cox" <alc@imimic.com>, arch@FreeBSD.ORG
Subject:   Re: getsysfd() patch #1 (Re: Virtual memory question) 
Message-ID:  <200301161901.h0GJ1htn023581@apollo.backplane.com>
References:   <Pine.NEB.3.96L.1030116100226.59693C-100000@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
    Hmm.  Well, as an owner of one of the original NeXT boxes I am quite
    familiar with the Mach auxillary data mechanism.  It has rather serious
    issues not the least of which being that it is difficult (or impossible)
    to cache mappings to make the mechanism efficient.  This is because
    the userland and kernel do not agree on the mapping prior to the message
    being sent.  This has not changed since the NeXT days and forcing the
    kernel to repeatedly reinterpret and remap userland pointers on a per
    message basis is a major problem.

    To do it right we would need a way to extend the interface to support
    pre-registered data areas.  For example, instead of constructing a
    random mach message and calling mach_msg() on it what we really need
    is a system call to allocate a mach_msg() which can then be managed
    both in kernel and user space, removing the mapping overhead for the
    message header (mach_msg_header_t) when mach_msg() is called. 
    Similarly, registering send and receive data areas would be needed to
    solve this endemic problem with Mach.  This would result in an order
    of magnitude faster processing of the Mach message.

    The idea of using the mach port primitive is not a bad idea, though.
    Mach ports are very similar to Amiga message ports and messages and
    I really liked the Amiga mechanism.  I think I could implement the
    Mach port primitives quite easily (at least the core support for it),
    and it would certainly apply to the SYSFD_TIMER and SYSFD_MSGQ brainstorm.
    I'm not sure it applies to SYSFD_MEMORY, however, because you still need
    a handle (file descriptor) and you still need the flexibility to mmap()
    ports of the VM object however and wherever you want, and the mach
    messaging interface is not suited to that at all.  The mach messaging
    interface is designed for discrete data sets.

    So we still have the problem of allocating the file descriptor to
    represent the VM Object for SYSFD_MEMORY.  Right offhand I do not recall
    a Mach equivalent for something like that.  We could do it with a
    system call (ala getsysfd()), or we could do it with a device.

    I have to say that I don't see how using /dev/zero is any more portable
    then creating a new system call.  Extending an existing mechanism does
    not in any way make an implementation more portable.  In fact, in my
    view, extending an existing mechanism far beyond its original intention
    can result in more confusion and more difficulty because there may not
    be any clear way to determine whether the new API is actually supported
    or not by the target architecture.  /dev/zero is seriously overused and
    that creates a hassle for anyone trying to use its extended mechanisms
    in a portable program.  The Diablo news system I wrote has four different
    ways of managing shared memory and mapped memory due to all the weird 
    extensions different operating systems have done with mmap() and
    /dev/zero, and it wasn't fun making it all work.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

:...
:So I'm not saying a new API would be the wrong thing to do, I just want us
:to explore the options and see which has the lowest impact vs biggest
:bang.  One concern I have with introducing entirely new primitives is how
:to fit them into the MAC Framework (i.e., are there new objects that
:require labels that didn't have labels before, how to document and
:instrument the important operations).  Another concern is application
:portability -- we've actually had a lot of luck with other OS's picking up
:kqueue(), but IPC is likely to be more controversial, especially if it
:overlaps existing functionality provided by other IPC primitives.
:
:And, I'd like to avoid any further System V IPC debacles, where the
:semantics are such a poor match for UNIX that it's almost impossible to do
:useful security things with them.  :-) 
:
:Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
:robert@fledge.watson.org      Network Associates Laboratories

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200301161901.h0GJ1htn023581>