From owner-freebsd-arch Thu Jan 16 11: 1:47 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0B6837B401; Thu, 16 Jan 2003 11:01:45 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3C37143EB2; Thu, 16 Jan 2003 11:01:45 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0GJ1i0i023582; Thu, 16 Jan 2003 11:01:44 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0GJ1htn023581; Thu, 16 Jan 2003 11:01:43 -0800 (PST) Date: Thu, 16 Jan 2003 11:01:43 -0800 (PST) From: Matthew Dillon Message-Id: <200301161901.h0GJ1htn023581@apollo.backplane.com> To: Robert Watson Cc: Peter Wemm , Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hmm. Well, as an owner of one of the original NeXT boxes I am quite familiar with the Mach auxillary data mechanism. It has rather serious issues not the least of which being that it is difficult (or impossible) to cache mappings to make the mechanism efficient. This is because the userland and kernel do not agree on the mapping prior to the message being sent. This has not changed since the NeXT days and forcing the kernel to repeatedly reinterpret and remap userland pointers on a per message basis is a major problem. To do it right we would need a way to extend the interface to support pre-registered data areas. For example, instead of constructing a random mach message and calling mach_msg() on it what we really need is a system call to allocate a mach_msg() which can then be managed both in kernel and user space, removing the mapping overhead for the message header (mach_msg_header_t) when mach_msg() is called. Similarly, registering send and receive data areas would be needed to solve this endemic problem with Mach. This would result in an order of magnitude faster processing of the Mach message. The idea of using the mach port primitive is not a bad idea, though. Mach ports are very similar to Amiga message ports and messages and I really liked the Amiga mechanism. I think I could implement the Mach port primitives quite easily (at least the core support for it), and it would certainly apply to the SYSFD_TIMER and SYSFD_MSGQ brainstorm. I'm not sure it applies to SYSFD_MEMORY, however, because you still need a handle (file descriptor) and you still need the flexibility to mmap() ports of the VM object however and wherever you want, and the mach messaging interface is not suited to that at all. The mach messaging interface is designed for discrete data sets. So we still have the problem of allocating the file descriptor to represent the VM Object for SYSFD_MEMORY. Right offhand I do not recall a Mach equivalent for something like that. We could do it with a system call (ala getsysfd()), or we could do it with a device. I have to say that I don't see how using /dev/zero is any more portable then creating a new system call. Extending an existing mechanism does not in any way make an implementation more portable. In fact, in my view, extending an existing mechanism far beyond its original intention can result in more confusion and more difficulty because there may not be any clear way to determine whether the new API is actually supported or not by the target architecture. /dev/zero is seriously overused and that creates a hassle for anyone trying to use its extended mechanisms in a portable program. The Diablo news system I wrote has four different ways of managing shared memory and mapped memory due to all the weird extensions different operating systems have done with mmap() and /dev/zero, and it wasn't fun making it all work. -Matt Matthew Dillon :... :So I'm not saying a new API would be the wrong thing to do, I just want us :to explore the options and see which has the lowest impact vs biggest :bang. One concern I have with introducing entirely new primitives is how :to fit them into the MAC Framework (i.e., are there new objects that :require labels that didn't have labels before, how to document and :instrument the important operations). Another concern is application :portability -- we've actually had a lot of luck with other OS's picking up :kqueue(), but IPC is likely to be more controversial, especially if it :overlaps existing functionality provided by other IPC primitives. : :And, I'd like to avoid any further System V IPC debacles, where the :semantics are such a poor match for UNIX that it's almost impossible to do :useful security things with them. :-) : :Robert N M Watson FreeBSD Core Team, TrustedBSD Projects :robert@fledge.watson.org Network Associates Laboratories To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message