From owner-freebsd-current@FreeBSD.ORG Mon Jun 28 06:54:14 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7173A16A4CE; Mon, 28 Jun 2004 06:54:14 +0000 (GMT) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 26EAA43D53; Mon, 28 Jun 2004 06:54:14 +0000 (GMT) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) i5S6rvds076566; Sun, 27 Jun 2004 23:53:57 -0700 (PDT) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.9p2/8.12.9/Submit) id i5S6rufW076565; Sun, 27 Jun 2004 23:53:56 -0700 (PDT) (envelope-from dillon) Date: Sun, 27 Jun 2004 23:53:56 -0700 (PDT) From: Matthew Dillon Message-Id: <200406280653.i5S6rufW076565@apollo.backplane.com> To: Robert Watson References: cc: freebsd-current@freebsd.org cc: Cordula's Web cc: alex@hightemplar.com Subject: Re: HEADSUP: ibcs2 and svr4 compat headed for history X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Jun 2004 06:54:14 -0000 Because of a desire to maintain / have / create compatibility with other operating systems, including remaining compatible with FreeBSD-4 and adding FreeBSD-5 compatibility (possibly), as well as Linux, and of course other architectures that might be used far less.... I have for the last year been thinking very carefully about the issue of the compatibility code we have in the kernel. The problem that I see is not so much that the compatibility code exists, but that it exists in the kernel. I believe that the solution is to move it to userland and thus unburden the kernel from having to deal with it. In userland it can be maintained (A) more easily, (B) without the security issues involved with it being in the kernel, and (C) is far more portable. I fully intend to undertake this project for DragonFly, especially because as we move to a messaged syscall interface we need to maintain compatibility with the non-messaged interface, and I want that to be a function set that runs in userland. i.e. for DragonFly when someone calls the 'native' read(), it wouldn't actually be a libc function but would instead be an intermediate user-level function vector whos code space is managed by the kernel, almost like a mmap'd library (or exactly like an mmap()'d library, but with a vector table). It would be great if we could come up with a joint methodology, because once such an abstraction is operational all the compatibility code that falls under it, being userland code, would be highly portable to any operating system running the abstraction. I would recommend that instead of ripping this stuff out of FreeBSD-5 willy nilly, leave it in for now and let's spend our energies on the development of an intermediate compatibility layer, abstraction, and API. The actual kernel work required to implement such a layer is not all that complex -- really all the kernel has to do is take an INT 0xN and throw it back in userland's face (or even just make the INT 0xN vector an LDT vector that runs in userland's protection ring and never even enters the kernel). In regards to where these functions would reside... well, I was thinking that we would reserve a chunk of VM either just below the kernel start, or just above the kernel start which would contain the intermediate layer. The actual address is almost irrelevant because the entry mechanism is, of course, the system call entry mechanism being emulated. It would be pure read-only code, with no writable data other then the stack, whos purpose is simply to translate system calls into the 'native' form. Another aspect of this abstraction is that it would be possible to change the kernel's own native entry interface, argument format, and so forth, and yet still maintain compatibility with 'older' userland programs by having an intermediate layer that glues userland program targeted to version X of the kernel to version Y of the kernel which is actually running. (This is why DFly needs it). One would also be able to abstract out optimizations, such as providing non-ring-crossing timestamp functions that utilize memory mapped I/O or other things... these types of functions would be placed in the proposed intermediate (run in user mode) layer. The intermediate layer would also have a direct access mechanism. That is, userland programs which are aware of the layer could query to get a vector base and call through a vector array into the layer directly. The intermediate layer would then optmiize those calls that do not require entry into the kernel and pass the rest on to the kernel. The userland program would not know the difference, which is the whole point of the exercise. So, as you can see, there is great potential flexibility in such a design. So much so, in fact, that the ability to move things like SysV and IBCS2 out of the kernel become mere side effects of a larger purpose. It would be a huge advance over the crufty syscall methodology that all UNIXes today employ. -Matt