Date: Sun, 18 Aug 2002 07:19:07 +1000 (EST) From: Bruce Evans <bde@zeta.org.au> To: Ian Dowse <iedowse@maths.tcd.ie> Cc: arch@FreeBSD.ORG Subject: Re: Solving the stack gap issue Message-ID: <20020818055951.N12475-100000@gamplex.bde.org> In-Reply-To: <200208171918.aa72556@salmon.maths.tcd.ie>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 17 Aug 2002, Ian Dowse wrote: > Many emulated Linux system calls use the stack gap to store paths > and structures that need to be converted before calling the native > system call. This has the well-known problem that shared address > space threads can corrupt each others stack gap data if they perform > system calls concurrently. Especially on SMP boxes this makes many > Linux applications unusable on FreeBSD. It also has the not so well known problems that stackgaps don't even work well for their main purpose of avoiding lots of code having to know that the args are in a special place. It just moves the problem from lots of general code having to know this to lots of compat code having to know this (so it actually increases the problem if there is enough compat code). Some compat code doesn't know this very well and causes panics by accessing the stack gap directly. Non-broken code would require lots more copyins and copyouts to avoid direct accesses: copyin input-args from user space translate input-args copyout input-args to stack gap call BSD syscall copyin input-args from stack gap reverse-translate input-args copyout input-args to user space > A few approaches have been suggested: > - Lock access to the stack gap, so that only one thread at a time > can use it. > - Use a different address region for each thread. > - Avoid the need for the stack gap by providing kernel-callable versions > of all syscalls. > ... > I have attempted to implement the third approach. It requires more Seems best. > extensive changes than the others, but it has the advantage of > aiming to remove the stack gap hack instead of just adding another > bad-aid to it. That said, it does add some overhead to normal system > calls, so it may be that some ugliness is necessary to balance this > tradeoff. I'd like normal calls to have a fast path. We're already 1 or 2 layers slower than Linux. (Linux on i386's does something like "pushal; call syscalltable(,%eax,4)" for the fast path, so it goes directly from the lowest layer to sys_foo(), but FreeBSD calls syscall() from the lowest label and syscall() does lots of relatively slow things.) > The basic change is that many system calls now have a version that > is called with the FreeBSD ABI syscall arguments, e.g. > > int > open(struct thread *td, struct open_args *uap) > { > return sys_open(td, SCARG(uap, path), UIO_USERSPACE, > SCARG(uap, flags), SCARG(uap, mode)); > } I would prefer this to be named something like xxx_open() and in a translation layer between Xsyscall() and open(), with the translation layer as null as possible. > and a version that contains all of the code for implementing open(2) > and is internal to the kernel, e.g.: > > int sys_open(struct thread *td, char *path, enum uio_seg pathseg, > int flags, int mode); I would prefer this to be named open() and take the same args as open(2). Passing around td args seems to just lead to pessimizations and bugs, since syscalls especially almost require td == curthread to work... > In this case it is expected that we may need support for both > userspace and kernel-space paths, and since a path is a simple > string, it makes sense for the caller to specify the address space > from which the path should be read. For other functions, it seems > more appropriate to have the wrapper do the copyin() itself. I think copyin() would be more simpler for pathnames too (but a pessimization unless it were done for all syscalls that take pathnames and not done by namei()). Reconsider this later. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020818055951.N12475-100000>