From owner-freebsd-hackers Tue Mar 18 11:55:05 1997 Return-Path: Received: (from root@localhost) by freefall.freebsd.org (8.8.5/8.8.5) id LAA23104 for hackers-outgoing; Tue, 18 Mar 1997 11:55:05 -0800 (PST) Received: from time.cdrom.com (root@time.cdrom.com [204.216.27.226]) by freefall.freebsd.org (8.8.5/8.8.5) with ESMTP id LAA23096 for ; Tue, 18 Mar 1997 11:55:01 -0800 (PST) Received: from time.cdrom.com (jkh@localhost [127.0.0.1]) by time.cdrom.com (8.8.5/8.6.9) with ESMTP id LAA17703 for ; Tue, 18 Mar 1997 11:55:00 -0800 (PST) To: hackers@freebsd.org Subject: REPOST: dup3() - interesting feature-in-training or silly hack? Date: Tue, 18 Mar 1997 11:55:00 -0800 Message-ID: <17699.858714900@time.cdrom.com> From: "Jordan K. Hubbard" Sender: owner-hackers@freebsd.org X-Loop: FreeBSD.org Precedence: bulk [I posted this last night to -hackers, but several people reported back that they never saw it, and freefall sort of crashed about 3 hours later so perhaps it just got eaten.] From: "Jordan K. Hubbard" Date: Mon, 17 Mar 1997 22:35:23 -0800 Subject: Whee! jkh adds his first syscall... The subject alone should have the kernel hackers all running for shelter at this point - "Aigh! He's looking in /usr/src/sys!" they yell. "Somebody stop him!!" :-) Well, OK, maybe I have to confess that I've just always wanted to see what would be involved in adding a system call, and in particular *this* system call in order that I might implement a long-standing wishlist item of mine with redirection and piping (I've got the first part of this, but not the second yet). The system call: int dup3(int oldd, pid_t tpid, int newd) In dup3(), a target process ID and the value of the new descriptor newd is specified in the context of that process. If this descriptor is currently assigned to a valid file, then it will be returned as a new file descriptor in the current process context, otherwise -1 is returned. If the returned file descriptor is not needed then it should be closed. The primary purpose of dup3() is to allow "splicing" of I/O in already-running processes. Yes, I know many will look at the name and go "yuck!" - it's half in jest, OK? :) So what's the use of it? To use in things like shells, so that you can do stuff like this: # make world ^Z%1 Suspended # bg > make.out # This also works with fg, so you can foreground and redirect stdin, stdout and stderr at the same time just as easily. The patches here to sh implement this extra behavior, conditionalized on HAVE_DUP3. Now please note: This really is just proof-of-concept material here for 3 big reasons: 1. Thwapping over another processes's file descriptors is rude, and it generally confuses things like the stdio library to do this. It seems to mostly work OK in my implementation, but I'm sure some sort of "invalidate current buffered fd contents" hack would have to be added to stdio if you wanted to make it all work correctly (try redirecting stdin, for example, and see the slightly odd behavior it has now). 2. The hacks to the shell are exceedingly minimal, and don't implement the more complicated and useful cases like: fg | more or yes | fg Note that it should be perfectly possible, but you'd have to change the way the shell handles these builtins pretty substantially to make it work. Making redirection happen was easy. :-) The changes also only cover /bin/sh, which of course almost nobody uses. If we actually manage to make a useful facility out of this, someone would also have to beat on bash and tcsh. 3. I'm sure there is at least one blatant security hole opened by this mechanism, and I do *ONLY THE MOST MINIMAL* checks for security. More specifically, I compare the euids of from and to, refusing the dup3() if they don't match (or the current euid is 0). This is a very minimal test, and I don't even test for proper parent/child relationship in the non-root case. So use this stuff at your own risk! I'm mostly just releasing it for comments at this point, to find out if I'm really just smoking crack with this whole idea. Patches relative to 2.2-current, thought they should work just as well in 3.0. I also included patches to all the "derived" files from syscalls.master - while not strictly necessary, it saves everyone from having to do anything more than apply this patch from the top of /usr/src and build a new libc, new kernel and new /bin/sh. Feedback most welcome. Jordan Index: bin/sh/Makefile =================================================================== RCS file: /home/ncvs/src/bin/sh/Makefile,v retrieving revision 1.15 diff -u -r1.15 Makefile - --- Makefile 1996/10/25 14:49:24 1.15 +++ Makefile 1997/03/18 06:00:58 @@ -15,7 +15,7 @@ LDADD+= -ll -ledit -ltermcap LFLAGS= -8 # 8-bit lex scanner for arithmetic - -CFLAGS+=-DSHELL -I. -I${.CURDIR} +CFLAGS+=-DSHELL -DHAVE_DUP3 -I. -I${.CURDIR} # for debug: # CFLAGS+= -g -DDEBUG=2 Index: bin/sh/jobs.c =================================================================== RCS file: /home/ncvs/src/bin/sh/jobs.c,v retrieving revision 1.8.2.1 diff -u -r1.8.2.1 jobs.c - --- jobs.c 1997/01/12 21:58:49 1.8.2.1 +++ jobs.c 1997/03/18 06:00:45 @@ -213,11 +213,20 @@ struct job *jp; { struct procstat *ps; - - int i; + int i, fd; if (jp->state == JOBDONE) return; INTOFF; +#ifdef HAVE_DUP3 + for (i = 0; i < 2; i++) { + if (fd_redirected_p(i)) { + fd = dup3(i, jp->ps[0].pid, i); + if (fd != -1) + close(fd); + } + } +#endif killpg(jp->ps[0].pid, SIGCONT); for (ps = jp->ps, i = jp->nprocs ; --i >= 0 ; ps++) { if ((ps->status & 0377) == 0177) { @@ -591,7 +600,7 @@ ignoresig(SIGINT); ignoresig(SIGQUIT); if ((jp == NULL || jp->nprocs == 0) && - - ! fd0_redirected_p ()) { + ! fd_redirected_p (0)) { close(0); if (open("/dev/null", O_RDONLY) != 0) error("Can't open /dev/null"); @@ -602,7 +611,7 @@ ignoresig(SIGINT); ignoresig(SIGQUIT); if ((jp == NULL || jp->nprocs == 0) && - - ! fd0_redirected_p ()) { + ! fd_redirected_p (0)) { close(0); if (open("/dev/null", O_RDONLY) != 0) error("Can't open /dev/null"); Index: bin/sh/redir.c =================================================================== RCS file: /home/ncvs/src/bin/sh/redir.c,v retrieving revision 1.5 diff -u -r1.5 redir.c - --- redir.c 1996/09/01 10:21:36 1.5 +++ redir.c 1997/03/18 05:50:46 @@ -76,11 +76,11 @@ MKINIT struct redirtab *redirlist; /* - - * We keep track of whether or not fd0 has been redirected. This is for + * We keep track of whether or not fds 0-2 have been redirected. This is for * background commands, where we want to redirect fd0 to /dev/null only - - * if it hasn't already been redirected. + * if it hasn't already been redirected, and for fb/bg redirection to files. */ - -int fd0_redirected = 0; +int fd_redirected[3]; STATIC void openredirect __P((union node *, char[10 ])); STATIC int openhere __P((union node *)); @@ -132,8 +132,8 @@ } else { close(fd); } - - if (fd == 0) - - fd0_redirected++; + if (fd >= 0 && fd <= 2) + fd_redirected[fd]++; openredirect(n, memory); } if (memory[1]) @@ -267,8 +267,8 @@ for (i = 0 ; i < 10 ; i++) { if (rp->renamed[i] != EMPTY) { - - if (i == 0) - - fd0_redirected--; + if (i >= 0 && i <= 2) + fd_redirected[i]--; close(i); if (rp->renamed[i] >= 0) { copyfd(rp->renamed[i], i); @@ -303,8 +303,11 @@ /* Return true if fd 0 has already been redirected at least once. */ int - -fd0_redirected_p () { - - return fd0_redirected != 0; +fd_redirected_p (int fd) { + if (fd >= 0 && fd <= 2) + return fd_redirected[fd] != 0; + else + return 0; } /* Index: bin/sh/redir.h =================================================================== RCS file: /home/ncvs/src/bin/sh/redir.h,v retrieving revision 1.3 diff -u -r1.3 redir.h - --- redir.h 1996/09/01 10:21:37 1.3 +++ redir.h 1997/03/18 05:51:43 @@ -44,7 +44,7 @@ union node; void redirect __P((union node *, int)); void popredir __P((void)); - -int fd0_redirected_p __P((void)); +int fd_redirected_p __P((int)); void clearredir __P((void)); int copyfd __P((int, int)); Index: lib/libc/sys/Makefile.inc =================================================================== RCS file: /home/ncvs/src/lib/libc/sys/Makefile.inc,v retrieving revision 1.20 diff -u -r1.20 Makefile.inc - --- Makefile.inc 1996/09/20 13:55:25 1.20 +++ Makefile.inc 1997/03/17 18:13:35 @@ -14,7 +14,7 @@ # modules with default implementations on all architectures: ASM= accept.o access.o acct.o adjtime.o bind.o chdir.o chflags.o chmod.o \ - - chown.o chroot.o close.o connect.o dup.o dup2.o execve.o fchdir.o \ + chown.o chroot.o close.o connect.o dup.o dup2.o dup3.o execve.o fchdir.o \ fchflags.o fchmod.o fchown.o fcntl.o flock.o fpathconf.o fstat.o \ fstatfs.o fsync.o getdirentries.o getdtablesize.o getegid.o \ geteuid.o getfh.o getfsstat.o getgid.o getgroups.o getitimer.o \ @@ -109,6 +109,7 @@ MLINKS+=brk.2 sbrk.2 MLINKS+=dup.2 dup2.2 +MLINKS+=dup.2 dup3.2 MLINKS+=chdir.2 fchdir.2 MLINKS+=chflags.2 fchflags.2 MLINKS+=chmod.2 fchmod.2 Index: lib/libc/sys/dup.2 =================================================================== RCS file: /home/ncvs/src/lib/libc/sys/dup.2,v retrieving revision 1.3.2.3 diff -u -r1.3.2.3 dup.2 - --- dup.2 1997/03/09 22:16:51 1.3.2.3 +++ dup.2 1997/03/18 06:13:05 @@ -37,7 +37,8 @@ .Os BSD 4 .Sh NAME .Nm dup , - -.Nm dup2 +.Nm dup2 , +.Nm dup3 .Nd duplicate an existing file descriptor .Sh SYNOPSIS .Fd #include @@ -45,6 +46,8 @@ .Fn dup "int oldd" .Ft int .Fn dup2 "int oldd" "int newd" +.Ft int +.Fn dup3 "int oldd" "pid_t tpid" "int newd" .Sh DESCRIPTION .Fn Dup duplicates an existing object descriptor and returns its value to @@ -113,6 +116,18 @@ is a valid descriptor, then .Fn dup2 is successful, and does nothing. +.Pp +In +.Fn dup3 , +a target process ID and the value of the new descriptor +.Fa newd +is specified in the context of that process. If this descriptor +is currently assigned to a valid file, then it will be returned +as a new file descriptor in the current process context, otherwise +-1 is returned. If the returned file descriptor is not needed then +it should be closed. The primary purpose of +.Fn dup3 +is to allow "splicing" of I/O in already-running processes. .Sh IMPLEMENTATION NOTES .Pp In the non-threaded library @@ -166,9 +181,10 @@ .Va errno indicates the cause of the error. .Sh ERRORS - -.Fn Dup - -and +.Fn Dup , .Fn dup2 +and +.Fn dup3 fail if: .Bl -tag -width Er .It Bq Er EBADF @@ -178,6 +194,18 @@ is not a valid active descriptor .It Bq Er EMFILE Too many descriptors are active. +.Pp +.Fn dup3 +will additionally fail if: +.Bl -tag -width Er +.It Bq Er ESRCH +The +.Fa tpid +is not found. +.It Bq Er EPERM +The effective uid of the current process does not match that of +the target process. Only the super user can modify the file descriptor +table of processes with a different euid. .El .Sh SEE ALSO .Xr accept 2 , @@ -202,3 +230,6 @@ .Fn dup2 function call appeared in .At v7 . +The +.Fn dup3 +function call appeared in FreeBSD 3.0 . Index: sys/kern/init_sysent.c =================================================================== RCS file: /home/ncvs/src/sys/kern/init_sysent.c,v retrieving revision 1.36 diff -u -r1.36 init_sysent.c - --- init_sysent.c 1996/09/19 19:48:31 1.36 +++ init_sysent.c 1997/03/17 18:15:11 @@ -2,7 +2,7 @@ * System call switch table. * * DO NOT EDIT-- this file is automatically generated. - - * created from Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp + * created from Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp */ #include @@ -266,7 +266,7 @@ { 3, (sy_call_t *)shmctl }, /* 229 = shmctl */ { 1, (sy_call_t *)shmdt }, /* 230 = shmdt */ { 3, (sy_call_t *)shmget }, /* 231 = shmget */ - - { 0, (sy_call_t *)nosys }, /* 232 = nosys */ + { 3, (sy_call_t *)dup3 }, /* 232 = dup3 */ { 0, (sy_call_t *)nosys }, /* 233 = nosys */ { 0, (sy_call_t *)nosys }, /* 234 = nosys */ { 0, (sy_call_t *)nosys }, /* 235 = nosys */ Index: sys/kern/kern_descrip.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_descrip.c,v retrieving revision 1.32.2.2 diff -u -r1.32.2.2 kern_descrip.c - --- kern_descrip.c 1996/12/21 19:04:24 1.32.2.2 +++ kern_descrip.c 1997/03/18 05:17:28 @@ -149,8 +149,69 @@ } /* - - * Duplicate a file descriptor. + * Duplicate a file descriptor to a particular value in another process. */ +#ifndef _SYS_SYSPROTO_H_ +struct dup3_args { + u_int from; + pid_t target; + u_int to; +}; +#endif +/* ARGSUSED */ +int +dup3(p, uap, retval) + struct proc *p; + struct dup3_args *uap; + int *retval; +{ + struct filedesc *tdp, *fdp; + struct proc *t; + struct file *fp, *nfp; + int i, error; + u_int from = uap->from, to = uap->to; + + /* Look up target process and make sure it exists, then set */ + t = pfind(uap->target); + if (!t) + return (ESRCH); + tdp = t->p_fd; + fdp = p->p_fd; + + /* Don't let non-root procs stomp other procs unless euid is the same */ + /* XXX should also put in a check for parentage here in the non-root case XXX */ + if (p->p_ucred->cr_uid && p->p_ucred->cr_uid != t->p_ucred->cr_uid) + return (EPERM); + + if (from >= fdp->fd_nfiles || fdp->fd_ofiles[from] == NULL) + return (EBADF); + if (to >= tdp->fd_nfiles) { + if ((error = fdalloc(t, to, &i))) + return (error); + if (to != i) + panic("dup3: fdalloc"); + *retval = -1; + } + else if (tdp->fd_ofiles[to]) { + if ((error = fdalloc(p, 0, &i))) + return (error); + fdp->fd_ofiles[i] = tdp->fd_ofiles[to]; + fdp->fd_ofileflags[i] = tdp->fd_ofileflags[to] &~ UF_EXCLOSE; + tdp->fd_ofiles[to]->f_count++; + if (i > fdp->fd_lastfile) + fdp->fd_lastfile = i; + *retval = i; + } + tdp->fd_ofiles[to] = fdp->fd_ofiles[from]; + tdp->fd_ofileflags[to] = fdp->fd_ofileflags[from] &~ UF_EXCLOSE; + tdp->fd_ofiles[from]->f_count++; + if (to > tdp->fd_lastfile) + tdp->fd_lastfile = to; + return (0); +} + +/* + * Duplicate a file descriptor. */ #ifndef _SYS_SYSPROTO_H_ struct dup_args { u_int fd; Index: sys/kern/syscalls.c =================================================================== RCS file: /home/ncvs/src/sys/kern/syscalls.c,v retrieving revision 1.31 diff -u -r1.31 syscalls.c - --- syscalls.c 1996/09/19 19:48:34 1.31 +++ syscalls.c 1997/03/17 18:15:11 @@ -2,7 +2,7 @@ * System call names. * * DO NOT EDIT-- this file is automatically generated. - - * created from Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp + * created from Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp */ char *syscallnames[] = { @@ -253,7 +253,7 @@ "shmctl", /* 229 = shmctl */ "shmdt", /* 230 = shmdt */ "shmget", /* 231 = shmget */ - - "#232", /* 232 = nosys */ + "dup3", /* 232 = dup3 */ "#233", /* 233 = nosys */ "#234", /* 234 = nosys */ "#235", /* 235 = nosys */ Index: sys/kern/syscalls.master =================================================================== RCS file: /home/ncvs/src/sys/kern/syscalls.master,v retrieving revision 1.29 diff -u -r1.29 syscalls.master - --- syscalls.master 1996/09/19 19:48:38 1.29 +++ syscalls.master 1997/03/17 18:06:01 @@ -364,7 +364,7 @@ 230 STD BSD { int shmdt(void *shmaddr); } 231 STD BSD { int shmget(key_t key, int size, int shmflg); } ; - -232 UNIMPL NOHIDE nosys +232 STD BSD { int dup3(u_int from, pid_t target, u_int to); } 233 UNIMPL NOHIDE nosys 234 UNIMPL NOHIDE nosys 235 UNIMPL NOHIDE nosys Index: sys/sys/syscall-hide.h =================================================================== RCS file: /home/ncvs/src/sys/sys/syscall-hide.h,v retrieving revision 1.25 diff -u -r1.25 syscall-hide.h - --- syscall-hide.h 1996/09/19 19:49:10 1.25 +++ syscall-hide.h 1997/03/17 18:15:11 @@ -2,7 +2,7 @@ * System call hiders. * * DO NOT EDIT-- this file is automatically generated. - - * created from Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp + * created from Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp */ HIDE_POSIX(fork) @@ -209,5 +209,6 @@ HIDE_BSD(shmctl) HIDE_BSD(shmdt) HIDE_BSD(shmget) +HIDE_BSD(dup3) HIDE_BSD(minherit) HIDE_BSD(rfork) Index: sys/sys/syscall.h =================================================================== RCS file: /home/ncvs/src/sys/sys/syscall.h,v retrieving revision 1.29 diff -u -r1.29 syscall.h - --- syscall.h 1996/09/19 19:49:12 1.29 +++ syscall.h 1997/03/17 18:15:11 @@ -2,7 +2,7 @@ * System call numbers. * * DO NOT EDIT-- this file is automatically generated. - - * created from Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp + * created from Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp */ #define SYS_syscall 0 @@ -203,6 +203,7 @@ #define SYS_shmctl 229 #define SYS_shmdt 230 #define SYS_shmget 231 +#define SYS_dup3 232 #define SYS_minherit 250 #define SYS_rfork 251 #define SYS_MAXSYSCALL 252 Index: sys/sys/sysproto.h =================================================================== RCS file: /home/ncvs/src/sys/sys/sysproto.h,v retrieving revision 1.15 diff -u -r1.15 sysproto.h - --- sysproto.h 1996/09/19 19:49:13 1.15 +++ sysproto.h 1997/03/17 18:15:11 @@ -2,7 +2,7 @@ * System call prototypes. * * DO NOT EDIT-- this file is automatically generated. - - * created from Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp + * created from Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp */ #ifndef _SYS_SYSPROTO_H_ @@ -716,6 +716,11 @@ int size; int shmflg; }; +struct dup3_args { + u_int from; + pid_t target; + u_int to; +}; struct minherit_args { caddr_t addr; size_t len; @@ -891,6 +896,7 @@ int shmctl __P((struct proc *, struct shmctl_args *, int [])); int shmdt __P((struct proc *, struct shmdt_args *, int [])); int shmget __P((struct proc *, struct shmget_args *, int [])); +int dup3 __P((struct proc *, struct dup3_args *, int [])); int minherit __P((struct proc *, struct minherit_args *, int [])); int rfork __P((struct proc *, struct rfork_args *, int [])); ------------------------------ End of hackers-digest V3 #109 *****************************