From owner-freebsd-hackers@FreeBSD.ORG Fri Apr 19 22:48:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D36CD6A4 for ; Fri, 19 Apr 2013 22:48:56 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (unknown [IPv6:2001:610:1108:5012::107]) by mx1.freebsd.org (Postfix) with ESMTP id 4118A2D8 for ; Fri, 19 Apr 2013 22:48:56 +0000 (UTC) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id A777012013B for ; Sat, 20 Apr 2013 00:48:39 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 900482848C; Sat, 20 Apr 2013 00:48:39 +0200 (CEST) Date: Sat, 20 Apr 2013 00:48:39 +0200 From: Jilles Tjoelker To: freebsd-hackers@freebsd.org Subject: [patch] accept4 Message-ID: <20130419224839.GA69212@stack.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Apr 2013 22:48:56 -0000 The accept4() function, compared to accept(), allows setting the new file descriptor atomically close-on-exec and explicitly controlling the non-blocking status on the new socket. (Note that the latter point means that accept() is not equivalent to any form of accept4().) The linuxulator's accept4 implementation leaves a race window where the new file descriptor is not close-on-exec because it calls sys_accept(). This implementation leaves no such race window (by using falloc() flags). The linuxulator could be fixed and simplified by using the new code. The comms/openobex port now compiles again without the patch-lib_cloexec.h patch. More information about API extensions to set new file descriptors atomically close-on-exec is at https://wiki.freebsd.org/AtomicCloseOnExec I'm also working on pipe2() (using linuxulator work) and dup3() (patch from Jukka A. Ukkonen). commit 2e410d9260d12328b689b0938e09d09649b0460a Author: Jilles Tjoelker Date: Sat Mar 30 21:44:14 2013 +0100 Add accept4() system call. diff --git a/lib/libc/sys/Makefile.inc b/lib/libc/sys/Makefile.inc index fef0f3c..105f469 100644 --- a/lib/libc/sys/Makefile.inc +++ b/lib/libc/sys/Makefile.inc @@ -270,6 +270,7 @@ MAN+= sctp_generic_recvmsg.2 \ wait.2 \ write.2 +MLINKS+=accept.2 accept4.2 MLINKS+=access.2 eaccess.2 \ access.2 faccessat.2 MLINKS+=brk.2 sbrk.2 diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map index 6faa0af..149fa41 100644 --- a/lib/libc/sys/Symbol.map +++ b/lib/libc/sys/Symbol.map @@ -378,6 +378,7 @@ FBSD_1.2 { }; FBSD_1.3 { + accept4; bindat; cap_fcntls_get; cap_fcntls_limit; @@ -461,6 +462,8 @@ FBSDprivate_1.0 { __sys_abort2; _accept; __sys_accept; + _accept4; + __sys_accept4; _access; __sys_access; _acct; diff --git a/lib/libc/sys/accept.2 b/lib/libc/sys/accept.2 index 978b948..e8bccaf 100644 --- a/lib/libc/sys/accept.2 +++ b/lib/libc/sys/accept.2 @@ -41,6 +41,8 @@ .In sys/socket.h .Ft int .Fn accept "int s" "struct sockaddr * restrict addr" "socklen_t * restrict addrlen" +.Ft int +.Fn accept4 "int s" "struct sockaddr * restrict addr" "socklen_t * restrict addrlen" "int flags" .Sh DESCRIPTION The argument .Fa s @@ -66,6 +68,26 @@ and signals from the original socket .Fa s . .Pp +The +.Fn accept4 +system call is similar, +but the +.Dv O_NONBLOCK +property of the new socket is instead determined by the +.Dv SOCK_NONBLOCK +flag in the +.Fa flags +argument, +the +.Dv O_ASYNC +property is cleared, +the signal destination is cleared +and the close-on-exec flag on the new file descriptor can be set via the +.Dv SOCK_CLOEXEC +flag in the +.Fa flags +argument. +.Pp If no pending connections are present on the queue, and the original socket is not marked as non-blocking, @@ -141,13 +163,15 @@ properties and the signal destination being inherited, but should set them explicitly using .Xr fcntl 2 . .Sh RETURN VALUES -The call returns \-1 on error. -If it succeeds, it returns a non-negative +These calls return \-1 on error. +If they succeed, they return a non-negative integer that is a descriptor for the accepted socket. .Sh ERRORS The .Fn accept -system call will fail if: +and +.Fn accept4 +system calls will fail if: .Bl -tag -width Er .It Bq Er EBADF The descriptor is invalid. @@ -180,6 +204,16 @@ are present to be accepted. A connection arrived, but it was closed while waiting on the listen queue. .El +.Pp +The +.Fn accept4 +system call will also fail if: +.Bl -tag -width Er +.It Bq Er EINVAL +The +.Fa flags +argument is invalid. +.El .Sh SEE ALSO .Xr bind 2 , .Xr connect 2 , @@ -194,3 +228,8 @@ The .Fn accept system call appeared in .Bx 4.2 . +.Pp +The +.Fn accept4 +system call appeared in +.Fx 10.0 . diff --git a/lib/libthr/pthread.map b/lib/libthr/pthread.map index 355edea..bbbd930e 100644 --- a/lib/libthr/pthread.map +++ b/lib/libthr/pthread.map @@ -181,6 +181,7 @@ FBSDprivate_1.0 { ___wait; ___waitpid; __accept; + __accept4; __aio_suspend; __close; __connect; @@ -408,3 +409,7 @@ FBSD_1.2 { setcontext; swapcontext; }; + +FBSD_1.3 { + accept4; +}; diff --git a/lib/libthr/thread/thr_syscalls.c b/lib/libthr/thread/thr_syscalls.c index 2327d74..7a08302 100644 --- a/lib/libthr/thread/thr_syscalls.c +++ b/lib/libthr/thread/thr_syscalls.c @@ -101,6 +101,7 @@ extern pid_t __waitpid(pid_t, int *, int); extern int __sys_aio_suspend(const struct aiocb * const[], int, const struct timespec *); extern int __sys_accept(int, struct sockaddr *, socklen_t *); +extern int __sys_accept4(int, struct sockaddr *, socklen_t *, int); extern int __sys_connect(int, const struct sockaddr *, socklen_t); extern int __sys_fsync(int); extern int __sys_msync(void *, size_t, int); @@ -129,6 +130,7 @@ int ___usleep(useconds_t useconds); pid_t ___wait(int *); pid_t ___waitpid(pid_t, int *, int); int __accept(int, struct sockaddr *, socklen_t *); +int __accept4(int, struct sockaddr *, socklen_t *, int); int __aio_suspend(const struct aiocb * const iocbs[], int, const struct timespec *); int __close(int); @@ -176,6 +178,26 @@ __accept(int s, struct sockaddr *addr, socklen_t *addrlen) return (ret); } +__weak_reference(__accept4, accept4); + +/* + * Cancellation behavior: + * If thread is canceled, no socket is created. + */ +int +__accept4(int s, struct sockaddr *addr, socklen_t *addrlen, int flags) +{ + struct pthread *curthread; + int ret; + + curthread = _get_curthread(); + _thr_cancel_enter(curthread); + ret = __sys_accept4(s, addr, addrlen, flags); + _thr_cancel_leave(curthread, ret == -1); + + return (ret); +} + __weak_reference(__aio_suspend, aio_suspend); int diff --git a/sys/compat/freebsd32/syscalls.master b/sys/compat/freebsd32/syscalls.master index 0a40ab2..2cbdf31 100644 --- a/sys/compat/freebsd32/syscalls.master +++ b/sys/compat/freebsd32/syscalls.master @@ -1022,3 +1022,7 @@ int namelen); } 540 AUE_CHFLAGSAT NOPROTO { int chflagsat(int fd, const char *path, \ u_long flags, int atflag); } +541 AUE_ACCEPT NOPROTO { int accept4(int s, \ + struct sockaddr * __restrict name, \ + __socklen_t * __restrict anamelen, \ + int flags); } diff --git a/sys/kern/capabilities.conf b/sys/kern/capabilities.conf index f54c25d..0b64503 100644 --- a/sys/kern/capabilities.conf +++ b/sys/kern/capabilities.conf @@ -78,6 +78,7 @@ abort2 ## relies on existing bindings on a socket, subject to capability rights. ## accept +accept4 ## ## Allow AIO operations by file descriptor, subject to capability rights. diff --git a/sys/kern/syscalls.master b/sys/kern/syscalls.master index 4b3348f..922db30 100644 --- a/sys/kern/syscalls.master +++ b/sys/kern/syscalls.master @@ -972,5 +972,9 @@ int namelen); } 540 AUE_CHFLAGSAT STD { int chflagsat(int fd, const char *path, \ u_long flags, int atflag); } +541 AUE_ACCEPT STD { int accept4(int s, \ + struct sockaddr * __restrict name, \ + __socklen_t * __restrict anamelen, \ + int flags); } ; Please copy any additions and changes to the following compatability tables: ; sys/compat/freebsd32/syscalls.master diff --git a/sys/kern/uipc_syscalls.c b/sys/kern/uipc_syscalls.c index ac2fd42..4bbd595 100644 --- a/sys/kern/uipc_syscalls.c +++ b/sys/kern/uipc_syscalls.c @@ -97,10 +97,18 @@ __FBSDID("$FreeBSD$"); #endif /* SCTP */ #endif /* INET || INET6 */ +/* + * Flags for accept1() and kern_accept4(), in addition to SOCK_CLOEXEC + * and SOCK_NONBLOCK. + */ +#define ACCEPT4_INHERIT 0x1 +#define ACCEPT4_COMPAT 0x2 + static int sendit(struct thread *td, int s, struct msghdr *mp, int flags); static int recvit(struct thread *td, int s, struct msghdr *mp, void *namelenp); -static int accept1(struct thread *td, struct accept_args *uap, int compat); +static int accept1(struct thread *td, int s, struct sockaddr *uname, + socklen_t *anamelen, int flags); static int do_sendfile(struct thread *td, struct sendfile_args *uap, int compat); static int getsockname1(struct thread *td, struct getsockname_args *uap, int compat); @@ -317,49 +325,46 @@ sys_listen(td, uap) * accept1() */ static int -accept1(td, uap, compat) +accept1(td, s, uname, anamelen, flags) struct thread *td; - struct accept_args /* { - int s; - struct sockaddr * __restrict name; - socklen_t * __restrict anamelen; - } */ *uap; - int compat; + int s; + struct sockaddr *uname; + socklen_t *anamelen; + int flags; { struct sockaddr *name; socklen_t namelen; struct file *fp; int error; - if (uap->name == NULL) - return (kern_accept(td, uap->s, NULL, NULL, NULL)); + if (uname == NULL) + return (kern_accept4(td, s, NULL, NULL, flags, NULL)); - error = copyin(uap->anamelen, &namelen, sizeof (namelen)); + error = copyin(anamelen, &namelen, sizeof (namelen)); if (error) return (error); - error = kern_accept(td, uap->s, &name, &namelen, &fp); + error = kern_accept4(td, s, &name, &namelen, flags, &fp); /* * return a namelen of zero for older code which might * ignore the return value from accept. */ if (error) { - (void) copyout(&namelen, - uap->anamelen, sizeof(*uap->anamelen)); + (void) copyout(&namelen, anamelen, sizeof(*anamelen)); return (error); } - if (error == 0 && name != NULL) { + if (error == 0 && uname != NULL) { #ifdef COMPAT_OLDSOCK - if (compat) + if (flags & ACCEPT4_COMPAT) ((struct osockaddr *)name)->sa_family = name->sa_family; #endif - error = copyout(name, uap->name, namelen); + error = copyout(name, uname, namelen); } if (error == 0) - error = copyout(&namelen, uap->anamelen, + error = copyout(&namelen, anamelen, sizeof(namelen)); if (error) fdclose(td->td_proc->p_fd, fp, td->td_retval[0], td); @@ -372,6 +377,13 @@ int kern_accept(struct thread *td, int s, struct sockaddr **name, socklen_t *namelen, struct file **fp) { + return (kern_accept4(td, s, name, namelen, ACCEPT4_INHERIT, fp)); +} + +int +kern_accept4(struct thread *td, int s, struct sockaddr **name, + socklen_t *namelen, int flags, struct file **fp) +{ struct filedesc *fdp; struct file *headfp, *nfp = NULL; struct sockaddr *sa = NULL; @@ -400,7 +412,7 @@ kern_accept(struct thread *td, int s, struct sockaddr **name, if (error != 0) goto done; #endif - error = falloc(td, &nfp, &fd, 0); + error = falloc(td, &nfp, &fd, (flags & SOCK_CLOEXEC) ? O_CLOEXEC : 0); if (error) goto done; ACCEPT_LOCK(); @@ -441,7 +453,10 @@ kern_accept(struct thread *td, int s, struct sockaddr **name, TAILQ_REMOVE(&head->so_comp, so, so_list); head->so_qlen--; - so->so_state |= (head->so_state & SS_NBIO); + if (flags & ACCEPT4_INHERIT) + so->so_state |= (head->so_state & SS_NBIO); + else + so->so_state |= (flags & SOCK_NONBLOCK) ? SS_NBIO : 0; so->so_qstate &= ~SQ_COMP; so->so_head = NULL; @@ -454,9 +469,15 @@ kern_accept(struct thread *td, int s, struct sockaddr **name, /* connection has been removed from the listen queue */ KNOTE_UNLOCKED(&head->so_rcv.sb_sel.si_note, 0); - pgid = fgetown(&head->so_sigio); - if (pgid != 0) - fsetown(pgid, &so->so_sigio); + if (flags & ACCEPT4_INHERIT) { + pgid = fgetown(&head->so_sigio); + if (pgid != 0) + fsetown(pgid, &so->so_sigio); + } else { + fflag &= ~(FNONBLOCK | FASYNC); + if (flags & SOCK_NONBLOCK) + fflag |= FNONBLOCK; + } finit(nfp, fflag, DTYPE_SOCKET, so, &socketops); /* Sync socket nonblocking/async state with file flags */ @@ -527,7 +548,18 @@ sys_accept(td, uap) struct accept_args *uap; { - return (accept1(td, uap, 0)); + return (accept1(td, uap->s, uap->name, uap->anamelen, ACCEPT4_INHERIT)); +} + +int +sys_accept4(td, uap) + struct thread *td; + struct accept4_args *uap; +{ + if (uap->flags & ~(SOCK_CLOEXEC | SOCK_NONBLOCK)) + return (EINVAL); + + return (accept1(td, uap->s, uap->name, uap->anamelen, uap->flags)); } #ifdef COMPAT_OLDSOCK @@ -537,7 +569,8 @@ oaccept(td, uap) struct accept_args *uap; { - return (accept1(td, uap, 1)); + return (accept1(td, uap->s, uap->name, uap->anamelen, + ACCEPT4_INHERIT | ACCEPT4_COMPAT)); } #endif /* COMPAT_OLDSOCK */ diff --git a/sys/sys/socket.h b/sys/sys/socket.h index 41c85b6..90411d7 100644 --- a/sys/sys/socket.h +++ b/sys/sys/socket.h @@ -634,6 +634,7 @@ int accept(int, struct sockaddr * __restrict, socklen_t * __restrict); int bind(int, const struct sockaddr *, socklen_t); int connect(int, const struct sockaddr *, socklen_t); #if __BSD_VISIBLE +int accept4(int, struct sockaddr * __restrict, socklen_t * __restrict, int); int bindat(int, int, const struct sockaddr *, socklen_t); int connectat(int, int, const struct sockaddr *, socklen_t); #endif diff --git a/sys/sys/syscallsubr.h b/sys/sys/syscallsubr.h index fa0d351..49e8be1 100644 --- a/sys/sys/syscallsubr.h +++ b/sys/sys/syscallsubr.h @@ -60,6 +60,8 @@ int kern___getcwd(struct thread *td, u_char *buf, enum uio_seg bufseg, u_int buflen); int kern_accept(struct thread *td, int s, struct sockaddr **name, socklen_t *namelen, struct file **fp); +int kern_accept4(struct thread *td, int s, struct sockaddr **name, + socklen_t *namelen, int flags, struct file **fp); int kern_access(struct thread *td, char *path, enum uio_seg pathseg, int flags); int kern_accessat(struct thread *td, int fd, char *path, -- Jilles Tjoelker