From owner-freebsd-net@freebsd.org Thu Jan 7 10:54:24 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 8606FA65EC8 for ; Thu, 7 Jan 2016 10:54:24 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 15CF71DA0 for ; Thu, 7 Jan 2016 10:54:23 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u07AsIoB087955 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 7 Jan 2016 12:54:19 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u07AsIoB087955 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u07AsIO5087954; Thu, 7 Jan 2016 12:54:18 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 7 Jan 2016 12:54:18 +0200 From: Konstantin Belousov To: Boris Astardzhiev Cc: Mark Delany , freebsd-net@freebsd.org Subject: Re: Does FreeBSD have sendmmsg or recvmmsg system calls? Message-ID: <20160107105418.GS3625@kib.kiev.ua> References: <20160103214720.72014.qmail@f5-external.bushwire.net> <20160104085415.GS3625@kib.kiev.ua> <20160104091108.50654.qmail@f5-external.bushwire.net> <20160104093859.GV3625@kib.kiev.ua> <20160104101747.58347.qmail@f5-external.bushwire.net> <20160104194044.GD3625@kib.kiev.ua> <20160104210741.32812.qmail@f5-external.bushwire.net> <20160107101351.GR3625@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jan 2016 10:54:24 -0000 On Thu, Jan 07, 2016 at 12:28:32PM +0200, Boris Astardzhiev wrote: See inline comments and final notes at the end. > diff --git a/lib/libc/include/libc_private.h b/lib/libc/include/libc_private.h > index 5caf9a3..7e2a902 100644 > --- a/lib/libc/include/libc_private.h > +++ b/lib/libc/include/libc_private.h > @@ -200,8 +200,10 @@ enum { > INTERPOS_pselect, > INTERPOS_recvfrom, > INTERPOS_recvmsg, > + INTERPOS_recvmmsg, > INTERPOS_select, > INTERPOS_sendmsg, > + INTERPOS_sendmmsg, > INTERPOS_sendto, > INTERPOS_setcontext, > INTERPOS_sigaction, The interposing table must be extended at the end, and not in the middle. otherwise you introduce too large inconsistence with older libthr. That said, you changed libc table, but did not updated libthr. The result is random segfaults in the multithreaded processes. > diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map > index 7b3257c..6cc3c6e 100644 > --- a/lib/libc/sys/Symbol.map > +++ b/lib/libc/sys/Symbol.map > @@ -220,6 +220,7 @@ FBSD_1.0 { > reboot; > recvfrom; > recvmsg; > + recvmmsg; > rename; > revoke; > rfork; > @@ -240,6 +241,7 @@ FBSD_1.0 { > semsys; > sendfile; > sendmsg; > + sendmmsg; > sendto; > setaudit; > setaudit_addr; The versioning is wrong, new non-private symbols for 11.0 go into _1.4. > @@ -851,6 +853,8 @@ FBSDprivate_1.0 { > __sys_recvfrom; > _recvmsg; > __sys_recvmsg; > + _recvmmsg; > + __sys_recvmmsg; > _rename; > __sys_rename; > _revoke; > @@ -891,6 +895,8 @@ FBSDprivate_1.0 { > __sys_sendfile; > _sendmsg; > __sys_sendmsg; > + _sendmmsg; > + __sys_sendmmsg; > _sendto; > __sys_sendto; > _setaudit; > diff --git a/sys/kern/uipc_syscalls.c b/sys/kern/uipc_syscalls.c > index c33a2cf..7354b4f 100644 > --- a/sys/kern/uipc_syscalls.c > +++ b/sys/kern/uipc_syscalls.c > +int > +sys_recvmmsg(td, uap) > + struct thread *td; > + struct recvmmsg_args /* { > + int s; > + struct mmsghdr *msgvec; > + unsigned int vlen; > + int flags; > + } */ *uap; Use ANSI C definitions for new code, please. I personally do not inline uap declaration, the time has gone. > +{ > + struct mmsghdr *vec, *mmp; > + struct msghdr *mp, msg; > + struct iovec *uiov, *iov; > + unsigned int vlen, len; > + int i, rcvd = 0, error; > + struct socket *so = NULL; Style, no initialization in declaration. > + struct file *fp; > + cap_rights_t rights; > + > + if (fget(td, uap->s, cap_rights_init(&rights, CAP_RECV), &fp) != 0) > + return (EBADF); > + > + so = fp->f_data; What if uap->s filedescriptor does not reference a socket ? > + > + vlen = uap->vlen; > + if (vlen > UIO_MAXIOV) > + vlen = UIO_MAXIOV; > + > + len = vlen * sizeof(*uap->msgvec); > + vec = malloc(len, M_TEMP, M_WAITOK); This is a user-controlled malloc(9), i.e. an easy kernel panic due to impossible to satisfy request, or KVA and memory exhaustion. > + > + error = copyin(uap->msgvec, vec, len); > + if (error != 0) > + goto out; > + > + for (i = 0; i < vlen; i++) { > + mmp = &vec[i]; > + mp = &mmp->msg_hdr; > + msg = *mp; > + > + error = copyiniov(msg.msg_iov, msg.msg_iovlen, &iov, EMSGSIZE); > + if (error != 0) > + goto out; > + > + msg.msg_flags = uap->flags; > +#ifdef COMPAT_OLDSOCK > + msg.msg_flags &= ~MSG_COMPAT; > +#endif > + uiov = msg.msg_iov; > + msg.msg_iov = iov; > + > + error = recvit(td, uap->s, &msg, NULL); > + if (error == 0) { > + msg.msg_iov = uiov; > + error = copyout(&msg, &uap->msgvec[i].msg_hdr, sizeof(msg)); > + if (error != 0) { > + free(iov, M_IOV); > + goto out; > + } > + error = copyout(&td->td_retval[0], &uap->msgvec[i].msg_len, > + sizeof(td->td_retval[0])); > + if (error != 0) { > + free(iov, M_IOV); > + goto out; > + } > + } > + free(iov, M_IOV); > + rcvd++; > + } > + > + td->td_retval[0] = rcvd; > + > +out: > + if (error != 0 && rcvd != 0 && rcvd != vlen) { > + so->so_error = error; > + error = 0; > + td->td_retval[0] = rcvd; > + } > + > + fdrop(fp, td); > + > + free(vec, M_TEMP); > + return (error); > +} > + > /* ARGSUSED */ > int > sys_shutdown(td, uap) > diff --git a/sys/sys/socket.h b/sys/sys/socket.h > index 18e2de1..d352cd2 100644 > --- a/sys/sys/socket.h > +++ b/sys/sys/socket.h > @@ -414,6 +414,11 @@ struct msghdr { > int msg_flags; /* flags on received message */ > }; > > +struct mmsghdr { > + struct msghdr msg_hdr; /* message header */ > + unsigned int msg_len; /* message length */ > +}; > + > #define MSG_OOB 0x1 /* process out-of-band data */ > #define MSG_PEEK 0x2 /* peek at incoming message */ > #define MSG_DONTROUTE 0x4 /* send without using routing tables */ > @@ -615,10 +620,12 @@ int listen(int, int); > ssize_t recv(int, void *, size_t, int); > ssize_t recvfrom(int, void *, size_t, int, struct sockaddr * __restrict, socklen_t * __restrict); > ssize_t recvmsg(int, struct msghdr *, int); > +ssize_t recvmmsg(int, struct mmsghdr *, unsigned int, int); > ssize_t send(int, const void *, size_t, int); > ssize_t sendto(int, const void *, > size_t, int, const struct sockaddr *, socklen_t); > ssize_t sendmsg(int, const struct msghdr *, int); > +ssize_t sendmmsg(int, const struct mmsghdr *, unsigned int, int); Why did you put new functions into POSIX namespace ? They are GNU (and now BSD) extensions, at least I do not see then in SUS. > #if __BSD_VISIBLE > int sendfile(int, int, off_t, size_t, struct sf_hdtr *, off_t *, int); > int setfib(int); The patch lacks man page update, and lacks the freebsd32 compat shims. What are the reasons to go with the trivial kernel implementation right now, instead of trivial and simpler libc wrapper ? I think the speed would be same, but the code _much_ simpler.