From owner-freebsd-net@freebsd.org Sun Jan 17 21:38:13 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 58C5FA85DE6 for ; Sun, 17 Jan 2016 21:38:13 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 3E7C31770 for ; Sun, 17 Jan 2016 21:38:13 +0000 (UTC) (envelope-from jilles@stack.nl) Received: by mailman.ysv.freebsd.org (Postfix) id 3DC2EA85DE5; Sun, 17 Jan 2016 21:38:13 +0000 (UTC) Delivered-To: net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 22989A85DDF for ; Sun, 17 Jan 2016 21:38:13 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id AC402176F for ; Sun, 17 Jan 2016 21:38:12 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 6AFA1B805A; Sun, 17 Jan 2016 22:38:10 +0100 (CET) Received: by snail.stack.nl (Postfix, from userid 1677) id 563A328494; Sun, 17 Jan 2016 22:38:10 +0100 (CET) Date: Sun, 17 Jan 2016 22:38:10 +0100 From: Jilles Tjoelker To: Konstantin Belousov Cc: Boris Astardzhiev , net@freebsd.org Subject: Re: Does FreeBSD have sendmmsg or recvmmsg system calls? Message-ID: <20160117213810.GA38279@stack.nl> References: <20160108075815.3243.qmail@f5-external.bushwire.net> <20160108204606.G2420@besplex.bde.org> <20160113080349.GC72455@kib.kiev.ua> <20160116195657.GJ3942@kib.kiev.ua> <20160116202534.GK3942@kib.kiev.ua> <20160117211853.GA37847@stack.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160117211853.GA37847@stack.nl> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jan 2016 21:38:13 -0000 On Sun, Jan 17, 2016 at 10:18:53PM +0100, Jilles Tjoelker wrote: > On Sat, Jan 16, 2016 at 10:25:34PM +0200, Konstantin Belousov wrote: > > On Sat, Jan 16, 2016 at 09:56:57PM +0200, Konstantin Belousov wrote: > > > After thinking some more, I believe I managed to construct a possible > > > way to implement this, in libc, with some libthr extensions. Basically, > > > the idea is to have function pthread_cancel_is_pending_np(), which > > > would return the state of pending cancel. For some time I thought that > > > this cannot work, because cancellation could happen between call to > > > the cancel_is_pending() and next recvmmsg(). But, libc has a privilege > > > of having access to the syscalls without libthr interposing, just > > > call __sys_recvmmsg(), which would give EINTR on the cancel attempt. > > > This is an implementation detail, but we can rely on it in implementation. > > > In other words, the structure of the code would be like this > > > for (i = 0; i < vlen; i++) { > > > if (pthread_cancel_is_pending_np()) > > > goto out; > > Right after writing the text and hitting send, I realized that the > > pthread_cancel_is_pending_np() is not needed at all. You get EINTR > > from __sys_recvmsg() on the cancel attempt, so everything would just > > work without the function. > > The crusial part is to use __sys_recvmsg instead of interposable > > _recvmsg(). > This will typically work (if the cancellation occurs while blocked > inside __sys_recvmsg()) but has the usual problem of relying on [EINTR]: > lost wakeups. This is certainly less bad than using the interposable > recvmsg(), though, which would discard the already received data. > As a slight modification, the first recvmsg could use the interposable > version, since there is no pending data at that point. This avoids > needing to call pthread_testcancel() manually. > The regular cancellation code closes this race window using the > undocumented thr_wake() system call, on the thread itself, in the signal > handler for the cancellation signal. This causes the next attempt to > sleep(9) to fail with [EINTR]. (On another note, it appears to be > possible for user code (cleanup handlers and pthread_key_create() > destructors) to be called with thr_wake() still active, if the > cancellation signal handler is called immediately after the cancellation > point system call returns.) > The race in recvmmsg could be removed using this mechanism but it > requires either a separate version of recvmmsg in libthr or a new > interface in libthr. I imagine the new interface as a new cancellation > type which causes cancellation point functions such as recvmsg() to fail > with a new errno when cancelled while leaving cancellation pending. This > seems conceptually possible but adds some code to the common path for > cancellation points. A new version of pthread_testcancel() with a return > value would be needed. I realized that the above may be interpreted as requiring cancellation to be completely fixed before recvmmsg/sendmmsg are "done". That is not my intention. Given that cancellation is not commonly used in applications, I think an approach based on __sys_recvmsg() is good enough. -- Jilles Tjoelker