From owner-freebsd-arch@freebsd.org Wed Feb 10 22:58:48 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 547E5AA52F3 for ; Wed, 10 Feb 2016 22:58:48 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 3A5671F99 for ; Wed, 10 Feb 2016 22:58:48 +0000 (UTC) (envelope-from jilles@stack.nl) Received: by mailman.ysv.freebsd.org (Postfix) id 3877EAA52F0; Wed, 10 Feb 2016 22:58:48 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1E286AA52EF for ; Wed, 10 Feb 2016 22:58:48 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client CN "mailhost.stack.nl", Issuer "CA Cert Signing Authority" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id AACDA1F98; Wed, 10 Feb 2016 22:58:47 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from toad2.stack.nl (toad2.stack.nl [IPv6:2001:610:1108:5010::161]) by mx1.stack.nl (Postfix) with ESMTP id 8F3EEB80A0; Wed, 10 Feb 2016 23:58:44 +0100 (CET) Received: by toad2.stack.nl (Postfix, from userid 1677) id 949F9892F8; Wed, 10 Feb 2016 23:58:44 +0100 (CET) Date: Wed, 10 Feb 2016 23:58:44 +0100 From: Jilles Tjoelker To: John Baldwin Cc: arch@freebsd.org Subject: Re: Refactoring asynchronous I/O Message-ID: <20160210225844.GA89743@stack.nl> References: <2793494.0Z1kBV82mT@ralph.baldwin.cx> <9227739.EqUaAQ57pU@ralph.baldwin.cx> <20160205213212.GA97435@stack.nl> <2544692.uejYAA75dx@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2544692.uejYAA75dx@ralph.baldwin.cx> User-Agent: Mutt/1.5.23 (2014-03-12) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Feb 2016 22:58:48 -0000 On Mon, Feb 08, 2016 at 11:22:25AM -0800, John Baldwin wrote: > On Friday, February 05, 2016 10:32:12 PM Jilles Tjoelker wrote: > > On Fri, Feb 05, 2016 at 12:21:52PM -0800, John Baldwin wrote: > > > On Monday, February 01, 2016 12:02:14 AM Jilles Tjoelker wrote: > > > > On Tue, Jan 26, 2016 at 05:39:03PM -0800, John Baldwin wrote: > > > > > Note that binding the AIO support to a new fileop does mean that > > > > > the AIO code now becomes mandatory (rather than optional). We > > > > > could perhaps make the system calls continue to be optional if > > > > > people really need that, but the guts of the code will now need to > > > > > always be in the kernel. > > > > Enabling this by default is OK with me as long as the easy ways to get a > > > > stuck process are at least disabled by default. Currently, a process > > > > gets stuck forever if it has an AIO request from or to a pipe that will > > > > never complete. An AIO daemon should not be allowed to perform an > > > > unbounded sleep such as for a pipe (NFS server should be OK). > > > One thing I could do is split vfs_aio.c into two files: kern_aio.c that > > > holds the "library" such as aio_aqueue() / aio_complete(), etc. and a > > > sys_aio.c that is just the system calls. kern_aio.c would be standard, > > > but sys_aio.c could still be optional and aio.ko would contain it. > > > This would still make AIO optional, though aio.ko would be fairly small, > > > so not having it probably wouldn't save much in terms of size. > > > Does this seem reasonable or is a trivial aio.ko not worth it? > > It is one possible option. Another option is to refuse AIO for kinds of > > file that have not been vetted to avoid blocking an AIO daemon for too > > long. "Kinds of file" is about the code that will be executed to do I/O, > > so that, for example, /dev/klog and /dev/ttyv0 are different kinds. > > Depending on whether this restriction breaks existing code, it may need > > to be a sysctl. > This doesn't seem to be very easy to implement, especially if it should > vary by character device. It sounds like what you would do today is > to allow mlock() and AIO on sockets and maybe the fast physio path > and disable everything else by default for now? This would be slightly > more feasible than something more fine-grained. A sysctl would allow > "full" AIO use that would disable these checks? It does not seem that complicated to me. A check can be inserted before placing a job into aio_jobs (in aio_queue_file() in the patched code). If the code can detect some safe cases, they could be allowed; otherwise, the AIO daemons will not be used. I'm assuming that the code in the new fo_aio_queue file ops behaves properly and will not cause AIO daemons to get stuck. Note that this means there is no easy way to do kernel AIO on a kind of file without specific support in the code for that kind of file. I think the risk of a stuck process (that may even cause shutdown to hang indefinitely) is bad enough to forbid it by default. > > All of this would not be a problem if PCATCH worked in AIO daemons > > (treat aio_cancel like a pending signal in a user process) but the last > > time I looked this appeared quite hard to fix. > The cancel routines approach is what I'm doing to try to resolve this, > but it means handling the case explicitly in the various backends. OK. -- Jilles Tjoelker