From owner-freebsd-arch@freebsd.org Thu Jan 28 00:44:51 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5F42CA6F727 for ; Thu, 28 Jan 2016 00:44:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 44E261ADD for ; Thu, 28 Jan 2016 00:44:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 43503A6F726; Thu, 28 Jan 2016 00:44:51 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 42E06A6F725 for ; Thu, 28 Jan 2016 00:44:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 24E681ADC for ; Thu, 28 Jan 2016 00:44:51 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id BE1F3B917; Wed, 27 Jan 2016 19:44:49 -0500 (EST) From: John Baldwin To: Slawa Olhovchenkov Cc: arch@freebsd.org Subject: Re: Refactoring asynchronous I/O Date: Wed, 27 Jan 2016 16:44:28 -0800 Message-ID: <5889488.CCOMJGym34@ralph.baldwin.cx> User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; ) In-Reply-To: <20160127231817.GA88527@zxy.spb.ru> References: <2793494.0Z1kBV82mT@ralph.baldwin.cx> <3640314.Du7Q0QmH0W@ralph.baldwin.cx> <20160127231817.GA88527@zxy.spb.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 27 Jan 2016 19:44:49 -0500 (EST) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Jan 2016 00:44:51 -0000 On Thursday, January 28, 2016 02:18:17 AM Slawa Olhovchenkov wrote: > On Wed, Jan 27, 2016 at 01:16:37PM -0800, John Baldwin wrote: > > > On Thursday, January 28, 2016 12:04:20 AM Slawa Olhovchenkov wrote: > > > On Wed, Jan 27, 2016 at 09:52:12AM -0800, John Baldwin wrote: > > > > > > > On Wednesday, January 27, 2016 01:52:05 PM Slawa Olhovchenkov wrote: > > > > > On Tue, Jan 26, 2016 at 05:39:03PM -0800, John Baldwin wrote: > > > > > > > > > > > The original motivation for my changes is to support efficient zero-copy > > > > > > receive for TOE using Chelsio T4/T5 adapters. However, read() is ill > > > > > > > > > > I undertuns that not you work, but: what about (teoretical) async > > > > > open/close/unlink/etc? > > > > > > > > Implementing more asynchronous operations is orthogonal to this. It > > > > would perhaps be a bit simpler to implement these in the new model > > > > since most of the logic would live in a vnode-specific aio_queue > > > > method in vfs_vnops.c. However, the current AIO approach is to add a > > > > new system call for each async system call (e.g. aio_open()). You > > > > would then create an internal LIO opcode (e.g. LIO_OPEN). The vnode > > > > aio hook would then have to support LIO_OPEN requests and return the > > > > opened fd via aio_complete(). Async stat / open might be nice for > > > > network filesystems in particular. I've known of programs forking > > > > separate threads just to do open/fstat of NFS files to achieve the > > > > equivalent of aio_open() / aio_stat(). > > > > > > Some problem exist for open()/unlink/rename/etc -- you can't use > > > fd-related semantic. > > > > Mmmm. We have an aio_mlock(). aio_open() would require more of a special > > case like aio_mlock(). It's still doable, but it would not go via the > > fileop, yes. fstat could go via the fileop, but a path-based stat would > > be akin to aio_open(). > > aio_rename require yet more of special handling. > As I see this is can't be packed in current structures (aiocb and > perhaps sigevent). I am don't see space for multiple paths. I am don't > see space for fd return. > > Need to change some semantics (dissalow some notifications, for > examples, only SIGEV_THREAD will be allowed? How pass information > about called aio operation?). > > Also, may be some problems inside kernel for fd-less operations? The kernel side of aiocb is free to hold additional information, and you could always pass additional information via arguments, e.g. aio_rename(aiocb, from, to) (Alternatively, you could define aio_buf as pointing to a structure that holds arguments.) -- John Baldwin