From owner-freebsd-arch@freebsd.org Wed Jan 27 23:18:22 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C3C26A709CE for ; Wed, 27 Jan 2016 23:18:22 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B35F41469 for ; Wed, 27 Jan 2016 23:18:22 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: by mailman.ysv.freebsd.org (Postfix) id AFF00A709CD; Wed, 27 Jan 2016 23:18:22 +0000 (UTC) Delivered-To: arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AF86EA709CC for ; Wed, 27 Jan 2016 23:18:22 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7353C1468; Wed, 27 Jan 2016 23:18:22 +0000 (UTC) (envelope-from slw@zxy.spb.ru) Received: from slw by zxy.spb.ru with local (Exim 4.86 (FreeBSD)) (envelope-from ) id 1aOZLu-000G0N-1O; Thu, 28 Jan 2016 02:18:18 +0300 Date: Thu, 28 Jan 2016 02:18:17 +0300 From: Slawa Olhovchenkov To: John Baldwin Cc: arch@freebsd.org Subject: Re: Refactoring asynchronous I/O Message-ID: <20160127231817.GA88527@zxy.spb.ru> References: <2793494.0Z1kBV82mT@ralph.baldwin.cx> <1723457.HAUy43H1XN@ralph.baldwin.cx> <20160127210420.GZ88527@zxy.spb.ru> <3640314.Du7Q0QmH0W@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3640314.Du7Q0QmH0W@ralph.baldwin.cx> User-Agent: Mutt/1.5.24 (2015-08-30) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: slw@zxy.spb.ru X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jan 2016 23:18:23 -0000 On Wed, Jan 27, 2016 at 01:16:37PM -0800, John Baldwin wrote: > On Thursday, January 28, 2016 12:04:20 AM Slawa Olhovchenkov wrote: > > On Wed, Jan 27, 2016 at 09:52:12AM -0800, John Baldwin wrote: > > > > > On Wednesday, January 27, 2016 01:52:05 PM Slawa Olhovchenkov wrote: > > > > On Tue, Jan 26, 2016 at 05:39:03PM -0800, John Baldwin wrote: > > > > > > > > > The original motivation for my changes is to support efficient zero-copy > > > > > receive for TOE using Chelsio T4/T5 adapters. However, read() is ill > > > > > > > > I undertuns that not you work, but: what about (teoretical) async > > > > open/close/unlink/etc? > > > > > > Implementing more asynchronous operations is orthogonal to this. It > > > would perhaps be a bit simpler to implement these in the new model > > > since most of the logic would live in a vnode-specific aio_queue > > > method in vfs_vnops.c. However, the current AIO approach is to add a > > > new system call for each async system call (e.g. aio_open()). You > > > would then create an internal LIO opcode (e.g. LIO_OPEN). The vnode > > > aio hook would then have to support LIO_OPEN requests and return the > > > opened fd via aio_complete(). Async stat / open might be nice for > > > network filesystems in particular. I've known of programs forking > > > separate threads just to do open/fstat of NFS files to achieve the > > > equivalent of aio_open() / aio_stat(). > > > > Some problem exist for open()/unlink/rename/etc -- you can't use > > fd-related semantic. > > Mmmm. We have an aio_mlock(). aio_open() would require more of a special > case like aio_mlock(). It's still doable, but it would not go via the > fileop, yes. fstat could go via the fileop, but a path-based stat would > be akin to aio_open(). aio_rename require yet more of special handling. As I see this is can't be packed in current structures (aiocb and perhaps sigevent). I am don't see space for multiple paths. I am don't see space for fd return. Need to change some semantics (dissalow some notifications, for examples, only SIGEV_THREAD will be allowed? How pass information about called aio operation?). Also, may be some problems inside kernel for fd-less operations?