From owner-svn-src-all@freebsd.org Mon Oct 10 18:40:30 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 6D4BEC0CC14; Mon, 10 Oct 2016 18:40:30 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4851B11FC; Mon, 10 Oct 2016 18:40:30 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 81B7410AF69; Mon, 10 Oct 2016 14:40:28 -0400 (EDT) From: John Baldwin To: Slawa Olhovchenkov Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: Re: svn commit: r306661 - in stable/11/sys/dev/cxgbe: . tom Date: Mon, 10 Oct 2016 11:39:24 -0700 Message-ID: <5243602.cilUCEM5cP@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-PRERELEASE; KDE/4.14.10; amd64; ; ) In-Reply-To: <20161010182821.GZ54003@zxy.spb.ru> References: <201610032315.u93NFiHE057529@repo.freebsd.org> <1660024.uzJn2AtV1k@ralph.baldwin.cx> <20161010182821.GZ54003@zxy.spb.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Mon, 10 Oct 2016 14:40:28 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 10 Oct 2016 18:40:30 -0000 On Monday, October 10, 2016 09:28:21 PM Slawa Olhovchenkov wrote: > On Mon, Oct 10, 2016 at 10:46:27AM -0700, John Baldwin wrote: > > > On Monday, October 10, 2016 02:09:01 PM Slawa Olhovchenkov wrote: > > > On Mon, Oct 03, 2016 at 11:15:44PM +0000, John Baldwin wrote: > > > > > > > Author: jhb > > > > Date: Mon Oct 3 23:15:44 2016 > > > > New Revision: 306661 > > > > URL: https://svnweb.freebsd.org/changeset/base/306661 > > > > > > > > Log: > > > > MFC 303405: Add support for zero-copy aio_write() on TOE sockets. > > > > > > > > AIO write requests for a TOE socket on a Chelsio T4+ adapter can now > > > > DMA directly from the user-supplied buffer. This is implemented by > > > > wiring the pages backing the user-supplied buffer and queueing special > > > > mbufs backed by raw VM pages to the socket buffer. The TOE code > > > > recognizes these special mbufs and builds a sglist from the VM page > > > > array associated with the mbuf when queueing a work request to the TOE. > > > > > > > > Because these mbufs do not have an associated virtual address, m_data > > > > is not valid. Thus, the AIO handler does not invoke sosend() directly > > > > for these mbufs but instead inlines portions of sosend_generic() and > > > > tcp_usr_send(). > > > > > > > > An aiotx_buffer structure is used to describe the user buffer (e.g. > > > > it holds the array of VM pages and a reference to the AIO job). The > > > > special mbufs reference this structure via m_ext. Note that a single > > > > job might be split across multiple mbufs (e.g. if it is larger than > > > > the socket buffer size). The 'ext_arg2' member of each mbuf gives an > > > > offset relative to the backing aiotx_buffer. The AIO job associated > > > > with an aiotx_buffer structure is completed when the last reference to > > > > the structure is released. > > > > > > > > Zero-copy aio_write()'s for connections associated with a given > > > > adapter can be enabled/disabled at runtime via the > > > > 'dev.t[45]nex.N.toe.tx_zcopy' sysctl. > > > > > > > > Sponsored by: Chelsio Communications > > > > > > Do you have any public available application patches for support this? > > > May be nginx? > > > > Applications need to use aio_read(), ideally with at least 2 buffers (so > > queue two reads, then when a read completes, consume the data and do the > > next read). I'm not sure nginx will find this but so useful as web servers > > tend to send a lot more data than they receive. The only software I have > > patched explicitly for this is netperf. > > Hm, this is like only aio_read() on sokets give performance boost, not > aio_write()? Sorry, I was confused on the commit, this does affect aio_write() (earlier changes also permit zero-copy for receive via aio_read()). However, as you noted in the reply to Navdeep, it seems that nginx only supports using AIO on the backing files for static content it seems. It would need changes to support using aio_write on sockets (similar to using sendfile). -- John Baldwin