From owner-svn-src-all@freebsd.org  Sat May  7 01:15:44 2016
Return-Path: <owner-svn-src-all@freebsd.org>
Delivered-To: svn-src-all@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 0D5F2B2EC48;
 Sat,  7 May 2016 01:15:44 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id E229C19BD;
 Sat,  7 May 2016 01:15:43 +0000 (UTC) (envelope-from jhb@freebsd.org)
Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net
 [73.231.226.104])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id EA2C1B97D;
 Fri,  6 May 2016 21:15:42 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: src-committers@freebsd.org
Cc: svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject: Re: svn commit: r299210 - in head/sys/dev/cxgbe: . tom
Date: Fri, 06 May 2016 17:52:15 -0700
Message-ID: <3138889.ZBJ52FyIMB@ralph.baldwin.cx>
User-Agent: KMail/4.14.3 (FreeBSD/10.2-STABLE; KDE/4.14.3; amd64; ; )
In-Reply-To: <201605070033.u470XZCs075568@repo.freebsd.org>
References: <201605070033.u470XZCs075568@repo.freebsd.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Fri, 06 May 2016 21:15:43 -0400 (EDT)
X-BeenThere: svn-src-all@freebsd.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "SVN commit messages for the entire src tree \(except for &quot;
 user&quot; and &quot; projects&quot; \)" <svn-src-all.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/svn-src-all/>
List-Post: <mailto:svn-src-all@freebsd.org>
List-Help: <mailto:svn-src-all-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/svn-src-all>,
 <mailto:svn-src-all-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 07 May 2016 01:15:44 -0000

On Saturday, May 07, 2016 12:33:35 AM John Baldwin wrote:
> Author: jhb
> Date: Sat May  7 00:33:35 2016
> New Revision: 299210
> URL: https://svnweb.freebsd.org/changeset/base/299210
> 
> Log:
>   Use DDP to implement zerocopy TCP receive with aio_read().
>   
>   Chelsio's TCP offload engine supports direct DMA of received TCP payload
>   into wired user buffers.  This feature is known as Direct-Data Placement.
>   However, to scale well the adapter needs to prepare buffers for DDP
>   before data arrives.  aio_read() is more amenable to this requirement than
>   read() as applications often call read() only after data is available in
>   the socket buffer.
>   
>   When DDP is enabled, TOE sockets use the recently added pru_aio_queue
>   protocol hook to claim aio_read(2) requests instead of letting them use
>   the default AIO socket logic.  The DDP feature supports scheduling DMA
>   to two buffers at a time so that the second buffer is ready for use
>   after the first buffer is filled.  The aio/DDP code optimizes the case
>   of an application ping-ponging between two buffers (similar to the
>   zero-copy bpf(4) code) by keeping the two most recently used AIO buffers
>   wired.  If a buffer is reused, the aio/DDP code is able to reuse the
>   vm_page_t array as well as page pod mappings (a kind of MMU mapping the
>   Chelsio NIC uses to describe user buffers).  The generation of the
>   vmspace of the calling process is used in conjunction with the user
>   buffer's address and length to determine if a user buffer matches a
>   previously used buffer.  If an application queues a buffer for AIO that
>   does not match a previously used buffer then the least recently used
>   buffer is unwired before the new buffer is wired.  This ensures that no
>   more than two user buffers per socket are ever wired.
>   
>   Note that this feature is best suited to applications sending a steady
>   stream of data vs short bursts of traffic.
>   
>   Discussed with:	np
>   Relnotes:	yes
>   Sponsored by:	Chelsio Communications

The primary tool I used for evaluating performance was netperf's TCP stream
test.  It is a best case for this (constant stream of traffic), but that is
also the intended use case for this feature.

Using 2 64K buffers in a ping-pong via aio_read() to receive a 40Gbps stream
used about about two full CPUs (~190% CPU usage) on a single-package
Intel E5-1620 v3 @ 3.50GHz with the stock TCP stack.  Enabling TOE brings the
usage down to about 110% CPU.  With DDP, the usage is around 30% of a single
CPU.  With two 1MB buffers the the stock and TOE numbers are about the same,
but the DDP usage is about 5% of single CPU.

Note that these numbers are with aio_read().  read() fares a bit better (180%
for stock and 70% for TOE).  Before the AIO rework, trying to use aio_read()
with two buffers in a ping-pong used twice as much CPU as bare read(), but
aio_read() in general is now fairly comparable to read() at least in terms of
CPU overhead.

-- 
John Baldwin