From owner-svn-src-all@freebsd.org Sat May 7 17:57:06 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 441A6B32B19; Sat, 7 May 2016 17:57:06 +0000 (UTC) (envelope-from nparhar@gmail.com) Received: from mail-pf0-x243.google.com (mail-pf0-x243.google.com [IPv6:2607:f8b0:400e:c00::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 1478E1E75; Sat, 7 May 2016 17:57:06 +0000 (UTC) (envelope-from nparhar@gmail.com) Received: by mail-pf0-x243.google.com with SMTP id r187so14854649pfr.2; Sat, 07 May 2016 10:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=QXry2BuMmOOEyrKd6jI0asiTXdRO1jfwheak53m4t8E=; b=mHC/NwS3COAoWViWI7Y3oiGKLv9PfYJhACOVQ4szF+LyHMCjfPTUFSrggqjYR/sizd XH7Vr8szbwScOV+Byvwmmf79mOrFxSH3mcprgmX868QJr4bUVE694dFSgckdtJteBY9r rKuA2ifFDc36yovvWsXiGhr/q+GHxI5IE3XI7J7ZoQbluF1zTT4PvpWjLkBfrTOMRAmO VYL/J/ZGTyTlS/T2qg+BKMTC0uBLAayqZzeBxt956h2oBcbDJZQYxgo4HQII7hrLYdA7 aVK5M7PadZ3Q+21upJV4d4rdMhLFBGgRowopuGJGRSpxdFXZMOnxfiJWahTO6zjoDkDg at6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=QXry2BuMmOOEyrKd6jI0asiTXdRO1jfwheak53m4t8E=; b=bWAaV4iBCHJttpBjxKlv0HKyneXeIcYrJ6T7g/IUv7jFXgWKAF92zaelvw1AxJYJ0m YZhlnJLMO/rTxJCKhWDYI5hOAwARAm+XQ+4eZwa+K9qz1EfvNuRtcbB63SdAFXlyb2zr qLVKx1QVgu23rUsz/Bs5rKCY62IS3yfpWLtHTr836NRPSMnnxioZ9m7KynnxlRTrlWiT SN8mbQPQIKFYNJ0JmWtKy4KkAkOfA1A0rf/pXEtNCnkFqjEtj1R+yCUEHBzhC07rJlpi 6nGYFC3wNqSo3LOD+XapLD84okEZ64j0z0psg2GTZc6d4SUOX9EsrHCoG94N70f8vUqr eg9g== X-Gm-Message-State: AOPr4FWXYHeMF1Plf/7LlnDO7V1Oc6kfTn4VKLVG4nmpHsdgxiSqHxEMNhFQlaruYLSlbA== X-Received: by 10.98.49.134 with SMTP id x128mr37413619pfx.45.1462643825661; Sat, 07 May 2016 10:57:05 -0700 (PDT) Received: from ox ([2601:641:c000:e00:ecd5:8987:c6cf:c947]) by smtp.gmail.com with ESMTPSA id d13sm29508067pfd.80.2016.05.07.10.57.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 07 May 2016 10:57:04 -0700 (PDT) Sender: Navdeep Parhar Date: Sat, 7 May 2016 10:56:58 -0700 From: Navdeep Parhar To: John Baldwin Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r299210 - in head/sys/dev/cxgbe: . tom Message-ID: <20160507175658.GA4513@ox> Mail-Followup-To: John Baldwin , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org References: <201605070033.u470XZCs075568@repo.freebsd.org> <3138889.ZBJ52FyIMB@ralph.baldwin.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3138889.ZBJ52FyIMB@ralph.baldwin.cx> User-Agent: Mutt/1.5.24 (2015-08-30) X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2016 17:57:06 -0000 On Fri, May 06, 2016 at 05:52:15PM -0700, John Baldwin wrote: > On Saturday, May 07, 2016 12:33:35 AM John Baldwin wrote: > > Author: jhb > > Date: Sat May 7 00:33:35 2016 > > New Revision: 299210 > > URL: https://svnweb.freebsd.org/changeset/base/299210 > > > > Log: > > Use DDP to implement zerocopy TCP receive with aio_read(). > > > > Chelsio's TCP offload engine supports direct DMA of received TCP payload > > into wired user buffers. This feature is known as Direct-Data Placement. > > However, to scale well the adapter needs to prepare buffers for DDP > > before data arrives. aio_read() is more amenable to this requirement than > > read() as applications often call read() only after data is available in > > the socket buffer. > > > > When DDP is enabled, TOE sockets use the recently added pru_aio_queue > > protocol hook to claim aio_read(2) requests instead of letting them use > > the default AIO socket logic. The DDP feature supports scheduling DMA > > to two buffers at a time so that the second buffer is ready for use > > after the first buffer is filled. The aio/DDP code optimizes the case > > of an application ping-ponging between two buffers (similar to the > > zero-copy bpf(4) code) by keeping the two most recently used AIO buffers > > wired. If a buffer is reused, the aio/DDP code is able to reuse the > > vm_page_t array as well as page pod mappings (a kind of MMU mapping the > > Chelsio NIC uses to describe user buffers). The generation of the > > vmspace of the calling process is used in conjunction with the user > > buffer's address and length to determine if a user buffer matches a > > previously used buffer. If an application queues a buffer for AIO that > > does not match a previously used buffer then the least recently used > > buffer is unwired before the new buffer is wired. This ensures that no > > more than two user buffers per socket are ever wired. > > > > Note that this feature is best suited to applications sending a steady > > stream of data vs short bursts of traffic. > > > > Discussed with: np > > Relnotes: yes > > Sponsored by: Chelsio Communications > > The primary tool I used for evaluating performance was netperf's TCP stream > test. It is a best case for this (constant stream of traffic), but that is > also the intended use case for this feature. > > Using 2 64K buffers in a ping-pong via aio_read() to receive a 40Gbps stream > used about about two full CPUs (~190% CPU usage) on a single-package > Intel E5-1620 v3 @ 3.50GHz with the stock TCP stack. Enabling TOE brings the > usage down to about 110% CPU. With DDP, the usage is around 30% of a single > CPU. With two 1MB buffers the the stock and TOE numbers are about the same, > but the DDP usage is about 5% of single CPU. 5% of a single core on modern systems (with 4+ cores) means top/vmstat will report around 1% aggregate CPU use or less while receiving full 40Gbps line rate @ 1500 MTU. The idea here is to let applications written against standard BSD sockets and POSIX AIO APIs make full use of hardware TCP zero copy features when available. Zero copy on the transmit side will also be implemented (it's simpler than the receive side) in time for FreeBSD 11. Regards, Navdeep