From owner-freebsd-net@freebsd.org Fri Jan 8 07:24:37 2016 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id EFFA6A66183 for ; Fri, 8 Jan 2016 07:24:36 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail109.syd.optusnet.com.au (mail109.syd.optusnet.com.au [211.29.132.80]) by mx1.freebsd.org (Postfix) with ESMTP id 6F4DB19E5 for ; Fri, 8 Jan 2016 07:24:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-166-197.carlnfd1.nsw.optusnet.com.au (c211-30-166-197.carlnfd1.nsw.optusnet.com.au [211.30.166.197]) by mail109.syd.optusnet.com.au (Postfix) with ESMTPS id 8F78DD619DF; Fri, 8 Jan 2016 18:24:27 +1100 (AEDT) Date: Fri, 8 Jan 2016 18:24:26 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Konstantin Belousov cc: Luigi Rizzo , Mark Delany , Boris Astardzhiev , "freebsd-net@freebsd.org" Subject: Re: Does FreeBSD have sendmmsg or recvmmsg system calls? In-Reply-To: <20160107192840.GF3625@kib.kiev.ua> Message-ID: <20160108172323.W1815@besplex.bde.org> References: <20160104091108.50654.qmail@f5-external.bushwire.net> <20160104093859.GV3625@kib.kiev.ua> <20160104101747.58347.qmail@f5-external.bushwire.net> <20160104194044.GD3625@kib.kiev.ua> <20160104210741.32812.qmail@f5-external.bushwire.net> <20160107161213.GZ3625@kib.kiev.ua> <20160107192840.GF3625@kib.kiev.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.1 cv=R4L+YolX c=1 sm=1 tr=0 a=KA6XNC2GZCFrdESI5ZmdjQ==:117 a=PO7r1zJSAAAA:8 a=L9H7d07YOLsA:10 a=9cW_t1CCXrUA:10 a=s5jvgZ67dGcA:10 a=JzwRw_2MAAAA:8 a=kj9zAlcOel0A:10 a=8vZNj04OTU8zMInQf6EA:9 a=CjuIK1q_8ugA:10 X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jan 2016 07:24:37 -0000 On Thu, 7 Jan 2016, Konstantin Belousov wrote: > On Thu, Jan 07, 2016 at 10:31:13AM -0800, Luigi Rizzo wrote: >> ... >> What we need first is experimental data that shows a performance benefit >> compared to looping around the single-packet syscall. Then we can decide >> how to proceed. > This is about performance. I don't really need the performance, but have done a lot of measurements of the the sendto() loop in an old version of ttcp and know how it works. It works poorly. It is very slow, and good error handling for ENOBUFS is impossible since select() doesn't work so the only ways to handle ENOBUFS are to busy-wait or to try to use a timeout that is not too long or so short that the kernel doesn't support it or it reduces to busy-waiting. All of the netrate utilities and newer versions of ttcp have similar problems. netblast produces interesting statistics by dropping packets differently. netrate tends to reach the limits of timeout granularity before producing anything interesting. Syscall overheads and other per-packet costs are enormous. I get a 10% speedup just by changing the malloc() in sendit() + getsockaddr() to a stack variable. The full 10% only occurs for the non-useful case of _dropped_ packets (ones that are passed to the driver but dropped due to ENOBUFS there). Counting dropped (output) packets is not completely useless for packet-blasting benchmarks. If the NIC can't reach line rate, then dropped packets are your main measure of spare capacity in the network stack. Network stack overheads are also enormous. They seem to have approximately doubled since FreeBSD-5. FreeBSD-5 drops packets better by peeking at the ifq early and not sending the packets down to the driver if they would be dropped there. This frees resources for doing more useful things. Another 10-20%. Batching won't help much here, except it almost requires better ENOBUFS handling which can be done much more easily in the kernel. sm@ pointed me to kttcp in /usr/src/tools. I didn't get around to trying it and it doesn't seem to have been maintained there. I thought that it tests at the driver level, but it actually loops doing sosend/receive() so it is too high level for driver testing but at about the level of kernel sendmmsg(). Anyway: - there must be an easy 10-30% to be gained by doing sendmmsg() in the kernel for just a few more than 1 message at a time - Someone should update and test kttcp. This gives a quick test for the batch performance that can be obtained without changes deep in the stack. Bruce