From owner-freebsd-net@FreeBSD.ORG Tue Jan 6 20:04:09 2015 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9A186462; Tue, 6 Jan 2015 20:04:09 +0000 (UTC) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.59.238]) by mx1.freebsd.org (Postfix) with ESMTP id 7F5652FF7; Tue, 6 Jan 2015 18:37:41 +0000 (UTC) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 5B70A7300A; Tue, 6 Jan 2015 19:42:20 +0100 (CET) Date: Tue, 6 Jan 2015 19:42:20 +0100 From: Luigi Rizzo To: Adrian Chadd Subject: Re: netmap over virtio giving packets with extra 12 bytes Message-ID: <20150106184220.GA35485@onelab2.iet.unipi.it> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: Vincenzo Maffione , "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Jan 2015 20:04:09 -0000 On Tue, Jan 06, 2015 at 10:15:02AM -0800, Adrian Chadd wrote: ... > This won't be the first time that there'll be useful data at the front > end of an RX mbuf that isn't related to the mbuf payload. > > It'd be nice if there were something in each rx ring slot saying how > far to skip into the buffer to get the beginning of the packet. I am not opposed in principle, this is something we have been looking at since day one. The blocking issue is that incompatible hw constraints make it hard to make a decent choice. Examples: 1. the rx buffer size you tell to ixgbe must be a power of two. If you want to write at some offset into the netmap buffer, you need to allocate one twice the size you pass to the driver. 2. some NICs may want buffers aligned to 4, 8, 16 bytes, so input offsets for headers cannot be arbitrary (12 is almost as bad as 14!) 3. irrespective of functionality, performance drops badly with small packets (where it matters the most) when buffers are not aligned to 64 byte boundaries. The above makes me think that for small packets, copying is the only reasonable way to go, and for large packets i have no idea how to deal with #1 and #2 without having to do scatter-gather. If you have a good suggestion please speak up. cheers luigi