From owner-svn-src-head@FreeBSD.ORG Mon Nov 28 20:13:19 2011 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69D8A1065678; Mon, 28 Nov 2011 20:13:19 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2F7798FC14; Mon, 28 Nov 2011 20:13:19 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) by cyrus.watson.org (Postfix) with ESMTPSA id A373F46B09; Mon, 28 Nov 2011 15:13:18 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 1CE16B96B; Mon, 28 Nov 2011 15:13:18 -0500 (EST) From: John Baldwin To: Lawrence Stewart Date: Mon, 28 Nov 2011 14:20:43 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p8; KDE/4.5.5; amd64; ; ) References: <201111210417.pAL4HOdi023556@svn.freebsd.org> <8805563E-60F9-4BF4-A292-1A43C4FA368A@cubinlab.ee.unimelb.edu.au> <4ECCA2A1.4020408@freebsd.org> In-Reply-To: <4ECCA2A1.4020408@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201111281420.43310.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Mon, 28 Nov 2011 15:13:18 -0500 (EST) Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Julien Ridoux , Ben Kaduk Subject: Re: svn commit: r227778 - head/sys/net X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 20:13:19 -0000 On Wednesday, November 23, 2011 2:37:05 am Lawrence Stewart wrote: > On 11/23/11 17:42, Julien Ridoux wrote: > > > > On 23/11/2011, at 1:00 AM, Lawrence Stewart wrote: > > > >> On 11/23/11 00:30, John Baldwin wrote: > >>> Think of standalone modules that are not built as part of a > >>> kernel (e.g. 3rd party device drivers). In general we should > >>> avoid having structures change size for kernel options, > >>> especially common structures. It just adds lots of pain and > >>> suffering and complexity. We are stuck with it for PAE on i386 > >>> (which causes pain), and for LOCK_PROFILING (but that is > >>> sufficiently rare and expensive it seems to be ok). I think 8 > >>> bytes for bpf packet is not sufficiently expensive to justify the > >>> extra headache. Just always leave the new field in. > >> > >> hmm... Julien almost has a patch finished which accomplishes what > >> my most recent email in this thread describes. Julien, I suggest we > >> get it finished and follow up to this thread with a pointer to the > >> patch for people to look at. If there's still a strong feeling that > >> what it does is going to bring pain we can do away with the new > >> BPF_FFCOUNTER config option and have the bpf header struct just > >> grow by 8 bytes. > >> > >> Stay tuned... > > > > Thanks all for the feedback. With some delay, I have a patch against > > r227871 that implements what Lawrence proposed. You can find it > > here: > > http://www.cubinlab.ee.unimelb.edu.au/~jrid/patches/ffclock-bpf-header- r227871.patch > > There are a few nits, but the patch implements what I envisaged, thanks > Julien. > > > I have tested this under a few typical scenario, it works as > > expected but already brings some headaches (hence the long delay > > mentioned above :-)). > > > > I thought a bit more of user cases. I believe many of them call for > > having both feed-forward counter and its conversion in second be > > present in the BPF header. For example, this allows to have absolute > > packet departure/arrival times (as per usual), but also provides the > > opportunity to compute inter-arrival times accurately using the > > difference clock. There are other examples I can think of, and if one > > believe the feed-forward clock approach becomes more popular, such > > usages will be more and more common. > > > > Assuming the BPF header grows by 8 bytes independent of any kernel > > option, I admit that the current implementation is a bit ugly. The > > BPF structure is not nicely packed and looks clunky. Ideally, the > > feed-forward counter should be placed just below the bh_tstamp > > member, but this would require libpcap and all ports depending on it > > to be recompiled after this change. > > Even though it looks a bit gross, we would still add it at the end to > avoid gratuitously breaking binaries. We would then also add some > explicit padding in the struct to soak up the redundant space left in > between it and the second last struct member. If this is not something you expect to MFC, then I would just add it where it makes the most sense. One question I have is if this affects the file format of what tcpdump -w writes out and tcpdump -r reads in? If that is affected then changing this will need much more thought. If not, then I think it should be fixed "right" in 10. If you need to MFC it then you may need to do some gymnatics to preserve the ABI in stable branches, but we prefer to keep HEAD clean so we don't build up layer upon layer of compat hacks. > > What is your favourite option? > > FreeBSD parlance is to ask what colour you would like to paint the > bikeshed ;) > > As I've never experienced the pain John refers to, I'll defer to the > wisdom of others on whether the proposed patch will create pain down the > road. I think it's ok, but if consensus is 8bytes per packet isn't going > to break the bank, I guess we just go for it - but I guess I am cautious > about this route as we can push a lot of packets per second through the > stack. In my limited experience the limit on pushing pps through bpf(4) isn't due to the size of the bpf packet header itself but has more to do with disk I/O transactions, etc. -- John Baldwin