Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 23 Nov 2011 18:37:05 +1100
From:      Lawrence Stewart <lstewart@freebsd.org>
To:        Julien Ridoux <jrid@cubinlab.ee.unimelb.edu.au>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Ben Kaduk <minimarmot@gmail.com>, John Baldwin <jhb@FreeBSD.org>
Subject:   Re: svn commit: r227778 - head/sys/net
Message-ID:  <4ECCA2A1.4020408@freebsd.org>
In-Reply-To: <8805563E-60F9-4BF4-A292-1A43C4FA368A@cubinlab.ee.unimelb.edu.au>
References:  <201111210417.pAL4HOdi023556@svn.freebsd.org> <648D11A8-3636-49E5-BF20-83E4EA87242C@cubinlab.ee.unimelb.edu.au> <4EC9FD8A.5040401@freebsd.org> <201111220830.05029.jhb@freebsd.org> <4ECBAB19.4010907@freebsd.org> <8805563E-60F9-4BF4-A292-1A43C4FA368A@cubinlab.ee.unimelb.edu.au>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/23/11 17:42, Julien Ridoux wrote:
>
> On 23/11/2011, at 1:00 AM, Lawrence Stewart wrote:
>
>> On 11/23/11 00:30, John Baldwin wrote:
>>> On Monday, November 21, 2011 2:28:10 am Lawrence Stewart wrote:
>>>> On 11/21/11 17:18, Julien Ridoux wrote:
>>>>>
>>>>> On 21/11/2011, at 4:39 PM, Lawrence Stewart wrote:
>>>>>
>>>>>> On 11/21/11 16:12, Ben Kaduk wrote:
>>>>>>> On Sun, Nov 20, 2011 at 11:17 PM, Lawrence
>>>>>>> Stewart<lstewart@freebsd.org>     wrote:
>>>>>>>> Author: lstewart Date: Mon Nov 21 04:17:24 2011 New
>>>>>>>> Revision: 227778 URL:
>>>>>>>> http://svn.freebsd.org/changeset/base/227778
>>>>>>>>
>>>>>>>> Log: - When feed-forward clock support is compiled in,
>>>>>>>> change the BPF header to contain both a regular
>>>>>>>> timestamp obtained from the system clock and the
>>>>>>>> current feed-forward ffcounter value. This enables new
>>>>>>>> possibilities including
>>>>>>>
>>>>>>> Is it really necessary to make the ABI dependent on a
>>>>>>> kernel configuration option?  This causes all sorts of
>>>>>>> headaches if loadable modules ever want to use that ABI,
>>>>>>> something that we just ran into with vm_page_t and
>>>>>>> friends and had a long thread on -current about.
>>>>>>
>>>>>> Fair question. Julien, if pcap and other consumers will
>>>>>> happily ignore the new ffcount_stamp member in the bpf
>>>>>> header, is there any reason to conditionally add the
>>>>>> ffcounter into the header struct?
>>>>>
>>>>> It is a valid question indeed. The feedback I have received
>>>>> so far was to not have the feed-forward clock support be a
>>>>> default kernel configuration option. What follows is based on
>>>>> this assumption.
>>>>>
>>>>> The commit (r227747) introduces sysctl that are conditioned
>>>>> by the same "FFCLOCK" kernel configuration option. If a
>>>>> loadable module tests for the presence of this sysctl, it
>>>>> will know if the ffcount_stamp member is available. Is it too
>>>>> much of a hack?
>>>>>
>>>>> Alternatively, if the ffcounter is added to the bpf header
>>>>> unconditionally, the ffcount_stamp member can be set to 0.
>>>>> Loadable modules will then see a consistent ABI but will
>>>>> retrieve a meaningless value.
>>>>>
>>>>> I am not sure which option makes more sense, any preference?
>>>>
>>>> If I understand the issues correctly, I think the appropriate
>>>> path forward is to remove the conditional change to the bpf
>>>> header and have ffcount_stamp become a permanent member of the
>>>> struct. We'll just leave the member uninitialised in the
>>>> !FFCLOCK case. This change will make the patch un-MFCable, but
>>>> I think that's ok.
>>>>
>>>> As to the issue of how a kernel module would detect if it's
>>>> being loaded into a FFCLOCK enabled kernel, why wouldn't we
>>>> expect modules to "#include opt_ffclock.h" and conditionally
>>>> compile code based on FFCLOCK being defined? Is there a use
>>>> case for run-time (as opposed to compile-time) module detection
>>>> of feed-forward clock capabilities?
>>>
>>> Think of standalone modules that are not built as part of a
>>> kernel (e.g. 3rd party device drivers).  In general we should
>>> avoid having structures change size for kernel options,
>>> especially common structures.  It just adds lots of pain and
>>> suffering and complexity.  We are stuck with it for PAE on i386
>>> (which causes pain), and for LOCK_PROFILING (but that is
>>> sufficiently rare and expensive it seems to be ok).  I think 8
>>> bytes for bpf packet is not sufficiently expensive to justify the
>>> extra headache.  Just always leave the new field in.
>>
>> hmm... Julien almost has a patch finished which accomplishes what
>> my most recent email in this thread describes. Julien, I suggest we
>> get it finished and follow up to this thread with a pointer to the
>> patch for people to look at. If there's still a strong feeling that
>> what it does is going to bring pain we can do away with the new
>> BPF_FFCOUNTER config option and have the bpf header struct just
>> grow by 8 bytes.
>>
>> Stay tuned...
>
> Thanks all for the feedback. With some delay, I have a patch against
> r227871 that implements what Lawrence proposed. You can find it
> here:
> http://www.cubinlab.ee.unimelb.edu.au/~jrid/patches/ffclock-bpf-header-r227871.patch

There are a few nits, but the patch implements what I envisaged, thanks 
Julien.

>  I have tested this under a few typical scenario, it works as
> expected but already brings some headaches (hence the long delay
> mentioned above :-)).
>
> I thought a bit more of user cases. I believe many of them call for
> having both feed-forward counter and its conversion in second be
> present in the BPF header. For example, this allows to have absolute
> packet departure/arrival times (as per usual), but also provides the
> opportunity to compute inter-arrival times accurately using the
> difference clock. There are other examples I can think of, and if one
> believe the feed-forward clock approach becomes more popular, such
> usages will be more and more common.
>
> Assuming the BPF header grows by 8 bytes independent of any kernel
> option, I admit that the current implementation is a bit ugly. The
> BPF structure is not nicely packed and looks clunky. Ideally, the
> feed-forward counter should be placed just below the bh_tstamp
> member, but this would require libpcap and all ports depending on it
> to be recompiled after this change.

Even though it looks a bit gross, we would still add it at the end to 
avoid gratuitously breaking binaries. We would then also add some 
explicit padding in the struct to soak up the redundant space left in 
between it and the second last struct member.

> What is your favourite option?

FreeBSD parlance is to ask what colour you would like to paint the 
bikeshed ;)

As I've never experienced the pain John refers to, I'll defer to the 
wisdom of others on whether the proposed patch will create pain down the 
road. I think it's ok, but if consensus is 8bytes per packet isn't going 
to break the bank, I guess we just go for it - but I guess I am cautious 
about this route as we can push a lot of packets per second through the 
stack.

Cheers,
Lawrence



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4ECCA2A1.4020408>