Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 06 Nov 2012 10:25:06 +0100
From:      Andre Oppermann <oppermann@networx.ch>
To:        Adrian Chadd <adrian@freebsd.org>
Cc:        net@freebsd.org
Subject:   Re: splitting m_flags to pkthdr.flags + m_flags
Message-ID:  <5098D772.60002@networx.ch>
In-Reply-To: <CAJ-Vmo=gqOLSo5mTwe=27EbPV71cCLiOZDNVOTttDfxT24pDQw@mail.gmail.com>
References:  <20121102123817.GP70741@FreeBSD.org> <CAJ-Vmo=gqOLSo5mTwe=27EbPV71cCLiOZDNVOTttDfxT24pDQw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 06.11.2012 09:27, Adrian Chadd wrote:
> Hiya,
>
> On 2 November 2012 05:38, Gleb Smirnoff <glebius@freebsd.org> wrote:
>>    Hello networkers,
>>
>>    some (most) of m_flags bits are describing features that can
>> be present on a packet only, not on a single mbuf. Since we are
>> close to exhaustion of available bits, and many subsystems prefer
>> to use one of M_PROTO bits instead of grabbing a flag, I propose
>> to split m_flags to two parts:
>
> The idea of splitting things up is nice and it does make sense.
>
>> In the m_flags remain:
>> #define M_EXT           0x00000001 /* has associated external storage */
>> #define M_PKTHDR        0x00000002 /* start of record */
>> #define M_EOR           0x00000004 /* end of record */
>> #define M_RDONLY        0x00000008 /* associated data is marked read-only */
>> and all M_PROTO flags.
>>
>> struct pkthdr grows by one word and its new flag word now
>> posesses:
>>
>> #define M_BCAST         0x00000200 /* send/received as link-level broadcast */
>> #define M_MCAST         0x00000400 /* send/received as link-level multicast */
>> #define M_FRAG          0x00000800 /* packet is a fragment of a larger packet */
>> #define M_FIRSTFRAG     0x00001000 /* packet is first fragment */
>> #define M_LASTFRAG      0x00002000 /* packet is last fragment */
>> #define M_SKIP_FIREWALL 0x00004000 /* skip firewall processing */
>> #define M_VLANTAG       0x00010000 /* ether_vtag is valid */
>> #define M_PROMISC       0x00020000 /* packet was not for us */
>> #define M_FLOWID        0x00400000 /* deprecated: flowid is valid */
>> #define M_HASHTYPEBITS  0x0F000000 /* mask of bits holding flowid hash type */
>>
>> Some M_PROTO abusements like M_FASTFWD_OURS and recently added M_IP_NEXTHOP
>> also go to the pkthdr mbuf flags.
>>
>> P.S.
>> An attentive reader may have noticed that I missed M_NOFREE and M_FREELIST.
>> Yep, these flags coming from historical mbuf allocator from FreeBSD 4.x times
>> are about to be deleted, we carefully examine them, but never set. Patch
>> for review attached.
>
> My comments:
>
> * We don't know if any downstream clients are using M_NOFREE for
> whatever reason. M_FREELIST may be easy to kill but again, who knows
> who's reused it in their own code. So I'd prefer that this sort of
> thing was more widely communicated to the major (known) FreeBSD
> companies before you go and change this stuff.

It's unlikely anybody is using it instead of our native UMA allocator.
If they do, they have a heavily modified stack anyway.  If they do,
they have just to reintroduce that flag again possibly as an overlay
or with a separate bit.

> * .. I'm not sure whether you want to churn the mbuf API through some
> accessor API at this stage - ie, instead of just separating things
> out, what about also at the same time creating accessor macros that
> right now just go to the mbuf flags, but later on break it out to the
> hdr and non-hdr flags?

This is probably not worth it and really complicates things much
more than not.

> * There's been a lot of network stack churn lately. I personally think
> we should wait a while longer and let the fallout .. well, finish
> falling out, before you continue churning things. :-)

We've been stagnating in some areas far too long.  If there is a reasonable
justification we must not be afraid to move/break things.  To make a tasty
omelette some eggs have to be broken.  If you're afraid of making changes
and wait for signoff from unknown parties nothing is ever going to happen.

Rather than breaking things all the time we'd rather make all the sweeping
changes for 10.0 in one swoop and then have to mostly stable for some time
again.  Instead of breaking half of the things with every major release.

In fact most changes make it eventually easier for vendors to track FreeBSD
than before.  A couple of those changes are even driven by the vendors.

Also we are not are not working for the vendors (well, most of us).  The
vendor is working with us.  We want to drive FreeBSD into the future and
make it even more attractive to them to work with us.  Most of them have
got by now that one-off branching is not a good long-term option and started
to feed back their non-core IP changes.  Also driver vendors have understood
that maintaining their driver in the tree is far easier than outside of it.

Note that I'm not saying that we should break and move things nilly willy
just for the sake of it as it seems to happen in other OS's.

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5098D772.60002>