Date: Mon, 7 Oct 2002 09:46:53 -0700 From: "Sam Leffler" <sam@errno.com> To: "Julian Elischer" <julian@elischer.org> Cc: "Nate Lawson" <nate@root.org>, <freebsd-arch@FreeBSD.ORG>, <freebsd-net@FreeBSD.ORG> Subject: Re: CFR: m_tag patch Message-ID: <151b01c26e21$26adb690$52557f42@errno.com> References: <Pine.BSF.4.21.0210062343510.22932-100000@InterJet.elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Sun, 6 Oct 2002, Sam Leffler wrote: > > > > 1. Is ordering important or is an SLIST sufficient for all cases? > > > > Order is not important. > > I do similar in Netgraph but could move to whatever scheme becomes > standard in the rest of the system. However I have some serious > comments. > > In netgraph I have a separate structure for the packet that contains > a pointer to the mbuf chain as well a pointer to a malloc'd memory > buffer that can contain metadata such as is kept here. > the metadata is of the form [header][field][field][field].... > > where each field was defined as: > struct meta_field_header { > u_long cookie; /* cookie for the field. Skip fields you don't > * know about (same cookie as in messgaes) */ > u_short type; /* field ID within this cookie scheme */ > u_short len; /* total len of this field including extra > * data */ > char data[0]; /* data starts here */ > }; > > the header for metadata is: > struct ng_meta { > char priority; /* -ve is less priority, 0 is default */ > char discardability; /* higher is less valuable.. discard first */ > u_short allocated_len; /* amount malloc'd */ > u_short used_len; /* sum of all fields, options etc. */ > u_short flags; /* see below.. generic flags */ > struct meta_field_header options[0]; /* add as (if) needed */ > }; > > /* Flags for meta-data */ > #define NGMF_TEST 0x01 /* discard at the last moment before sending */ > #define NGMF_TRACE 0x02 /* trace when handing this data to a node */ > > One metadata collection is associated with each packet. similar to what > you do except that in 4.x Ipass it as an arguent and in 5.0 > I have a packet header that is passed around that identifies both > data and metadata. ( I need the header for other reasons anyway) > > I show this only to show that I have tackled the same problem with a > similar but different scheme. My thought was that a packet usually only > has at most a couple of tags, and the tags are usually short, so that it > was ok to malloc a bigger chunk and using the length fields walk them. > If you didn't have room you could malloc a bigger one and copy the > intitial fields, and then add your new ones at the end. It would > probably almost never happen.. > I'm not sure I follow exactly how the above works, but the tag data structures are very simple and would appear to accomodate your data structures. An m_tag is simply a variable-length malloc'd block of memory that has a type field (and linked list pointer). What you store in the space that follows the fixed-length struct m_tag header is up to you. The type field is used simply to locate tags once attached to a packet. You can certainly allocate a tag with larger chunk of memory than you initially need and store the bookkeeping info in the tag data block. This would appear to permit implementation of the above within the m_tag framework. > I did however find a need to delete a tag once the packet passed out of > the scope of the module in question. (the "cookie" represents a > particular "ABI" the "type" is only valid withing the ABI. > If you do not support a particular cookie type you cannot interpret the > contents aof that field) Protocols and such have their own cookies. > Yes, this is exactly what the m_tag_id item is for in the m_tag data structure (what I called a type field above). > I point this out as a feature because the TAG values need to only be > defined within their own ABI/API include files. > > If a module that supports a particular cookie passes a packet out to > the rest of the world, it probably should invalidate some fields that > are set with that particular cookie, in case that packet should in some > way be redirected back towards a module that does support it, and that > might cause unexpected results. For this reason I needed to 'remove' > fields. I ended leaving them in place and setting the cookie to a > special "invalid" cookie value that no-one would match. In a similar > vein, I can imagine wanting to remove one of the 'tags' in your lists. > You can either remove tags from the chain attached to an mbuf or invalidate them as you described--by setting the m_tag_id field to something "invalid". > Each module has a cookie field that is pretty much guaranteed > to be unique (time since epoch when written) so in your terms, > the IP code would only know and recognise TAGS prepended > with the IP cookie and would handle other tags as opaque data. > Similarly IPSEC code would only see into tags with the IPSEC cookie > prepended.. I don't think you should be > defining the TAG values in a global file, but rather > each module should identify its own tags and ignode others. > If two modules need to share data, a .h file can contain the > defineitions for that API and the cookie that identifies such tags, > and both modules would include that shared include. That way > the base code need not know about future improvements, which > has always been "difficult" :-) > Changing the way m_tag_id definitions are done is fine with me. It's done statically in mbuf.h for compatibility with code I've bringing over from openbsd. > > > > > 2. Is it possible to attach the aux argument to the mbuf chain instead of > > > adding it as a new parameter to ip_output? > > > > > > > The "aux argument" _was_ originally attached to the mbuf chain. The change > > to add an extra arg to ip*_output was done to eliminate one of the biggest > > uses of the aux mbuf; the socket to use to get IPsec policy. This is a > > performance win and worth doing independent of the aux->m_tag switch. > > > > One could split the ip_output change out but doing it together avoids > > converting code that would just eventually be eliminated (unless you did the > > ip_output change first and then the m_tag switch). > > I'm not sure but it's possible m_aux was invented so that ip_output() > would not change interfaces to match with was defined in some interface > method table. I think NetBSD added entries in their > interface method tables with varargs (yech) to get around some of this.. > > > So in summary: > is it worth making it a linked list? > how many tags do you see on packets, and how large are they? > CAn you use a sche,e suc as that outlined so that each module can > define its own tags privatly? > You can certainly use a scheme as you suggest. I don't believe doing something like that is precluded by what I've offered, you'd just layer the additional logic on top w/o any changes--I think. Tags tend to be very small (e.g. a pointer or two ints) and there tend to be only 1 or 2 when there are any. In my performance tuning work with IPsec their manipulation has never shown up as significant in kernel profiles. Sam To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?151b01c26e21$26adb690$52557f42>