Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jun 2015 19:22:13 +0300
From:      Slawa Olhovchenkov <slw@zxy.spb.ru>
To:        Luigi Rizzo <rizzo@iet.unipi.it>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: netmap custom RSS and custom packet info
Message-ID:  <20150629162213.GG1647@zxy.spb.ru>
In-Reply-To: <CA%2BhQ2%2BjhNkhLnxHQKeoEgbs2479hdnLd7mRR3XPmQLZyS1=1sw@mail.gmail.com>
References:  <20150629151750.GD1647@zxy.spb.ru> <CA%2BhQ2%2BjhNkhLnxHQKeoEgbs2479hdnLd7mRR3XPmQLZyS1=1sw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 29, 2015 at 06:05:41PM +0200, Luigi Rizzo wrote:

> On Mon, Jun 29, 2015 at 5:17 PM, Slawa Olhovchenkov <slw@zxy.spb.ru> wrote:
> 
> > Working with netmap and modern hardware I am lacking some features:
> >
> > a) some spare space before packet (64/128/192/256 bytes) for
> > application data. For example: application do some pre-analysig
> > packet, filled structure in this space and routed packet (via NETMAP
> > pipe) to other thread. Received thread got packet and linked
> > inforamtion about this packet for processing w/o additional overhead.
> >
> 
> ​spare space in front of the packet is something we have
> been considering for a different purpose, namely better
> support for encapsulation/decapsulation and things like
> vhost-net header.

Adding more space (sysctl or ioctl controled may be satisfy both:
4-8-20 bytes for encapsulation and rest for application).

> ​Note though that the annotation is transferred for free
> only in the case of pipes or ports sharing the same memory
> region; vale ports would have to explicitly copy the
> extra​ bytes which is (moderately) expensive.

I think this bytes don't be transfered throw VALE.
This is only packet-processing information, like tags, opposite to
VALE that is like packet transfered by wire.

> A quick and dirty way to support what you want is the following:
> - in the kernel code, modify NMB(), PNMB() and the offset between
>   the netmap_ring and the first buffer to add the extra space
>   you want in front of the packet. You can possibly make this
>   offset a sysctl-controlled value
> 
> - in netmap_vale.c, make a small change to the code that copies
>   buffers so that it includes also the space before the actual packet.
> 
> That should be all.

Do you plan to do this?
I am don't like have permanenty private branch/patchs.

> > b) custom RSS. Modern NIC have RSS poorly interoperable with packet
> > analysing: packets from same flow, but different direction placed in
> > different queue, PPPoE encapsulated packets placed in queue 0,
> > different tunneling don't recognised and etc. May be NETMAP can be
> > used custom RSS hashing from loadable kernel module, provideng by
> > user? Function frm this module can be packet analysing, tunnel
> > removing, custom RSS hashnig with direction-independly maner, filled
> > some structure prepended to buffer (see above) and pass this
> > information to application.
> >
> 
> ​RSS is completely orthogonal to​
> 
> ​ netmap and I strongly
> suggest to keep it this way, using either use the NIC-specific
> tools to control RSS or some generic mechanism
> (on linux there is ethtool, and we should implement something
> similar also on freebsd).

This is not true RSS. This is only trick for reassigning RX packets to
different netmap rings. All hardware avalable RSS mechanism is fully
inacceptable for this:

- don't support different encapsulation (PPPoE, GRE, GTP and etc)
- give different rings for packet 1.2.3.4->5.6.7.8 and  5.6.7.8->1.2.3.4

Producing unversal hashing/distributing mechanism is too complex. But
using user-providing kernel module (syncing to application) may be
acceptable?

This is like ephemeral permanent NETMAP pipe between real hardware
RX rings/driver and application visible rings.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150629162213.GG1647>