Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Sep 2012 08:10:52 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-net@freebsd.org
Cc:        Adrian Chadd <adrian@freebsd.org>, Ryan Stone <rysto32@gmail.com>
Subject:   Re: What's the latest on fixing IFF_DRV_OACTIVE/if_start/etc?
Message-ID:  <201209180810.52409.jhb@freebsd.org>
In-Reply-To: <CAJ-VmoksUJBX507rOWD8%2B1ZRD1xjEBQvRZFmdnHNxpGeVRJF3w@mail.gmail.com>
References:  <CAFMmRNzkwbQpUZ3OOoMKVdrz=dePc5fkeX3m-5vXtiWJ7qXwVA@mail.gmail.com> <201209171503.12517.jhb@freebsd.org> <CAJ-VmoksUJBX507rOWD8%2B1ZRD1xjEBQvRZFmdnHNxpGeVRJF3w@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, September 17, 2012 7:16:27 pm Adrian Chadd wrote:
> There's a lot less cache in these boards. Going through the stack
> trace all the way and back for each packet is actually quite
> expensive.
> 
> Then there's the overhead of having if_start() be called multiple
> times, concurrently, from multiple senders. It's fine for a wifi AP
> setup where the if_start() is only called once or twice in an
> overlapping fashion, but sucks with lots of concurrent TCP/UDP
> contexts all potentially calling if_start() and having them have to
> clash with each other.
> 
> I've not sat down and instrumented it all that much, so I'm going to
> spend much time harping on about it until I have some hard numbers
> either way. I'm just going by what I see people do to various network
> stacks when it comes time to try and squeeze high packet rates out of
> smaller platforms, especially with NAT/bridging/PPPoE in the way. And
> that tended not to be "complete packet to completion on each frame."

I (mostly) don't get where you are coming from at all.  The "old" code would 
do this given a packet bound for if_vlan(4) or if_bridge(4):

- queue the packet to the virtual interface's if_snd(4) including
  locking, etc.
- call the virtual interface's if_start() which dequeues the
  previously queued packet (again, using locking) and then passes it
  down to the underlying physical interface via if_transmit().  For
  most NICs this entails queueing the packet to if_snd and then
  optionally calling the NIC's if_start().

The "new" code does:

- Pass the packet down to the underlying physical interface via
  if_transmit().  For most NICs this entails queueing the packet
  to if_snd and then optionally calling the NIC's if_start().

This is _less_ work.  Also, consider vlan(4).  vlan(4)'s job is to serve
(largely) as a protocol layer that prepends (or strips) a vlan header from
the packet.  We don't use interface queues for other protocols such as
Ethernet or IP.  We shouldn't use one for vlan(4) either.  The packet
should have the transform applied and then be dispatched to the actual
NIC.

As for direct dispatch, there is a toggle to not use direct dispatch in the 
network stack and to use netisr for certain tasks.  However, that is generally 
used for receive, not transmit AFAIK.

As for concurrent calls to if_start(), it certainly might be nice to provide a 
way under the IFQ lock to know in if_transmit() (the default one) whether or 
not if_start() is in progress and should be called (this would mean having 
if_start() be called with the IFQ lock held I think, though for drivers that 
use if_start() this would work nicely I think, esp. if you moved OACTIVE into 
the IFQ).  That is orthogonal to bypassing queueing for things like vlan and 
bridge however.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201209180810.52409.jhb>