Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 08 Nov 2012 12:13:14 +0100
From:      Andre Oppermann <andre@freebsd.org>
To:        pyunyh@gmail.com
Cc:        FreeBSD Net <freebsd-net@freebsd.org>, Adrian Chadd <adrian@freebsd.org>, Pyun YongHyeon <yongari@freebsd.org>
Subject:   Re: svn commit: r242739 - stable/9/sys/dev/ti
Message-ID:  <509B93CA.90609@freebsd.org>
In-Reply-To: <20121108023858.GA3127@michelle.cdnetworks.com>
References:  <201211080206.qA826RiN054539@svn.freebsd.org> <CAJ-VmomEOPGbLwmOmL0EdenZA7QKbV5P-hAYsTRcwLao2LbAqg@mail.gmail.com> <20121108023858.GA3127@michelle.cdnetworks.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 08.11.2012 03:38, YongHyeon PYUN wrote:
> On Wed, Nov 07, 2012 at 06:15:30PM -0800, Adrian Chadd wrote:
>> If so, may I suggest we perhaps accelerate discussing if_transmit() of
>> multiple frames per call?
>
> Hmm, actually I'm still not a fan of if_transmit() at this moment.
> Honestly I don't have good queuing code in driver to handle queue
> full condition. Interactions with altq(9) is also one of my
> concern as well as packet reordering issue of drbr(9) interface.

The whole interface packet handoff needs some serious reconsideration.

These days we have two queues/buffers at the interfaces.  One is the
DMA ring which can take a considerable number of packets and the second
one is the ifq (if enabled).  The DMA ring already adds significant
depth and latency so that a packet scheduler like altq(9) become almost
useless.  Also modern queue management algorithms like CoDel don't work
with the current framework.  Also bufferbloat is a major concern.  See
ACM queue article by Jim Gettys.

What we need to take make this functionality available again is a well
specified and reasonably simple interface handoff.  It should include
information on the maximum tx DMA ring depth and the current depth.
There should also be a function to limit the current depth to a certain
value.

What I'd like to see is this (names are not fixed):

  if_send() as the main entry point for the stack.  It's a function
   pointer within struct ifnet.  In normal operation it is the same
   as if_transmit() and directly adds a packet to the tx DMA ring.
   Locking of the DMA ring is done in this function and a property
   of the driver.  The stack always calls unlocked.  Obviously the
   tx DMA ring lock must not be a sleep lock.
   When altq(9) or equivalent is active this function pointer is
   replaced with a call to the alternative queuing function that
   does it's magic.  Again locking of the queuing mechanism is the
   property of that mechanism.  When a NIC has multiple queues that
   it can bind to CPU's locking may not be necessary.  We gain this
   flexibility in the driver to do that.

  if_transmit() is a function pointer for a function that directly
   adds a packet to the tx DMA ring (if a free slot is available).
   It is never called by the stack directly except in special
   circumstances.  The altq(9), if active, uses if_transmit() to
   add packets to the tx DMA ring.  If not active, if_send() is
   this function pointer.

  if_txeof() is a function pointer for a callback from the driver
   to an altq(9) dequeue function, if active.  It is called when
   when new free slots on the tx DMA ring are available.

When a driver needs a software interface queue because the tx DMA
is too small, then the stack should provide a generic queuing
implementation the driver can use.

I've begun to explore this area while hacking on bge(4) and em(4)
in tcp_workqueue branch.

It's a very interesting path and we going to have a couple more
discussions before we arrive at the optimal solution. :)

-- 
Andre




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?509B93CA.90609>