Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Mar 2002 22:39:48 -0800
From:      Luigi Rizzo <rizzo@icir.org>
To:        Lars Eggert <larse@ISI.EDU>
Cc:        Matthew Luckie <mjl@nlanr.net>, freebsd-net@FreeBSD.ORG
Subject:   Re: ip_output and ENOBUFS
Message-ID:  <20020326223947.B16450@iguana.icir.org>
In-Reply-To: <3CA0AB3D.5000300@isi.edu>
References:  <Pine.BSF.4.21.0203260819020.91970-100000@mave.nlanr.net> <3CA0AB3D.5000300@isi.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
the ENOBUFS is very typical with UDP applications that try to
send as fast as possible (e.g. the various network test utilities
in ports), and as i said in a previous message, putting up a mechanism to
pass around queue full/queue not full events is expensive because
it might trigger on every single packet, and possibly have to wakeup
multiple processes each time (with only one being able to succeed).

The tcp handling of ENOBUFS is much cheaper.
TCP is not waken up by the device, but from acks coming from the other
side, or from timeouts. So there is not per-packet overhead just to
implement this mechanism.

As a matter of fact, i even implemented a similar thing in dummynet,
and if device drivers call if_tx_rdy() when they complete a
transmission, then the tx interrupt can be used to clock
packets out of the dummynet pipes. A patch for if_tun.c is below,
and if_tx_rdy() is in netinet/ip_dummynet.c. You could replace
the call to if_tx_rdy with a wakeup() using some appropriate
argument to wake up threads waiting for devices to become ready.

	cheers
	luigi

> lcvs diff -u if_tun.c
Index: if_tun.c
===================================================================
RCS file: /home/ncvs/src/sys/net/if_tun.c,v
retrieving revision 1.51.2.2
diff -u -r1.51.2.2 if_tun.c
--- if_tun.c    28 Jul 1999 15:08:06 -0000      1.51.2.2
+++ if_tun.c    19 Jun 2000 12:07:17 -0000
@@ -19,6 +19,7 @@
 
 #include "opt_devfs.h"
 #include "opt_inet.h"
+#include "opt_ipdn.h"
 
 #include <sys/param.h>
 #include <sys/proc.h>
@@ -162,6 +163,10 @@
        ifp = &tp->tun_if;
        tp->tun_flags |= TUN_OPEN;
        TUNDEBUG("%s%d: open\n", ifp->if_name, ifp->if_unit);
+#ifdef DUMMYNET
+       if (ifp->if_snd.ifq_len == 0) /* better be! */
+           if_tx_rdy(ifp);
+#endif
        return (0);
 }
 
@@ -487,6 +492,10 @@
                        }
                }
        } while (m0 == 0);
+#ifdef DUMMYNET
+       if (ifp->if_snd.ifq_len == 0)
+           if_tx_rdy(ifp);
+#endif
        splx(s);
 
        while (m0 && uio->uio_resid > 0 && error == 0) {

On Tue, Mar 26, 2002 at 09:09:17AM -0800, Lars Eggert wrote:
> Matthew Luckie wrote:
> > hmm, we looked at how other protocols handled the ENOBUFS case from
> > ip_output.
> >
> > tcp_output calls tcp_quench on this error.
> >
> > while the interface may not be able to send any more packets than it
> > does currently, closing the congestion window back to 1 segment
> > seems a severe way to handle this error, knowing that the network
> > did not drop the packet due to congestion.  Ideally, there might be
> > some form of blocking until such time as a mbuf comes available.
> > This sounds as if it will be much easier come FreeBSD 5.0
> 
> TCP will almost never encouter this scenario, since it's self-clocking.
> The NIC is very rarely the bottleneck resource for a given network
> connection. Have you looked at mean queue lengths for NICs? They are
> typically zero or one. The NIC will only be the bottleneck if you are
> sending at a higher rate than line speed and your burt time is too long
> to be absorbed by the queue.
> 
> > I'm aware that if people are hitting this condition, they need to
> > increase the number of mbufs to get maximum performance.
> 
> No. ENOBUFS in ip_output almost always means that your NIC queue is
> full, which isn't controlled through mbufs. You can make the queue 
> longer, but that won't help if you're sending too fast.
> 
> > This section of code has previously been discussed here:
> > http://docs.freebsd.org/cgi/getmsg.cgi?fetch=119188+0+archive/2000/fr-
> > eebsd-net/20000730.freebsd-net and has been in use for many years (a
> 
> This is a slightly different problem than you describe. What Archie saw
> was an ENOBUFS being handled like a loss inside the network, even though
> the sender has information locally that can allow it to make smarter
> retransmission decisions.
> 
> Lars
> -- 
> Lars Eggert <larse@isi.edu>               Information Sciences Institute
> http://www.isi.edu/larse/              University of Southern California



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020326223947.B16450>