Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Oct 2001 11:03:07 -0700
From:      Luigi Rizzo <rizzo@aciri.org>
To:        net@freebsd.org
Subject:   performance issues with M_PREPEND on clusters
Message-ID:  <20011023110307.A34494@iguana.aciri.org>

next in thread | raw e-mail | index | archive | help

Hi,
don't know what is the right forum to discuss this, but this is
certainly one. I have also Bcc-ed some people who might need to be
involved.

In the mbuf code, M_LEADINGSPACE always returns 0 when M_EXT is
set, instead of calling the second part of M_WRITABLE to check
whether there is a chance of writing into the cluster. This means
that both M_PREPEND() and m_pullup() will have to allocate a second
mbuf when they are invoked on a cluster, which these days is
basically true for most of the incoming traffic (several network
drivers pass up the full packet in a cluster).

Why do we care ? Well, the additional work is not much (~2us per
packet on a 750MHz box, say 10us on a slow box), but on boxes acting
as routers this might in turn trigger further slowdown with various
NICs which have performance or functional problems in doing
scatter-gather I/O (the 21143 is one of those, and some of the NICs
supported by the "dc" driver also require an additional copy when
the packet being output is split across multiple buffers).

A quick fix to this problem is to change M_LEADINGSPACE
as shown in the attached patch (for -stable), so that M_PREPEND
can write into the cluster if this is not shared.
This change alone improved the peak forwarding rate (in packets
per second) by 15-20% on a system we have here using the "dc"
driver.

Similar things could be done in m_pullup() to avoid the
extra allocation.

I have quickly discussed this with Bosko, and he suggests that
M_LEADINGSPACE should not care if the buffer is writable or not,
and only report how much room is available in front of the buffer,
doing somewhere else the test for writability.

While I agree with this view, it is also true that as it is now,
M_LEADINGSPACE always returns 0 if the buffer is not writable, and
some of the places where it is used (e.g. M_PREPEND) depend on this
feature. So i would feel a bit uncomfortable in removing the
writability test from M_LEADINGSPACE, whereas the change I
suggest below seems to be reasonably safe.

Any objections in importing this in -stable after adequate testing
by some volunteers ? I do not think that testing in current would
help for this change, because the mbuf handling code is slightly
different there (as a matter of fact, the mbuf code in -stable is
quite ugly, e.g. there is no locking when accessing the refcounts
for mbufs).

	cheers
	luigi

Index: sys/mbuf.h
===================================================================
RCS file: /home/xorpc/u2/freebsd/src/sys/sys/mbuf.h,v
retrieving revision 1.44.2.10
diff -u -r1.44.2.10 mbuf.h
--- sys/mbuf.h  2001/07/03 11:02:01     1.44.2.10
+++ sys/mbuf.h  2001/10/20 21:18:04
@@ -473,8 +473,11 @@
 /*
  * Check if we can write to an mbuf.
  */
+#define M_EXT_WRITABLE(m)              \
+    ((m)->m_ext.ext_free == NULL && mclrefcnt[mtocl((m)->m_ext.ext_buf)] == 1)
+
 #define M_WRITABLE(m) (!((m)->m_flags & M_EXT) || \
-    ((m)->m_ext.ext_free == NULL && mclrefcnt[mtocl((m)->m_ext.ext_buf)] == 1))
+    M_EXT_WRITABLE(m) )
 
 /*
  * Compute the amount of space available
@@ -482,7 +485,7 @@
  */
 #define        M_LEADINGSPACE(m)                                       \
        ((m)->m_flags & M_EXT ?                                         \
-           /* (m)->m_data - (m)->m_ext.ext_buf */ 0 :                  \
+           (M_EXT_WRITABLE(m) ? (m)->m_data - (m)->m_ext.ext_buf : 0): \
            (m)->m_flags & M_PKTHDR ? (m)->m_data - (m)->m_pktdat :     \
            (m)->m_data - (m)->m_dat)
 

===================================================================
----------------------------------+-----------------------------------------
 Luigi RIZZO, luigi@iet.unipi.it  . ACIRI/ICSI (on leave from Univ. di Pisa)
 http://www.iet.unipi.it/~luigi/  . 1947 Center St, Berkeley CA 94704
 Phone: (510) 666 2927
----------------------------------+-----------------------------------------

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011023110307.A34494>