Date: Thu, 17 Dec 1998 02:13:10 GMT From: Michael Robinson <robinson@netrinsics.com> To: dot@dotat.at Cc: fenner@parc.xerox.com, freebsd-net@FreeBSD.ORG Subject: Re: MLEN < write length < MINCLSIZE "bug" Message-ID: <199812170213.CAA00532@netrinsics.com> In-Reply-To: <E0zqHzA-0000Le-00@fanf.noc.demon.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Tony Finch <dot@dotat.at> writes:
>Having read this bit of the red demon book recently (although I can't
>find the precise reference again at the moment), ISTR that the
>heuristic is that since allocating an mbuf with a cluster takes two
>allocations, MINCLSIZE is just bigger than two mbufs.
So it is as I suspected. MINCLSIZE is a parameter for a classic time/space
performance tradeoff. A small MINCLSIZE gives you fewer mbuf allocations,
but with lots of unused space in mbuf clusters. A big MINCLSIZE gives you
more mbuf allocations, and more copy operations, but with more efficient
memory use.
As such, MINCLSIZE seems like a good candidate for a sysctl (a patch for
which can be found at the end of this message). People running heavily-used
dedicated network servers may find it useful to be able to tune this
parameter.
It seems to me that this is largely orthogonal, though, to the issue of
segmenting writes in sosend before sending them to the protocol. That is
more an issue of hardware speed vs. kernel speed. For example, on a dialup
PPP connection, the additional packet header overhead vastly outweighs
the mostly non-existent parallelism of the serial interface. However, a
100Mhz 64-bit PCI gigabit Ethernet controller can process buffers faster than
the CPU can spit them out, so segmenting the writes could result in significant
improvements in throughput and latency.
So I think this behavior is something that one should be able to turn on and
off. The question is with what granularity: kernel, interface, or socket?
A socket option would be trivial to implement, but wouldn't work for existing
code until it was retrofitted in.
A sysctl would also be trivial to implement, would work with existing code,
but the granularity is probably to coarse.
A new option for ifconfig would work at the interface level, but I don't
know if that's what people want or will accept.
Comments?
-Michael Robinson
Index: sys/mbuf.h
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/sys/mbuf.h,v
retrieving revision 1.18
diff -u -r1.18 mbuf.h
--- mbuf.h 1996/08/19 18:30:15 1.18
+++ mbuf.h 1998/12/17 01:39:44
@@ -52,7 +52,8 @@
#define MLEN (MSIZE - sizeof(struct m_hdr)) /* normal data len */
#define MHLEN (MLEN - sizeof(struct pkthdr)) /* data len w/pkthdr */
-#define MINCLSIZE (MHLEN + MLEN) /* smallest amount to put in cluster */
+extern int minclsize;
+#define MINCLSIZE minclsize /* smallest amount to put in cluster */
#define M_MAXCOMPRESS (MHLEN / 2) /* max amount to copy for compression */
/*
Index: sys/sysctl.h
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/sys/sysctl.h,v
retrieving revision 1.48.2.2
diff -u -r1.48.2.2 sysctl.h
--- sysctl.h 1997/08/30 14:08:56 1.48.2.2
+++ sysctl.h 1998/12/17 01:39:58
@@ -231,6 +231,7 @@
#define KERN_PS_STRINGS 32 /* int: address of PS_STRINGS */
#define KERN_USRSTACK 33 /* int: address of USRSTACK */
#define KERN_MAXID 34 /* number of valid kern ids */
+#define KERN_MINCLSIZE 35 /* minumum size for mbuf cluster */
#define CTL_KERN_NAMES { \
{ 0, 0 }, \
@@ -267,6 +268,7 @@
{ "maxsockbuf", CTLTYPE_INT }, \
{ "ps_strings", CTLTYPE_INT }, \
{ "usrstack", CTLTYPE_INT }, \
+ { "minclsize", CTLTYPE_INT }, \
}
/*
Index: kern/uipc_socket.c
===================================================================
RCS file: /cdrom/CVSROOT/src/sys/kern/uipc_socket.c,v
retrieving revision 1.20.2.5
diff -u -r1.20.2.5 uipc_socket.c
--- uipc_socket.c 1998/03/02 07:58:12 1.20.2.5
+++ uipc_socket.c 1998/12/17 01:40:26
@@ -53,6 +53,9 @@
static int somaxconn = SOMAXCONN;
SYSCTL_INT(_kern, KERN_SOMAXCONN, somaxconn, CTLFLAG_RW, &somaxconn, 0, "");
+int minclsize = (MHLEN + MLEN);
+SYSCTL_INT(_kern, KERN_MINCLSIZE, minclsize, CTLFLAG_RW, &minclsize, 0, "");
+
/*
* Socket operation routines.
* These routines are called by the routines in
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199812170213.CAA00532>
