From owner-freebsd-bugs  Sun Jun 22 06:40:04 1997
Return-Path: <owner-bugs>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id GAA02023
          for bugs-outgoing; Sun, 22 Jun 1997 06:40:04 -0700 (PDT)
Received: (from gnats@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id GAA02000;
          Sun, 22 Jun 1997 06:40:02 -0700 (PDT)
Resent-Date: Sun, 22 Jun 1997 06:40:02 -0700 (PDT)
Resent-Message-Id: <199706221340.GAA02000@hub.freebsd.org>
Resent-From: gnats (GNATS Management)
Resent-To: freebsd-bugs
Resent-Reply-To: FreeBSD-gnats@FreeBSD.ORG, sthaug@nethelp.no
Received: from verdi.nethelp.no (verdi.nethelp.no [195.1.171.130])
          by hub.freebsd.org (8.8.5/8.8.5) with SMTP id GAA01872
          for <FreeBSD-gnats-submit@freebsd.org>; Sun, 22 Jun 1997 06:37:03 -0700 (PDT)
Received: (qmail 29392 invoked by uid 1001); 22 Jun 1997 13:36:54 +0000 (GMT)
Message-Id: <19970622133654.29391.qmail@verdi.nethelp.no>
Date: 22 Jun 1997 13:36:54 +0000 (GMT)
From: sthaug@nethelp.no
Reply-To: sthaug@nethelp.no
To: FreeBSD-gnats-submit@FreeBSD.ORG
Cc: sthaug@nethelp.no
X-Send-Pr-Version: 3.2
Subject: kern/3925: SO_SNDLOWAT of 0 causes kernel to use 99% of CPU time on TCP send
Sender: owner-bugs@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk


>Number:         3925
>Category:       kern
>Synopsis:       SO_SNDLOWAT of 0 causes kernel to use 99% of CPU time on TCP send
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jun 22 06:40:01 PDT 1997
>Last-Modified:
>Originator:     Steinar Haug
>Organization:
Nethelp Consulting
>Release:        FreeBSD 2.2-961014-SNAP i386
>Environment:

This may be a generic 4.4BSD networking bug. It applies to NetBSD-1.2, FreeBSD
(all versions I've checked, eg. 2.2-BETA, 3.0-970124-SNAP). It was discovered
because BIND-8.1.1-T2B named sets SO_SNDLOWAT to 0 if SO_SNDLOWAT is defined.

>Description:

Setting SO_SNDLOWAT to 0 and then sending data with TCP to a site which is more
than a few milliseconds away (eg. connected over a 28.800 link) causes the kernel
to use 99% of the CPU time, as reported by vmstat or top. Using a SO_SNDLOWAT > 0
makes the problem disappear.

It's quite possible that this effect *always* occurs, but it's much more visible
against a slow site.

>How-To-Repeat:

Compile the following program, which sends a number of buffers to the discard
port at a given address. Run the program with

% tstlowat 0 50 ip-address

ie. send 50 buffers to a given address with SO_SNDLOWAT set to 0. Observe with
vmstat how the kernel uses all of the CPU.

#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

main(int argc, char *argv[])
{
	struct sockaddr_in sin;
	int i, n, s, sndlowat;
	char buf[65536];

	sndlowat = atoi(argv[1]);
	n = atoi(argv[2]);

	if ((s = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
		perror("socket"); exit(1);
	}
	if (setsockopt(s, SOL_SOCKET, SO_SNDLOWAT, (char *)&sndlowat,
		       sizeof sndlowat) < 0) {
		perror("setsockopt"); exit(1);
	}
	sin.sin_port = htons(9);	/* Discard port */
	sin.sin_family = AF_INET;
	if (inet_aton(argv[3], &sin.sin_addr) == 0) {
		fprintf(stderr, "inet_aton"); exit(1);
	}
	if (connect(s, (struct sockaddr *)&sin, sizeof sin) < 0) {
		perror("connect"); exit(1);
	}

	for (i=0; i<n; i++) {
		if (write(s, buf, sizeof buf) < 0) {
			perror("write"); exit(1);
		}
	}
}


>Fix:
	
Sorry, no known fix. Easy workaround is to use a SO_SNDLOWAT which is > 0.

>Audit-Trail:
>Unformatted: