From owner-freebsd-hackers  Mon Jan  1 09:10:21 1996
Return-Path: owner-hackers
Received: (from root@localhost)
          by freefall.freebsd.org (8.7.3/8.7.3) id JAA27739
          for hackers-outgoing; Mon, 1 Jan 1996 09:10:21 -0800 (PST)
Received: from asstdc.scgt.oz.au (root@asstdc.scgt.oz.au [202.14.234.65])
          by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id JAA27731
          for <hackers@freebsd.org>; Mon, 1 Jan 1996 09:10:16 -0800 (PST)
Received: (from imb@localhost)
	by asstdc.scgt.oz.au (8.6.12/BSD4.4)
	id EAA15041; Tue, 2 Jan 1996 04:10:07 +1100
From: michael butler <imb@scgt.oz.au>
Message-Id: <199601011710.EAA15041@asstdc.scgt.oz.au>
Subject: Re: 2.1 instabilities
To: bde@zeta.org.au (Bruce Evans)
Date: Tue, 2 Jan 1996 04:10:03 +1100 (EST)
Cc: hackers@freebsd.org
In-Reply-To: <199512310731.SAA22598@godzilla.zeta.org.au> from "Bruce Evans" at Dec 31, 95 06:31:34 pm
X-Mailer: ELM [version 2.4 PL24beta]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-hackers@freebsd.org
Precedence: bulk

> >Two of them occasionally stop dead whilst under heavy ppp load. Both are
> >using kernel-based ppp. One of them simply stops blinking his cursor and
> >simply goes to sleep. No keyboard response, nothing :-(. Very rarely, it
> >will just spontaneously reboot (which I'd actually prefer as it's 4km
> >away).
 
> Try the following fix from -current:

 [ .. /sys/i386/include/spl.h patch .. ]

This seems to have done the trick. It's been up and running since I
recompiled it and hasn't missed a beat. If it stays up for the remainder of
the week (which would be a minor miracle :-)), I'd guess that this patch is
a fairly conclusive fix.

> >The other, the only other under such a heavy load, stops forwarding IP
> >packets and a ping (from the host itself) to any one of the remote users
> >returns a "cannot write, no buffers available" error. The mbuf cluster
> >count is <100 although there are usually somewhere around 100-300 mbufs
> >allocated to data (load dependent). Killing any pppd will solve the
> >problem until the next recurrence.
 
> The fix is less likely to help here.
 
 [ .. /sys/kern/tty.c patch .. ]

As it turns out, the problem appears to be one of those nasty modem
incompatibilities with V34. For whatever reason, the receiving end (an
Avtek) decides that it doesn't want to talk any more and a whole bunch of
packets destined for that system rapidly queue up and fill all the available
mbufs thereby killing almost all network activity from the sender not just
the stalled PPP connection. Switching the modem that serves it (from
Microcom to Hayes) seems to avoid the problem.

Perhaps there should be some defensive code to prevent one PPP link from
monopolising mbufs like this ? As soon as a link reaches a point where no
more mbufs can be allocated (failure on queue attempts), it should start
simply dropping the oldest packets (and freeing mbufs) to ensure some
availability for other activity,

	michael