From owner-freebsd-hackers Mon Jan 1 09:10:21 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id JAA27739 for hackers-outgoing; Mon, 1 Jan 1996 09:10:21 -0800 (PST) Received: from asstdc.scgt.oz.au (root@asstdc.scgt.oz.au [202.14.234.65]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id JAA27731 for ; Mon, 1 Jan 1996 09:10:16 -0800 (PST) Received: (from imb@localhost) by asstdc.scgt.oz.au (8.6.12/BSD4.4) id EAA15041; Tue, 2 Jan 1996 04:10:07 +1100 From: michael butler Message-Id: <199601011710.EAA15041@asstdc.scgt.oz.au> Subject: Re: 2.1 instabilities To: bde@zeta.org.au (Bruce Evans) Date: Tue, 2 Jan 1996 04:10:03 +1100 (EST) Cc: hackers@freebsd.org In-Reply-To: <199512310731.SAA22598@godzilla.zeta.org.au> from "Bruce Evans" at Dec 31, 95 06:31:34 pm X-Mailer: ELM [version 2.4 PL24beta] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@freebsd.org Precedence: bulk > >Two of them occasionally stop dead whilst under heavy ppp load. Both are > >using kernel-based ppp. One of them simply stops blinking his cursor and > >simply goes to sleep. No keyboard response, nothing :-(. Very rarely, it > >will just spontaneously reboot (which I'd actually prefer as it's 4km > >away). > Try the following fix from -current: [ .. /sys/i386/include/spl.h patch .. ] This seems to have done the trick. It's been up and running since I recompiled it and hasn't missed a beat. If it stays up for the remainder of the week (which would be a minor miracle :-)), I'd guess that this patch is a fairly conclusive fix. > >The other, the only other under such a heavy load, stops forwarding IP > >packets and a ping (from the host itself) to any one of the remote users > >returns a "cannot write, no buffers available" error. The mbuf cluster > >count is <100 although there are usually somewhere around 100-300 mbufs > >allocated to data (load dependent). Killing any pppd will solve the > >problem until the next recurrence. > The fix is less likely to help here. [ .. /sys/kern/tty.c patch .. ] As it turns out, the problem appears to be one of those nasty modem incompatibilities with V34. For whatever reason, the receiving end (an Avtek) decides that it doesn't want to talk any more and a whole bunch of packets destined for that system rapidly queue up and fill all the available mbufs thereby killing almost all network activity from the sender not just the stalled PPP connection. Switching the modem that serves it (from Microcom to Hayes) seems to avoid the problem. Perhaps there should be some defensive code to prevent one PPP link from monopolising mbufs like this ? As soon as a link reaches a point where no more mbufs can be allocated (failure on queue attempts), it should start simply dropping the oldest packets (and freeing mbufs) to ensure some availability for other activity, michael