From owner-freebsd-stable@FreeBSD.ORG  Fri Feb  6 23:10:20 2004
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7547F16A4CE
	for <freebsd-stable@freebsd.org>;
	Fri,  6 Feb 2004 23:10:20 -0800 (PST)
Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id F3C9F43D55
	for <freebsd-stable@freebsd.org>;
	Fri,  6 Feb 2004 23:10:11 -0800 (PST)
	(envelope-from dwhite@gumbysoft.com)
Received: by carver.gumbysoft.com (Postfix, from userid 1000)
	id E39C772DC7; Fri,  6 Feb 2004 23:10:11 -0800 (PST)
Received: from localhost (localhost [127.0.0.1])
	by carver.gumbysoft.com (Postfix) with ESMTP
	id E0D1472DBF; Fri,  6 Feb 2004 23:10:11 -0800 (PST)
Date: Fri, 6 Feb 2004 23:10:11 -0800 (PST)
From: Doug White <dwhite@gumbysoft.com>
To: Elliot Moore <elliot@devnull.org.uk>
In-Reply-To: <481C8DB1-591D-11D8-8420-000A95765552@devnull.org.uk>
Message-ID: <20040206230049.W20729@carver.gumbysoft.com>
References: <481C8DB1-591D-11D8-8420-000A95765552@devnull.org.uk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-stable@freebsd.org
Subject: Re: FreeBSD4.9 - panic: timeout table full
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Production branch of FreeBSD source code
	<freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Feb 2004 07:10:20 -0000

On Sat, 7 Feb 2004, Elliot Moore wrote:

> Hello all,
>
> I have a repetitive kernel panic on FreeBSD-4.9 [fresh installed from
> CD - no CVS upgrades]
>
> =========================
> panic: timeout table full

Hm, haven't seen this one.

Looking at your config, you may be overtuning by cranking up maxusers that
high. I suggest leaving it at 0, and letting the system autotune.  I'd
also not suggest changing NMBCLUSTERS unless you have a specific reason to
do so.

> * [Q] ??: either the number of free ncallouts is depleating over time
> or something has stopped responding, causing a rapid increase in the
> number of timeouts called or something has stopped clearing its timeout
> handles - a bad driver?

Could be, or a stuck loop somewhere.  Unfortunately, you'd need to be
watching things when it goes off to see if there are any more kernel
messages, or if a disk is flipping out, or something like that.

> * [Q] Does somebody know of a method to ask the kernel how many
> timeouts are assigned and what called them?

You could attach gdb to /dev/kmem and poke around, although that gets
tricky, and unless you know your way around you won't have much luck.

>        To be able to find out how many are left/being used and therefore
> workout the rate of depletion would be helpful in debugging - AND to
> 'throw in the towel' and reboot safely before it dies!
> Can this be done? [some inquiry code or a kernel patch]
> Is there something already in FreeBSD that can do this?

in 5.x there is the KTR mechanism, which can record various kernel events.
This isn't available in 4.x, however.

> The only quirk i see at boot is this in dmesg:
>   pci0: <unknown card> (vendor=0x8086, dev=0x24c3) at 31.3 irq 7

This is an SMBus controller, if you compile in the intpm driver it should
get picked up. Not critical to system operation, however.

> And sometimes (note: not all the time) this message after boot or
> midway thru the day:
>   stray irq 7
>
> * [Q] This unknown card at irq7 I imagine from vendor this is the
> onboard Intel SMBus/I2C bridge. Could this play a part in this timeout
> panic?

Doubtful; irq 7 is a junk irq that various things can trigger. Stuck
interrupts don't schedule callouts.

> * [Q] is my kernel config at fault? (though GENERIC still paniced)

Good to know that GENERIC also had the problem. I'd stick with GENERIC for
now unless you have need of a custom driver or configuration; easier for
the rest of us to debug against :)

Its possible that your disk is flaking out and not accepting commands, or
has some other sort of failure that causes the ata driver to malfunction.
Have you tried replacing the disk?

> * [Q] I have a 70 gig UFS+S filesystem (27067418 used inodes) is it
> normal for it to take an hour to fsck after the panic?

An hour would be a very long time.

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite@gumbysoft.com          |  www.FreeBSD.org