From owner-freebsd-stable@FreeBSD.ORG Fri Feb 6 23:10:20 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7547F16A4CE for ; Fri, 6 Feb 2004 23:10:20 -0800 (PST) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id F3C9F43D55 for ; Fri, 6 Feb 2004 23:10:11 -0800 (PST) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id E39C772DC7; Fri, 6 Feb 2004 23:10:11 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id E0D1472DBF; Fri, 6 Feb 2004 23:10:11 -0800 (PST) Date: Fri, 6 Feb 2004 23:10:11 -0800 (PST) From: Doug White To: Elliot Moore In-Reply-To: <481C8DB1-591D-11D8-8420-000A95765552@devnull.org.uk> Message-ID: <20040206230049.W20729@carver.gumbysoft.com> References: <481C8DB1-591D-11D8-8420-000A95765552@devnull.org.uk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD4.9 - panic: timeout table full X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Feb 2004 07:10:20 -0000 On Sat, 7 Feb 2004, Elliot Moore wrote: > Hello all, > > I have a repetitive kernel panic on FreeBSD-4.9 [fresh installed from > CD - no CVS upgrades] > > ========================= > panic: timeout table full Hm, haven't seen this one. Looking at your config, you may be overtuning by cranking up maxusers that high. I suggest leaving it at 0, and letting the system autotune. I'd also not suggest changing NMBCLUSTERS unless you have a specific reason to do so. > * [Q] ??: either the number of free ncallouts is depleating over time > or something has stopped responding, causing a rapid increase in the > number of timeouts called or something has stopped clearing its timeout > handles - a bad driver? Could be, or a stuck loop somewhere. Unfortunately, you'd need to be watching things when it goes off to see if there are any more kernel messages, or if a disk is flipping out, or something like that. > * [Q] Does somebody know of a method to ask the kernel how many > timeouts are assigned and what called them? You could attach gdb to /dev/kmem and poke around, although that gets tricky, and unless you know your way around you won't have much luck. > To be able to find out how many are left/being used and therefore > workout the rate of depletion would be helpful in debugging - AND to > 'throw in the towel' and reboot safely before it dies! > Can this be done? [some inquiry code or a kernel patch] > Is there something already in FreeBSD that can do this? in 5.x there is the KTR mechanism, which can record various kernel events. This isn't available in 4.x, however. > The only quirk i see at boot is this in dmesg: > pci0: (vendor=0x8086, dev=0x24c3) at 31.3 irq 7 This is an SMBus controller, if you compile in the intpm driver it should get picked up. Not critical to system operation, however. > And sometimes (note: not all the time) this message after boot or > midway thru the day: > stray irq 7 > > * [Q] This unknown card at irq7 I imagine from vendor this is the > onboard Intel SMBus/I2C bridge. Could this play a part in this timeout > panic? Doubtful; irq 7 is a junk irq that various things can trigger. Stuck interrupts don't schedule callouts. > * [Q] is my kernel config at fault? (though GENERIC still paniced) Good to know that GENERIC also had the problem. I'd stick with GENERIC for now unless you have need of a custom driver or configuration; easier for the rest of us to debug against :) Its possible that your disk is flaking out and not accepting commands, or has some other sort of failure that causes the ata driver to malfunction. Have you tried replacing the disk? > * [Q] I have a 70 gig UFS+S filesystem (27067418 used inodes) is it > normal for it to take an hour to fsck after the panic? An hour would be a very long time. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org