Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Feb 2004 01:48:03 +0000
From:      Elliot Moore <elliot@devnull.org.uk>
To:        freebsd-stable@freebsd.org
Subject:   Re: FreeBSD4.9 - panic: timeout table full - UPDATE
Message-ID:  <54D68324-5C34-11D8-8420-000A95765552@devnull.org.uk>
In-Reply-To: <20040206230049.W20729@carver.gumbysoft.com>
References:  <481C8DB1-591D-11D8-8420-000A95765552@devnull.org.uk> <20040206230049.W20729@carver.gumbysoft.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Well so far so good
The node is performing well doing its job and has been up for 3.5 days.

On Saturday:
  replaced the IDE cables.
  replaced the memory.
  loaded BIOS fail-safe defaults.

Ok, due to the node being miles away in a colo, I broke the golden rule 
by changing 3 things at the same time, but I can test the cables and 
memory in a test FreeBSD node at my leisure.

Good, it's not a FreeBSD problem. (well apart from it not warning me! I 
will investigate 5.x KTR)
Though somebody else may benefit from reading this update in the 
archives.

Thanks doug for your help and a big shout to the FreeBSD community.

:wq ells





On 7 Feb 2004, at 07:10, Doug White wrote:

> On Sat, 7 Feb 2004, Elliot Moore wrote:
>
>> Hello all,
>>
>> I have a repetitive kernel panic on FreeBSD-4.9 [fresh installed from
>> CD - no CVS upgrades]
>>
>> =========================
>> panic: timeout table full
>
> Hm, haven't seen this one.
>
> Looking at your config, you may be overtuning by cranking up maxusers 
> that
> high. I suggest leaving it at 0, and letting the system autotune.  I'd
> also not suggest changing NMBCLUSTERS unless you have a specific 
> reason to
> do so.
>
>> * [Q] ??: either the number of free ncallouts is depleating over time
>> or something has stopped responding, causing a rapid increase in the
>> number of timeouts called or something has stopped clearing its 
>> timeout
>> handles - a bad driver?
>
> Could be, or a stuck loop somewhere.  Unfortunately, you'd need to be
> watching things when it goes off to see if there are any more kernel
> messages, or if a disk is flipping out, or something like that.
>
>> * [Q] Does somebody know of a method to ask the kernel how many
>> timeouts are assigned and what called them?
>
> You could attach gdb to /dev/kmem and poke around, although that gets
> tricky, and unless you know your way around you won't have much luck.
>
>>        To be able to find out how many are left/being used and 
>> therefore
>> workout the rate of depletion would be helpful in debugging - AND to
>> 'throw in the towel' and reboot safely before it dies!
>> Can this be done? [some inquiry code or a kernel patch]
>> Is there something already in FreeBSD that can do this?
>
> in 5.x there is the KTR mechanism, which can record various kernel 
> events.
> This isn't available in 4.x, however.
>
>> The only quirk i see at boot is this in dmesg:
>>   pci0: <unknown card> (vendor=0x8086, dev=0x24c3) at 31.3 irq 7
>
> This is an SMBus controller, if you compile in the intpm driver it 
> should
> get picked up. Not critical to system operation, however.
>
>> And sometimes (note: not all the time) this message after boot or
>> midway thru the day:
>>   stray irq 7
>>
>> * [Q] This unknown card at irq7 I imagine from vendor this is the
>> onboard Intel SMBus/I2C bridge. Could this play a part in this timeout
>> panic?
>
> Doubtful; irq 7 is a junk irq that various things can trigger. Stuck
> interrupts don't schedule callouts.
>
>> * [Q] is my kernel config at fault? (though GENERIC still paniced)
>
> Good to know that GENERIC also had the problem. I'd stick with GENERIC 
> for
> now unless you have need of a custom driver or configuration; easier 
> for
> the rest of us to debug against :)
>
> Its possible that your disk is flaking out and not accepting commands, 
> or
> has some other sort of failure that causes the ata driver to 
> malfunction.
> Have you tried replacing the disk?
>
>> * [Q] I have a 70 gig UFS+S filesystem (27067418 used inodes) is it
>> normal for it to take an hour to fsck after the panic?
>
> An hour would be a very long time.
>
> -- 
> Doug White                    |  FreeBSD: The Power to Serve
> dwhite@gumbysoft.com          |  www.FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54D68324-5C34-11D8-8420-000A95765552>