Date: Mon, 24 Jan 2005 18:20:25 +0800 From: Ganbold <ganbold@micom.mng.net> To: Robert Watson <rwatson@freebsd.org> Cc: freebsd-current@freebsd.org Subject: Re: fxp0: device timed out problem Message-ID: <6.2.0.14.2.20050124181259.03419040@202.179.0.80> In-Reply-To: <Pine.NEB.3.96L.1050124093419.63183A-100000@fledge.watson.o rg> References: <6.2.0.14.2.20050124113106.03402770@202.179.0.80> <Pine.NEB.3.96L.1050124093419.63183A-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
At 05:53 PM 1/24/2005, you wrote: >On Mon, 24 Jan 2005, Ganbold wrote: > >Any luck with disabling ACPI? In particular, are the interrupt >assignments substantially different between booting with ACPI and without? >You can probably just diff -u the old dmesg.boot and the new one... I didn't try disabling ACPI. > > >Usually a device timed out error is related to interrupts from the device > > >not being delivered, being delivered improperly, etc. Does your dmesg > > >contain any references to interrupt storms? Once the above message has > > >printed, do you see any further interrupts on the fxp interrupt source > > >when checking intermittently with "systat -vmstat 1" or "vmstat -i"? > > > > I couldn't check the system by issuing those commands. Following is the > > dmesg output with debug.mpsafenet disabled: > >Couldn't as in, not possible for administrative reasons, because you >couldn't log in once the failure occurred so couldn't get the output, or >because they don't work, or...? Just want to make sure I understand if >this is an administrative issue or symptomatic. Sorry for my poor explanation. Actually I didn't try these commands. > > I didn't do much investigation on those servers that time. However > > without debug.mpsafenet, servers are working fine for more than 3 weeks. > >That is certainly suggestive -- I wonder if we're looking at a locking bug >in fxp0 involving serialization with the hardware. However, it's not >conclusive, I think -- when running MPSAFE, the timing is quite different >on UP as well as SMP hardware, which could trigger other existing bugs. >The big open question, I think, is whether an interrupt delivery problem >is involved. Probably I have to enable debug.mpsafenet in one of the servers and experiment disabling ACPI and checking interrupt source when device times out. I will let you know. thanks, Ganbold >Robert N M Watson > > >_______________________________________________ >freebsd-current@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-current >To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6.2.0.14.2.20050124181259.03419040>