Date: Wed, 21 Jul 2010 18:44:49 +0200 From: Markus Gebert <markus.gebert@hostpoint.ch> To: Andriy Gapon <avg@icyb.net.ua> Cc: freebsd-stable@freebsd.org, John Baldwin <jhb@freebsd.org> Subject: Re: 8.1-RC2 MCE caused by some LAPIC/clock changes? Message-ID: <BB90561D-87E3-4732-BC94-E702C64A1B32@hostpoint.ch> In-Reply-To: <4C46E9E5.8000204@icyb.net.ua> References: <6B57591F-9FA2-45EB-825F-1DB025C0635D@hostpoint.ch> <9DCFE2F6-D7CB-49CB-8EBC-06C1E5EBB727@hostpoint.ch> <F744F475-3D2B-4BC6-856A-A5D302AA8681@hostpoint.ch> <201007201559.45081.jhb@freebsd.org> <6781BC8B-51E0-4F8B-9307-9C062DE70C21@hostpoint.ch> <4C46B0C6.4020400@icyb.net.ua> <5CABE3EC-1EE7-4B6B-85EA-70AA2A107948@hostpoint.ch> <4C46E9E5.8000204@icyb.net.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 21.07.2010, at 14:36, Andriy Gapon wrote: > on 21/07/2010 15:25 Markus Gebert said the following: >> On 21.07.2010, at 10:33, Andriy Gapon wrote: >>=20 >>> on 21/07/2010 03:57 Markus Gebert said the following: >>>> Another thing though: Today I compared verbose boot output from = 8-stable >>>> and the current box. I saw that the ioapic sets up IRQ routing = differently >>>> on these two systems although the hardware is the same. This seemed = not so=20 >>>> interesting at first, but then I noticed that 8-stable sets up two = routes >>>> (to lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while = current >>>> only uses one route (to lapic0). >>> My understanding that it's not "two routes", but re-routing. During = early >>> boot all interrupts are bound to BSP; later, when APs become online, = the >>> interrupts are re-distributed among available CPUs. >>=20 >> I guess you're right, misinterpretation on my side. Thanks for = clarifying this. >>=20 >>=20 >> Now being aware of this, it seems to me that in the = machdep.lapic_allclocks=3D0 >> case, there might just be more interrupts to be assigned/routed due = to "more >> clocks being used". If that's true, maybe it's just "luck" that in = this case >> the mpt interrupt gets assigned to lapic0/cpu0 and the box runs fine. = I'm just >> guessing though, since I have no clue how interrupts are assigned to = lapics >> exactly (round-robin? some logic?). >=20 > Yes, round-robin, for interrupts that not explicitly bound to specific = CPUs. > The process is deterministic, but hard to predict indeed. I see. >>>> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box = behave=20 >>>> like the one running current. Indeed, this seems to have changed = IRQ58 to >>>> be routed to lapic0 only. And the box was running for hours without = showing >>>> the symptoms. >>>>=20 >>>> I just checked boot verbose outpout of my 8-stable box again = (booted with=20 >>>> machdep.lapic_allclocks=3D0 as mentioned above). And now it seems = to have set >>>> up IRQ routes just like the current box (one route for IRQ58 to = lapic0). >>> Not sure how to interpret this properly. One possibility is a = hardware >>> problem where interrupt message route between ioapic2 and CPU to = which lapic3 >>> belongs is flaky. Perhaps, this might be a FreeBSD problem: it could = be that >>> the system somehow tells to not set up such routes, but we don't = listen. But >>> this is far fetched. >>=20 >>=20 >> I'm not sure either. If my "theory" above proved to be true, it would = have been >> just luck, that 6.x and 7.x (and current) run just fine on the = X4100M2. A >> (short) test on Ubuntu didn't trigger the problem, so the Linux = kernel is >> either lucky too by selecting an interrupt route that is "not flaky", = or >> there's indeed some way to figure out not to use some lapics for some >> interrupts. Or we didn't test Linux thoroughly enough. >=20 > Yep, it would be interesting to see how interrupts were distributed = among CPUs on > that Linux. Well I can't provide this kind of information about _that_ Ubuntu Linux = right now, because it was wiped from the second test machine to test = current. But we have a few productive X4100M2 running Debian and there = it looks like this: ---- # uname -a Linux XX 2.6.26-2-amd64 #1 SMP Tue Mar 9 22:29:32 UTC 2010 x86_64 = GNU/Linux # cat /proc/interrupts=20 CPU0 CPU1 CPU2 CPU3 =20 0: 36 0 0 1 IO-APIC-edge = timer 1: 0 0 0 2 IO-APIC-edge = i8042 7: 1 0 0 0 IO-APIC-edge =20 8: 0 0 0 1 IO-APIC-edge = rtc0 9: 0 0 0 0 IO-APIC-fasteoi = acpi 12: 0 0 0 4 IO-APIC-edge = i8042 14: 0 0 0 74 IO-APIC-edge = ide0 21: 0 0 0 2 IO-APIC-fasteoi = ehci_hcd:usb2 22: 0 0 1 31 IO-APIC-fasteoi = ohci_hcd:usb1 56: 52836 302759221 129 50868 IO-APIC-fasteoi = eth2 57: 288921 1070387307 225 98210 IO-APIC-fasteoi = eth3 1271: 92146 45282139 9 4885 PCI-MSI-edge = ioc0 NMI: 0 0 0 0 Non-maskable = interrupts LOC: 258132347 312890202 166484456 147070084 Local timer = interrupts RES: 118623017 84540907 100591028 107693244 Rescheduling = interrupts CAL: 108384 89281 110429 104206 function call = interrupts TLB: 14719843 24105630 12456528 18955140 TLB shootdowns TRM: 0 0 0 0 Thermal event = interrupts THR: 0 0 0 0 Threshold APIC = interrupts SPU: 0 0 0 0 Spurious interrupts ERR: 1 ---- Not sure how to interpret this. At first sight no IRQ58, but I guess = they might be using MSI for mpt, which might avoid the problem entirely. Markus
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BB90561D-87E3-4732-BC94-E702C64A1B32>