Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 May 2006 17:27:55 +0300
From:      Andriy Gapon <avg@icyb.net.ua>
To:        freebsd-stable@freebsd.org, freebsd-acpi@freebsd.org, freebsd-hardware@freebsd.org
Subject:   5.4=>6.1 regression: nforce2 vs. APIC [+fix]
Message-ID:  <446DD5EB.6030300@icyb.net.ua>

next in thread | raw e-mail | index | archive | help

[Disclaimer, just in case: I do mean APIC, not ACPI]

This is a good lesson for me for not trying any RCs or BETAs in due time.

Short description of my system: nforce2 based motherboard NF-7 v2 with
the latest BIOS (v2.7), CPU is Athlon XP.

After upgrading from 5.4 to 6.1 I started to experience complete system
freezes after some (short) time after each boot. 100% reproducible, time
before lockup varied from several seconds to several minutes.

I already had freezes but with different symptoms on this system with
5.2.1 and APIC enabled:
http://lists.freebsd.org/pipermail/freebsd-questions/2004-September/058392.html
These freezes were fixed either in 5.3 or in 5.4, I don't remember now
precisely, but I had APIC enabled in kernel and BIOS for a long time.
(Just in case: I did have interrupts > 15 all that time).

So I went and disabled APIC in BIOS and freezes went away.
I am not sure exactly why, but I wanted my APIC back. So I googled up a
lot of information about nforce2+APIC, nforce2+Linux and APIC+FreeBSD.
Here's a brief summary of my findings:

1. apparently on FreeBSD 5.4 APIC works in mixed mode, system uses IRQ0
timer and everything is OK (for reasons not clear to me).

2. apparently linux 2.4.* works similarly but had or has some problems
with nforce2 because almost all BIOSes (MADTs) on almost all
nforce2-based MBs (save for some Shuttles) have bogus IRQ0->PIN2
override and that screwed something in linux. This might be (have been)
causing problems for some FreeBSD users, but not for me, not my MB.

3. apparently linux 2.6.* uses LAPIC timer similarly to FreeBSD 6.1, but
people still experienced or experience hard freezes when they have all
of the following 3 enabled: LAPIC timer, APIC and "Disconnect CPU on C1"
chipset feature. The latter is done through either BOIS setting or
through programs like fvcool.

I indeed verified that if I disable "C1 disconnect", then 6.1 with APIC
enabled works well. But the CPU temperature went up as well, so I wanted
my "C1 disconnect" back :)

After fruitlessly trying to hack sources to disable LAPIC timer and go
back to IRQ0 timer and make this portion of kernel behave similarly to
5.4 (this is a long and uninteresting story), I finally found a very
useful piece of information from within nVidia itself:
http://lkml.org/lkml/2004/5/3/157

Based on that info and the linux patch in that thread I came up with the
following PCI fixup. Now I am running 6.1 with both APIC and "C1
disconnect" enabled for 2 days without any problems.

--- sys/dev/pci/fixup_pci.c.orig	Wed May 17 21:08:47 2006
+++ sys/dev/pci/fixup_pci.c	Thu May 18 16:42:53 2006
@@ -51,6 +51,7 @@

 static int	fixup_pci_probe(device_t dev);
 static void	fixwsc_natoma(device_t dev);
+static void	fixc1_nforce2(device_t dev);

 static device_method_t fixup_pci_methods[] = {
     /* Device interface */
@@ -76,6 +77,9 @@
     case 0x12378086:		/* Intel 82440FX (Natoma) */
 	fixwsc_natoma(dev);
 	break;
+    case 0x01e010de:		/* nVidia nforce2 */
+	fixc1_nforce2(dev);
+	break;
     }
     return(ENXIO);
 }
@@ -99,4 +103,18 @@
 	pci_write_config(dev, 0x50, pmccfg, 2);
     }
 #endif
+}
+
+/*
+ * See: http://lkml.org/lkml/2004/5/3/157
+ */
+static void
+fixc1_nforce2(device_t dev)
+{
+    uint32_t	val;
+
+    val = pci_read_config(dev, 0x6c, 4);
+    val &= 0xfff1ffff;
+    pci_write_config(dev, 0x6c, val, 4);
+    printf("fixup from nforce2 C1 CPU disconnect hangs\n");
 }

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?446DD5EB.6030300>