Date: Fri, 30 Mar 2007 14:42:08 +0100 From: Deomid Ryabkov <myself@rojer.pp.ru> To: freebsd-hackers@freebsd.org Subject: 6.2: reproducible hang on amd64, traced to 24h of commits Message-ID: <460D13B0.5070500@rojer.pp.ru>
next in thread | raw e-mail | index | archive | help
ok, now that the machine has been up for 10 days, i am reasonably sure i've close enough to this one. back in january i cvsupped to -STABLE and the box (dual head opteron box) started hanging. and i mean it dies completely. i have all debug options and a working serial console, but still it just dies and both serial and system console are unresponsive. no panic message on either, nothing. pretty sad. the kernel config is vanilla SMP GENERIC, with all debug options i could think of enabled (after it started hanging). so the first thing i did after rebooting the box a couple of times is fall back to kernel.old (6.1-STABLE circa august '06). no hangs. i then started incrementally updating, gradually getting closer to jan 22. long story short, i seem to have isolated the problem to commits made between date=2006.12.28.00.00.00 and date=2006.12.29.00.00.00. last hang i had was when running the 12/29 kernel, now it's 12/28 and the box has been up for 2 weeks already. based on previois experience i'm pretty certain that this is it. with bad kernel the box would never stay up more than a few days, never more than 5. between 12/28 and 12/29 i see some changes to /sys/amd64/ and /sys/pci/, which might've be the cause. i will probably start looking into individual changes, but if anyone more experienced than me could take a look, it'd be appreciated. i am willing to try patches. i confirmed that recent (as of 3 weeks or so) -STABLE still has this problem. thanks in advance. ==== files under /sys that were changed between 12/28 and 12/29: Edit src/sys/amd64/amd64/mptable_pci.c Edit src/sys/amd64/pci/pci_bus.c Edit src/sys/contrib/dev/ath/public/wackelf.c Edit src/sys/dev/acpica/acpi_pci.c Edit src/sys/dev/acpica/acpi_pcib_acpi.c Edit src/sys/dev/acpica/acpi_pcib_pci.c Checkout src/sys/dev/ath/if_ath.c Edit src/sys/dev/cardbus/cardbus.c Edit src/sys/dev/drm/drm_agpsupport.c Edit src/sys/dev/pci/pci.c Edit src/sys/dev/pci/pci_if.m Edit src/sys/dev/pci/pci_pci.c Edit src/sys/dev/pci/pci_private.h Edit src/sys/dev/pci/pcib_private.h Edit src/sys/dev/pci/pcivar.h Edit src/sys/i386/i386/mptable_pci.c Edit src/sys/i386/pci/pci_bus.c Edit src/sys/kern/subr_bus.c Checkout src/sys/netgraph/ng_deflate.h Edit src/sys/pci/agp.c Edit src/sys/pci/agpreg.h Edit src/sys/powerpc/ofw/ofw_pcib_pci.c Edit src/sys/sparc64/pci/apb.c Edit src/sys/sparc64/pci/ofw_pcib.c Edit src/sys/sparc64/pci/ofw_pcibus.c Edit src/sys/sys/param.h ==== kernel configuration used: include GENERIC options SMP options KDB options DDB makeoptions DEBUG=-g options INVARIANTS options INVARIANT_SUPPORT options WITNESS options DEBUG_LOCKS options DEBUG_VFS_LOCKS options DIAGNOSTIC ==== -- Deomid Ryabkov aka Rojer myself@rojer.pp.ru rojer@sysadmins.ru ICQ: 8025844
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?460D13B0.5070500>