Date: Thu, 18 Mar 2004 12:56:32 -0600 From: "Douglas K. Rand" <rand@meridian-enviro.com> To: freebsd-hardware@freebsd.org Subject: System Freezing Message-ID: <87fzc6gf1b.wl@delta.meridian-enviro.com>
next in thread | raw e-mail | index | archive | help
--Multipart_Thu_Mar_18_12:56:32_2004-1 Content-Type: text/plain; charset=US-ASCII I'm having what is probably a hardware problem on a system that just hangs every 6-36 hours, and I'm wondering if anybody has any ideas for things I could try. Its a RELENG_4_8 system with DDB, DDB_UNATTENDED, and ALT_BREAK_TO_DEBUGGER kernel options set. (Its on a serial console, thats why the ALT_BREAK_TO_DEBUGGER option.) Its an Athlon 3200+ on a Gigabyte GA-7N400-L mobo, with two 512MB PC3200 DDR DIMMs, and a 2 port 3ware controller and 2 Deskstar 180 GXP disks. The power supply is an Antec TruePower 380W. The system ran perfectly for about 60 days, and then started having this problem. In almost all cases the system will simply hang, there is no response from the console or network, and the CR ~ ^B sequence will not get me to the kernel debugger. (I've tested this when the system is running fine and I do get the kernel debugger.) The only solution is to reset or power cycle the system. It has crashed 3 times with a Fatal trap 12: page fault while in kernel mode panic, and one time it simply rebooted as if someone pressed the reset button. But it has simply hung 18 times. I've tried running with only one DIMM, and when the system died 3 times with that DIMM, I tried running with only the other DIMM, and it still dies. I've replaced the power supply with an Antec 400W, and the system still dies. I even replaced the power cord. I've tried both the stock 4.8 twe driver and 3ware's beta driver, both still die. I replaced the onboard NIC with an Intel Etherexpress Pro, and the system still dies. I don't think its temperature related, I've run the system with the case open and on its side, and a continous mbmon output shows no temperature increases just before the system hangs. (A representative output from mbmon is: Temp.= 75.2, 113.0, 86.0; Rot.= 4821, 2636, 0 Vcore = 1.70, 2.74; Volt. = 3.31, 4.14, 11.55, -5.29, -2.05 I've got a ThermalTake Volcano 11+ cooler on the CPU. I don't think the problems are load related, as it carries very high loads with out hanging, and I've had it hang with fairly light loads. I've attached the dmesg and kernel config files. If anybody has any suggestions I'd be thrilled. I'm up to replacing either the CPU or the mobo, neither of which I'm looking forward too. --Multipart_Thu_Mar_18_12:56:32_2004-1 Content-Type: application/octet-stream Content-Disposition: attachment; filename="dmesg" Content-Transfer-Encoding: quoted-printable Copyright (c) 1992-2003 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.8-RELEASE-p16 #6: Wed Mar 17 14:46:41 CST 2004 rand@snow.meridian-enviro.com:/usr/obj/usr/src/sys/SNOW Timecounter "i8254" frequency 1193182 Hz Timecounter "TSC" frequency 2191242163 Hz CPU: AMD Athlon(tm) XP 3200+ (2191.24-MHz 686-class CPU) Origin =3D "AuthenticAMD" Id =3D 0x6a0 Stepping =3D 0 Features=3D0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE= ,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE> AMD Features=3D0xc0400000<AMIE,DSP,3DNow!> real memory =3D 536805376 (524224K bytes) avail memory =3D 519462912 (507288K bytes) Preloaded elf kernel "kernel" at 0xc02db000. Pentium Pro MTRR support enabled Using $PIR table, 11 entries at 0xc00fcda0 npx0: <math processor> on motherboard npx0: INT 16 interface pcib0: <Host to PCI bridge> on motherboard pci0: <PCI bus> on pcib0 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x01eb) at 0.1 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x01ee) at 0.2 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x01ed) at 0.3 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x01ec) at 0.4 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x01ef) at 0.5 isab0: <PCI to ISA bridge (vendor=3D10de device=3D0060)> at device 1.0 on p= ci0 isa0: <ISA bus> on isab0 pci0: <unknown card> (vendor=3D0x10de, dev=3D0x0064) at 1.1 irq 11 pcib1: <PCI to PCI bridge (vendor=3D10de device=3D006c)> at device 8.0 on p= ci0 pci1: <PCI bus> on pcib1 pci1: <3Dfx Voodoo 3 graphics accelerator> at 6.0 irq 12 fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xd400-0xd43f mem 0xe7800000-0= xe781ffff,0xe7821000-0xe7821fff irq 10 at device 7.0 on pci1 fxp0: Ethernet address 00:02:b3:e7:ab:6e inphy0: <i82555 10/100 media interface> on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto twe0: <3ware Storage Controller> port 0xd800-0xd80f mem 0xe7000000-0xe77fff= ff,0xe7820000-0xe782000f irq 11 at device 9.0 on pci1 twe0: 2 ports, Firmware FE7X 1.05.00.050, BIOS BE7X 1.08.00.046 atapci0: <Generic PCI ATA controller> port 0xf000-0xf00f at device 9.0 on p= ci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 pcib2: <PCI to PCI bridge (vendor=3D10de device=3D01e8)> at device 30.0 on = pci0 pci2: <PCI bus> on pcib2 orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff,0xca000-0xcaff= f on isa0 fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0 vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=3D0x100> sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 16550A, console sio1 at port 0x2f8-0x2ff irq 3 on isa0 sio1: type 16550A acd0: CDROM <TEAC CD-552E> at ata0-slave PIO4 twed0: <TwinStor, Normal> on twe0 twed0: 176699MB (361880032 sectors) twe0: command interrupt Mounting root from ufs:/dev/twed0s1a WARNING: / was not properly dismounted --Multipart_Thu_Mar_18_12:56:32_2004-1 Content-Type: text/plain; charset=US-ASCII --Multipart_Thu_Mar_18_12:56:32_2004-1 Content-Type: application/octet-stream Content-Disposition: attachment; filename="SNOW" Content-Transfer-Encoding: quoted-printable machine i386 cpu I686_CPU ident SNOW maxusers 0 options INET options FFS options FFS_ROOT options SOFTUPDATES options UFS_DIRHASH options NFS options COMPAT_43 options INCLUDE_CONFIG_FILE options ICMP_BANDLIM options MAXDSIZ=3D"(1024*1024*1024)" options DFLDSIZ=3D"(1024*1024*1024)" options DDB options DDB_UNATTENDED options ALT_BREAK_TO_DEBUGGER device isa device pci device fdc0 at isa? port IO_FD1 irq 6 drq 2 device fd0 at fdc0 drive 0 device ata device atadisk device atapicd device twe device atkbdc0 at isa? port IO_KBD device atkbd0 at atkbdc? irq 1 flags 0x1 device vga0 at isa? device sc0 at isa? flags 0x100 device npx0 at nexus? port IO_NPX irq 13 device sio0 at isa? port IO_COM1 flags 0x10 irq 4 device sio1 at isa? port IO_COM2 irq 3 device miibus device fxp device rl pseudo-device loop pseudo-device ether pseudo-device pty --Multipart_Thu_Mar_18_12:56:32_2004-1 Content-Type: text/plain; charset=US-ASCII --Multipart_Thu_Mar_18_12:56:32_2004-1--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87fzc6gf1b.wl>