From owner-freebsd-stable@FreeBSD.ORG Mon May 16 12:40:07 2005 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6606916A4CE; Mon, 16 May 2005 12:40:07 +0000 (GMT) Received: from postmaster.etv.net (postmaster.etv.net [208.14.190.176]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9B97F43D54; Mon, 16 May 2005 12:40:04 +0000 (GMT) (envelope-from efinleywork@efinley.com) Received: from work.efinley.com ([205.161.203.55] helo=elliotdevelop) by postmaster.etv.net with smtp (Exim 4.50 (FreeBSD)) id 1DXet5-0001Q0-P3; Mon, 16 May 2005 06:39:59 -0600 Message-ID: <001801c55a14$609720d0$37cba1cd@emerytelcom.com> From: "Elliot Finley" To: Date: Mon, 16 May 2005 06:40:01 -0600 Organization: Emery Telcom MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 cc: sos@freebsd.org Subject: 5.4-RC2 freezing - ATA related? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Elliot Finley List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2005 12:40:07 -0000 This has been happening since 5.3-R, I've been tuning different parameters to no avail. I've taken the disks off of the onboard ICH5 controller and put them a promise TX4 S150 controller, but still the same thing happens. The system freezes, but isn't totally dead. It'll still respond to pings, the screensaver still functions, but it won't respond to a CAD at the console. But if I press 'Enter' at the console, it'll give me a 'login:' prompt, but after entering the username, it never comes back with the 'password:' prompt. After manually resetting the system it boots and says 'Automatic file system check failed; help!' and drops into single user mode. Running fsck manually corrects errors on all volumes. Then it'll boot from that point. This seems to be triggered by daily periodic as it happens at 3:02-3:03AM each time. But it doesn't happen *every* morning. I suspect a bug in FreeBSD because this mode of failure happens on 3 different machines, all configured similarly. ASUS P4P800 2G RAM (though the other affected systems only have 1G) 80G Seagate Barracuda SATA drives (one system now on Promise TX4 S150 controller, others on onboard ICH5) On my lightly loaded systems, it happens rarely. On my mailserver (fairly heavy disk load), it happens quite frequently. How can I troubleshoot this? dmesg follows: Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-RC2 #2: Wed Apr 13 17:35:20 MDT 2005 root@postmaster.etv.net:/usr/obj/usr/src/sys/Postmaster Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 2.60GHz (2605.92-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 2146631680 (2047 MB) avail memory = 2095153152 (1998 MB) ACPI APIC Table: ioapic0 irqs 0-23 on motherboard npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xf8000000-0xfbffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) uhci0: port 0xef00-0xef1f irq 16 at device 29.0 on pci0 usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xef20-0xef3f irq 19 at device 29.1 on pci0 usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xef40-0xef5f irq 18 at device 29.2 on pci0 usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: port 0xef80-0xef9f irq 16 at device 29.3 on pci0 usb3: on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered pci0: at device 29.7 (no driver attached) pcib2: at device 30.0 on pci0 pci2: on pcib2 skc0: <3Com 3C940 Gigabit Ethernet> port 0xd800-0xd8ff mem 0xfeafc000-0xfeafffff irq 22 at device 5.0 on pci2 skc0: 3Com Gigabit LOM (3C940) rev. (0x1) sk0: on skc0 sk0: Ethernet address: 00:0c:6e:54:4b:19 miibus0: on sk0 e1000phy0: on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto atapci0: port 0xdc00-0xdc7f,0xdfa0-0xdfaf,0xdf00-0xdf3f mem 0xfeac0000-0xfeadffff,0xfeafb000-0xfeafbfff irq 21 at device 9.0 on pci2 atapci0: failed: rid 0x20 is memory, requested 4 ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 ata4: channel #2 on atapci0 ata5: channel #3 on atapci0 xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xd480-0xd4ff mem 0xfeaf9c00-0xfeaf9c7f irq 20 at device 12.0 on pci2 miibus1: on xl0 ukphy0: on miibus1 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:04:75:f1:1c:7e isab0: at device 31.0 on pci0 isa0: on isab0 atapci1: port 0xfc00-0xfc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci1 ata1: channel #1 on atapci1 pci0: at device 31.3 (no driver attached) pci0: at device 31.5 (no driver attached) acpi_button0: on acpi0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 psm0: irq 12 on atkbdc0 psm0: model IntelliMouse, device ID 3 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 orm0: at iomem 0xcf000-0xcf7ff,0xc8000-0xcefff,0xc0000-0xc7fff on isa0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2605917008 Hz quality 800 Timecounters tick every 10.000 msec acd0: CDROM at ata1-master PIO4 ad4: 76319MB [155061/16/63] at ata2-master SATA150 ad6: 76319MB [155061/16/63] at ata3-master SATA150 ad8: 117246MB [238216/16/63] at ata4-master SATA150 ad10: 76319MB [155061/16/63] at ata5-master SATA150 ar0: 76319MB [9729/255/63] status: READY subdisks: disk0 READY on ad4 at ata2-master disk1 READY on ad6 at ata3-master ar1: 76319MB [9729/255/63] status: READY subdisks: disk0 READY on ad10 at ata5-master disk1 READY on ad8 at ata4-master Mounting root from ufs:/dev/ar0s1a WARNING: / was not properly dismounted