Date: Mon, 15 Mar 2004 23:31:47 +0200 From: Evren Yurtesen <yurtesen@ispro.net.tr> To: Elliot Finley <lists@efinley.com> Cc: freebsd-current@freebsd.org Subject: Re: reliable disk FAILURE Message-ID: <405620C3.2060900@ispro.net.tr> In-Reply-To: <019001c40abb$ac2a9350$32cba1cd@science1> References: <019001c40abb$ac2a9350$32cba1cd@science1>
next in thread | previous in thread | raw e-mail | index | archive | help
I am using the same board(I think). And I had similar problem. I disabled the SATA enhanced mode from the bios. That fixed all my SATA problems. The only thing you lose is the secondary PATA controller. You dont have any PATA drives anyway so...CDROM can manage in primary PATA controller. Of course there should be a permanent fix to this problem in the source, but I am not a programmer :) I can just suggest the temporary fix that I used. Can you please reply if this helps you? By the way, when you change the SATA mode, the disk names also change. I see my SATA drives as ad2 and ad3 now! Evren Elliot Finley wrote: > when doing a disk-to-disk backup using dump/restore, I reliably get a disk > failure. I don't think it's the disk because it happens on six different > machines. All six machines are using SATA drives. They all have an ASUS > P4P800 MB. > > None of the six machines had any problems until after the last security > patch to 5.2.1. After the patch, they all fail. If I remember correctly, > the last security patch only touched some TCP files, so the disk failures > don't make any sense to me. > > commands causing the failure, console output and dmesg are below. This is > on a test machine that I can take down or modify at any time, so if there is > anything further that I can do to help debug this - please let me know. > > sequence of commands issued to cause failure > ---------------------------------------------- > Executing command: /bin/dd if=/dev/zero of=/dev/ad14 bs=1k count=1 > Executing command: /sbin/fdisk -BI ad14 > Executing command: /sbin/bsdlabel -w -B ad14s1 auto > Executing command: /sbin/bsdlabel ad14s1 > /tmp/backup.disk.label > Executing command: /bin/echo 'a: 2097152 0 4.2BSD' >> /tmp/backup.disk.label > Executing command: /bin/echo 'b: 4194304 * swap' >> /tmp/backup.disk.label > Executing command: /bin/echo 'd: 125829120 * 4.2BSD' >> > /tmp/backup.disk.label > Executing command: /bin/echo 'e: * * 4.2BSD' >> /tmp/backup.disk.label > Executing command: /sbin/bsdlabel -R -B ad14s1 /tmp/backup.disk.label > Executing command: /sbin/newfs -U /dev/ad14s1a > Executing command: /sbin/newfs -U /dev/ad14s1d > Executing command: /sbin/newfs -U /dev/ad14s1e > Executing command: /sbin/mount -rw /dev/ad14s1a /mnt > Executing command: /sbin/dump -0Lf - / | (cd /mnt; /sbin/restore -rf -) > Executing command: /sbin/umount /mnt > Executing command: /sbin/mount -rw /dev/ad14s1d /mnt > Executing command: /sbin/dump -0Lf - /usr | (cd /mnt; /sbin/restore -rf -) > DUMP: Date of this level 0 dump: Mon Mar 15 10:32:02 2004 > DUMP: Date of last level 0 dump: the epoch > DUMP: Dumping snapshot of /dev/ad12s1d (/usr) to standard output > DUMP: mapping (Pass I) [regular files] > DUMP: mapping (Pass II) [directories] > DUMP: estimated 1621128 tape blocks. > DUMP: dumping (Pass III) [directories] > DUMP: dumping (Pass IV) [regular files] > warning: ./.snap: File exists > (dump/restore dies here - this time (doesn't die in same place every time) - > causing the following output on the console) > > > console output > --------------- > ad12: TIMEOUT - READ_DMA retrying (2 retries left) LBA=28166139 > ad12: timeout sending command=c8 > ad12: error issuing DMA command > GEOM: create disk ad12 dp=0xc6ded160 > ad12: 76319MB <ST380013AS> [155061/16/63] at ata6-master UDMA100 > ad12: FAILURE - SETFEATURES SET TRANSFER MODE timed out > > dmesg > ------ > Copyright (c) 1992-2004 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD 5.2.1-RELEASE-p1 #5: Fri Mar 5 17:54:52 MST 2004 > root@oregon.etv.net:/usr/obj/usr/src/sys/GENERIC > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a35000. > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a3521c. > ACPI APIC Table: <A M I OEMAPIC > > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Intel(R) Pentium(R) 4 CPU 2.60GHz (2598.76-MHz 686-class CPU) > Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA > ,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> > Hyperthreading: 2 logical CPUs > real memory = 1072889856 (1023 MB) > avail memory = 1032749056 (984 MB) > FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > ioapic0 <Version 2.0> irqs 0-23 on motherboard > Pentium Pro MTRR support enabled > npx0: [FAST] > npx0: <math processor> on motherboard > npx0: INT 16 interface > acpi0: <A M I OEMXSDT > on motherboard > pcibios: BIOS version 2.10 > Using $PIR table, 14 entries at 0xc00f5410 > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 > acpi_cpu0: <CPU> on acpi0 > acpi_cpu1: <CPU> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > agp0: <Intel 82865 host to AGP bridge> mem 0xf8000000-0xfbffffff at device > 0.0 on pci0 > pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0 > pcib1: could not get PCI interrupt routing table for \\_SB_.PCI0.P0P1 - > AE_NOT_FOUND > pci1: <ACPI PCI bus> on pcib1 > pci1: <display, VGA> at device 0.0 (no driver attached) > uhci0: <Intel 82801EB (ICH5) USB controller USB-A> port 0xef00-0xef1f irq 16 > at device 29.0 on pci0 > usb0: <Intel 82801EB (ICH5) USB controller USB-A> on uhci0 > usb0: USB revision 1.0 > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub0: 2 ports with 2 removable, self powered > uhci1: <Intel 82801EB (ICH5) USB controller USB-B> port 0xef20-0xef3f irq 19 > at device 29.1 on pci0 > usb1: <Intel 82801EB (ICH5) USB controller USB-B> on uhci1 > usb1: USB revision 1.0 > uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub1: 2 ports with 2 removable, self powered > uhci2: <Intel 82801EB (ICH5) USB controller USB-C> port 0xef40-0xef5f irq 18 > at device 29.2 on pci0 > usb2: <Intel 82801EB (ICH5) USB controller USB-C> on uhci2 > usb2: USB revision 1.0 > uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub2: 2 ports with 2 removable, self powered > uhci3: <Intel 82801EB (ICH5) USB controller USB-D> port 0xef80-0xef9f irq 16 > at device 29.3 on pci0 > usb3: <Intel 82801EB (ICH5) USB controller USB-D> on uhci3 > usb3: USB revision 1.0 > uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 > uhub3: 2 ports with 2 removable, self powered > pci0: <serial bus, USB> at device 29.7 (no driver attached) > pcib2: <ACPI PCI-PCI bridge> at device 30.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > skc0: <3Com 3C940 Gigabit Ethernet> port 0xd800-0xd8ff mem > 0xfeafc000-0xfeafffff irq 22 at device 5.0 on pci2 > skc0: 3Com Gigabit LOM (3C940) > sk0: <Marvell Semiconductor, Inc. Yukon> on skc0 > sk0: Ethernet address: 00:0c:6e:54:4b:25 > miibus0: <MII bus> on sk0 > e1000phy0: <Marvell 88E1000 Gigabit PHY> on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, > auto > atapci0: <Promise PDC20319 SATA150 controller> port > 0xdc00-0xdc7f,0xdfa0-0xdfaf,0xdf00-0xdf3f mem > 0xfeac0000-0xfeadffff,0xfeafb000-0xfeafbfff irq 21 at device 9.0 on pci2 > atapci0: [MPSAFE] > ata2: at 0xfeafb000 on atapci0 > ata2: [MPSAFE] > ata3: at 0xfeafb000 on atapci0 > ata3: [MPSAFE] > ata4: at 0xfeafb000 on atapci0 > ata4: [MPSAFE] > ata5: at 0xfeafb000 on atapci0 > ata5: [MPSAFE] > fxp0: <Intel 82550 Pro/100 Ethernet> port 0xde80-0xdebf mem > 0xfeaa0000-0xfeabffff,0xfeafa000-0xfeafafff irq 23 at device 11.0 on pci2 > fxp0: Ethernet address 00:02:b3:d1:f7:ad > miibus1: <MII bus> on fxp0 > inphy0: <i82555 10/100 media interface> on miibus1 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > isab0: <PCI-ISA bridge> at device 31.0 on pci0 > isa0: <ISA bus> on isab0 > atapci1: <Intel ICH5 UDMA100 controller> port > 0xfc00-0xfc0f,0-0x3,0-0x7,0-0x3,0-0x7 at device 31.1 on pci0 > ata0: at 0x1f0 irq 14 on atapci1 > ata0: [MPSAFE] > ata1: at 0x170 irq 15 on atapci1 > ata1: [MPSAFE] > atapci2: <Intel ICH5 SATA150 controller> port > 0xef60-0xef6f,0xefa8-0xefab,0xefa0-0xefa7,0xefac-0xefaf,0xefe0-0xefe7 irq 18 > at device 31.2 on pci0 > atapci2: [MPSAFE] > ata6: at 0xefe0 on atapci2 > ata6: [MPSAFE] > ata7: at 0xefa0 on atapci2 > ata7: [MPSAFE] > pci0: <serial bus, SMBus> at device 31.3 (no driver attached) > pci0: <multimedia, audio> at device 31.5 (no driver attached) > acpi_button0: <Power Button> on acpi0 > atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 > atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0 > kbd0 at atkbd0 > psm0: <PS/2 Mouse> irq 12 on atkbdc0 > psm0: model IntelliMouse, device ID 3 > sio0 port 0x3f8-0x3ff irq 4 on acpi0 > sio0: type 16550A > sio1 port 0x2e8-0x2ef irq 3 on acpi0 > sio1: type 16550A > ppc0 port 0x378-0x37f irq 7 on acpi0 > ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode > ppbus0: <Parallel port bus> on ppc0 > plip0: <PLIP network interface> on ppbus0 > lpt0: <Printer> on ppbus0 > lpt0: Interrupt-driven port > ppi0: <Parallel I/O> on ppbus0 > orm0: <Option ROMs> at iomem 0xc8000-0xc97ff,0xc0000-0xc7fff on isa0 > pmtimer0 on isa0 > fdc0: ready for input in output > fdc0: cmd 3 failed at out byte 1 of 3 > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 10.000 msec > acd0: CDROM <FX54++M> at ata0-master PIO4 > GEOM: create disk ad12 dp=0xc6b91060 > ad12: 76319MB <ST380013AS> [155061/16/63] at ata6-master UDMA100 > GEOM: create disk ad14 dp=0xc69d1d60 > ad14: 76319MB <ST380013AS> [155061/16/63] at ata7-master UDMA100 > SMP: AP CPU #1 Launched! > Mounting root from ufs:/dev/ad12s1a > WARNING: / was not properly dismounted > WARNING: /usr was not properly dismounted > WARNING: /var was not properly dismounted > > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?405620C3.2060900>