From owner-freebsd-stable@FreeBSD.ORG Mon Jun 8 14:57:15 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CEAD7106564A for ; Mon, 8 Jun 2009 14:57:15 +0000 (UTC) (envelope-from adamk@voicenet.com) Received: from QMTA06.westchester.pa.mail.comcast.net (qmta06.westchester.pa.mail.comcast.net [76.96.62.56]) by mx1.freebsd.org (Postfix) with ESMTP id 7B7238FC18 for ; Mon, 8 Jun 2009 14:57:15 +0000 (UTC) (envelope-from adamk@voicenet.com) Received: from OMTA01.westchester.pa.mail.comcast.net ([76.96.62.11]) by QMTA06.westchester.pa.mail.comcast.net with comcast id 1NmM1c0040EZKEL56Sk0Zl; Mon, 08 Jun 2009 14:44:00 +0000 Received: from localhost ([67.103.204.242]) by OMTA01.westchester.pa.mail.comcast.net with comcast id 1Sjd1c00C5EJinX3MSjfqL; Mon, 08 Jun 2009 14:43:43 +0000 Date: Mon, 8 Jun 2009 10:43:34 -0400 From: Adam K Kirchhoff To: freebsd-stable@freebsd.org Message-ID: <20090608104334.59a4718c@voicenet.com> In-Reply-To: <20090608101837.79c3b7d7@voicenet.com> References: <20090608101837.79c3b7d7@voicenet.com> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.16.2; i386-portbld-freebsd7.2) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: Problems with PATA disk X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jun 2009 14:57:16 -0000 On Mon, 8 Jun 2009 10:18:37 -0400 Adam K Kirchhoff wrote: > > My old workstation finally died and replaced by a Dell Vostro 420. > Since the hard drives on the old machine were fine, I decided to throw > them into the new machine. The new machine only had SATA onboard, so I added a Promise controller to the mix: > > atapci1@pci0:5:3:0: class=0x018000 card=0x3375105a chip=0x3375105a rev=0x020 > vendor = 'Promise Technology Inc' > device = 'PDC20375(??) FastTrak SATA150 TX2plus Controller' > class = mass storage > > It has two SATA connectors and a single PATA connector. I had two PATA > drives, so that worked out fine, and I hooked them up. The master was > the master in the old machine and the slave was the slave in the old > machine. No need to change anything around. > > At first everything was fine. I booted up (using GENERIC, as I nearly > always do) and ran for a while. The machine locked up and I decided to > bring the machine up in single user mode and run an fsck. It ran just > fine on / /tmp /var and /usr (all on the master drive, ad14). I then > ran the fsck on ad15s1a (/home). Unfortunately, I almost immediately > started receiving 'WARNING - SETFEATURES SET TRANSFER MODE taskqueue > timeout' messages (along with various other SETFEATURES messages). > They were proceeded by both ad14 and ad15 (though, as I said, ad14 > fsck'ed fine). > > This continued for 30 minutes before I gave up and rebooted. When the > machine came back up, ad15 had no partition table or disklabel. And > fdisk refused to create a partition. > > Assuming that the drive had gone bad, I swapped it out with another > drive. Created a new partition, and labelled it. Restored /home from > backups. It ran for about a week, but locked up on me today (as > before, when doing something 3D, so I do not believe the backups are > related to disk activity), and I decided to manually run a fsck on the > system. Unfortunately, I received the same SETFEATURES messages as > before when fsck'ing /home. Instead of letting it run for 30 minutes, I > stopped after the messages flashed by the screen. I rebooted. The > partition table is hosed and there is no disklabel. > > When I go to create a new partition (per the directions > in /usr/share/doc/handbook/disks-adding.html, which is what I used > without any problems when I threw the new drive into the system), this > is what I happens: > > [ root@memory - ~ ]: dd if=/dev/zero of=/dev/ad15 bs=1k count=1 > 1+0 records in > 1+0 records out > 1024 bytes transferred in 0.000118 secs (8676702 bytes/sec) > [ root@memory - ~ ]: fdisk -BI ad15 > ******* Working on device /dev/ad15 ******* > fdisk: invalid fdisk partition table found > fdisk: Geom not found: "ad15" > [ root@memory - ~ ]: bsdlabel -B -w ad15s1 auto > bsdlabel: /dev/ad15s1: No such file or directory > > And, indeed, there is still only /dev/ad15. > > So I have a few questions... > > Why do I keep losing my data? > How can I partition and label either one of these drives? > > Some system information: > > [ root@memory - ~ ]: uname -a > FreeBSD memory.visualtech.com 7.2-STABLE FreeBSD 7.2-STABLE #5: Fri May 8 14:02:01 EDT 2009 root@memory.visualtech.com:/usr/obj/usr/src/sys/GENERIC i386 > [ root@memory - ~ ]: pciconf -vl > hostb0@pci0:0:0:0: class=0x060000 card=0x02821028 chip=0x2e208086 rev=0x03 hdr=0x00 > vendor = 'Intel Corporation' > class = bridge > subclass = HOST-PCI > pcib1@pci0:0:1:0: class=0x060400 card=0x02821028 chip=0x2e218086 rev=0x03 hdr=0x01 > vendor = 'Intel Corporation' > class = bridge > subclass = PCI-PCI > uhci0@pci0:0:26:0: class=0x0c0300 card=0x02821028 chip=0x3a378086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > uhci1@pci0:0:26:1: class=0x0c0300 card=0x02821028 chip=0x3a388086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > uhci2@pci0:0:26:2: class=0x0c0300 card=0x02821028 chip=0x3a398086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > ehci0@pci0:0:26:7: class=0x0c0320 card=0x02821028 chip=0x3a3c8086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > pcib2@pci0:0:28:0: class=0x060400 card=0x02821028 chip=0x3a408086 rev=0x00 hdr=0x01 > vendor = 'Intel Corporation' > class = bridge > subclass = PCI-PCI > pcib3@pci0:0:28:1: class=0x060400 card=0x02821028 chip=0x3a428086 rev=0x00 hdr=0x01 > vendor = 'Intel Corporation' > class = bridge > subclass = PCI-PCI > pcib4@pci0:0:28:2: class=0x060400 card=0x02821028 chip=0x3a448086 rev=0x00 hdr=0x01 > vendor = 'Intel Corporation' > class = bridge > subclass = PCI-PCI > uhci3@pci0:0:29:0: class=0x0c0300 card=0x02821028 chip=0x3a348086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > uhci4@pci0:0:29:1: class=0x0c0300 card=0x02821028 chip=0x3a358086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > uhci5@pci0:0:29:2: class=0x0c0300 card=0x02821028 chip=0x3a368086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > ehci1@pci0:0:29:7: class=0x0c0320 card=0x02821028 chip=0x3a3a8086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = USB > pcib5@pci0:0:30:0: class=0x060401 card=0x02821028 chip=0x244e8086 rev=0x90 hdr=0x01 > vendor = 'Intel Corporation' > device = '82801 Family (ICH2/3/4/4/5/5/6/7/8/9,63xxESB) Hub Interface to PCI Bridge' > class = bridge > subclass = PCI-PCI > isab0@pci0:0:31:0: class=0x060100 card=0x02821028 chip=0x3a168086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = bridge > subclass = PCI-ISA > atapci2@pci0:0:31:2: class=0x010601 card=0x02821028 chip=0x3a228086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = mass storage > subclass = SATA > none0@pci0:0:31:3: class=0x0c0500 card=0x02821028 chip=0x3a308086 rev=0x00 hdr=0x00 > vendor = 'Intel Corporation' > class = serial bus > subclass = SMBus > vgapci0@pci0:1:0:0: class=0x030000 card=0x30001002 chip=0x5b631002 rev=0x00 hdr=0x00 > vendor = 'ATI Technologies Inc' > device = 'Radeon X550 Series' > class = display > subclass = VGA > vgapci1@pci0:1:0:1: class=0x038000 card=0x30011002 chip=0x5b731002 rev=0x00 hdr=0x00 > vendor = 'ATI Technologies Inc' > device = 'Radeon X550 Series - Secondary' > class = display > atapci0@pci0:3:0:0: class=0x010185 card=0x02821028 chip=0x2363197b rev=0x03 hdr=0x00 > vendor = 'JMicron Technology Corp' > device = 'JMB36X PCIe-to-SATA-300/IDE RAID Controller' > class = mass storage > subclass = ATA > re0@pci0:4:0:0: class=0x020000 card=0x02821028 chip=0x816810ec rev=0x02 hdr=0x00 > vendor = 'Realtek Semiconductor' > device = 'RTL8168/8111 PCI-E Gigabit Ethernet NIC' > class = network > subclass = ethernet > fxp0@pci0:5:0:0: class=0x020000 card=0x000c8086 chip=0x12298086 rev=0x08 hdr=0x00 > vendor = 'Intel Corporation' > device = '82550/1/7/8/9 EtherExpress PRO/100(B) Ethernet Adapter' > class = network > subclass = ethernet > emu10kx0@pci0:5:1:0: class=0x040100 card=0x80641102 chip=0x00021102 rev=0x0a hdr=0x00 > vendor = 'Creative Technology LTD.' > device = 't4780010004541 Sound Blaster Live! (Also Live! 5.1) - OEM from DELL - CT4780' > class = multimedia > subclass = audio > none1@pci0:5:1:1: class=0x098000 card=0x00201102 chip=0x70021102 rev=0x0a hdr=0x00 > vendor = 'Creative Technology LTD.' > device = 'EMU10000 Game Port' > class = input device > atapci1@pci0:5:3:0: class=0x018000 card=0x3375105a chip=0x3375105a rev=0x02 hdr=0x00 > vendor = 'Promise Technology Inc' > device = 'PDC20375(??) FastTrak SATA150 TX2plus Controller' > class = mass storage > [ root@memory - ~ ]: vmstat > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr ad14 ad15 in sy cs us sy id > 0 0 0 194M 2916M 110 0 1 0 91 0 0 0 119 2024 952 0 0 100 > My apologies for replying to my first e-mail. I'm not sure why this didn't occur to me the first time this happened, but I completely powered off my machine, and then powered it back on. It could see the partition table and disklabel on ad15 again. I attempted an fsck, and I received the same errors as before, but this time I hit a kernel panic, too: GEOM_LABEL: Label ufsid/4a296b573007b5f2 removed. Jun 8 14:35:42 memory last message repeated 7 times ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad14: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad14: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad14: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly acd0: WARNING - TEST_UNIT_READY taskqueue timeout - completing request directly ad14: WARNING - SET_MULTI taskqueue timeout - completing request directly ad15: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad15: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad15: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad15: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad15: WARNING - SET_MULTI taskqueue timeout - completing request directly ad15: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=470440143 Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x188 fault code = supervisor read, page not present instruction pointer = 0x20:0xc07d4d94 stack pointer = 0x28:0xc62f9c00 frame pointer = 0x28:0xc62f9c18 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 23 (swi6: task queue) trap number = 12 panic: page fault cpuid = 0 Uptime: 1m56s Physical memory: 3058 MB Dumping 113 MB: 98 82 66 50 34 18 2 Dump complete Automatic reboot in 15 seconds - press a key on the console to abort Rebooting... cpu_reset: Stopping other CPUs Unfortunately, nothing showed up in /var/crash, which I think is odd. I'll update my -STABLE, rebuild my kernel with debugging, and hope to catch something next time. In the mean time, I'd appreciate any help I could get on resolving this problem. Adam