Date: Thu, 26 Feb 2009 12:22:12 +0100 From: Gary Jennejohn <gary.jennejohn@freenet.de> To: Alexander Motin <mav@FreeBSD.org> Cc: FreeBSD-Current <freebsd-current@freebsd.org> Subject: Re: SATA disks suddenly stop working Message-ID: <20090226122212.76077ed0@ernst.jennejohn.org> In-Reply-To: <49A5A276.9080401@FreeBSD.org> References: <go44ht$2i6a$1@FreeBSD.cs.nctu.edu.tw> <49A5A276.9080401@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 25 Feb 2009 21:56:38 +0200 Alexander Motin <mav@FreeBSD.org> wrote: > Gary Jennejohn wrote: > > I've been having lots of problems with SATA drives attached to higher > > port numbers, namely ata5 and ata6. > > > > I was installing Linux under qemu today and it had been running for > > several hours and had installed multi-gigabytes of data when qemu > > just stopped. > > > > I noticed that all I/O to the disk had ceased. > > > > Doing "atacontrol reinit" on the port (ata5) resulted in a message > > that the device was not configured, which was patently false since > > qemu had just been merrily writing to it. > > > > This with a kernel made from sources updated today at about 2 PM (GMT+1). > > > > I've also seen problems with a disk attached to ata6. It just sort > > of disappears after a while. > > > > Disks attached to ata2, ata3 and ata4 don't exhibit any problems. > > You have told much and same time gave nothing that can be used. > I was only interested in whether others have seen this problem. I was not looking for a solution. > What controller do you have? What drives on what channels? Is there any > kernel messages about the problem? Have you tried to enable verbose > messages to get additional details? > atapci0@pci0:0:17:0: class=0x010601 card=0xb0021458 chip=0x43911002 rev=0x00 hdr=0x00 vendor = 'ATI Technologies Inc' class = mass storage subclass = SATA There were no kernel messages at all, the drive simply hung. I'll do a verbose boot and try to reproduce the disk hang later. > Reinit could return ENXIO if it already was in progress. Disappearing > drives are also can be related to that reinit. Can't it be just a real > hardware problem? > I should have mentioned that the error returned was about some IOCTL. Can't remember which one right now, but the error message did include that the device was not configured. I've also noticed several times in the past when the problem occurred that the BIOS could not enumerate the AHCI disks anymore. I had to do a POR. Seems that the controller was completely hosed such that a simple reset didn't reinitialize it sufficiently for it to work. This morning I booted the box and started a cvsup. My repository is on a ZFS mirror with the disks on ata3 and ata4. The system hung after the data from the server were received, although all the data were successfully written to the disks. I couldn't do anything at all - it looked like the root disk was not responding and the disk light was on solid red. I had to do a hard reset. This is the first time I've seen a problem with this port. The root disk is on ata2. I rebooted and turned off MSI. I'll monitor the situation to see whether that helps. --- Gary Jennejohn
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090226122212.76077ed0>