Date: Fri, 24 Oct 2008 18:15:04 -0700 From: Jeremy Chadwick <koitsu@FreeBSD.org> To: mark.jacobs@custserv.com Cc: freebsd-questions@freebsd.org Subject: Re: Drive Disconnection Message-ID: <20081025011504.GA47577@icarus.home.lan> In-Reply-To: <DAA3726915FC4F419AA4E18E3D587EDA0684C9@TMPMSGMB03.enterprise.corpad.timeinc.com> References: <49020DC1.4060205@custserv.com> <20081024230911.GA33151@icarus.home.lan> <DAA3726915FC4F419AA4E18E3D587EDA0684C9@TMPMSGMB03.enterprise.corpad.timeinc.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Oct 24, 2008 at 07:44:41PM -0400, mark.jacobs@custserv.com wrote: > It is a Lacie d2 quadra drive but FreeBSD reports this; > server kernel: ad4: 953869MB <Hitachi HDS721010KLA330 GKAOA70M> at ata2-master SATA150 > > When I perform the RSYNC I receive these errors > > Oct 24 12:47:13 server kernel: ad4: FAILURE - device detached > Oct 24 12:47:13 server kernel: subdisk4: detached > Oct 24 12:47:13 server kernel: ad4: detached > Oct 24 12:47:13 server kernel: g_vfs_done():ad4s1a[WRITE(offset=144332767232, length=131072)]error = 6 > Oct 24 12:47:13 server kernel: g_vfs_done():ad4s1a[WRITE(offset=144332898304, length=131072)]error = 6 > The write failure messages keep on being issued until the server reboots. It isn't in the log, but I receive a dirty buffer panic. It appears the disk is literally falling off of the SATA bus. The g_vfs_done errors you see are a result of that. I'll explain the reboot in a moment. There could be tons of reasons for the disk disappearing. I'll list off some the possibilities that come to mind: * Drive losing power - Shoddiness inside of the d2 Quadra enclosure, such as bad internal cabling or manufacturing defects, - AC adapter for d2 Quadra is faulty, - d2 Quadra could offer some kind of "sleep mode" where the unit goes into a low-power-save state, and the disk ends up falling off the bus during this time. * SATA300 vs. SATA150 compatibility issues - VIA and SiS chipsets are known to experience data corruption, disks falling off the bus, or other insanity when SATA300 disks are connected to those chipsets. The chipsets support SATA300, but are downright buggy. Workaround is to force the drive to SATA150 speed using jumpers on the disk (only *some* manufacturers offer this), - The Hitachi disk in your d2 Quadra is spec'd at SATA300, while it's obvious your Silicon Image SATA controller is only detecting SATA150 (yet LaCie claims this enclosure does SATA300). The 7K1000 series drives *do not* have a force-SATA150 jumper (I've checked), which is too bad, since forcing SATA150 might fix the problem. * d2 Quadra USB/FW/eSATA controller bug - I have no idea what chip is inside of that enclosure, but many of them are "bridges", e.g. they're USB/FW controllers that have a horribly shoddy "SATA emulation" interface on top of them, - Could be a firmware bug with the controller used in the enclosure, - Controller may not be 100% compatible with Silicon Image devices. * Silicon Image SATA controller bugs As for why the system reboots: what you're experiencing is probably a kernel panic. On FreeBSD, when you have a filesystem that's mounted and the underlying device (disk, etc.) is yanked out from underneath, the kernel will panic; this is by design. I've been told by lower-level folks that CURRENT supposedly addresses this issue, but I haven't personally confirmed it. I would still like to see SMART stats on the drive. Why? Because SMART stats will show me if the drive is actually losing power or not (the Power_Cycle_Count attribute should increment). You'll need to install ports/sysutils/smartmontools, then run "smartctl -a /dev/ad4". Save that data somewhere, then run your rsync. Your machine will reboot (a soft reset, hopefully!), and once it's back up, run the same smartctl command again, and save that data. Then you can compare the adjusted attributes and RAW_VALUEs; I can help you with reading this data if need be (people often misread it). > I don't have easy access to a 7.1 system with an ESATA port. That's disappointing, as it would be useful to know if 7.1-PRERELEASE behaves the same way for you. Based on the above I'd say it probably does, but it's always good to check. > I'm current redoing the entire process, wipe, build filesystem, mount, > rsync using the USB port. If that works I'm going to junk the idea of > using the ESATA card for the drive. I would _highly_ recommend you reconsider this. USB on FreeBSD is in an even worse state (and I am not exaggerating) than ATA/SATA is. If your disk is falling off the bus with SATA, the same will likely happen with USB, and you'll experience the same problem. > Can you recommend an ESATA card that fits in an PCI slot since my > server doesn't have a PCI-E slot? Promise makes the SATA300 TX4302 controller, which is PCI, and provides two eSATA ports, plus two internal SATA ports. I believe this card goes for US$70-100. Promise's website (for me) appears to be malfunctioning (webserver answers, but stalls indefinitely), so I can't easily check their products list. I don't think HighPoint makes any eSATA-capable controllers that are standard PCI or PCI-X; all appear to be PCI Express. If your motherboard has on-board SATA support that *does not* use a Silicon Image, VIA, or SiS chip and instead something like an Intel ICH or nVidia nForce controller, I would recommend buying something like this and using it instead: http://www.icydock.com/product/MB559power_bracket.html http://www.cooldrives.com/essaii3gbexp.html http://www.newegg.com/Product/Product.aspx?Item=N82E16812119021 Finally, and I don't know if you're doing this, but -- be aware you can't "hot-swap" disks via eSATA without having a hot-swap-capable controller that fully supports hot-swapping. Meaning: you can't yank that d2 Quadra enclosure off the eSATA port whenever you feel like it. You'll need to use "atacontrol detach" to properly detach it first, and that's assuming the SATA controller you're using supports hot-swapping (things with AHCI behave fairly well in this regard). > -----Original Message----- > From: Jeremy Chadwick [mailto:koitsu@FreeBSD.org] > Sent: Fri 10/24/2008 7:09 PM > To: Jacobs, Mark - Data Center Operations <mark.jacobs@custserv.com> > Cc: freebsd-questions@freebsd.org > Subject: Re: Drive Disconnection > > On Fri, Oct 24, 2008 at 02:02:41PM -0400, Mark Jacobs wrote: > > I have an external Lacie 1Tb drive attached to a FreeBSD 6.4-PRERELEASE > > system via an ESATA connection. > > > > atapci0: <SiI SiI 3512 SATA150 controller> > > > > I cleaned off the drive by writing random data to it. The write took > > overnight and didn't experience any problems. I then added a filesystem > > to the drive and mounted it on the system. > > > > However when I perform an rsync backup from a FreeBSD 7.1 PRERELEASE > > system to the drive over an NFS connection the drive disconnects and the > > server reboots. > > You've not provided enough information to help track this down. What > model/brand of disk is attached to that controller? What does smartctl > -a have to say about the disk? What gets printed on the console before > it reboots? Do you have the same problem if you run > 7.1-PRERELEASE/BETA2? > > > Does anyone have an idea where to go from here? > > The only generic advice I can give you at this point) is to avoid > Silicon Image controllers, particularly their SATA controllers. They > have a history of causing data corruption on Linux, FreeBSD, and > Windows, and some have reported other miscellaneous problems with them > as well. There's not enough evidence in this thread so far to blame the > SiI controller, but when I see them, I become immediately suspicious. > > -- > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | > > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org" -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081025011504.GA47577>