Date: Wed, 9 Jun 2010 01:47:53 -0500 From: Scott Lambert <lambert@lambertfam.org> To: freebsd-stable@freebsd.org Subject: Re: gmirror refused to connect second disk after a reboot Message-ID: <20100609064753.GA46148@sysmon.tcworks.net> In-Reply-To: <20100606194515.GA29230@icarus.home.lan> References: <20100606052509.GA4744@mavetju.org> <20100606185551.GA267@sysmon.tcworks.net> <20100606194515.GA29230@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jun 06, 2010 at 12:45:15PM -0700, Jeremy Chadwick wrote: > On Sun, Jun 06, 2010 at 01:55:51PM -0500, Scott Lambert wrote: > > I have one dual PIII machine doing the same to me. I've been assuming > > my issue is with the ATA controller. ... <snip> > I agree -- these look like you have either a bad PATA cable, an PATA > controller port which has gone bad, or a PATA controller which is > behaving *very* badly (internal IC problems). ICRC errors indicate data > transmission failures between the controller and the disk. > > Since these are classic PATA disks, ad0 is probably the master and ad2 > is the slave -- but both are probably on the same physical cable. > > The LBAs for both ad0 and ad2 are very close (ad0=242235039, > ad2=242234911), which makes sense since they're in a mirror config. But > two disks going kaput at the same time, around the same LBA? I have my > doubts. I think I actually made sure that ad0 and ad2 are on their own cables. ad0 may be sharing with acd0 though. Yeah, looks like it. 01:16:24 Wed Jun 09 $ sudo atacontrol list ATA channel 0: Master: ad0 <WDC WD2500JB-57REA0/20.00K20> ATA/ATAPI revision 7 Slave: acd0 <LG CD-ROM CRD-8521B/1.04> ATA/ATAPI revision 0 ATA channel 1: Master: ad2 <WDC WD2500JB-57REA0/20.00K20> ATA/ATAPI revision 7 Slave: no device present > SMART statistics for both of the disks themselves would help determine > if the disks are seeing issues or if the disks are also seeing problems > communicating with the PATA controller. (Depends on the age of the disks > though; some older PATA disks don't have the SMART attribute that > describes this). The drives are only a couple of years old. The box itself is ancient. :-) The ICRC error only seem to have occured right after boot. I'll jerk the box apart to check/change the cabling when I get a chance. Maybe I'll just dump the cd drive. > What you should be worried about -- FreeBSD sees problems on both ad0 > and ad2. ad2 is offline cuz of the problem, but ad0 isn't. Chances are > ad0 is going to fall off the bus eventually because of this problem. I > really hope you do backups regularly (daily) if you plan on just > ignoring this problem. AMANDA takes care of things. Also, this box is not terribly important. I rebuilt the array Sunday. I don't see anything terribly scary in the smartctl output. Anyway, I do hope I haven't hijacked the thread for the OP. I actually just wanted to offer a possible matching datapoint. -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100609064753.GA46148>