Date: Sat, 02 Dec 2000 23:49:24 +0000 From: Peter Gradwell <peter@gradwell.com> To: Mike Smith <msmith@freebsd.org> Cc: freebsd-scsi@freebsd.org Subject: Re: Mylex DAC960 Driver "online/offline" Message-ID: <5.0.0.25.0.20001202233356.0366b2d8@pop3.gradwell.net> In-Reply-To: <200012022339.eB2NdWF21371@mass.osd.bsdi.com> References: <Your message of "Fri, 01 Dec 2000 21:32:54 GMT." <5.0.0.25.0.20001201212649.03798548@pop3.gradwell.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Mike, At 15:39 02/12/2000 -0800, Mike Smith wrote: > > What does this message really mean? > >It means that the controller is telling us that the drive is offline. >Then that it's online. Then that it's offline again. > >You don't say what the time intervals between these messages are; you can >get the 'drive offline' message from either the status poll (once per >second) or if an I/O operation is sent to a drive that the controller >reports as offline. The 'drive online' message only comes from the >status poll though. It was occuring without any apparent activity, about once per second, so I would guess it was from the status poll. >Can you describe your configuration? I can try to reproduce the >situation here and see if it's not possible that there's a bug in the >driver confusing the status between your two drives. I have to say, >though, that the fact that the controller thinks that one of your system >drives is offline when you claim it's a mirror is a bit troubling. Ok, on an update to the situation though, I was able to get too the mylex bios (there is 250 miles between me and the machine you see!) via a serial console and discovered that it had marked two drives offline. We have: 3 x 18 gig disks, of which two are bonded in a raid 1 pack and one is a hot spare 2 x 36 gig disks, bonded in a raid 0 pack. Everything apart from /var/spool/news is on the raid 1 pack. (Yeah, it's a news server.) One of the 18 gig disks and one of the 36 gig disks were marked offline. I belive that when the 18 gig disk was marked off line the RAID card rebuilt it's redundancy data onto the hot spare disk and carried on. - cos the 18 gig which is off line was part of the raid 1 pack and there is now not hot spare. *So, that's good.* So, we hard reset the machine and it booted. However, the symptoms described previously prevailed. We couldn't login via ssh or on the console as it was unresponsive. * This worries me. I would hope the machine would take the loss of /v/s/news gracefully, and carry on. So, when I accessed the bios this morning, I tried, as an "experiment" to put the 36 gig disk back online and rebooted. After running fsck a bit (is there a journaling file system for freebsd?!) the machine is now running ok. I have yet to schedule a reboot to mark the currently off line 18 gig disk as the hot spare. I think I will be able to do this. I am worried that the controller randomly marks the drives off line. Mylex tell me this happens when it looses contact with the drives. They are internal drives, well screwed into a big case, nicely racked into a locked cabinet in Telehouse Europe. From what I can gather, no one accessed the rack. It appears they aren't disconnected anyway because I can mark them online and we're go again. I'd be happy to help with more information if it helps. Directed questions work best! thanks peter -- peter gradwell; online @ http://www.gradwell.com/peter/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5.0.0.25.0.20001202233356.0366b2d8>