Date: Thu, 25 Jan 2001 20:39:47 -0600 From: Hank Marquardt <hmarq@yerpso.net> To: "Kenneth D. Merry" <ken@kdm.org> Cc: Hank Marquardt <hmarq@oscar2.yerpso.net>, freebsd-stable@FreeBSD.ORG Subject: Re: Dying disk ... Message-ID: <20010125203946.A53543@hermes.yerpso.net> In-Reply-To: <20010125174624.A37036@panzer.kdm.org>; from ken@kdm.org on Thu, Jan 25, 2001 at 05:46:24PM -0700 References: <20010125150722.A288@oscar2.yerpso.net> <20010125174624.A37036@panzer.kdm.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Thanks, heat is an interesting thought ... it's a dual P133 machine in fairly close quarters (inside the case); though it was running solid for close to 30 days before the first lockup (4.2 Stable update/boot around 12/15) and another couple months before that with no issues -- further, the dmesg I originally sent was from less than 15 minutes running after having been powered off for over 24 hours -- I don't think I'd get a heat problem that quick unless something was seriously screwed up -- of which there was no evidence when I opened the case. The drive is a Segate ST19171W attached to a 2940 card -- CD also SCSI has functioned and been recognized throughout all the episodes. Hank On Thu, Jan 25, 2001 at 05:46:24PM -0700, Kenneth D. Merry wrote: > [ Try to hit return every 70 characters or so. If you're using vi, > type ESC and then :se wm=10 ] > > On Thu, Jan 25, 2001 at 15:07:23 -0600, Hank Marquardt wrote: > > If born out, this would be my first hardware failure under BSD so I'm just looking for another set of eyes to look at this dmesg: > > > > > > WARNING: / was not properly dismounted > > (da0:ahc0:0:0:0): SCB 0x31 - timed out while idle, SEQADDR == 0x3 > > Timed out while idle, for a disk, generally means that we sent a > command to the disk and it still hadn't returned it after 60 seconds. > So we assume that the disk is out to lunch and hasn't come back, and > therefore we reset it to wake it up. > > > STACK == 0x1, 0xe8, 0x143, 0x16b > > SXFRCTL0 == 0x80 > > SCB count = 50 > > QINFIFO entries: > > Waiting Queue entries: > > Disconnected Queue entries: 2:20 0:49 > > QOUTFIFO entries: > > Sequencer Free SCB List: 13 14 15 3 9 1 10 12 5 6 11 4 7 8 > > Pending list: 20 49 > > Kernel Free SCB list: 33 11 24 35 34 48 17 18 14 47 45 31 22 21 27 36 16 4 5 9 12 15 25 26 28 46 2 10 29 23 7 38 13 0 8 6 32 30 3 19 37 1 44 43 42 41 40 > > sg[0] - Addr 0x1134000 : Length 1024 > > (da0:ahc0:0:0:0): Queuing a BDR SCB > > (da0:ahc0:0:0:0): Bus Device Reset Message Sent > > (da0:ahc0:0:0:0): no longer in timeout, status = 34b > > ahc0: Bus Device Reset on A:0. 2 SCBs aborted > > > > The machine has locked solid under X a couple times and then refused to boot, not seeing the disk at all even under the SCSI ROM startup, so I popped the case thinking it might be a loose cable or something and sure enough it booted ... though you see it fschked the disk from the crash ... I left it be for a while, came back and saw the errors on the screen .. a quick look at syslog showed similar entries at each of the crashes. > > > > It seems to be running now (I'm writing this on it) ... but if the disk if flaked, I may as well go buy a new one now rather than wonder when it's going to die. > > You should make sure your disk is properly cooled (is it hot to the > touch?). You should also check your cabling and termination. > > If your disk wasn't showing up during your SCSI controller's BIOS > probe, that could indicate a cabling or termination problem. It could > also mean the disk is on its way out. > > What sort of disk is it? Some disks are known to have firmware issues > that cause them to go "out to lunch" and not come back. > > Ken > -- > Kenneth Merry > ken@kdm.org > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-stable" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010125203946.A53543>