Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Jan 2001 17:46:24 -0700
From:      "Kenneth D. Merry" <ken@kdm.org>
To:        Hank Marquardt <hmarq@oscar2.yerpso.net>
Cc:        freebsd-stable@FreeBSD.ORG
Subject:   Re: Dying disk ...
Message-ID:  <20010125174624.A37036@panzer.kdm.org>
In-Reply-To: <20010125150722.A288@oscar2.yerpso.net>; from hmarq@oscar2.yerpso.net on Thu, Jan 25, 2001 at 03:07:23PM -0600
References:  <20010125150722.A288@oscar2.yerpso.net>

next in thread | previous in thread | raw e-mail | index | archive | help
[ Try to hit return every 70 characters or so.  If you're using vi,
  type ESC and then :se wm=10 ]

On Thu, Jan 25, 2001 at 15:07:23 -0600, Hank Marquardt wrote:
> If born out, this would be my first hardware failure under BSD so I'm just looking for another set of eyes to look at this dmesg:
> 
> 
> WARNING: / was not properly dismounted
> (da0:ahc0:0:0:0): SCB 0x31 - timed out while idle, SEQADDR == 0x3

Timed out while idle, for a disk, generally means that we sent a
command to the disk and it still hadn't returned it after 60 seconds.
So we assume that the disk is out to lunch and hasn't come back, and
therefore we reset it to wake it up.

> STACK == 0x1, 0xe8, 0x143, 0x16b
> SXFRCTL0 == 0x80
> SCB count = 50
> QINFIFO entries: 
> Waiting Queue entries: 
> Disconnected Queue entries: 2:20 0:49 
> QOUTFIFO entries: 
> Sequencer Free SCB List: 13 14 15 3 9 1 10 12 5 6 11 4 7 8 
> Pending list: 20 49 
> Kernel Free SCB list: 33 11 24 35 34 48 17 18 14 47 45 31 22 21 27 36 16 4 5 9 12 15 25 26 28 46 2 10 29 23 7 38 13 0 8 6 32 30 3 19 37 1 44 43 42 41 40 
> sg[0] - Addr 0x1134000 : Length 1024
> (da0:ahc0:0:0:0): Queuing a BDR SCB
> (da0:ahc0:0:0:0): Bus Device Reset Message Sent
> (da0:ahc0:0:0:0): no longer in timeout, status = 34b
> ahc0: Bus Device Reset on A:0. 2 SCBs aborted
> 
> The machine has locked solid under X a couple times and then refused to boot, not seeing the disk at all even under the SCSI ROM startup, so I popped the case thinking it might be a loose cable or something and sure enough it booted ... though you see it fschked the disk from the crash ... I left it be for a while, came back and saw the errors on the screen .. a quick look at syslog showed similar entries at each of the crashes.
> 
> It seems to be running now (I'm writing this on it) ... but if the disk if flaked, I may as well go buy a new one now rather than wonder when it's going to die.

You should make sure your disk is properly cooled (is it hot to the
touch?).  You should also check your cabling and termination.

If your disk wasn't showing up during your SCSI controller's BIOS
probe, that could indicate a cabling or termination problem.  It could
also mean the disk is on its way out.

What sort of disk is it?  Some disks are known to have firmware issues
that cause them to go "out to lunch" and not come back.

Ken
-- 
Kenneth Merry
ken@kdm.org


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010125174624.A37036>