Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 Oct 1996 18:08:30 +0100 (BST)
From:      Gordon Henderson <gordon@drogon.net>
To:        freebsd-scsi@freebsd.org
Subject:   Buslogic controller, Sync mode & a SCSI disk error
Message-ID:  <Pine.LNX.3.91.961010172220.1551L-100000@unicorn>

next in thread | raw e-mail | index | archive | help

I have a Bizarre set of problems...

Heres the setup: ASUS P120, 128MB RAM, 2 Buslogic 946C controllers each
with 3 identical 2GB drives. Running FreeBSD 2.1.5R. 

Firstly: Boot messages which I find rather odd:

[I've cut some of the verbage]

  /kernel: bt0 <Buslogic 946 SCSI host adapter> rev 0 int a irq 10 on pci0:11
  /kernel: bt0: Bt946C/ 0-(32bit) bus
  /kernel: bt0: reading board settings, busmastering, int=10
  /kernel: bt0: version 4.25J, fast sync, parity, 32 mbxs, 3 2 ccbs
  /kernel: bt0: targ 0 sync rate=10.00MB/s(100ns), offset=15
  /kernel: bt0: targ 1 sync rate=10.00MB/s(100ns), offset=15
  /kernel: bt0: targ 2 sync rate=10.00MB/s(100ns), offset=15
  /kernel: bt0: Using Strict Round robin scheme
  /kernel: bt0 waiting for scsi devices to settle

  [device probing snipped]

  /kernel: bt1 <Buslogic 946 SCSI host adapter> rev 0 int a irq 11 on pci0:12
  /kernel: bt1: Bt946C/ 0-(32bit) bus
  /kernel: bt1: reading board settings, busmastering, int=11
  /kernel: bt1: version 4.28D, async only, parity, 32 mbxs, 32 ccbs
  /kernel: bt1: targ 0 async
  /kernel: bt1: targ 1 async
  /kernel: bt1: targ 2 async
  /kernel: bt1: Using Strict Round robin scheme
  /kernel: bt1 waiting for scsi devices to settle

So - 2 different versions of the Buslogic board, and the most recent one
doesn't come up in Sync mode.... Any reason why? (According to Buslogic,
that version of board firmware is good and I should use Linux instead of
FreeBSD - I did boot Linux on it once while testing it and Linux correctly
enabled all devices in 10MB/sec sync mode - Why doesn't FreeBSD?)

2nd problem: One of the disks seems to have a fault. Heres the errors 
from the messages file:

  /kernel: sd1(bt0:1:0): MEDIUM ERROR info:272133 asc:11,0
	 Unrecovered read error
  /kernel: , retries:4
  /kernel: sd1(bt0:1:0): MEDIUM ERROR info:272133 asc:11,0
	Unrecovered read error
  /kernel: , retries:3
  /kernel: sd1(bt0:1:0): MEDIUM ERROR info:272133 asc:11,0
	Unrecovered read error 
  /kernel: , retries:2
  /kernel: bt0: Try to abort
  /kernel: bt0: not taking commands!
  /kernel: Debugger("bt742a") called.
  /kernel: bt0: Abort Operation has timed out  

at this stage the machine rebooted it's self, fsck'd ok and carried on. 
(It's a news server). That disk isn't used for swap so it was a file read
that caused the failure. Why should reading a duff sector cause the system
to crash? 

So, anyone any ideas how to fix the error, why the bt driver crashes, and
why my 2nd controller doesn't come up in sync mode? 

And exactly what are you supposed to do on a SCSI error anyway? How do I 
mark the block bad, I know about the scsi command - Used:

  scsi -f /dev/sd1 -m 1

and got: (amongst other things):

  AWRE (Auto Write Reallocation Enbld):  1 
  ARRE (Auto Read Reallocation Enbld):  1 

So the disk controller is supposed to automagically re-allocate bad blocks
- if thats the case, why am I seeing faults and why is it crashing the
machine?

If that isn't the case, how do I scan the disk for bad blocks and mark 
them bad. I've been told that bad144 is only for IDE drives, is that 
right? If so, whats for SCSI drives?

Any help would be gratefully appreciated - I'm sure I'm not the only one
in the world with a duff sector or 2 on a FreeBSD SCSI drive! 

Gordon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.3.91.961010172220.1551L-100000>