Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Apr 1997 19:01:06 -0700 (PDT)
From:      Simon Shapiro <Shimon@i-Connect.Net>
To:        freebsd-hackers@freebsd.org, freebsd-scsi@freebsd.org
Subject:   SCSI Problems Since 2.2-BETA_A of 13-Feb-1997
Message-ID:  <XFMail.970413224310.Shimon@i-Connect.Net>

next in thread | raw e-mail | index | archive | help
I apologize if i am repeating known truth, jumping in the middle of a
conversation or missing the solution (I was very busy last few weeks), but
this is serious.

We are having a severe problem with SCSI devices (wide) connected to
AHA-2940W or 2940UW (PCI).  The problem manifests itself as (more or less):

  sd10(ahc1:13:0): timed out in dataout phase SCSISIGI = 0x0
  SEQADDR = {0x6,0xf,0xa....}
  ahc1: Issued channel A Bus reset
  spec_getpage I/O read error
  vm_fault:  pager input ...

With kernels up to and including that mentioned in the Subject: line, we
get this error once and it clears and continues for a time.  Anything we
have that is later goes into an endless loop of these, from which only PWER
CYCLE will save.

Now, the problem is probably real.  getting the SCSI ERROR is not what I am
asking for help with.  The endless loop is.

The system has two controllers a 2940W and a 2940UW.  I cannot see any
difference in behavior (but could be wrong.

To these controllers are hooked up:

ahc0 (the 2940W): Iomega Jaz, Ymaha CDR-100 Sony SDT-7000/BM (DAT) and 8
Seagate 2713 (2GB Baracuda.  The Baracudas are on an external box, with a
good cable and terminated via the internal termination on the last drive.

ahc1 (the 2940Uw):  8 Quantum 4GB drives in a ``raid'' box.

Each group is arranged as a CCD RAID-0 array, 4 drives per HBA.  The
arrangement is suboptimal in that drives are accessed sequentially on the
same bus.  It is also suboptimal as noise must be on the bus.

Close examination of the bus reveals that when an error occures, a drive
stays selected and the bus stays locked up.  It stays locked up during
power cycle on the drives (as soon as the drive is spinning again, it is
selected again).  It survives reboots and motherboard resets.  It only will
release if the motherboard is powered off and on again.

This hard-core stuckiness is only evident with newer revisions of the
aic7xxx driver.  Up to the 2.2-BETA_A, the errors happen, rarely, but the
driver resets the bus and recovers nicely.

I am not sure if this is a ccd problem (timing, etc.;  I do not think so),
or a scsi layer problem (again, I do not think so), or an aic7xxx problem.

I also relize that this is more hardware than most of us use, but
represents about 1/3 of what we want done (48 drives per system as a base
configuration).

Any help I can get/give will be appreciated!

Thanx,

Simon

ng known truth, jumping in the middle of a
conversation or missing the solution (I was very busy last few weeks), but
this is serious.

We are having a severe problem with SCSI devices (wide) connected to
AHA-2940W or 2940UW (P



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.970413224310.Shimon>