Date: Sun, 13 Apr 1997 19:01:06 -0700 (PDT) From: Simon Shapiro <Shimon@i-Connect.Net> To: freebsd-hackers@freebsd.org, freebsd-scsi@freebsd.org Subject: SCSI Problems Since 2.2-BETA_A of 13-Feb-1997 Message-ID: <XFMail.970413224310.Shimon@i-Connect.Net>
next in thread | raw e-mail | index | archive | help
I apologize if i am repeating known truth, jumping in the middle of a conversation or missing the solution (I was very busy last few weeks), but this is serious. We are having a severe problem with SCSI devices (wide) connected to AHA-2940W or 2940UW (PCI). The problem manifests itself as (more or less): sd10(ahc1:13:0): timed out in dataout phase SCSISIGI = 0x0 SEQADDR = {0x6,0xf,0xa....} ahc1: Issued channel A Bus reset spec_getpage I/O read error vm_fault: pager input ... With kernels up to and including that mentioned in the Subject: line, we get this error once and it clears and continues for a time. Anything we have that is later goes into an endless loop of these, from which only PWER CYCLE will save. Now, the problem is probably real. getting the SCSI ERROR is not what I am asking for help with. The endless loop is. The system has two controllers a 2940W and a 2940UW. I cannot see any difference in behavior (but could be wrong. To these controllers are hooked up: ahc0 (the 2940W): Iomega Jaz, Ymaha CDR-100 Sony SDT-7000/BM (DAT) and 8 Seagate 2713 (2GB Baracuda. The Baracudas are on an external box, with a good cable and terminated via the internal termination on the last drive. ahc1 (the 2940Uw): 8 Quantum 4GB drives in a ``raid'' box. Each group is arranged as a CCD RAID-0 array, 4 drives per HBA. The arrangement is suboptimal in that drives are accessed sequentially on the same bus. It is also suboptimal as noise must be on the bus. Close examination of the bus reveals that when an error occures, a drive stays selected and the bus stays locked up. It stays locked up during power cycle on the drives (as soon as the drive is spinning again, it is selected again). It survives reboots and motherboard resets. It only will release if the motherboard is powered off and on again. This hard-core stuckiness is only evident with newer revisions of the aic7xxx driver. Up to the 2.2-BETA_A, the errors happen, rarely, but the driver resets the bus and recovers nicely. I am not sure if this is a ccd problem (timing, etc.; I do not think so), or a scsi layer problem (again, I do not think so), or an aic7xxx problem. I also relize that this is more hardware than most of us use, but represents about 1/3 of what we want done (48 drives per system as a base configuration). Any help I can get/give will be appreciated! Thanx, Simon ng known truth, jumping in the middle of a conversation or missing the solution (I was very busy last few weeks), but this is serious. We are having a severe problem with SCSI devices (wide) connected to AHA-2940W or 2940UW (P
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.970413224310.Shimon>