Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 24 Nov 1999 13:12:32 +0100
From:      "Ben C. O. Grimm" <Ben.Grimm@wirehub.net>
To:        freebsd-questions@freebsd.org
Subject:   Re: AHC parity errors, timeouts - -STABLE
Message-ID:  <95ln3skv9r10s0f6sp0lhd8nhslh1hgrh7@smtp.wirehub.nl>
In-Reply-To: <19991124135732.E23235@mincom.com.newsgate.clinet.fi>
References:  <19991124135732.E23235@mincom.com.newsgate.clinet.fi>

next in thread | previous in thread | raw e-mail | index | archive | help
On 24 Nov 1999 07:46:42 +0200, Phil Homewood <philh@mincom.com> wrote:

> Anyone know of any changes to the SCSI/CAM/AHC code in -STABLE
> in the last two weeks that may cause devices or the bus to go
> out to lunch?
> 
> I'm in the middle of deploying a swag of near-identical boxes,
> and the latest one died last night with console displaying
> 
> ahc0: Data Parity Error Detected during address or write data phase
> (da0:ahc0:0:0:0): SCB 0x3 - timed out while idle, LASTPHASE == 0x1,
> SEQADDR == 0x8
> (da0:ahc0:0:0:0): SCB 3: Immediate reset.  Flags = 0x4040
> (da0:ahc0:0:0:0): no longer in timeout, status = 34b
> ahc0: Issued Channel A Bus Reset. 64 SCBs aborted

Ooooooo, yes ....

(da0:ahc0:0:0:0): parity error during Data-In phase.
SEQADDR == 0x5d
SCSIRATE == 0x95
Unexpected busfree.  LASTPHASE == 0xa0
SEQADDR == 0x15d
ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE
RESET
SAVED_TCL == 0x0, ARG_1 == 0x35, SEQ_FLAGS == 0x0
ahc0: Bus Device Reset on A:0. 19 SCBs aborted
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096
swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size:
4096

Happens regularly at two machines under heavy load (news servers).

Sometimes 'panic: free vnode isn't' messages appear as well, but these
may be 'after the fact'.

ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 10
on pci0.6.0
ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
da0 at ahc0 bus 0 target 0 lun 0
da0: <IFT 3102 0212> Fixed Direct Access SCSI-2 device 
da0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged
Queueing Enabled
da0: 8748MB (17916096 512 byte sectors: 255H 63S/T 1115C)
da1 at ahc0 bus 0 target 0 lun 1
da1: <IFT 3102 0212> Fixed Direct Access SCSI-2 device 
da1: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged
Queueing Enabled
da1: 34863MB (71399424 512 byte sectors: 255H 63S/T 4444C)

Problems ONLY appear when da0 and da1 are busy at the same time. When
the entire load is switched to da1, nothing's wrong.

The other machine:

ahc0: <Adaptec 2940 Ultra2 SCSI adapter> rev 0x00 int a irq 12 on
pci0.10.0
ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
ahc1: <Adaptec 2940 Ultra2 SCSI adapter> rev 0x00 int a irq 10 on
pci0.11.0
ahc1: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs
da2 at ahc1 bus 0 target 0 lun 0
da2: <IFT 3102 0212> Fixed Direct Access SCSI-2 device 
da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged
Queueing Enabled
da2: 105010MB (215061120 512 byte sectors: 255H 63S/T 13386C)
da1 at ahc0 bus 0 target 1 lun 0
da1: <IFT 3102 0212> Fixed Direct Access SCSI-2 device 
da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged
Queueing Enabled
da1: 105010MB (215061120 512 byte sectors: 255H 63S/T 13386C)
da0 at ahc0 bus 0 target 0 lun 0
da0: <IBM DNES-309170W SA30> Fixed Direct Access SCSI-3 device 
da0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged
Queueing Enabled
da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C)

Problems are less frequent on this one, but also mainly when both disk
systems are under heavy load at the same time.

Do I smell an Adaptec rat in here? Is it the hardware, firmware,
software, BSD code?

-- 
- Ben C. O. Grimm ----------------- Ben.Grimm@wirehub.net -
- Wirehub! Internet Engineering - http://www.wirehub.net/ -
- Wirehub! Backbone --- http://doema.wirehub.net/wirehub/ -
- Private Ponderings ------- http://libertas.wirehub.net/ -


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?95ln3skv9r10s0f6sp0lhd8nhslh1hgrh7>