Date: Wed, 24 Nov 1999 13:12:32 +0100 From: "Ben C. O. Grimm" <Ben.Grimm@wirehub.net> To: freebsd-questions@freebsd.org Subject: Re: AHC parity errors, timeouts - -STABLE Message-ID: <95ln3skv9r10s0f6sp0lhd8nhslh1hgrh7@smtp.wirehub.nl> In-Reply-To: <19991124135732.E23235@mincom.com.newsgate.clinet.fi> References: <19991124135732.E23235@mincom.com.newsgate.clinet.fi>
next in thread | previous in thread | raw e-mail | index | archive | help
On 24 Nov 1999 07:46:42 +0200, Phil Homewood <philh@mincom.com> wrote: > Anyone know of any changes to the SCSI/CAM/AHC code in -STABLE > in the last two weeks that may cause devices or the bus to go > out to lunch? > > I'm in the middle of deploying a swag of near-identical boxes, > and the latest one died last night with console displaying > > ahc0: Data Parity Error Detected during address or write data phase > (da0:ahc0:0:0:0): SCB 0x3 - timed out while idle, LASTPHASE == 0x1, > SEQADDR == 0x8 > (da0:ahc0:0:0:0): SCB 3: Immediate reset. Flags = 0x4040 > (da0:ahc0:0:0:0): no longer in timeout, status = 34b > ahc0: Issued Channel A Bus Reset. 64 SCBs aborted Ooooooo, yes .... (da0:ahc0:0:0:0): parity error during Data-In phase. SEQADDR == 0x5d SCSIRATE == 0x95 Unexpected busfree. LASTPHASE == 0xa0 SEQADDR == 0x15d ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET SAVED_TCL == 0x0, ARG_1 == 0x35, SEQ_FLAGS == 0x0 ahc0: Bus Device Reset on A:0. 19 SCBs aborted swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 swap_pager: indefinite wait buffer: device: 0x20401, blkno: 768, size: 4096 Happens regularly at two machines under heavy load (news servers). Sometimes 'panic: free vnode isn't' messages appear as well, but these may be 'after the fact'. ahc0: <Adaptec aic7890/91 Ultra2 SCSI adapter> rev 0x00 int a irq 10 on pci0.6.0 ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs da0 at ahc0 bus 0 target 0 lun 0 da0: <IFT 3102 0212> Fixed Direct Access SCSI-2 device da0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 8748MB (17916096 512 byte sectors: 255H 63S/T 1115C) da1 at ahc0 bus 0 target 0 lun 1 da1: <IFT 3102 0212> Fixed Direct Access SCSI-2 device da1: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged Queueing Enabled da1: 34863MB (71399424 512 byte sectors: 255H 63S/T 4444C) Problems ONLY appear when da0 and da1 are busy at the same time. When the entire load is switched to da1, nothing's wrong. The other machine: ahc0: <Adaptec 2940 Ultra2 SCSI adapter> rev 0x00 int a irq 12 on pci0.10.0 ahc0: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs ahc1: <Adaptec 2940 Ultra2 SCSI adapter> rev 0x00 int a irq 10 on pci0.11.0 ahc1: aic7890/91 Wide Channel A, SCSI Id=7, 16/255 SCBs da2 at ahc1 bus 0 target 0 lun 0 da2: <IFT 3102 0212> Fixed Direct Access SCSI-2 device da2: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da2: 105010MB (215061120 512 byte sectors: 255H 63S/T 13386C) da1 at ahc0 bus 0 target 1 lun 0 da1: <IFT 3102 0212> Fixed Direct Access SCSI-2 device da1: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da1: 105010MB (215061120 512 byte sectors: 255H 63S/T 13386C) da0 at ahc0 bus 0 target 0 lun 0 da0: <IBM DNES-309170W SA30> Fixed Direct Access SCSI-3 device da0: 40.000MB/s transfers (20.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 8748MB (17916240 512 byte sectors: 255H 63S/T 1115C) Problems are less frequent on this one, but also mainly when both disk systems are under heavy load at the same time. Do I smell an Adaptec rat in here? Is it the hardware, firmware, software, BSD code? -- - Ben C. O. Grimm ----------------- Ben.Grimm@wirehub.net - - Wirehub! Internet Engineering - http://www.wirehub.net/ - - Wirehub! Backbone --- http://doema.wirehub.net/wirehub/ - - Private Ponderings ------- http://libertas.wirehub.net/ - To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?95ln3skv9r10s0f6sp0lhd8nhslh1hgrh7>