Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 7 Nov 1999 12:59:30 +0100
From:      Shaun Jurrens <shamz@powertech.no>
To:        "Justin T. Gibbs" <gibbs@narnia.plutotech.com>
Cc:        scsi@freebsd.org
Subject:   Re: scsi bus errors
Message-ID:  <19991107125930.A20165@shamz.net>
In-Reply-To: <199911051943.MAA69413@narnia.plutotech.com>; from Justin T. Gibbs on Fri, Nov 05, 1999 at 12:43:02PM -0700
References:  <19991105120951.C1083@shamz.net> <199911051943.MAA69413@narnia.plutotech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Nov 05, 1999 at 12:43:02PM -0700, Justin T. Gibbs wrote:
#> In article <19991105120951.C1083@shamz.net> you wrote:
#> 
#> [
#>   I've reformatted your mail so it fits in 80 columns.  This makes
#>   it much easier to read.
#> ]
	sorry, forgot to import my nexrc to this acct
#> 
#> > Hi,
#> > 
#> > After reading the lists and trying about everything under the sun
#> > to get the errors to abate, I am finally writing.  The setup is
#> > about the same as all the others with SCB timeout errors.
#> 
#> ...
#> 
#> > I left out the logs because they don't seem to have been more than
#> > grounds for speculation about termination and such up until now.
#> 
#> As far as I know, we've resolved all other "timeout" type errors
#> with the ahc driver.  This was only possible because the people
#> having the problems gave detailed information about their setup
#> and the errors that occurred.  In other words, provide the output
#> from 'dmesg' for the system having problems as well as the messages
#> output by the driver when the errors occur and we'll see what we
#> can do.  Leave the determination of what is valuable information
#> to the experts.
#> 
#> --
#> Justin

Well then let's begin with dmesg:
Copyright (c) 1992-1999 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California. All rights reserved.
FreeBSD 3.3-STABLE #0: Sun Oct 24 17:54:31 CEST 1999
    root@dakota.shamz.net:/usr/src/sys/compile/DAKOTA
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 350797513 Hz
CPU: AMD-K6(tm) 3D processor (350.80-MHz 586-class CPU)
  Origin = "AuthenticAMD"  Id = 0x580  Stepping = 0
  Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
  AMD Features=0x80000800<SYSCALL,3DNow!>
real memory  = 67108864 (65536K bytes)
avail memory = 61997056 (60544K bytes)
Preloaded elf kernel "kernel" at 0xc0312000.
Probing for devices on PCI bus 0:
chip0: <VIA 82C597 (Apollo VP3) system controller> rev 0x04 on pci0.0.0
chip1: <VIA 82C598MVP (Apollo MVP3) PCI-PCI bridge> rev 0x00 on pci0.1.0
chip2: <VIA 82C586 PCI-ISA bridge> rev 0x41 on pci0.7.0
chip3: <VIA 82C586B ACPI interface> rev 0x10 on pci0.7.3
ahc0: <Adaptec 2940 SCSI adapter> rev 0x03 int a irq 9 on pci0.8.0
ahc0: aic7870 Wide Channel A, SCSI Id=7, 16/255 SCBs
vga0: <Matrox MGA 1024SG/1064SG/1164SG graphics accelerator> rev 0x03 int a irq 10 on pci0.9.0
rl0: <RealTek 8139 10/100BaseTX> rev 0x10 int a irq 11 on pci0.10.0
rl0: Ethernet address: 00:e0:7d:01:00:99
rl0: autoneg complete, link status good (half-duplex, 10Mbps)
Probing for devices on PCI bus 1:
Probing for PnP devices:
CSN 1 Vendor ID: CTL00f0 [0xf0008c0e] Serial 0xffffffff Comp ID: PNPb02f [0x2fb0d041]
This is a Vibra16X, but LDN 0 is disabled
Probing for devices on the ISA bus:
sc0 on isa
sc0: VGA color <16 virtual consoles, flags=0x0>
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A
sio1 at 0x2f8-0x2ff irq 3 flags 0x10 on isa
sio1: type 16550A
atkbdc0 at 0x60-0x6f on motherboard
atkbd0 irq 1 on isa
psm0 irq 12 on isa
psm0: model Generic PS/2 mouse, device ID 0
ppc0 at 0x378 irq 7 on isa
ppc0: Winbond chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
ppb0: IEEE1284 device found /ECP
Probing for PnP devices on ppbus0:
ppbus0: <Connectix QuickCam VC> MEDIA
lpt0: <generic printer> on ppbus 0
lpt0: Interrupt-driven port
ppi0: <generic parallel i/o> on ppbus 0
pcm0 not found
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
fd1: 1.2MB 5.25in
npx0 on motherboard
npx0: INT 16 interface
vga0 at 0x3b0-0x3df maddr 0xa0000 msize 131072 on isa
IP packet filtering initialized, divert enabled, rule-based forwarding disabled, default to accept, logging limited to 100 packets/entry by default
IP Filter: initialized.  Default = pass all, Logging = enabled
Waiting 8 seconds for SCSI devices to settle
da1 at ahc0 bus 0 target 1 lun 0
da1: <Quantum XP34300W L912> Fixed Direct Access SCSI-2 device 
da1: 16.128MB/s transfers (8.064MHz, offset 8, 16bit), Tagged Queueing Enabled
da1: 4101MB (8399520 512 byte sectors: 255H 63S/T 522C)
da2 at ahc0 bus 0 target 9 lun 0
da2: <Quantum XP31070W L912> Fixed Direct Access SCSI-2 device 
da2: 16.128MB/s transfers (8.064MHz, offset 8, 16bit), Tagged Queueing Enabled
da2: 1075MB (2203480 512 byte sectors: 255H 63S/T 137C)
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST31230W 0300> Fixed Direct Access SCSI-2 device 
da0: 20.000MB/s transfers (10.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da0: 1010MB (2069860 512 byte sectors: 64H 32S/T 1010C)
cd0 at ahc0 bus 0 target 2 lun 0
cd0: <PIONEER CD-ROM DR-124X 1.00> Removable CD-ROM SCSI-2 device 
cd0: 4.629MB/s transfers (4.629MHz, offset 8)
cd0: cd present [105372 x 2048 byte records]
cd9660: Joliet Extension
rl0: selecting MII, 10Mbps, half duplex


As you can see, i have tried to take the transfer speed down between the controler and the quantum drives, but that did not result in any noticeable reduction in errors (as suggested in the man page).  I am currently trying to retrieve the drive specs to check the jumpers on all drives once again, just to be sure and have retrieved a new bios for my fic board, after i noticed that at least one other had the same board as I do.  BTW, i am still not on the list. I'm working on that too, but the machine crashed this morning, so now I have to get this hardware problem taken care of first.  A few console errors too, just for length and completeness...

Oct  3 19:03:34 dakota /kernel: (da2:ahc0:0:9:0): Queuing a BDR SCB
Oct  3 19:03:34 dakota /kernel: (da2:ahc0:0:9:0): Bus Device Reset Message Sent
Oct  3 19:03:34 dakota /kernel: (da2:ahc0:0:9:0): no longer in timeout, status =
 34b
Oct  3 19:03:34 dakota /kernel: ahc0: Bus Device Reset on A:9. 3 SCBs aborted
Oct  4 21:40:03 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
In phase.  Tag == 0x45.
Oct  4 21:40:03 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 40960.  NumSGs = 10.
Oct  4 21:40:03 dakota /kernel: sg[0] - Addr 0x2735000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[1] - Addr 0x1d36000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[2] - Addr 0x3d77000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[3] - Addr 0x2878000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[4] - Addr 0x2179000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[5] - Addr 0x2eba000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[6] - Addr 0x113b000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[7] - Addr 0x243c000 : Length 4096
Oct  4 21:40:03 dakota /kernel: sg[8] - Addr 0x39fd000 : Length 4096
Oct  4 21:47:34 dakota /kernel: pid 808 (navigator-4.61.b), uid 1002: exited on 
signal 10

	here, it obviously killed netscape, but that's not hard


Oct  8 12:56:35 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
Out phase.  Tag == 0x1a.
Oct  8 12:56:35 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct  8 12:56:35 dakota /kernel: sg[0] - Addr 0x3075000 : Length 4096
Oct  9 21:03:04 dakota mountd[143]: mount request succeeded from 192.168.0.17 fo
r /usr/local/public/root/midge
Oct  9 21:28:07 dakota /kernel: (da1:ahc0:0:1:0): data overrun detected in Data-
Out phase.  Tag == 0x21.
Oct  9 21:28:07 dakota /kernel: (da1:ahc0:0:1:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct  9 21:28:07 dakota /kernel: sg[0] - Addr 0x1d89000 : Length 4096
Oct  9 21:29:03 dakota /kernel: (da1:ahc0:0:1:0): SCB 0x2c - timed out while idl
e, LASTPHASE == 0x1, SEQADDR == 0xb
Oct  9 21:29:03 dakota /kernel: (da1:ahc0:0:1:0): Queuing a BDR SCB
Oct  9 21:29:03 dakota /kernel: (da1:ahc0:0:1:0): Bus Device Reset Message Sent
Oct  9 21:29:03 dakota /kernel: (da1:ahc0:0:1:0): no longer in timeout, status =
 34b
Oct  9 21:29:03 dakota /kernel: ahc0: Bus Device Reset on A:1. 10 SCBs aborted
Oct  9 21:29:31 dakota /kernel: (da1:ahc0:0:1:0): data overrun detected in Data-
Out phase.  Tag == 0x21.
Oct  9 21:29:31 dakota /kernel: (da1:ahc0:0:1:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct  9 21:29:31 dakota /kernel: sg[0] - Addr 0xeb5000 : Length 4096
Oct  9 21:31:22 dakota /kernel: (da1:ahc0:0:1:0): data overrun detected in Data-
Out phase.  Tag == 0xf0.
Oct  9 21:31:22 dakota /kernel: (da1:ahc0:0:1:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct  9 21:31:22 dakota /kernel: sg[0] - Addr 0x2cdd000 : Length 4096
Oct  9 21:34:43 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
In phase.  Tag == 0x23.
Oct  9 21:34:43 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct  9 21:34:43 dakota /kernel: sg[0] - Addr 0x3e86000 : Length 4096
Oct  9 21:42:59 dakota /kernel: pid 45161 (ld), uid 0: exited on signal 11 (core
 dumped)

	this was not nice.

Oct 12 02:06:20 dakota /kernel: (da1:ahc0:0:1:0): data overrun detected in Data-
In phase.  Tag == 0x26.
Oct 12 02:06:20 dakota /kernel: (da1:ahc0:0:1:0): Have seen Data Phase.  Length 
= 1024.  NumSGs = 1.
Oct 12 07:43:27 dakota /kernel: (da2:ahc0:0:9:0): SCB 0xa - timed out while idle
, LASTPHASE == 0x1, SEQADDR == 0xc
Oct 12 07:43:35 dakota /kernel: (da2:ahc0:0:9:0): Queuing a BDR SCB
Oct 12 07:43:35 dakota /kernel: (da2:ahc0:0:9:0): Bus Device Reset Message Sent
Oct 12 07:43:35 dakota /kernel: (da2:ahc0:0:9:0): no longer in timeout, status =
 34b
Oct 12 07:43:35 dakota /kernel: ahc0: Bus Device Reset on A:9. 1 SCBs aborted
Oct 12 07:44:27 dakota /kernel: (da1:ahc0:0:1:0): SCB 0x9 - timed out while idle
, LASTPHASE == 0x1, SEQADDR == 0x9
Oct 12 07:44:27 dakota /kernel: (da1:ahc0:0:1:0): Queuing a BDR SCB
Oct 12 07:44:27 dakota /kernel: (da1:ahc0:0:1:0): Bus Device Reset Message Sent
Oct 12 07:44:27 dakota /kernel: (da1:ahc0:0:1:0): no longer in timeout, status =
 34b
Oct 12 07:44:27 dakota /kernel: ahc0: Bus Device Reset on A:1. 7 SCBs aborted
Oct 12 07:45:00 dakota /kernel: ahc0:A:1: no active SCB for reconnecting target 
- issuing BUS DEVICE RESET
Oct 12 07:45:00 dakota /kernel: SAVED_TCL == 0x10, ARG_1 == 0x6, SEQ_FLAGS == 0x
40
Oct 12 07:45:00 dakota /kernel: ahc0: Bus Device Reset on A:1. 13 SCBs aborted
Oct 12 07:45:27 dakota /kernel: (da2:ahc0:0:9:0): SCB 0x11 - timed out while idl
e, LASTPHASE == 0x1, SEQADDR == 0x9
Oct 12 07:45:35 dakota /kernel: (da2:ahc0:0:9:0): Queuing a BDR SCB
Oct 12 07:45:35 dakota /kernel: (da2:ahc0:0:9:0): Bus Device Reset Message Sent
Oct 12 07:45:35 dakota /kernel: (da2:ahc0:0:9:0): no longer in timeout, status =
 34b
Oct 12 07:45:35 dakota /kernel: ahc0: Bus Device Reset on A:9. 3 SCBs aborted
Oct 12 07:49:58 dakota /kernel: pid 27230 (cc), uid 0: exited on signal 11 (core
 dumped)
	again, not very nice when you're compiling

Oct 18 21:31:25 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
Out phase.  Tag == 0x2.
Oct 18 21:31:25 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct 18 21:31:25 dakota /kernel: sg[0] - Addr 0x287b000 : Length 4096

Oct 21 22:02:16 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
Out phase.  Tag == 0x3e.
Oct 21 22:02:16 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.
Oct 21 22:02:16 dakota /kernel: sg[0] - Addr 0x2a01000 : Length 4096


Oct 22 02:08:04 dakota /kernel: ahc0:A:9: ahc_intr - referenced scb not valid du
ring seqint 0x71 scb(36)
Oct 22 02:09:12 dakota /kernel: ahc0: WARNING no command for scb 36 (cmdcmplt)
Oct 22 02:09:12 dakota /kernel: QOUTPOS = 102
Oct 22 02:09:12 dakota /kernel: (da2:ahc0:0:9:0): SCB 0x14 - timed out while idl
e, LASTPHASE == 0x1, SEQADDR == 0x9
Oct 22 02:09:12 dakota /kernel: (da2:ahc0:0:9:0): Queuing a BDR SCB
Oct 22 02:09:12 dakota /kernel: (da2:ahc0:0:9:0): Bus Device Reset Message Sent
Oct 22 02:09:12 dakota /kernel: (da2:ahc0:0:9:0): no longer in timeout, status =
 34b
Oct 22 02:09:12 dakota /kernel: ahc0: Bus Device Reset on A:9. 2 SCBs aborted


Oct 22 09:40:29 dakota /kernel: (da2:ahc0:0:9:0): data overrun detected in Data-
Out phase.  Tag == 0x6.
Oct 22 09:40:29 dakota /kernel: (da2:ahc0:0:9:0): Have seen Data Phase.  Length 
= 8192.  NumSGs = 2.


and so on ad infinitum. an avg. of two per day under minimum load.  Sorry this got so long and I hope the formatting worked.



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19991107125930.A20165>