Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Jan 2001 15:36:27 -0800
From:      "Brett G. Lemoine" <bl@incyte.com>
To:        FreeBSD-gnats-submit@freebsd.org
Cc:        bl@incyte.com
Subject:   i386/24469: system hangs on scsi disk access error
Message-ID:  <200101192336.PAA158459@blah.incyte.com>

next in thread | raw e-mail | index | archive | help

>Number:         24469
>Category:       i386
>Synopsis:       system hangs on scsi disk access error
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jan 19 15:40:01 PST 2001
>Closed-Date:
>Last-Modified:
>Originator:     Brett G Lemoine
>Release:        FreeBSD 4.2-RELEASE i386
>Organization:
Incyte Genomics, Inc
>Environment:

	TYAN Thunderbolt S1837 motherboard w/ onboard Adaptec
	AIC-7896 dual channel Ultra2 LVD SCSI


	FreeBSD 4.2-RELEASE #1: Fri Jan 12 19:52:23 CST 2001
	    root@blur.unixshaman.com:/usr/src/sys/compile/SHAMAN
	Timecounter "i8254"  frequency 1193182 Hz
	CPU: Pentium III/Pentium III Xeon/Celeron (751.71-MHz 686-class CPU)
	  Origin = "GenuineIntel"  Id = 0x681  Stepping = 1
	  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE>
	real memory  = 1073741824 (1048576K bytes)
	config> di aha0
	config> q
	avail memory = 1042231296 (1017804K bytes)
	Programming 24 pins in IOAPIC #0
	IOAPIC #0 intpin 2 -> irq 0
	FreeBSD/SMP: Multiprocessor motherboard
	 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
	 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
	 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
	Preloaded elf kernel "kernel" at 0xc0392000.
	Preloaded userconfig_script "/boot/kernel.conf" at 0xc039209c.
	Pentium Pro MTRR support enabled
	md0: Malloc disk
	npx0: <math processor> on motherboard
	npx0: INT 16 interface
	pcib0: <Intel 82443GX host to PCI bridge> on motherboard
	pci0: <PCI bus> on pcib0
	pcib2: <Intel 82443GX (440 GX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
	pci1: <PCI bus> on pcib2
	pcib3: <PCI to PCI bridge (vendor=1011 device=0023)> at device 1.0 on pci1
	pci2: <PCI bus> on pcib3
	pci2: <S3 Savage 4 graphics accelerator> at 1.0
	pci2: <S3 Savage 4 graphics accelerator> at 2.0
	pci2: <S3 Savage 4 graphics accelerator> at 3.0
	pci2: <S3 Savage 4 graphics accelerator> at 4.0
	isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
	isa0: <ISA bus> on isab0
	atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
	ata0: at 0x1f0 irq 14 on atapci0
	ata1: at 0x170 irq 15 on atapci0
	uhci0: <Intel 82371AB/EB (PIIX4) USB controller> at device 7.2 on pci0
	uhci0: Invalid irq 255
	uhci0: Please switch on USB support and switch PNP-OS to 'No' in BIOS
	device_probe_and_attach: uhci0 attach returned 6
	Timecounter "PIIX"  frequency 3579545 Hz
	chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at device 7.3 on pci0
	ahc0: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0xe400-0xe4ff mem 0xfebfe000-0xfebfefff irq 16 at device 11.0 on pci0
	aic7896/97: Wide Channel A, SCSI Id=7, 32/255 SCBs
	ahc1: <Adaptec aic7896/97 Ultra2 SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfebfffff irq 16 at device 11.1 on pci0
	aic7896/97: Wide Channel B, SCSI Id=7, 32/255 SCBs
	pcm0: <AudioPCI ES1371> port 0xef00-0xef3f irq 18 at device 12.0 on pci0
	fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xee80-0xeebf mem 0xfea00000-0xfeafffff,0xfebfd000-0xfebfdfff irq 19 at device 13.0 on pci0
	fxp0: Ethernet address 00:e0:81:10:c9:0e
	fxp1: <Intel Pro 10/100B/100+ Ethernet> port 0xed80-0xedbf mem 0xfe800000-0xfe8fffff,0xfebfc000-0xfebfcfff irq 17 at device 17.0 on pci0
	fxp1: Ethernet address 00:d0:b7:73:39:03
	pcib1: <Intel 82443GX host to AGP bridge> on motherboard
	pci3: <PCI bus> on pcib1
	fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
	fdc0: FIFO enabled, 8 bytes threshold
	fd0: <1440-KB 3.5" drive> on fdc0 drive 0
	atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
	atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
	kbd0 at atkbd0
	psm0: <PS/2 Mouse> irq 12 on atkbdc0
	psm0: model IntelliMouse, device ID 3
	vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
	sc0: <System console> at flags 0x100 on isa0
	sc0: VGA <16 virtual consoles, flags=0x300>
	sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
	sio0: type 16550A
	sio1 at port 0x2f8-0x2ff irq 3 on isa0
	sio1: type 16550A
	ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
	ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
	ppi0: <Parallel I/O> on ppbus0
	plip0: <PLIP network interface> on ppbus0
	lpt0: <Printer> on ppbus0
	lpt0: Interrupt-driven port
	APIC_IO: Testing 8254 interrupt delivery
	APIC_IO: routing 8254 via IOAPIC #0 intpin 2
	SMP: AP CPU #1 Launched!
	acd0: CDROM <TOSHIBA CD-ROM XM-6702B> at ata1-master using PIO4
	Waiting 5 seconds for SCSI devices to settle
	Mounting root from ufs:/dev/da0s1a
	da0 at ahc0 bus 0 target 0 lun 0
	da0: <SEAGATE ST318404LW 0002> Fixed Direct Access SCSI-3 device 
	da0: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
	da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
	da1 at ahc0 bus 0 target 1 lun 0
	da1: <SEAGATE ST318404LW 0002> Fixed Direct Access SCSI-3 device 
	da1: 80.000MB/s transfers (40.000MHz, offset 63, 16bit), Tagged Queueing Enabled
	da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C)
	WARNING: / was not properly dismounted
	cd0 at ahc1 bus 0 target 5 lun 0
	cd0: <YAMAHA CRW8424S 1.0j> Removable CD-ROM SCSI-2 device 
	cd0: 20.000MB/s transfers (20.000MHz, offset 15)
	cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed
	da2 at ahc1 bus 0 target 6 lun 0
	da2: <iomega jaz 2GB E.17> Removable Direct Access SCSI-2 device 
	da2: 20.000MB/s transfers (20.000MHz, offset 15)
	da2: Attempt to query device size failed: NOT READY, Medium not present
	pid 217 (Xaccel): trap 12 with interrupts disabled
	pid 217 (Xaccel): trap 7 with interrupts disabled
	cd9660: RockRidge Extension

>Description:

	Sporadically (5 times in the last two weeks, including 3 times
	on one day), I get the below errors on one of my two disks.

(da1:ahc0:0:1:0): SCB 0x1d - timed out while idle, SEQADDR == 0x5
STACK == 0x13, 0x174, 0x15e, 0x174
SXFRCTL0 == 0x80
SCB count = 110
QINFIFO entries: 34 18 46 1 19 31 52 20 33 9 3 67 57 45 0 30 54 22 50 40 23 8 36 2 32 44 35 5 17 11 28 10 101 15 51 26 6
Waiting Queue entries: 11:66
Disconnected Queue entries: 17:39 27:29
QOUTFIFO entries:
Sequencer Free SCB List: 20 2 0 28 14 10 29 31 15 24
 7 19 6 23 18 21 12 26 13 22 4 30 9 3 16 8 25 1 5
Pending list: 6 26 51 15 101 10 28 11 17 5 35 44 32
2 36 8 23 40 50 22 54 30 0 45 57 67 3 9 33 20 52 31 19 1 46 18 34 66 39 29
Kernel Free SCB list: 24 58 25 47 59 55 27 42 4 49 3 8 37 43 21 41 53 48 16 12 69 56 68 13 83 14 82 81 80 99 98 97 96 95 94 93 92 91 90 109 108 107 106 105 104 103 102 65 84 85 86 87 88 89 70 71 72 73 74 75 76 77 78 79 60 61 62 63 64 100
sg[0] - Addr 0x1a608800 : Length 1024
(da1:ahc0:0:1:0): SCB 29: Immediate reset.  Flags =
0x4040
(da1:ahc0:0:1:0): no longer in timeout, status = 34b
ahc0: Issued Channel A Bus Reset. 40 SCBs aborted

	After looking for similar problems in the GNATs database, I saw
	suggestions to disable tagged queueing, which I then did on
	both disks (using camcontrol).

	I then didn't see the problem for a while, so I thought that it
	had been taken care of, but today, I get the following:


(da0:ahc0:0:0:0): SCB 0x8 - timed out while idle, SEQADDR == 0x3e
STACK == 0x1, 0x1, 0x1, 0x1
SXFRCTL0 == 0x80
SCB count = 20
QINFIFO entries: 8 14
Waiting Queue entries:
Disconnected Queue entrties:
QOUTFIFO entries:
Sequencer Free SCB List: 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Pending list: 14 8
Kernel Free SCB list: 15 16 17 18 18 0 1 2 3 4 5 6 7 13 12 11 10
Untagged Q(0): 8
Untagged Q(1): 14
sg[0] - Addr 0x3c381000 : Length 4096
sg[1] - Addr 0x35ce2000 : Length 2048
(da0:ahc0:0:0:0): SCB 8: Immediate reset.  Flags = 0x6040
(da0:ahc0:0:0:0): no longer in timeout, status = 34b
ahc0: Issued Channel A Bus Reset. 2 SCBs aborted

(da0:ahc0:0:0:0): SCB 0x9 - timed out while idle, SEQADDR == 0x3e
STACK == 0x1, 0x1, 0x1, 0x1
SXFRCTL0 == 0x80
SCB count = 20
QINFIFO entries: 9 14
Waiting Queue entries:
Disconnected Queue entrties:
QOUTFIFO entries:
Sequencer Free SCB List: 1 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Pending list: 14 9
Kernel Free SCB list: 15 16 17 18 18 0 1 2 3 4 5 6 7 13 12 11 10
Untagged Q(0): 9
Untagged Q(1): 14
sg[0] - Addr 0x3c381000 : Length 4096
sg[1] - Addr 0x35ce2000 : Length 2048
(da0:ahc0:0:0:0): SCB 8: Immediate reset.  Flags = 0x6040
(da0:ahc0:0:0:0): no longer in timeout, status = 34b
ahc0: Issued Channel A Bus Reset. 2 SCBs aborted

	I'm somewhat new to PC-type hardware, so this may be nothing,
	but are the two channels on the ahc's _supposed_ to have the
	same IRQ?  I couldn't find a way to alter either ahc's IRQ
	from either the system or scsi bios, so I'm assuming they're
	setup correctly.  Given that there was no activity on the
	other bus (nothing in either the cd-writer or zip drive) at
	the time of the problems, I don't believe it's likely to be
	simply an IRQ issue.

>How-To-Repeat:

	The problems seem to occur most frequenly when there's heavy
	disk activity, but I can't seem to reproduce it on demand.


>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200101192336.PAA158459>