Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 4 Oct 1998 18:01:45 +0200 (SAT)
From:      John Hay <jhay@mikom.csir.co.za>
To:        freebsd-scsi@FreeBSD.ORG
Subject:   cam panic... probably tag related
Message-ID:  <199810041601.SAA02050@zibbi.mikom.csir.co.za>

next in thread | raw e-mail | index | archive | help

System is a dual 266MHz PII with Asus motherboard with Adaptec 7880 on
board and a Seagate ST34572N as drive 0 and a Conner CFP4207S as drive 2.
Everything is on the Seagate except /usr/obj which is symlinked to the
Conner. It is running a very up to date -current and using softupdates
on all partitions.

The machine will panic sometimes during a "make world" especially with a
high -j value, but it panics in such a way that it does not leave a dump.

What is strange to me is that it seems that the cam code catches the
problem and try to recover from it, but the machine still panic. Should
cam be able to recover from it? The panic is in acquire_lock() in the
softupdate code (I have added a piece of "nm -aout kernel" at the end of
this email), but I don't really think it is to blame for the panic. I
have build a quirk entry for the Conner drive to limit the tags to max
24 and have now successfully done more than 20 "make world -j24"s without
a panic, where previously it would panic within about 3. I have added a
diff for the quirk entry at the end. About the minimum number of tags I
just took a random number smaller than the max. :-) I'm not sure what it
should be.

Here is the output on the serial console preceding the panic and also the
probe info during the reboot afterwards, with unrelated things here and
there removed, just incase it is usefull to someone. The first 3 lines
come fairly quickly after I start a make world. I understand them and
just left them in for completeness.

---------------------------------------------------------------------------
(da1:ahc0:0:2:0): tagged openings now 32
(da1:ahc0:0:2:0): tagged openings now 31
(da0:ahc0:0:0:0): tagged openings now 63

... Long delay depending on how long it takes to get it to panic ...

(da1:ahc0:0:2:0): SCB 0x2d - timed out while idle, LASTPHASE == 0x1, SCSISIGI == 0x0
SEQADDR == 0x9
SSTAT1 == 0xa
(da1:ahc0:0:2:0): Queuing a BDR SCB
(da1:ahc0:0:2:0): Bus Device Reset Message Sent
(da1:ahc0:0:2:0): no longer in timeout, status = 34b
ahc0: Bus Device Reset on A:2. 31 SCBs aborted
(da1:ahc0:0:2:0): tagged openings now 32


Fatal trap 12: page fault while in kernel mode
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
fault virtual address	= 0x30
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xf0192468
stack pointer	        = 0x10:0xff804ca8
frame pointer	        = 0x10:0xff804cac
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= bio  <- SMP: XXX
trap number		= 12
panic: page fault
mp_lock = 01000002; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

syncing disks... 

Fatal trap 12: page fault while in kernel mode
mp_lock = 01000003; cpuid = 1; lapic.id = 00000000
fault virtual address	= 0x30
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xf0192468
stack pointer	        = 0x10:0xff804a8c
frame pointer	        = 0x10:0xff804a90
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= bio  <- SMP: XXX
trap number		= 12
panic: page fault
mp_lock = 01000003; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

dumping to dev 20401, offset 557056
dump 

Fatal trap 12: page fault while in kernel mode
mp_lock = 01000004; cpuid = 1; lapic.id = 00000000
fault virtual address	= 0x30
fault code		= supervisor read, page not present
instruction pointer	= 0x8:0xf0192468
stack pointer	        = 0x10:0xff804594
frame pointer	        = 0x10:0xff804598
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= Idle
interrupt mask		= net tty bio cam  <- SMP: XXX
trap number		= 12
panic: page fault
mp_lock = 01000004; cpuid = 1; lapic.id = 00000000
boot() called on cpu#1

dumping to dev 20401, offset 557056
dump device not ready
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
cpu_reset called on cpu#1
cpu_reset: Stopping other CPUs
cpu_reset: Restarting BSP
cpu_reset_proxy: Grabbed mp lock for BSP
cpu_reset_proxy: Stopped CPU 1
...
>> FreeBSD BOOT @ 0x10000: 640/65472 k of memory, serial console
Boot default: 0:sd(0,a)kernel
...
total=0x23a0d4 entry point=0x100000
Copyright (c) 1992-1998 FreeBSD Inc.
Copyright (c) 1982, 1986, 1989, 1991, 1993
	The Regents of the University of California. All rights reserved.
FreeBSD 3.0-BETA #9: Fri Oct  2 16:16:08 SAST 1998
    jhay@beast.mikom.csir.co.za:/usr/src/sys/compile/BEAST
Timecounter "i8254"  frequency 1193561 Hz  cost 3246 ns
CPU: Pentium II (686-class CPU)
  Origin = "GenuineIntel"  Id = 0x633  Stepping=3
  Features=0x80fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,MMX>
real memory  = 134217728 (131072K bytes)
avail memory = 128409600 (125400K bytes)
Programming 24 pins in IOAPIC #0
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  1, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  0, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
Probing for devices on PCI bus 0:
chip0: <Host to PCI bridge (vendor=8086 device=7180)> rev 0x03 on pci0.0.0
chip1: <PCI to PCI bridge (vendor=8086 device=7181)> rev 0x03 on pci0.1.0
chip2: <Intel 82371AB PCI to ISA bridge> rev 0x01 on pci0.4.0
chip3: <Intel 82371AB USB host controller> rev 0x01 int d irq 9 on pci0.4.2
chip4: <Intel 82371AB Power management controller> rev 0x01 on pci0.4.3
ahc0: <Adaptec aic7880 Ultra SCSI adapter> rev 0x00 int a irq 19 on pci0.6.0
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
fxp0: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x04 int a irq 18 on pci0.10.0
fxp0: Ethernet address 00:a0:c9:8d:7c:5f
fxp1: <Intel EtherExpress Pro 10/100B Ethernet> rev 0x04 int a irq 17 on pci0.11.0
fxp1: Ethernet address 00:a0:c9:8d:74:dd
vga0: <S3 ViRGE DX/GX graphics accelerator> rev 0x01 int a irq 16 on pci0.12.0
Probing for devices on PCI bus 1:
Probing for devices on the ISA bus:
sc0 at 0x60-0x6f irq 1 on motherboard
sc0: VGA color <16 virtual consoles, flags=0x0>
sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
sio0: type 16550A, console
sio1 at 0x2f8-0x2ff irq 3 on isa
sio1: type 16550A
lpt0 at 0x378-0x37f irq 7 on isa
lpt0: Interrupt-driven port
lp0: TCP/IP capable interface
fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1.44MB 3.5in
npx0 on motherboard
npx0: INT 16 interface
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via pin 2
SMP: AP CPU #1 Launched!
sa0 at ahc0 bus 0 target 5 lun 0
sa0: <HP HP35470A 7 09> Removable Sequential Access SCSI2 device 
sa0: 5.0MB/s transfers (5.0MHz, offset 8)
da1 at ahc0 bus 0 target 2 lun 0
da1: <CONNER CFP4207S  4.28GB 1420> Fixed Direct Access SCSI2 device 
da1: 10.0MB/s transfers (10.0MHz, offset 15), Tagged Queueing Enabled
da1: 4096MB (8388608 512 byte sectors: 255H 63S/T 522C)
da0 at ahc0 bus 0 target 0 lun 0
da0: <SEAGATE ST34572N 0784> Fixed Direct Access SCSI2 device 
da0: 20.0MB/s transfers (20.0MHz, offset 15), Tagged Queueing Enabled
da0: 4340MB (8888924 512 byte sectors: 255H 63S/T 553C)
changing root device to da0s1a
...
(da0:ahc0:0:0:0): tagged openings now 64
(da0:ahc0:0:0:0): tagged openings now 63

---------------------------------------------------------------------------

Here is the "nm -aout kernel" part around the panic.

---------------------------------------------------------------------------
f0192370 F ffs_softdep.o
f0192370 F ffs_softdep_stub.o
f0192424 t _acquire_lock
f0192498 t _free_lock
f019251c t _acquire_lock_interlocked
f0192590 t _free_lock_interlocked
---------------------------------------------------------------------------

---------------------------------------------------------------------------
Index: sys/cam/cam_xpt.c
===================================================================
RCS file: /home/ncvs/src/sys/cam/cam_xpt.c,v
retrieving revision 1.15
diff -u -r1.15 cam_xpt.c
--- cam_xpt.c	1998/10/02 21:00:50	1.15
+++ cam_xpt.c	1998/10/04 06:47:44
@@ -233,12 +233,18 @@
 #endif
 };
 
+static const char conner[] = "CONNER";
 static const char quantum[] = "QUANTUM";
 static const char sony[] = "SONY";
 static const char west_digital[] = "WDIGTL";
 
 static struct xpt_quirk_entry xpt_quirk_table[] = 
 {
+	{
+		/* Sometimes gets stuck */
+		{ T_DIRECT, SIP_MEDIA_FIXED, conner, "CFP4207S*", "*" },
+		/*quirks*/0, /*mintags*/8, /*maxtags*/24
+	},
 	{
 		/* Reports QUEUE FULL for temporary resource shortages */
 		{ T_DIRECT, SIP_MEDIA_FIXED, quantum, "XP39100*", "*" },

John
-- 
John Hay -- John.Hay@mikom.csir.co.za

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199810041601.SAA02050>