Date: Fri, 18 Mar 2005 11:26:32 +0100 From: peter@bgnett.no (Peter N. M. Hansteen) To: freebsd-questions@freebsd.org Cc: peter@datadok.no Subject: sym driver broken in 5.3? Message-ID: <86ekedntbb.fsf@amidala.datadok.no>
next in thread | raw e-mail | index | archive | help
is anybody else having trouble with the sym scsi driver on 5.3-stable systems? I have a machine here where a tar to SCSI tape (tar cf /dev/nsa0 /home/data) will pretty reliably chrash the machine. This being our file server, it's a tad inconvenient. I was suspecting that the tape drive was bad, but today's crash gave me some new data - the console was full of repeated camq_init: - cannot malloc array! followed by the uptime figures. dmesg output immediately after reboot had according to grep -c 676 of them, before the expected boot time messages: Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-SECURITY #0: Fri Jan 7 04:09:28 UTC 2005 root@builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 Processor 3000+ (2000.09-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0xfc0 Stepping = 0 Features=0x78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> AMD Features=0xe0500000<NX,AMIE,LM,DSP,3DNow!> real memory = 1006567424 (959 MB) avail memory = 975384576 (930 MB) ACPI APIC Table: <AMIINT VIA_K8 > ioapic0 <Version 0.3> irqs 0-23 on motherboard npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <AMIINT VIA_K8> on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <VIA 8380 host to PCI bridge> mem 0xd0000000-0xd7ffffff at device 0.0 on pci0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <display, VGA> at device 0.0 (no driver attached) sym0: <895> port 0xe800-0xe8ff mem 0xcfffe000-0xcfffefff,0xcfffff00-0xcfffffff irq 16 at device 8.0 on pci0 sym0: Tekram NVRAM, ID 7, Fast-40, LVD, parity checking sym0: [GIANT-LOCKED] xl0: <3Com 3c905C-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xcffffe80-0xcffffeff irq 19 at device 11.0 on pci0 miibus0: <MII bus> on xl0 xlphy0: <3c905C 10/100 internal PHY> on miibus0 xlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto xl0: Ethernet address: 00:01:02:df:39:9a atapci0: <VIA 6420 SATA150 controller> port 0xd000-0xd0ff,0xd400-0xd40f,0xd800-0xd803,0xdc00-0xdc07,0xe000-0xe003,0xe400-0xe407 irq 20 at device 15.0 on pci0 ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 atapci1: <VIA 8237 UDMA133 controller> port 0xfc00-0xfc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 15.1 on pci0 ata0: channel #0 on atapci1 ata1: channel #1 on atapci1 isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 fdc0: <floppy drive controller> port 0x3f7,0x3f4-0x3f5,0x3f2-0x3f3 irq 6 drq 2 on acpi0 fdc0: [FAST] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: <ECP parallel printer port> port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] orm0: <ISA Option ROMs> at iomem 0xe0000-0xe0fff,0xcd800-0xcf7ff,0xc8800-0xc8fff on isa0 pmtimer0 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2000087768 Hz quality 800 Timecounters tick every 10.000 msec acpi_cpu: throttling enabled, 16 steps (100% to 6.2%), currently 100.0% acd0: CDROM <CD-950E/TKU/A4E> at ata0-master UDMA33 ad4: 38204MB <SAMSUNG SP0411C/UU100-05> [77622/16/63] at ata2-master SATA150 Waiting 15 seconds for SCSI devices to settle sa0 at sym0 bus 0 target 6 lun 0 sa0: <SEAGATE DAT DAT72-000 A060> Removable Sequential Access SCSI-3 device sa0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit) da0 at sym0 bus 0 target 2 lun 0 da0: <SEAGATE ST336753LW 0006> Fixed Direct Access SCSI-3 device da0: 80.000MB/s transfers (40.000MHz, offset 31, 16bit), Tagged Queueing Enabled da0: 35003MB (71687372 512 byte sectors: 255H 63S/T 4462C) Mounting root from ufs:/dev/ad4s1a WARNING: / was not properly dismounted WARNING: /home was not properly dismounted /home: mount pending error: blocks 1092 files 2 WARNING: /home/data/merplass was not properly dismounted xl0: transmission error: 90 xl0: tx underrun, increasing tx start threshold to 120 bytes xl0: transmission error: 90 xl0: tx underrun, increasing tx start threshold to 180 bytes xl0: transmission error: 90 xl0: tx underrun, increasing tx start threshold to 240 bytes xl0: transmission error: 90 xl0: tx underrun, increasing tx start threshold to 300 bytes I've been debugging this on and off for a while now. Tar to tape worked on the first couple of attempts, as far as I can tell from mt output compression is enabled in the drive (meaning there should be space for the data), but "excessive write errors" messages have been turning up in the syslog messages - as in Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): WRITE FILEMARKS. CDB: 10 0 0 0 2 0 Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): CAM Status: SCSI Status Error Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): SCSI Status: Check Condition Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): MEDIUM ERROR asc:3,2 Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): Excessive write errors Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): Retries Exhausted Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): failed to write terminating filemark(s) Mar 18 02:41:49 filehut kernel: (sa0:sym0:0:6:0): tape is now frozen- use an OFFLINE, REWIND or MTEOM command to clear this state. I was beginning to think I'd need to replace the tape drive, but the camq_init message made me think this could be a driver problem (the driver is afaik not supported in FreeBSD/amd64 at all, for example). The question is, what's the next reasonable debugging step here? (and I know you're dying to ask - we do rsync to an off-site location twice a day) - P -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team http://www.blug.linux.no/rfc1149/ http://www.datadok.no/ http://www.nuug.no/ "First, we kill all the spammers" The Usenet Bard, "Twice-forwarded tales"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86ekedntbb.fsf>