Date: Tue, 14 Nov 2006 13:18:47 -0800 From: Mark Dotson <mark@dmglobal.net> To: Atanas <atanas@asd.aplus.net> Cc: freebsd-stable@freebsd.org Subject: Re: twa: Passthru request timed out! Resetting controller... Message-ID: <455A32B7.9080304@dmglobal.net> In-Reply-To: <455A1DEA.20304@asd.aplus.net> References: <455A1DEA.20304@asd.aplus.net>
next in thread | previous in thread | raw e-mail | index | archive | help
I've had continued problems with the 3ware series SATA cards and the Tyan boards. Specifically, I have a "Tyan S5360-1U" and both a 9500S-4LP and a 8506 series 3ware cards. In my case the first error is different, but the 'resetting' over and over is VERY familiar. This could be triggered by a simple file copy from one part of a container to another; degrading the unit and triggering the resetting crap. Note that the drives are fine, I tested that first thing. Sep 8 11:59:23 localhost kernel: 3w-9xxx: scsi0: WARNING: (0x06:0x002C): Unit #1: Command (0x2a) timed out, resetting card. Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E): Cache synchronized after power fail:unit=0. Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO (0x04:0x005E): Cache synchronized after power fail:unit=1. I also found this problem to exist across platforms, not just FreeBSD. For example, the excerpt above is from a CentOS box. All tests were done with newest firmware for both card and mobo, and using the newest drivers provided by 3ware. Once I removed the card and drives from the Tyan system and stuck them in pretty much ANY other system, they worked fantastically. I don't have an answer for the "resetting problem" as of yet... 3ware and Tyan (And my system vendor "Appro") are still trying to find my specific problem and solve it. I believe they are currently doing the "replace everything" method of troubleshooting. -Mark Atanas wrote: > Has anyone experiencing this: > > twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = > 0xca839d20 > twa0: INFO: (0x16: 0x1108): Resetting controller...: > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 > ... > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 > twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 > twa0: INFO: (0x16: 0x1107): Controller reset done!: > > This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a > number of machines with the following hardware configuration: > > - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM > - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives > (configured as 8 SINGLE DISK units, aka JBOD) > > All hardware components, including the server chassis, are listed in the > 3ware hardware compatibility lists. It doesn't seem to be a cabling or > power issue. The controller and hard drives are already flashed to the > latest firmware revisions. I tried turning off NCQ, but it didn't make > any difference. I tried also switching the kernel from PAE to non-PAE > (reducing the usable memory to 3GB), but it didn't help either. > > I have another machines with similar I/O configurations (3ware), but > with Intel motherboards and running FreeBSD-5.5, and these run fine for > about a year already. Now I'm thinking about swapping the drives between > a working Intel and AMD based box, to see where controller timeouts will > follow. > > The problem happens sporadically once in a month or so and is very hard > to reproduce. Sometimes it takes several weeks until the next crash > happens, sometimes it crashes again in just a few hours. > > When the thing happens, the kernel sometimes panics (most likely due to > the inconsistent filesystem state caused by the controller reset), > sometimes just hangs. It can be interrupted (I have a serial console), > but the only usable thing after that seems to be "call cpu_reset()", > followed by full (and sometimes painfully long) filesystem check. > > Here are the diffs against the default GENERIC and PAE kernel > configurations: > > < cpu I486_CPU > < ident GENERIC > < options INET6 # IPv6 communications protocols > < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI > > > options QUOTA > > options SMP # Symmetric MultiProcessor Kernel > > options BREAK_TO_DEBUGGER > > options DDB > > options KDB > > options KDB_UNATTENDED > > > options IPFIREWALL > > options DUMMYNET > > I'm attaching the dmesg.boot following the latest crash. > > Regards, > Atanas > > > ------------------------------------------------------------------------ > > Copyright (c) 1992-2006 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 > root@xyz:/var/obj/usr/src/sys/XYZ-PAE > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 > Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT> > Features2=0x1<SSE3> > AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow> > AMD Features2=0x3<LAHF,CMP> > Cores per package: 2 > real memory = 5368709120 (5120 MB) > avail memory = 4182241280 (3988 MB) > ACPI APIC Table: <PTLTD APIC > > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > ioapic0 <Version 1.1> irqs 0-23 on motherboard > ioapic1 <Version 1.1> irqs 24-27 on motherboard > ioapic2 <Version 1.1> irqs 28-31 on motherboard > kbd1 at kbdmux0 > acpi0: <PTLTD RSDT> on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 > cpu0: <ACPI CPU> on acpi0 > cpu1: <ACPI CPU> on acpi0 > cpu2: <ACPI CPU> on acpi0 > cpu3: <ACPI CPU> on acpi0 > acpi_button0: <Power Button> on acpi0 > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci0: <ACPI PCI bus> on pcib0 > pci0: <memory> at device 0.0 (no driver attached) > isab0: <PCI-ISA bridge> at device 1.0 on pci0 > isa0: <ISA bus> on isab0 > pci0: <serial bus, SMBus> at device 1.1 (no driver attached) > pci0: <serial bus, USB> at device 2.0 (no driver attached) > atapci0: <nVidia nForce CK804 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 > ata0: <ATA channel 0> on atapci0 > ata1: <ATA channel 1> on atapci0 > pcib1: <ACPI PCI-PCI bridge> at device 9.0 on pci0 > pci1: <ACPI PCI bus> on pcib1 > pci1: <display, VGA> at device 6.0 (no driver attached) > fxp0: <Intel 82551 Pro/100 Ethernet> port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 > miibus0: <MII bus> on fxp0 > inphy0: <i82555 10/100 media interface> on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:33:b5:f1 > pcib2: <ACPI PCI-PCI bridge> at device 13.0 on pci0 > pci2: <ACPI PCI bus> on pcib2 > pcib3: <ACPI PCI-PCI bridge> at device 14.0 on pci0 > pci3: <ACPI PCI bus> on pcib3 > pcib4: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 > pci24: <ACPI PCI bus> on pcib4 > pcib5: <ACPI PCI-PCI bridge> at device 10.0 on pci24 > pci25: <ACPI PCI bus> on pcib5 > 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 > twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 > twa0: [FAST] > twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > pci24: <base peripheral, interrupt controller> at device 10.1 (no driver attached) > pcib6: <ACPI PCI-PCI bridge> at device 11.0 on pci24 > pci26: <ACPI PCI bus> on pcib6 > bge0: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 > miibus1: <MII bus> on bge0 > brgphy0: <BCM5704 10/100/1000baseTX PHY> on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:33:b6:f4 > bge1: <Broadcom BCM5704 A3, ASIC rev. 0x2003> mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 > miibus2: <MII bus> on bge1 > brgphy1: <BCM5704 10/100/1000baseTX PHY> on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:33:b6:f5 > pci24: <base peripheral, interrupt controller> at device 11.1 (no driver attached) > atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 > atkbd0: <AT Keyboard> irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A, console > fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > pmtimer0 on isa0 > orm0: <ISA Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 > ppc0: parallel port not found. > sc0: <System console> at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled > da0 at twa0 bus 0 target 0 lun 0 > da0: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da0: 100.000MB/s transfers > da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da1 at twa0 bus 0 target 1 lun 0 > da1: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da1: 100.000MB/s transfers > da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da2 at twa0 bus 0 target 2 lun 0 > da2: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da2: 100.000MB/s transfers > da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da3 at twa0 bus 0 target 3 lun 0 > da3: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da3: 100.000MB/s transfers > da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da4 at twa0 bus 0 target 4 lun 0 > da4: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da4: 100.000MB/s transfers > da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da5 at twa0 bus 0 target 5 lun 0 > da5: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da5: 100.000MB/s transfers > da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da6 at twa0 bus 0 target 6 lun 0 > da6: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da6: 100.000MB/s transfers > da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da7 at twa0 bus 0 target 7 lun 0 > da7: <AMCC 9550SX-8LP DISK 3.04> Fixed Direct Access SCSI-3 device > da7: 100.000MB/s transfers > da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > Trying to mount root from ufs:/dev/da0s1a > WARNING: / was not properly dismounted > /: mount pending error: blocks 208 files 5 > > > ------------------------------------------------------------------------ > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?455A32B7.9080304>