From owner-freebsd-stable@FreeBSD.ORG Tue Nov 14 20:52:43 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 63CFE16A560 for ; Tue, 14 Nov 2006 20:52:43 +0000 (UTC) (envelope-from atanas@asd.aplus.net) Received: from pro20.abac.com (pro20.abac.com [66.226.64.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 42DE743E6D for ; Tue, 14 Nov 2006 20:47:38 +0000 (GMT) (envelope-from atanas@asd.aplus.net) Received: from [216.55.129.232] ([216.55.129.232]) (authenticated bits=0) by pro20.abac.com (8.13.8/8.13.8) with ESMTP id kAEKlZGR086891 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Nov 2006 12:47:35 -0800 (PST) (envelope-from atanas@asd.aplus.net) Message-ID: <455A2B6E.20705@asd.aplus.net> Date: Tue, 14 Nov 2006 12:47:42 -0800 From: Atanas User-Agent: Thunderbird 1.5.0.8 (Macintosh/20061025) MIME-Version: 1.0 To: adam radford References: <455A1DEA.20304@asd.aplus.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Score: 1.47 (SPF_SOFTFAIL) Cc: freebsd-stable@freebsd.org Subject: Re: twa: Passthru request timed out! Resetting controller... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 20:52:43 -0000 adam radford said the following on 11/14/06 11:56 AM: > > Are you running the latest 3ware firmware on that controller? > Yep. It's in dmesg.boot: twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 That's the latest one released as 9.3.0.7 on the 3ware website. Yesterday flashed and rebooted them all, and this morning I got the next crash. Regards, Atanas > > On 11/14/06, Atanas wrote: >> Has anyone experiencing this: >> >> twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = >> 0xca839d20 >> twa0: INFO: (0x16: 0x1108): Resetting controller...: >> >> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 >> >> ... >> twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 >> >> twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 >> >> twa0: INFO: (0x16: 0x1107): Controller reset done!: >> >> >> This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a >> number of machines with the following hardware configuration: >> >> - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM >> - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives >> (configured as 8 SINGLE DISK units, aka JBOD) >> >> All hardware components, including the server chassis, are listed in the >> 3ware hardware compatibility lists. It doesn't seem to be a cabling or >> power issue. The controller and hard drives are already flashed to the >> latest firmware revisions. I tried turning off NCQ, but it didn't make >> any difference. I tried also switching the kernel from PAE to non-PAE >> (reducing the usable memory to 3GB), but it didn't help either. >> >> I have another machines with similar I/O configurations (3ware), but >> with Intel motherboards and running FreeBSD-5.5, and these run fine for >> about a year already. Now I'm thinking about swapping the drives between >> a working Intel and AMD based box, to see where controller timeouts will >> follow. >> >> The problem happens sporadically once in a month or so and is very hard >> to reproduce. Sometimes it takes several weeks until the next crash >> happens, sometimes it crashes again in just a few hours. >> >> When the thing happens, the kernel sometimes panics (most likely due to >> the inconsistent filesystem state caused by the controller reset), >> sometimes just hangs. It can be interrupted (I have a serial console), >> but the only usable thing after that seems to be "call cpu_reset()", >> followed by full (and sometimes painfully long) filesystem check. >> >> Here are the diffs against the default GENERIC and PAE kernel >> configurations: >> >> < cpu I486_CPU >> < ident GENERIC >> < options INET6 # IPv6 communications protocols >> < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI >> >> > options QUOTA >> > options SMP # Symmetric MultiProcessor Kernel >> > options BREAK_TO_DEBUGGER >> > options DDB >> > options KDB >> > options KDB_UNATTENDED >> >> > options IPFIREWALL >> > options DUMMYNET >> >> I'm attaching the dmesg.boot following the latest crash. >> >> Regards, >> Atanas >> >> >> >> Copyright (c) 1992-2006 The FreeBSD Project. >> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 >> The Regents of the University of California. All rights reserved. >> FreeBSD is a registered trademark of The FreeBSD Foundation. >> FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 >> root@xyz:/var/obj/usr/src/sys/XYZ-PAE >> Timecounter "i8254" frequency 1193182 Hz quality 0 >> CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) >> Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 >> >> Features=0x178bfbff >> >> Features2=0x1 >> AMD Features=0xe2500800 >> AMD Features2=0x3 >> Cores per package: 2 >> real memory = 5368709120 (5120 MB) >> avail memory = 4182241280 (3988 MB) >> ACPI APIC Table: >> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs >> cpu0 (BSP): APIC ID: 0 >> cpu1 (AP): APIC ID: 1 >> cpu2 (AP): APIC ID: 2 >> cpu3 (AP): APIC ID: 3 >> ioapic0 irqs 0-23 on motherboard >> ioapic1 irqs 24-27 on motherboard >> ioapic2 irqs 28-31 on motherboard >> kbd1 at kbdmux0 >> acpi0: on motherboard >> acpi0: Power Button (fixed) >> Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 >> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 >> cpu0: on acpi0 >> cpu1: on acpi0 >> cpu2: on acpi0 >> cpu3: on acpi0 >> acpi_button0: on acpi0 >> pcib0: port 0xcf8-0xcff on acpi0 >> pci0: on pcib0 >> pci0: at device 0.0 (no driver attached) >> isab0: at device 1.0 on pci0 >> isa0: on isab0 >> pci0: at device 1.1 (no driver attached) >> pci0: at device 2.0 (no driver attached) >> atapci0: port >> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 >> ata0: on atapci0 >> ata1: on atapci0 >> pcib1: at device 9.0 on pci0 >> pci1: on pcib1 >> pci1: at device 6.0 (no driver attached) >> fxp0: port 0x2400-0x243f mem >> 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 >> miibus0: on fxp0 >> inphy0: on miibus0 >> inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto >> fxp0: Ethernet address: 00:e0:81:33:b5:f1 >> pcib2: at device 13.0 on pci0 >> pci2: on pcib2 >> pcib3: at device 14.0 on pci0 >> pci3: on pcib3 >> pcib4: port 0xcf8-0xcff on acpi0 >> pci24: on pcib4 >> pcib5: at device 10.0 on pci24 >> pci25: on pcib5 >> 3ware device driver for 9000 series storage controllers, version: >> 3.60.02.012 >> twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem >> 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 >> twa0: [FAST] >> twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 >> ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 >> pci24: at device 10.1 (no >> driver attached) >> pcib6: at device 11.0 on pci24 >> pci26: on pcib6 >> bge0: mem >> 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 >> miibus1: on bge0 >> brgphy0: on miibus1 >> brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, >> 1000baseTX-FDX, auto >> bge0: Ethernet address: 00:e0:81:33:b6:f4 >> bge1: mem >> 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 >> miibus2: on bge1 >> brgphy1: on miibus2 >> brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, >> 1000baseTX-FDX, auto >> bge1: Ethernet address: 00:e0:81:33:b6:f5 >> pci24: at device 11.1 (no >> driver attached) >> atkbdc0: port 0x60,0x64 irq 1 on acpi0 >> atkbd0: irq 1 on atkbdc0 >> kbd0 at atkbd0 >> atkbd0: [GIANT-LOCKED] >> sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 >> on acpi0 >> sio0: type 16550A, console >> fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on >> acpi0 >> fdc0: [FAST] >> fd0: <1440-KB 3.5" drive> on fdc0 drive 0 >> pmtimer0 on isa0 >> orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 >> ppc0: parallel port not found. >> sc0: at flags 0x100 on isa0 >> sc0: VGA <16 virtual consoles, flags=0x300> >> sio1: configured irq 3 not in bitmap of probed irqs 0 >> sio1: port may not be enabled >> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 >> Timecounters tick every 1.000 msec >> ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding >> disabled, default to deny, logging disabled >> da0 at twa0 bus 0 target 0 lun 0 >> da0: Fixed Direct Access SCSI-3 device >> da0: 100.000MB/s transfers >> da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da1 at twa0 bus 0 target 1 lun 0 >> da1: Fixed Direct Access SCSI-3 device >> da1: 100.000MB/s transfers >> da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da2 at twa0 bus 0 target 2 lun 0 >> da2: Fixed Direct Access SCSI-3 device >> da2: 100.000MB/s transfers >> da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da3 at twa0 bus 0 target 3 lun 0 >> da3: Fixed Direct Access SCSI-3 device >> da3: 100.000MB/s transfers >> da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da4 at twa0 bus 0 target 4 lun 0 >> da4: Fixed Direct Access SCSI-3 device >> da4: 100.000MB/s transfers >> da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da5 at twa0 bus 0 target 5 lun 0 >> da5: Fixed Direct Access SCSI-3 device >> da5: 100.000MB/s transfers >> da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da6 at twa0 bus 0 target 6 lun 0 >> da6: Fixed Direct Access SCSI-3 device >> da6: 100.000MB/s transfers >> da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> da7 at twa0 bus 0 target 7 lun 0 >> da7: Fixed Direct Access SCSI-3 device >> da7: 100.000MB/s transfers >> da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) >> SMP: AP CPU #1 Launched! >> SMP: AP CPU #2 Launched! >> SMP: AP CPU #3 Launched! >> Trying to mount root from ufs:/dev/da0s1a >> WARNING: / was not properly dismounted >> /: mount pending error: blocks 208 files 5 >> >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"