From owner-freebsd-stable@FreeBSD.ORG Tue Nov 14 19:59:14 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0CD7F16A4D2 for ; Tue, 14 Nov 2006 19:59:14 +0000 (UTC) (envelope-from aradford@gmail.com) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id AE5C643DF5 for ; Tue, 14 Nov 2006 19:56:28 +0000 (GMT) (envelope-from aradford@gmail.com) Received: by wr-out-0506.google.com with SMTP id i20so735119wra for ; Tue, 14 Nov 2006 11:56:27 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=ZhdZDVCxz1EL57U003YLojel2ZkmE6qjHs248PKWaxTrQVZBEKvjXmBrjnAUI0FvqMi6lC4S1qy9LCbK27rPl+S/TUOb2a5pnq6pfUkNo6DD614yWCCLpDmU9ABcdE8H7vFKLPO14qTW+fNaZedAf/UBAZ+RlCI5v4UOa03TdqA= Received: by 10.78.25.11 with SMTP id 11mr1329663huy.1163534185693; Tue, 14 Nov 2006 11:56:25 -0800 (PST) Received: by 10.78.154.13 with HTTP; Tue, 14 Nov 2006 11:56:25 -0800 (PST) Message-ID: Date: Tue, 14 Nov 2006 11:56:25 -0800 From: "adam radford" To: Atanas In-Reply-To: <455A1DEA.20304@asd.aplus.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <455A1DEA.20304@asd.aplus.net> Cc: freebsd-stable@freebsd.org Subject: Re: twa: Passthru request timed out! Resetting controller... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 19:59:14 -0000 Atanas, Are you running the latest 3ware firmware on that controller? -Adam On 11/14/06, Atanas wrote: > Has anyone experiencing this: > > twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request = > 0xca839d20 > twa0: INFO: (0x16: 0x1108): Resetting controller...: > > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0 > > ... > twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7 > > twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1 > > twa0: INFO: (0x16: 0x1107): Controller reset done!: > > > This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a > number of machines with the following hardware configuration: > > - Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM > - 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives > (configured as 8 SINGLE DISK units, aka JBOD) > > All hardware components, including the server chassis, are listed in the > 3ware hardware compatibility lists. It doesn't seem to be a cabling or > power issue. The controller and hard drives are already flashed to the > latest firmware revisions. I tried turning off NCQ, but it didn't make > any difference. I tried also switching the kernel from PAE to non-PAE > (reducing the usable memory to 3GB), but it didn't help either. > > I have another machines with similar I/O configurations (3ware), but > with Intel motherboards and running FreeBSD-5.5, and these run fine for > about a year already. Now I'm thinking about swapping the drives between > a working Intel and AMD based box, to see where controller timeouts will > follow. > > The problem happens sporadically once in a month or so and is very hard > to reproduce. Sometimes it takes several weeks until the next crash > happens, sometimes it crashes again in just a few hours. > > When the thing happens, the kernel sometimes panics (most likely due to > the inconsistent filesystem state caused by the controller reset), > sometimes just hangs. It can be interrupted (I have a serial console), > but the only usable thing after that seems to be "call cpu_reset()", > followed by full (and sometimes painfully long) filesystem check. > > Here are the diffs against the default GENERIC and PAE kernel > configurations: > > < cpu I486_CPU > < ident GENERIC > < options INET6 # IPv6 communications protocols > < options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI > > > options QUOTA > > options SMP # Symmetric MultiProcessor Kernel > > options BREAK_TO_DEBUGGER > > options DDB > > options KDB > > options KDB_UNATTENDED > > > options IPFIREWALL > > options DUMMYNET > > I'm attaching the dmesg.boot following the latest crash. > > Regards, > Atanas > > > > Copyright (c) 1992-2006 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 6.2-PRERELEASE #0: Mon Nov 13 17:47:40 PST 2006 > root@xyz:/var/obj/usr/src/sys/XYZ-PAE > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: Dual Core AMD Opteron(tm) Processor 270 (2009.27-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x20f12 Stepping = 2 > Features=0x178bfbff > Features2=0x1 > AMD Features=0xe2500800 > AMD Features2=0x3 > Cores per package: 2 > real memory = 5368709120 (5120 MB) > avail memory = 4182241280 (3988 MB) > ACPI APIC Table: > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > cpu0 (BSP): APIC ID: 0 > cpu1 (AP): APIC ID: 1 > cpu2 (AP): APIC ID: 2 > cpu3 (AP): APIC ID: 3 > ioapic0 irqs 0-23 on motherboard > ioapic1 irqs 24-27 on motherboard > ioapic2 irqs 28-31 on motherboard > kbd1 at kbdmux0 > acpi0: on motherboard > acpi0: Power Button (fixed) > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 > cpu0: on acpi0 > cpu1: on acpi0 > cpu2: on acpi0 > cpu3: on acpi0 > acpi_button0: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pci0: at device 0.0 (no driver attached) > isab0: at device 1.0 on pci0 > isa0: on isab0 > pci0: at device 1.1 (no driver attached) > pci0: at device 2.0 (no driver attached) > atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x1400-0x140f at device 6.0 on pci0 > ata0: on atapci0 > ata1: on atapci0 > pcib1: at device 9.0 on pci0 > pci1: on pcib1 > pci1: at device 6.0 (no driver attached) > fxp0: port 0x2400-0x243f mem 0xda101000-0xda101fff,0xda120000-0xda13ffff irq 16 at device 8.0 on pci1 > miibus0: on fxp0 > inphy0: on miibus0 > inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > fxp0: Ethernet address: 00:e0:81:33:b5:f1 > pcib2: at device 13.0 on pci0 > pci2: on pcib2 > pcib3: at device 14.0 on pci0 > pci3: on pcib3 > pcib4: port 0xcf8-0xcff on acpi0 > pci24: on pcib4 > pcib5: at device 10.0 on pci24 > pci25: on pcib5 > 3ware device driver for 9000 series storage controllers, version: 3.60.02.012 > twa0: <3ware 9000 series Storage Controller> port 0x3000-0x303f mem 0xde000000-0xdfffffff,0xdc300000-0xdc300fff irq 27 at device 3.0 on pci25 > twa0: [FAST] > twa0: INFO: (0x15: 0x1300): Controller details:: Model 9550SX-8LP, 8 ports, Firmware FE9X 3.04.01.011, BIOS BE9X 3.04.00.002 > pci24: at device 10.1 (no driver attached) > pcib6: at device 11.0 on pci24 > pci26: on pcib6 > bge0: mem 0xdc410000-0xdc41ffff,0xdc400000-0xdc40ffff irq 28 at device 9.0 on pci26 > miibus1: on bge0 > brgphy0: on miibus1 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge0: Ethernet address: 00:e0:81:33:b6:f4 > bge1: mem 0xdc430000-0xdc43ffff,0xdc420000-0xdc42ffff irq 29 at device 9.1 on pci26 > miibus2: on bge1 > brgphy1: on miibus2 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto > bge1: Ethernet address: 00:e0:81:33:b6:f5 > pci24: at device 11.1 (no driver attached) > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 > sio0: type 16550A, console > fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: [FAST] > fd0: <1440-KB 3.5" drive> on fdc0 drive 0 > pmtimer0 on isa0 > orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc97ff on isa0 > ppc0: parallel port not found. > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > sio1: configured irq 3 not in bitmap of probed irqs 0 > sio1: port may not be enabled > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > Timecounters tick every 1.000 msec > ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, default to deny, logging disabled > da0 at twa0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 100.000MB/s transfers > da0: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da1 at twa0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-3 device > da1: 100.000MB/s transfers > da1: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da2 at twa0 bus 0 target 2 lun 0 > da2: Fixed Direct Access SCSI-3 device > da2: 100.000MB/s transfers > da2: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da3 at twa0 bus 0 target 3 lun 0 > da3: Fixed Direct Access SCSI-3 device > da3: 100.000MB/s transfers > da3: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da4 at twa0 bus 0 target 4 lun 0 > da4: Fixed Direct Access SCSI-3 device > da4: 100.000MB/s transfers > da4: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da5 at twa0 bus 0 target 5 lun 0 > da5: Fixed Direct Access SCSI-3 device > da5: 100.000MB/s transfers > da5: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da6 at twa0 bus 0 target 6 lun 0 > da6: Fixed Direct Access SCSI-3 device > da6: 100.000MB/s transfers > da6: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > da7 at twa0 bus 0 target 7 lun 0 > da7: Fixed Direct Access SCSI-3 device > da7: 100.000MB/s transfers > da7: 476827MB (976541696 512 byte sectors: 255H 63S/T 60786C) > SMP: AP CPU #1 Launched! > SMP: AP CPU #2 Launched! > SMP: AP CPU #3 Launched! > Trying to mount root from ufs:/dev/da0s1a > WARNING: / was not properly dismounted > /: mount pending error: blocks 208 files 5 > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > >