From owner-freebsd-bugs@FreeBSD.ORG Tue Oct 18 16:10:13 2011 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AFA0B1065670 for ; Tue, 18 Oct 2011 16:10:13 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 9491C8FC12 for ; Tue, 18 Oct 2011 16:10:13 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p9IGADVw072610 for ; Tue, 18 Oct 2011 16:10:13 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p9IGADhI072609; Tue, 18 Oct 2011 16:10:13 GMT (envelope-from gnats) Date: Tue, 18 Oct 2011 16:10:13 GMT Message-Id: <201110181610.p9IGADhI072609@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Armin Pirkovitsch Cc: Subject: Re: kern/161768: Panics after AHCI timeouts X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Armin Pirkovitsch List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Oct 2011 16:10:13 -0000 The following reply was made to PR kern/161768; it has been noted by GNATS. From: Armin Pirkovitsch To: Alexander Motin Cc: Alexey Shuvaev , bug-followup@FreeBSD.org Subject: Re: kern/161768: Panics after AHCI timeouts Date: Tue, 18 Oct 2011 18:03:03 +0200 On 10/18/11 17:48, Alexander Motin wrote: > Armin Pirkovitsch wrote: >> same problem here: >> machine 1: >> ahci0: mem 0xfbcfe000-0xfbcfffff >> irq 16 at device 0.0 on pci4 >> ahci0: AHCI v1.00 with 2 3Gbps ports, Port Multiplier supported >> ahcich0: at channel 0 on ahci0 >> ahcich1: at channel 1 on ahci0 >> ahci1: port >> 0x9c00-0x9c07,0x9880-0x9883,0x9800-0x9807,0x9480-0x9483,0x9400-0x941f >> mem 0xf7ffc000-0xf7ffc7ff irq 20 at device 31.2 on pci0 >> ahci1: AHCI v1.20 with 6 3Gbps ports, Port Multiplier supported >> ahcich2: at channel 0 on ahci1 >> ahcich3: at channel 1 on ahci1 >> ahcich4: at channel 2 on ahci1 >> ahcich5: at channel 3 on ahci1 >> ahcich6: at channel 4 on ahci1 >> ahcich7: at channel 5 on ahci1 >> ada0 at ahcich2 bus 0 scbus4 target 0 lun 0 >> ada0: ATA-8 SATA 3.x device >> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) >> ada0: Command Queueing enabled >> ada0: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C) >> ada0: Previously was known as ad10 >> ada1 at ahcich3 bus 0 scbus5 target 0 lun 0 >> ada1: ATA-7 SATA 2.x device >> ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) >> ada1: Command Queueing enabled >> ada1: 1430799MB (2930277168 512 byte sectors: 16H 63S/T 16383C) >> ada1: Previously was known as ad12 >> ada2 at ahcich4 bus 0 scbus6 target 0 lun 0 >> ada2: ATA-7 SATA 2.x device >> ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) >> ada2: Command Queueing enabled >> ada2: 1430799MB (2930277168 512 byte sectors: 16H 63S/T 16383C) >> ada2: Previously was known as ad14 >> >> machine 2: >> ahci0: port >> 0x5058-0x505f,0x5084-0x5087,0x5050-0x5057,0x5080-0x5083,0x5020-0x503f >> mem 0xb7806000-0xb78067ff irq 19 at device 31.2 on pci0 >> ahci0: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported >> ahcich0: at channel 0 on ahci0 >> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 >> ada0: ATA-8 SATA 3.x device >> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) >> ada0: Command Queueing enabled >> ada0: 114473MB (234441648 512 byte sectors: 16H 63S/T 16383C) >> ada0: Previously was known as ad4 >> >> I doubt it is a Samsung problem (same problem on my Corsair SSDs) and my >> assumption that it is an ata-intel driver problem seems to be wrong as >> well (as you have a non-intel sata controller) > > What do you mean by the "same problem"? > > Command timeouts -- they could have million different reasons > (controllers, disks, firmwares, cables, power, radio interference, ...) > and just yesterday I've promised you to experiment more with error recovery. > > Panics -- your backtraces look completely different from reported in > this PR and I have already comment to your that panic happens in file > system code. Ask file system people please. > In both cases first the disk stops to reply and then the system dies somehow - sounds similar to me. Imho the real problem is not the panic but the stuff that happens before - the panic is just the result of some operation which was unable to finish cleanly and therefor panics (in my case the fs which was unable to finish writing). And the same ahcich0: ... stuff is happening prior to the panic in this pr - that's why I saw some relevance (See Alexey's mail to current@ "Re: Panics after AHCI timeouts" at 15:13 (+2) 18.Oct.2011 )