From owner-freebsd-bugs@FreeBSD.ORG Thu Apr 4 07:50:01 2013 Return-Path: Delivered-To: freebsd-bugs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4C8A89C9 for ; Thu, 4 Apr 2013 07:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 3EB36C3D for ; Thu, 4 Apr 2013 07:50:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.6/8.14.6) with ESMTP id r347o1dp085634 for ; Thu, 4 Apr 2013 07:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.6/8.14.6/Submit) id r347o1u6085611; Thu, 4 Apr 2013 07:50:01 GMT (envelope-from gnats) Date: Thu, 4 Apr 2013 07:50:01 GMT Message-Id: <201304040750.r347o1u6085611@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org Cc: From: Alexander Motin Subject: Re: kern/157397: [ada] ahci/ada/cam NCQ timeouts on Samsung and non-disable-ability X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Alexander Motin List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Apr 2013 07:50:01 -0000 The following reply was made to PR kern/157397; it has been noted by GNATS. From: Alexander Motin To: Matthias Andree Cc: bug-followup@FreeBSD.org Subject: Re: kern/157397: [ada] ahci/ada/cam NCQ timeouts on Samsung and non-disable-ability Date: Thu, 04 Apr 2013 10:49:43 +0300 On 04.04.2013 01:08, Matthias Andree wrote: > - I am running with kern.cam.ada.default_timeout=5 which makes the > computer recover faster There is no specific timeout value in ATA specification. 30 seconds is probably kind of tradition. Drives without TLER (desktop models) may have unexpectedly high number of error recovery retries. But 5 seconds may be not enough to spin-up in some cases even for perfectly healthy drive. > - write/read status for stalls is unclear to me, but the kernel only > ever logs WRITE_FPDMA_QUEUED, so I guess the answer is "write". > > "rm -rf /usr/obj" or "log in to GNOME and try starting gnome-terminal" > are sufficient to trigger it. > > > - reducing the number of tags to 31 does not appear to help. Linux's > libata does that only to distinguish the bit mask 0xffffffff it might > get with 32 tags from "fatal errors". I have no explanation why 31 tag could be better then 32 from only ATA/AHCI specs. For siis(4) and mvs(4) that limitation is a part of hardware design. My guess is that it can be useful for AHCI during controller hot-plug, when missing controller will return 0xffffffff on any read. But so far it is irrelevant for us due to mostly missing PCI hot-plug support yet. It is not the case in logs provided. > Logs through "egrep ahcich1\|ada1\|pass1\|ahci0" available from > , with Serial > numbers removed. > > OBSERVE that this only ever affects odd-numbered slots, never > even-numbered slots. Interesting observation, but I don't have explanation to it. All slots are equal from the specs point of view. -- Alexander Motin