From owner-freebsd-fs@FreeBSD.ORG Sun Jan 27 15:29:05 2013 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A3FE55B8; Sun, 27 Jan 2013 15:29:05 +0000 (UTC) (envelope-from universite@ukr.net) Received: from ffe11.ukr.net (ffe11.ukr.net [195.214.192.31]) by mx1.freebsd.org (Postfix) with ESMTP id 5511A734; Sun, 27 Jan 2013 15:29:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ukr.net; s=ffe; h=Date:Message-Id:From:To:References:In-Reply-To:Subject:Cc:Content-Type:Content-Transfer-Encoding:MIME-Version; bh=KA3pILGNPBJkWcpAOJAYYa+9vxKGxcXhrIZ+SsImMNo=; b=C/fyFj5QdF2pqtrKTZinJdNeCy1WcGK2v1aRtcfK8JlWdZ66n2w1VaslcpA2T3rskP6ez496vmWq14Y/kIdfYMbfj1M6GmyQg5Q431bDrf99VTN7cpJWdeqjLKhHPPTsh2UBMmdj0ASbG+X/Sv8Z+KsSFo4rNlE7HirG30yUXxY=; Received: from mail by ffe11.ukr.net with local ID 1TzTvB-000Idu-Kk ; Sun, 27 Jan 2013 17:13:25 +0200 MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: binary Content-Type: text/plain; charset="windows-1251" Subject: Re[2]: Re[2]: AHCI timeout when using ZFS + AIO + NCQ In-Reply-To: <16B555759C2041ED8185DF478193A59D@multiplay.co.uk> References: <16B555759C2041ED8185DF478193A59D@multiplay.co.uk> <93308.1359297551.14145052969567453184@ffe15.ukr.net> <13391.1359029978.3957795939058384896@ffe16.ukr.net> <221B307551154F489452F89E304CA5F7@multiplay.co.uk> To: "Steven Hartland" From: "Vladislav Prodan" X-Mailer: freemail.ukr.net 4.0 Message-Id: <70362.1359299605.3196836531757973504@ffe11.ukr.net> X-Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0 Date: Sun, 27 Jan 2013 17:13:25 +0200 Cc: current@freebsd.org, fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jan 2013 15:29:05 -0000 > ----- Original Message ----- > From: "Vladislav Prodan" > > >> Is it always the same disk, of so replace it SMART helps identify issues > >> but doesn't tell you 100% there's no problem. > > > > > > Now it has fallen off a different HDD - ada0. > > I'm 99% sure that MHDD will not find problems in HDD - ada0 and ada2. > > I still have three servers with similar chipsets that have similar problems > > with blade ahci times out. > > I notice your disks are connecting at SATA 3.x, which rings bells. We had > a very similar issue on a new Supermicro machine here and after much > testing we proved to our satisfaction that the problem was the HW. I have a motherboard ASUS M5A97 PRO http://www.asus.com/Motherboard/M5A97_PRO/#specifications Has replacement SATA data cables. Putting hard RAID controller does not guarantee data recovery at his death. > Essentially the combination of SATA 3 speeds the midplane / backplane > degraded the connection between the MB and HDD enough to cause > the disks to randomly drop when under load. > > If we connected the disks directly to the MB with SATA cables the > problem went away. In the end we had midplanes changed from an > AHCI pass-through to active LSI controller. > > So if you have any sort of midplane / backplane connecting your disks > try connecting them direct to the MB / controller via known SATA 3.x > compliant cables and see if that stops the drops. > > Another test you can do is to force the disks to connect at SATA 2.x > this also fixed it in our case, but wasn't something we wanted to > put into production hence the controller swap. > > To force SATA 2 speeds you can use the following in /boot/loader.conf > where 'X' is disk identifier e.g. for ada0 X = 0:- > hint.ahcich.X.sata_rev=2 > > Hope this helps. > > Regards > Steve > -- Vladislav V. Prodan System & Network Administrator http://support.od.ua +380 67 4584408, +380 99 4060508 VVP88-RIPE