From owner-freebsd-current@FreeBSD.ORG Mon Sep 14 10:44:23 2009 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 575811065676; Mon, 14 Sep 2009 10:44:23 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 31A018FC16; Mon, 14 Sep 2009 10:44:21 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA26114; Mon, 14 Sep 2009 13:44:20 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4AAE1E83.2010807@icyb.net.ua> Date: Mon, 14 Sep 2009 13:44:19 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.22 (X11/20090724) MIME-Version: 1.0 To: Kris Kennaway References: <4AAD4E51.5060908@FreeBSD.org> <4AAD5365.5000902@FreeBSD.org> <4AAD5DD2.4030104@FreeBSD.org> In-Reply-To: <4AAD5DD2.4030104@FreeBSD.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Alexander Motin , FreeBSD Current Subject: Re: ata timeouts under load X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Sep 2009 10:44:23 -0000 on 14/09/2009 00:02 Kris Kennaway said the following: > > It's always that sequence (with setfeatures timing out first, then the > dma later)...and the block number varies widely, also whether it's > read/write. The disk itself & the data it contains appears to be OK as > far as I have been able to determine so far. I also sometimes see something similar when I put very high load with a specific pattern on my two-disk mirrored zpool. The pattern is zpool scrub plus additional load like untarring large archives. Example: kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815 kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568159071 kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568158815 kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568159071 kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568158815 kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568159071 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284457041920 size=131072 error=5 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284456910848 size=131072 error=5 But I also see cases where dma timeout message appears first: ad10: FAILURE - READ_DMA48 status=51 error=40 LBA=568157535 ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158559 ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815 No errors happen whatsoever if I run scrub without any additional load, or if I do any 'typical' disk loads without parallel scrubbing. -- Andriy Gapon