Date: Mon, 14 Sep 2009 13:44:19 +0300 From: Andriy Gapon <avg@icyb.net.ua> To: Kris Kennaway <kris@FreeBSD.org> Cc: Alexander Motin <mav@FreeBSD.org>, FreeBSD Current <current@FreeBSD.org> Subject: Re: ata timeouts under load Message-ID: <4AAE1E83.2010807@icyb.net.ua> In-Reply-To: <4AAD5DD2.4030104@FreeBSD.org> References: <4AAD4E51.5060908@FreeBSD.org> <4AAD5365.5000902@FreeBSD.org> <4AAD5DD2.4030104@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
on 14/09/2009 00:02 Kris Kennaway said the following: > > It's always that sequence (with setfeatures timing out first, then the > dma later)...and the block number varies widely, also whether it's > read/write. The disk itself & the data it contains appears to be OK as > far as I have been able to determine so far. I also sometimes see something similar when I put very high load with a specific pattern on my two-disk mirrored zpool. The pattern is zpool scrub plus additional load like untarring large archives. Example: kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815 kernel: ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568159071 kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568158815 kernel: ad10: TIMEOUT - READ_DMA48 retrying (0 retries left) LBA=568159071 kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly kernel: ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568158815 kernel: ad10: FAILURE - READ_DMA48 timed out LBA=568159071 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284457041920 size=131072 error=5 root: ZFS: vdev I/O failure, zpool=tank path=/dev/ad10s2d offset=284456910848 size=131072 error=5 But I also see cases where dma timeout message appears first: ad10: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=568157535 ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing request directly ad10: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing request directly ad10: WARNING - SET_MULTI taskqueue timeout - completing request directly ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158559 ad10: TIMEOUT - READ_DMA48 retrying (1 retry left) LBA=568158815 No errors happen whatsoever if I run scrub without any additional load, or if I do any 'typical' disk loads without parallel scrubbing. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4AAE1E83.2010807>