From owner-freebsd-current@FreeBSD.ORG  Thu Aug 30 23:07:32 2007
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E707F16A41B
	for <freebsd-current@freebsd.org>; Thu, 30 Aug 2007 23:07:32 +0000 (UTC)
	(envelope-from prvs=176275eba5=killing@multiplay.co.uk)
Received: from multiplay.co.uk (core6.multiplay.co.uk [85.236.96.23])
	by mx1.freebsd.org (Postfix) with ESMTP id 54FD413C46A
	for <freebsd-current@freebsd.org>; Thu, 30 Aug 2007 23:07:31 +0000 (UTC)
	(envelope-from prvs=176275eba5=killing@multiplay.co.uk)
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on
	core6.multiplay.co.uk
X-Spam-Level: 
X-Spam-Status: No, score=-14.7 required=6.0 tests=BAYES_00, USER_IN_WHITELIST, 
	USER_IN_WHITELIST_TO autolearn=ham version=3.1.8
Received: from r2d2 ([212.135.219.182])
	by multiplay.co.uk (multiplay.co.uk [85.236.96.23])
	(MDaemon PRO v9.6.0) with ESMTP id md50004157642.msg
	for <freebsd-current@freebsd.org>; Thu, 30 Aug 2007 23:49:27 +0100
Message-ID: <03b401c7eb57$ee714030$b6db87d4@multiplay.co.uk>
From: "Steven Hartland" <killing@multiplay.co.uk>
To: "Mark Powell" <M.S.Powell@salford.ac.uk>,
	<freebsd-current@freebsd.org>
References: <20070830183305.X60345@rust.salford.ac.uk>
Date: Thu, 30 Aug 2007 23:48:49 +0100
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-1";
	reply-type=response
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.3138
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
X-MDRemoteIP: 212.135.219.182
X-Return-Path: prvs=176275eba5=killing@multiplay.co.uk
X-Envelope-From: killing@multiplay.co.uk
X-MDaemon-Deliver-To: freebsd-current@freebsd.org
X-Spam-Processed: multiplay.co.uk, Thu, 30 Aug 2007 23:49:27 +0100
X-MDAV-Processed: multiplay.co.uk, Thu, 30 Aug 2007 23:49:27 +0100
Cc: 
Subject: Re: Another ZFS kernel panic on same block on every drive in raidz
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Aug 2007 23:07:33 -0000

That sounds very much like an overflow error on the controller / drive.

We had a very similar issue with the Highpoint 1820a drivers which turned out
to be compatibility issue with the drive firmware and the controller.

The controller was using standard LBA to access the drive up until the point
where 48-bit LBA was required. This caused issues with, in this case Seagate
drives, which would report an error when using this method after a specific
point. The fix was for the controller to always use 48-bit addressing for
the drives which supported it.

Hope this helps.

    Regards
    Steve

----- Original Message ----- 
From: "Mark Powell" <M.S.Powell@salford.ac.uk>
To: <freebsd-current@freebsd.org>
Sent: Thursday, August 30, 2007 6:47 PM
Subject: Another ZFS kernel panic on same block on every drive in raidz


> Hi,
>   I am testing a 3 drive raidz1 array which has been built with 3 new WD 
> 500GB SATA drives /dev/ad1[468], bought from 2 different sources.
>   I am being told that a dma error is occuring on the same block on all 3 
> drives at the same time:
> 
> Aug 30 18:13:15 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:15 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:15 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: FAILURE - WRITE_DMA timed out LBA=268435340
> Aug 30 18:13:46 echo kernel: ad14: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad18: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:46 echo kernel: ad16: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435340
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad14s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad16s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:25 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad18s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad18s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad14s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path=/dev/ad16s2 offset=132076011520 size=65536 error=5
> Aug 30 18:13:41 echo root: ZFS: vdev I/O failure, zpool=pool path= offset=396215451648 size=131072 error=5
> 
> And then the kernel panics:
> 
> panic: ZFS: I/O failure (write on <unknown> off 0: zio 0xffffff0013b0d000 
> [L0 ZFS plain file] 20000L/20000P DVA[0]=<5:5c40480000:30000> fletcher2 
> uncompressed LE contiguous birth=20167 fill=1 cksum=cfcfcfcfcfcfce00:cfcfcfcfcfcfce00:8a8a8a8a8a56e700:8a8a8a8a8a56e
> cpuid = 0
> 
>   I think I saw someone else have a similar problem to this. There were 
> told their hardware was probably flakey on to look for errors with geli.
>   Just performing a scrub now to see what happens.
>   Let me know if you need any further info.
>   Cheers.
> 
> -- 
> Mark Powell - UNIX System Administrator - The University of Salford
> Information Services Division, Clifford Whitworth Building,
> Salford University, Manchester, M5 4WT, UK.
> Tel: +44 161 295 4837  Fax: +44 161 295 5888  www.pgp.com for PGP key
> _______________________________________________
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
>

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.