From owner-freebsd-stable@FreeBSD.ORG  Sun Oct 14 14:09:02 2007
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: freebsd-stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 92BDA16A417;
	Sun, 14 Oct 2007 14:09:02 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from ecngs.de (mail.ecngs.de [217.73.144.50])
	by mx1.freebsd.org (Postfix) with ESMTP id B409413C48E;
	Sun, 14 Oct 2007 14:09:01 +0000 (UTC)
	(envelope-from d_elbracht@ecngs.de)
Received: from EC1a (ec1.elbracht.net [217.73.144.99]) 
	by ecngs.de (SurgeMail 3.8f2) with ESMTP id 1773204-1922481 
	for multiple; Sun, 14 Oct 2007 16:09:11 +0200
From: "d_elbracht" <d_elbracht@ecngs.de>
To: "'Scott Long'" <scottl@samsco.org>
References: <008801c80e65$47cbe650$639049d9@EC1a> <47121F9F.7050900@samsco.org>
Date: Sun, 14 Oct 2007 16:08:44 +0200
Message-ID: <008d01c80e6b$bb95b7e0$639049d9@EC1a>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 11
Thread-Index: AcgOa5ciuX2g50BFT+K6MBnLZJ0DxQAASbtQ
In-Reply-To: <47121F9F.7050900@samsco.org>
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138
Cc: freebsd-stable@freebsd.org, freebsd-geom@freebsd.org
Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400,
	length=8192)]error = 5
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 14 Oct 2007 14:09:02 -0000

> > we are trying to diagnose errors seen on 6.2, SMP, amd64, 
> cvsup'ed of
> > 2007-10-09
> > 
> > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x 
> > Opteron 2216, da3 is on a 3ware 9550-12
> > 
> > we are seeing this error:
> > g_vfs_done():da3s1a[READ(offset=81064794762854400, 
> length=8192)]error 
> > = 5 on a 12 GB Hyperdrive
> > 
> > the offset changes sometimes, but it is always 
> 81064794xxxxxxxxx and 
> > well out the 12GB range.
> > 
> > We did have the Hyperdrive connected directly to the 
> mainboards SATA0 
> > (ad4) with similar errors.
> > We used to have a md instead of the hyperdrive before, 
> coming up with 
> > similar errors.
> > 
> > Blocksize on the partition is 8192 (newsfs -b 8192 ..). 
> > We did have a blocksize of 65536 before, but after some hours 
> > (sometimes days), the machine will be unresponsible with 
> "newbuf" as a 
> > waitmessage in top and has to be hard-reset.
> > Regarding "newbuf", as well as nbufkv and nbufbs, I will write a 
> > seperate message to the list.
> > 
> > According to systat -vm, da3 does tps > 500 (yes, that's a lot)
> > 
> > This leads to an assumption, the error has to do with very high IOs 
> > per second on a SMP machine.
> > The system-disk is a RAID1 on an ICP 5805. All other disks 
> (51) are 20 
> > gstripe'd partitions.
> > 
> > Any hint to diagnose / fix the problem is well appreciated.
> > 
> > Cheers,
> > 
> > Dieter
> > 
> 
> I can geneate 30,000 I/O's per second for hours on end on 
> several types of storage hardware on FreeBSD SMP, and have no 
> problems.  Since you're seeing this problem both when 
> connected to a 3ware controller and when connected to a 
> simple ATA/SATA controller (both of which have also been 
> observed to do high amounts of I/O with no problems), I 
> suspect that the problem is with your disk device, not with 
> FreeBSD.  I don't know anything about a "hyperdrive" though, 
> so more information might help.
> 
> Scott

Well, how about this:
> > We used to have a md instead of the hyperdrive before, 
> coming up with 
> > similar errors.

here ist some info about the hyperdrive.
http://www.hyperossystems.co.uk/

We could go back the the md (memory-disk) to try again. 

What exactly does the "offset" in the error-message mean ? Isn't that like a
seek on the disk ? And what does "error=5" mean ?

Sure, the whole thing could be a problem of the application running. It's
diablo 5. The history file (dhistory) about 2 GB in size resides on the
hyperdrive. 

Dieter