From owner-freebsd-geom@FreeBSD.ORG Mon Oct 15 14:17:21 2007 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E56016A418; Mon, 15 Oct 2007 14:17:21 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from ns.trinitel.com (186.161.36.72.static.reverse.ltdomains.com [72.36.161.186]) by mx1.freebsd.org (Postfix) with ESMTP id 310B813C467; Mon, 15 Oct 2007 14:17:20 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from proton.storspeed.com (209-163-168-124.static.twtelecom.net [209.163.168.124]) (authenticated bits=0) by ns.trinitel.com (8.14.1/8.14.1) with ESMTP id l9FEGRLq005947; Mon, 15 Oct 2007 09:16:30 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <47137634.1010703@freebsd.org> Date: Mon, 15 Oct 2007 09:16:20 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.6 (Macintosh/20070728) MIME-Version: 1.0 To: d_elbracht References: <008801c80e65$47cbe650$639049d9@EC1a> <00cb01c80f04$50b11ed0$639049d9@EC1a> In-Reply-To: <00cb01c80f04$50b11ed0$639049d9@EC1a> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on ns.trinitel.com Cc: 'Ivan Voras' , freebsd-geom@freebsd.org Subject: Re: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2007 14:17:21 -0000 d_elbracht wrote: >>> we are trying to diagnose errors seen on 6.2, SMP, amd64, >> cvsup'ed of >>> 2007-10-09 >>> >>> Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x >>> Opteron 2216, da3 is on a 3ware 9550-12 >>> >>> we are seeing this error: >>> g_vfs_done():da3s1a[READ(offset=81064794762854400, >> length=8192)]error >>> = 5 on a 12 GB Hyperdrive >>> >>> the offset changes sometimes, but it is always >> 81064794xxxxxxxxx and >>> well out the 12GB range. >> Yes. >> >>> According to systat -vm, da3 does tps > 500 (yes, that's a lot) >> That's not a lot :) That's actually low for a modern solid >> state drive. >> >>> This leads to an assumption, the error has to do with very high IOs >>> per second on a SMP machine. >> Either that or file system errors. Does fsck run ok or does >> it say anything unusual? >> >> There are several theoretical reasons for such errors that >> are connected with the fact you use solid state drives, but >> all are tricky to diagnose if you don't have a certain >> repeatable test you can try. For example: >> some SSDs optimize writes to "spread out" the IO on the >> chips, but some do it by looking into file system structures >> to determine where it's safe to relocate the write - >> obviously this works only with a known and supported file >> system. This is a really wild guess, but maybe the SSD >> firmware has error somewhere in this area, trying to >> interpret UFS as it was FAT? If you manage to get a >> repeatable failure test, you can try formatting the drive as >> FAT32 and trying it on that. Solid state drives don't behave much differently that a regular drive from FreeBSD's point of view. The huge difference most people notice is that they perform best at their page size (or maybe what the SSD manufacturer might call a block size, which is not a sector size), which is often 128K or 256K. IO smaller than the page size suffers a big penalty since most SSD devices do not have a cache onboard (although some do now). >> Or maybe it's just a bad drive... I doubt it's a bad device.. >>> The system-disk is a RAID1 on an ICP 5805. All other disks >> (51) are 20 >>> gstripe'd partitions. >> 51 drives and 20 partitions? >> > According to the manufaturer, the drive handles any filesystem. In other > words, it's as transparent as any harddisk would be. > Also, as written before, we have seen the error=5 with weird offsets on an > md (memory disk) before too. > fsck on the disk does NOT show any error. > > yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's > for hashfeed from diablo. > > One basic question to ask: where does the value for offset= in g_vfs_done() > come from ? >>From the time the error shows up in syslog I believe, the error only > happens, when a file get's appended. I wonder if (wild guess follows) there's a 32/64 bit conversion problem somewhere, like a 32bit number cast as 64bit or something. I'd like to see a full trace to see what path it takes. Maybe putting a panic in the error path would be worth doing. Eric