Date: Mon, 15 Oct 2007 10:20:57 +0200 From: "d_elbracht" <d_elbracht@ecngs.de> To: "'Ivan Voras'" <ivoras@freebsd.org>, <freebsd-stable@freebsd.org> Cc: freebsd-geom@freebsd.org Subject: AW: g_vfs_done():da3s1a[READ(offset=81064794762854400, length=8192)]error = 5 Message-ID: <00cb01c80f04$50b11ed0$639049d9@EC1a> In-Reply-To: <feu58o$5uo$1@ger.gmane.org> References: <008801c80e65$47cbe650$639049d9@EC1a> <feu58o$5uo$1@ger.gmane.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> > we are trying to diagnose errors seen on 6.2, SMP, amd64, > cvsup'ed of > > 2007-10-09 > > > > Mainboard is a Tyan Thunder h2000M (S3992-E) with 16 GB RAM and 2 x > > Opteron 2216, da3 is on a 3ware 9550-12 > > > > we are seeing this error: > > g_vfs_done():da3s1a[READ(offset=81064794762854400, > length=8192)]error > > = 5 on a 12 GB Hyperdrive > > > > the offset changes sometimes, but it is always > 81064794xxxxxxxxx and > > well out the 12GB range. > > Yes. > > > According to systat -vm, da3 does tps > 500 (yes, that's a lot) > > That's not a lot :) That's actually low for a modern solid > state drive. > > > This leads to an assumption, the error has to do with very high IOs > > per second on a SMP machine. > > Either that or file system errors. Does fsck run ok or does > it say anything unusual? > > There are several theoretical reasons for such errors that > are connected with the fact you use solid state drives, but > all are tricky to diagnose if you don't have a certain > repeatable test you can try. For example: > some SSDs optimize writes to "spread out" the IO on the > chips, but some do it by looking into file system structures > to determine where it's safe to relocate the write - > obviously this works only with a known and supported file > system. This is a really wild guess, but maybe the SSD > firmware has error somewhere in this area, trying to > interpret UFS as it was FAT? If you manage to get a > repeatable failure test, you can try formatting the drive as > FAT32 and trying it on that. > > Or maybe it's just a bad drive... > > > The system-disk is a RAID1 on an ICP 5805. All other disks > (51) are 20 > > gstripe'd partitions. > > 51 drives and 20 partitions? > According to the manufaturer, the drive handles any filesystem. In other words, it's as transparent as any harddisk would be. Also, as written before, we have seen the error=5 with weird offsets on an md (memory disk) before too. fsck on the disk does NOT show any error. yes, 20 partitions on the other 51 disks (/dev/stripe/data ..datann). That's for hashfeed from diablo. One basic question to ask: where does the value for offset= in g_vfs_done() come from ? >From the time the error shows up in syslog I believe, the error only happens, when a file get's appended. Dieter
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?00cb01c80f04$50b11ed0$639049d9>