Date: Fri, 16 May 2008 14:14:14 +0200 From: Willy Offermans <Willy@Offermans.Rompen.nl> To: Roland Smith <rsmith@xs4all.nl> Cc: freebsd-stable@FreeBSD.ORG Subject: Re: g_vfs_done error third part--PLEASE HELP! Message-ID: <20080516121414.GD4618@wiz.vpn.offrom.nl> In-Reply-To: <20080421201047.GB6884@slackbox.xs4all.nl> References: <20080421190403.GA4625@wiz.vpn.offrom.nl> <20080421201047.GB6884@slackbox.xs4all.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello Roland and FreeBSD friends, I'm sorry to be so quite for a while, but I went away for a vacation. But now I'm back, I like to solve this issue. On Mon, Apr 21, 2008 at 10:10:47PM +0200, Roland Smith wrote: > On Mon, Apr 21, 2008 at 09:04:03PM +0200, Willy Offermans wrote: > > Dear FreeBSD friends, > > > > It is already the third time that I report this error. Can someone help > > me in solving this issue? > > Probably the reason that you hear so little is that you provide so > little information. Most of us are not clairvoyant. > > > Over and over again and always after heavy disk I/O I see the following > > errors in the log files. If I force ar0s1g to unmount the machine > > spontaneously reboots. Nothing seriously seems to be damaged by this > > act, but anyway I cannot afford something bad happening to this > > production machine. > > Why would you force an unmount? Otherwise the device keeps on reporting to be unavailable and cannot be unmounted: sun# umount /share/ umount: unmount of /share failed: Resource temporarily unavailable > > > Apr 18 20:02:19 sun kernel: g_vfs_done():ar0s1g[WRITE(offset=290725068800, length=4096)]error = 5 > > > > I have no clue what the errors mean, since offsets of 290725068800, > > 290725072896, and 290725074944 seem to be ridiculous. Does anybody > > have a clue what is going on? > > For starters, how big is ar0s1g? If the offset is in bytes, it is around > 270 GB, which is not that unusual in this day and age. I have to admit that I was a bit confused by an offset value of 290725068800. There is no indication of a unit, so I assumed that it was sector but probably it is simply bytes and then indeed the number does make sense. > > > I'm using FreeBSD 7.0, but found the error being reported before with > > previous versions of FreeBSD. I can and will provide more details on > > demand. > > What does 'df' say? Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ar0s1a 20308398 230438 18453290 1% / devfs 1 1 0 100% /dev /dev/ar0s1d 21321454 3814482 15801256 19% /usr /dev/ar0s1e 50777034 5331686 41383186 11% /var /dev/ar0s1f 101554150 18813760 74616058 20% /home /dev/ar0s1g 274977824 34564876 218414724 14% /share pretty normal I would say. > > Did you notice any file corruption in the filesystem on ar0s1g? No the two disks are brand new and I did not encounter any noticeable file corruption. However I assume that nowadays bad sectors on HD are handled by the hardware and do not need any user interaction to correct this. But maybe I'm totally wrong. > > Unmount the filesystem and run fsck(8) on it. Does it report any errors? sun# fsck /dev/ar0s1g ** /dev/ar0s1g ** Last Mounted on /share ** Phase 1 - Check Blocks and Sizes INCORRECT BLOCK COUNT I=34788357 (272 should be 264) CORRECT? [yn] y INCORRECT BLOCK COUNT I=34789217 (296 should be 288) CORRECT? [yn] y ** Phase 2 - Check Pathnames ** Phase 3 - Check Connectivity ** Phase 4 - Check Reference Counts ** Phase 5 - Check Cyl groups FREE BLK COUNT(S) WRONG IN SUPERBLK SALVAGE? [yn] y SUMMARY INFORMATION BAD SALVAGE? [yn] y BLK(S) MISSING IN BIT MAPS SALVAGE? [yn] y 182863 files, 17282440 used, 120206472 free (12448 frags, 15024253 blocks, 0.0% fragmentation) ***** FILE SYSTEM MARKED CLEAN ***** ***** FILE SYSTEM WAS MODIFIED ***** The usual stuff I would say. > > > Any hints are very much appreciated. > > Did you manage to create a partition larger than the disk is (using > newfs's -s switch)? In that case it could be that you're trying to write > past the end of the device. No, look to the following output: sun# bsdlabel -A /dev/ar0s1 # /dev/ar0s1: type: unknown disk: amnesiac label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 60799 sectors/unit: 976751937 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # milliseconds track-to-track seek: 0 # milliseconds drivedata: 0 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 41943040 0 4.2BSD 0 0 0 b: 8388608 41943040 swap c: 976751937 0 unused 0 0 # "raw" part, don't edit d: 44040192 50331648 4.2BSD 2048 16384 28552 e: 104857600 94371840 4.2BSD 2048 16384 28552 f: 209715200 199229440 4.2BSD 2048 16384 28552 g: 567807297 408944640 4.2BSD 2048 16384 28552 /dev/ar0s1g starts after 408944640*512/1024/1024=199680MB So I have to conclude that the write error message does make sense and that something seems to be wrong with the disks. The next question is what can I do about it? Should I return the disks to the shop and ask for new ones? However other people that I have contacted and who had a similar problem before have solved it by using software raid setup instead of a hardware raid setup. This seems to indicate that there is some bug in the FreeBSD code. Another peculiarity that I have to mention is the following. If I use sysinstall and if I try to ``Label allocated disk partitions'', I cannot see the partitions on ar0. However the partitions can be visualised by bsdlabel as shown above. What is going on and what should I do? > > Roland > -- > R.F.Smith http://www.xs4all.nl/~rsmith/ > [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] > pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725) -- Met vriendelijke groeten, With kind regards, Mit freundlichen Gruessen, De jrus wah, Willy ************************************* W.K. Offermans Home: +31 45 544 49 44 Mobile: +31 653 27 16 23 e-mail: Willy@Offermans.Rompen.nl Powered by .... (__) \\\'',) \/ \ ^ .\._/_) www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080516121414.GD4618>