Date: Sat, 20 Mar 2004 08:57:38 +1030 From: Greg 'groggy' Lehey <grog@FreeBSD.org> To: Lewis Thompson <lewiz@fajita.org> Cc: questions@freebsd.org Subject: Re: Vinum, replaced disk -- fsck error. Message-ID: <20040319222738.GQ21807@wantadilla.lemis.com> In-Reply-To: <20040319030334.GA1985@lewiz.org> References: <20040316020000.GA846@lewiz.org> <20040316111325.GB742@adelaide.lemis.com> <20040316172526.GB1236@lewiz.org> <20040318025602.GZ58155@wantadilla.lemis.com> <20040319030334.GA1985@lewiz.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--MT9SxUWSsctiw0kG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Friday, 19 March 2004 at 3:03:34 +0000, Lewis Thompson wrote: > On Thu, Mar 18, 2004 at 01:26:02PM +1030, Greg 'groggy' Lehey wrote: >> On Tuesday, 16 March 2004 at 17:25:26 +0000, Lewis Thompson wrote: >>> I can't think of anything else. Originally I ran dd without the >>> conv=noerror and it stopped at around 25GB (the disk is a 100GB). The >>> destination disk is 123GB but to my knowledge that is acceptable for dd. >>> >>> During the process a number (maybe eight to ten) I/O errors were >>> reported. >> >> But not to me. > > I've included more detailed errors neared to the end of this email :) > >> I was really thinking of "What to do if you have problems with Vinum" >> at http://www.vinumvm.org/vinum/how-to-debug.html. > > Okay, I did actually do my best to follow this but maybe got > sidetracked. I'm just going to bullet point these now so I don't miss > any of them out. > > * Problems: ``dd'' cloned disk ``does not work'' (i.e. gstat shows no > activity on the cloned disk during reading of files). What command did you enter? What happened on the command line? > Also see previous emails. You can't really expect me to go digging for them. > * Changes to system: Originally vinum ran on 4.9-STABLE. This worked > but had periodic ``disk crashes'' (i.e. vinum states disk as offline). > I don't think this is the problem as the same behaviour happens with > 5.2.1-p1 using the original dodgy disk (only GEOM removes it instead > of vinum). It looks like part of the problem to me. It seems that you have a flaky disk. Is that correct? > * Vinum list (excuse lack of wrapping). On the contrary, it shouldn't be wrapped. I don't see anything in this list that hasn't started. Is it correct that volume "data" only has one plex? > * /var/log/messages extract. I originally started vinum a long while > before, I included this entry too (excuse wrapping): > > Mar 17 23:33:57 amnesia kernel: vinum: loaded > Mar 17 23:34:00 amnesia kernel: vinum: reading configuration from /dev/ad1s1h > Mar 17 23:34:00 amnesia kernel: vinum: updating configuration from /dev/ad2s1h > Mar 17 23:34:00 amnesia kernel: vinum: updating configuration from /dev/ad3s1h > Mar 19 02:49:26 amnesia kernel: WARNING: /mnt/data was not properly dismounted > Mar 19 02:52:15 amnesia kernel: vinum: null rqg > > This seems a little odd to me -- previously I had not had a null rqg > error. This is certainly an interesting one. > I think maybe I didn't test it enough. Since these are mostly avi > files I can tell if they are broken on not by seeing if they have an > index -- last time they all played but many without indexes. > Nothing has changed since then; maybe I wasn't being thorough > enough? I'm wondering if the problem isn't at least partially due to the flaky disk. The "null rqg" message indicates that a request couldn't be mapped. I'd really need a dump from this point, if the problem is repeatable. Let me know and I'll send you a patch. >>> During the process a number (maybe eight to ten) I/O errors were >>> reported. > > These were dd errors. I didn't write these down at the time (silly of > me) and I'm not sure they even go into any log files. However, I have > found the exact error messages I got (although the offsets are wrong). > If required I will re-run dd and provide the full errors. > > The messages were: > > dd: reading `/dev/ad3': Input/output error Hmm. Why are you running against /dev/ad3? Why are you using dd at all? In any case, I would expect error messages in /var/log/messages at this point. > In a reply to my original question you stated that ``dd if=ad3 of=ad1 > bs=8192 conv=noerror'' ``may or may not work, depending on details you > haven't reported.'' Do these detailed errors help at all? A little. They tell me that the drive is flaky. I'd expect to see the error messages in /var/log/messages, though. > I just read a thread[1] about dd that makes me wonder whether it > would have been. The only reference I see there is to the current thread. At least it gave me the background. > I think that's everything. I'm just going to include some other stuff > from earlier emails that has been chopped earlier. Maybe it has some > relevance: > > = fsck_ufs /dev/vinum/data gives the following message: > = ** /dev/vinum/data > = cannot alloc 4316869296 bytes for inphead Yes, this is almost certainly due to incorrect copying. Probably the conv=noerror is to blame for that. I suspect that, unless you can read the sections of the volume that appear to be causing the fsck problems, you may be out of luck. About the only thing you could try is to mount the volume read-only without fsck, and then copy what data you can elsewhere. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply or reply to the original recipients. For more information, see http://www.lemis.com/questions.html Note: I discard all HTML mail unseen. Finger grog@FreeBSD.org for PGP public key. See complete headers for address and phone numbers. --MT9SxUWSsctiw0kG Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQFAW3PaIubykFB6QiMRAjGhAJ9uoNUXiS2WUX76t2aX6khESUF/yQCdHjUB CMcrmLPuwukH6jBqD3JwA2U= =htiL -----END PGP SIGNATURE----- --MT9SxUWSsctiw0kG--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040319222738.GQ21807>