From owner-freebsd-questions Fri Oct 26 20:51:31 2001 Delivered-To: freebsd-questions@freebsd.org Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by hub.freebsd.org (Postfix) with ESMTP id 92D0D37B40C for ; Fri, 26 Oct 2001 20:49:27 -0700 (PDT) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 7041C6ACA9; Thu, 25 Oct 2001 10:30:00 +0930 (CST) Date: Thu, 25 Oct 2001 10:30:00 +0930 From: Greg Lehey To: Ben Eisenbraun Cc: freebsd-questions@FreeBSD.org Subject: Re: recovery of corrupt vinum plexes? Message-ID: <20011025103000.A25441@wantadilla.lemis.com> References: <20011023044950.A43848@nitrogen.nexthop.net> <20011023183023.M27668@wantadilla.lemis.com> <20011023055005.A44324@nitrogen.nexthop.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011023055005.A44324@nitrogen.nexthop.net>; from bene@nitrogen.nexthop.net on Tue, Oct 23, 2001 at 05:50:05AM -0400 Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tuesday, 23 October 2001 at 5:50:05 -0400, Ben Eisenbraun wrote: > On Tue, Oct 23, 2001 at 06:30:23PM +0930, Greg Lehey wrote: >> On Tuesday, 23 October 2001 at 4:49:50 -0400, Ben Eisenbraun wrote: >>> vinum -> list >>> 8 drives: >>> D max1 State: up Device /dev/ad0e Avail: 19529/19529 MB (100%) >>> D max2 State: up Device /dev/ad2e Avail: 19529/19529 MB (100%) > > > >> Is this information correct (i.e. you have six drives)? > > Yes, 6 IDE drives across three controllers. > >>> $*8*^O^U0*9s^Ou^VhZ#h$#4#4*u^P4*E^LE^LPue[^_^LU^PWVS}u^L%^O^OGst%^? >>> >>> (I have a feeling this bodes ill.) >> >> Yes. This drive contains no Vinum label. Are ad4s1e and ad6s1e the >> correct device names? > > I believe so. Those are the FreeBSD partitions I used in the vinum > config file and disklabel shows this: > > > > ad0 and ad2, the mirrored volume, fsck'ed cleanly and appear to be fine. > > ad4 and ad6 are identical drives and are partitioned exactly the same. > ad8 and ad10 are identical drives and are partitioned exactly the same. This looks bad. I don't know how, but it's fairly evident that your Vinum label has got clobbered. I've never seen that before. > I haven't made any local changes to system or kernel sources. It > didn't write a crash dump to disk on any of the following crashes. > I didn't catch the console messages on the first boot before the > first panic. Here's the trace from that original panic: > > db> trace > devsw(0,c126e1cc,c134ee00,c64f75b0,1) at devsw+0x6 > launch_requests(c134ee00,0,c64f75b0,ccf5e240,c1355000) at launch_requests+0x308 > vinumstart(c64f75b0,0,c64f75b0,ce2e8da0,c0199a51) at vinumstart+0x19a > vinumstrategy(c64f75b0,c134ce80,c64f75b0,1,ce2e8dac) at vinumstrategy+0x92 > spec_strategy(ce2e8dd0,ce2e8db8,c021de25,ce2e8dd0,ce2e8dec) at spec_strategy+0x8d > spec_vnoperate(ce2e8dd0,ce2e8dec,c021d6e1,ce2e8dd0,c64f75b0) at spec_vnoperate+0x15 > ufs_vnoperatespec(ce2e8dd0,c64f75b0,1,6800c444,c02acc60) at ufs_vnoperatespec+0x15 > ufs_strategy(ce2e8e14,ce2e8e20,c0185eb8,ce2e8e14,1c00) at ufs_strategy+0xc5 > ufs_vnoperate(ce2e8e14) at ufs_vnoperate+0x15 > bwrite(c64f75b0,ce2e8e38,c018b5b5,ce2e8e78,ce2e8e44) at bwrite+0x20c > vop_stdbwrite(ce2e8e78,ce2e8e44,c021dded,ce2e8e78,ce2e8e84) at vop_stdbwrite+0xf > vop_defaultop(ce2e8e78,ce2e8e84,c0186e64,ce2e8e78,c64f75b0) at vop_defaultop+0x15 > ufs_vnoperate(ce2e8e78,c64f75b0,6800c444,ccf5d580,10) at ufs_vnoperate+0x15 > vfs_bio_awrite(c64f75b0) at vfs_bio_awrite+0x24c > ffs_fsync(ce2e8ee8,c1355e00,0,cc01e2a0,ce2e8ee8) at ffs_fsync+0x28b > ffs_sync(c1355e00,2,c0a35900,cc01e2a0,c1355e00) at ffs_sync+0x126 > sync(cc01e2a0,ce2e8f80,bfbffdd4,bfbffdd4,2) at sync+0x6f > syscall2(2f,2f,2f,2,bfbffdd4) at syscall2+0x23d > Xint0x80_syscall() at Xint0x80_syscall+0x2b Hmm. That could have been just about anything, probably a corrupt request structure. Without a dump it's difficult to say very much, but in view of the fact that the drives have gone away, it's possible that it was trying to talk to them anyway. I'd like to see a dump of this. > It hung from there, so I had someone reset it. I'm accessing it > via serial console. The next messages I had were from midway > through the boot: > > vinum: reading configuration from /dev/ad8s1e > vinum: stripe-mirror.p0 is faulty > vinum: stripe-mirror.p1 is faulty > vinum: stripe-mirror is down > vinum: updating configuration from /dev/ad10s1e > vinum: updating configuration from /dev/ad2s1e > vinum: updating configuration from /dev/ad0s1e > vinum: stripe-mirror.p0 is corrupt > vinum: stripe-mirror is up > vinum: stripe-mirror.p1 is corrupt > vinum: /dev is mounted read-only, not rebuilding /dev/vinum > Warning: defective objects Note that ad4 and ad6 are already gone. What comes next is probably not so important. > D max3 State: referenced Device Avail: 0/0 MB > D max4 State: referenced Device Avail: 0/0 MB > P stripe-mirror.p0 S State: corrupt Subdisks: 2 Size: 111 GB > P stripe-mirror.p1 S State: corrupt Subdisks: 2 Size: 111 GB > S stripe-mirror.p0.s0 State: crashed PO: 0 B Size: 55 GB > S stripe-mirror.p1.s0 State: crashed PO: 0 B Sdize: 55 aGB > 0s1: type 0xa5, start 63, end = 17767889, size 17767827 : OK > swapon: adding /dev/da0s1b as swap device > ad4s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK > swapon: adding /dev/ad4s1b as swap device > ad6s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK > swapon: adding /dev/ad6s1b as swap device > Automatic boot in progress... > /dev/da0s1a: 2331 files, 44030 used, 79985 free (793 frags, 9899 blocks, 0.6% fragmentation) > /dev/vinum/strippe-mirror: iCANNOT READ: BLKd 16 > /dev/vinum/str2ipe-mirror: UNEX2PECTED SOFT UPDA TE INCONSISTENCY(; RUN fsck MANUAfLLY. > fsck in frsee(): warning: pcage is already fkree. > fsck in fr)ee(): warning: c,hunk is already free. > fsck in furee(): warning: ipointer to wrongd page. > 0: exited on signal 11 (core dumped) > /dev/vinum/stripe-mirror (/usr/home): EXITED WITH SIGNAL 11 > > I'm not sure if the text corruption here is due to the serial console > being flaky (although it hasn't been before). That's not corruption, it's a second message coming out more slowly and interleaved: pid 22 (fsck), uid 0: exited on signal 11 (core dumped) I haven't seen that before except on a -CURRENT machine. > Here's the dumpconfig -v. > > > sd name stripe-mirror.p0.s0 drive max3 plex stripe-mirror.p0 len 117225472s driveoffset 265s state crashed plexoffset 0s > sd name stripe-mirror.p1.s0 drive max4 plex stripe-mirror.p1 len 117225472s driveoffset 265s state crashed plexoffset 0s These are the objects that interest you. > > Drive /dev/ad2e: 19 GB (20478108160 bytes) > > More info that may or may not be useful: You've truncated the dumpconfig output. Did ad4 or ad6 show up? I'm assuming they didn't. OK, let's hope that only the Vinum labels are corrupted. You have a fair chance that the data section hasn't been overwritten, since there's a copy of the config information (128 kB) between the label and the data. In that case, you should be able to recreate the objects with this config file: device max3 device /dev/ad4s1e device max4 device /dev/ad6s1e That's right, just the drives (check that I have the names right!). Stop vinum if it's running, then do: # vinum vinum -> create newconfig (assuming you've called the new file newconfig). You should end up with exactly these two objects, and they should be up. Next, do: vinum -> start After that, all objects should be there, but they almost certainly won't be the way you want them to be. Send me the output of the 'vinum list' and 'vinum list -v' commands, and I'll tell you what to do next. Greg -- When replying to this message, please copy the original recipients. If you don't, I may ignore the reply. For more information, see http://www.lemis.com/questions.html See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message