Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Oct 2001 02:33:59 -0400
From:      Ben Eisenbraun <bene@nitrogen.nexthop.net>
To:        Greg Lehey <grog@FreeBSD.org>
Cc:        freebsd-questions@FreeBSD.org
Subject:   Re: recovery of corrupt vinum plexes?
Message-ID:  <20011025023359.D64298@nitrogen.nexthop.net>
In-Reply-To: <20011025103000.A25441@wantadilla.lemis.com>; from grog@FreeBSD.org on Thu, Oct 25, 2001 at 10:30:00AM %2B0930
References:  <20011023044950.A43848@nitrogen.nexthop.net> <20011023183023.M27668@wantadilla.lemis.com> <20011023055005.A44324@nitrogen.nexthop.net> <20011025103000.A25441@wantadilla.lemis.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Oct 25, 2001 at 10:30:00AM +0930, Greg Lehey wrote:
> On Tuesday, 23 October 2001 at  5:50:05 -0400, Ben Eisenbraun wrote:
<snip a db> trace> 

> Hmm.  That could have been just about anything, probably a corrupt
> request structure.  Without a dump it's difficult to say very much,
> but in view of the fact that the drives have gone away, it's possible
> that it was trying to talk to them anyway.  I'd like to see a dump of
> this.

I think I can reproduce this if I frob with it enough, but this 
kernel wasn't build with debug symbols, and the system sources have 
been recently updated, so I'm not sure if this would be useful.  Do 
you have any suggestions?

> You've truncated the dumpconfig output.  Did ad4 or ad6 show up?  I'm
> assuming they didn't.

They didn't show up.  What I pasted is what I've got.

> OK, let's hope that only the Vinum labels are corrupted.  You have a
> fair chance that the data section hasn't been overwritten, since
> there's a copy of the config information (128 kB) between the label
> and the data.  In that case, you should be able to recreate the
> objects with this config file:
> 
> device max3 device /dev/ad4s1e
> device max4 device /dev/ad6s1e

"device" in the first column produced an error.  Looking at the config, 
I figured you meant "drive max# device /dev/etc", so I swapped "drive" 
for the first "device" in each line.  Here's the output of the 'create',
'start', 'list' and 'list -v' commands.

vinum -> create /root/newconfig
ad4s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK
vinum: drive max3 is up
ad6s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK
vinum: drive max4 is up
2 drives:
D max3                  State: up       Device /dev/ad4s1e      Avail: 57239/57239 MB (100%)
D max4                  State: up       Device /dev/ad6s1e      Avail: 57239/57239 MB (100%)

0 volumes:
0 plexes:
0 subdisks:

vinum -> start
ad0s1: type 0xa5, start 63, end = 40000463, size 40000401 : OK
<snip 6 repeats>
ad4s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK
<snip 26 repeats>
ad6s1: type 0xa5, start 63, end = 120053744, size 120053682 : OK
<snip 27 repeats>
ad8s1: type 0xa5, start 63, end = 117226304, size 117226242 : OK
<snip 3 repeats>
ad10s1: type 0xa5, start 63, end = 117226304, size 117226242 : OK
<snip 3 repeats>
da0s1: type 0xa5, start 63, end = 17767889, size 17767827 : OK
<snip 29 repeats>
md0: invalid primary partition table: no magic
<snip 34 repeats>
vinum: updating configuration from /dev/ad4e
vinum: updating configuration from /dev/ad6e
vinum: updating configuration from /dev/ad8s1e
vinum: updating configuration from /dev/ad10s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad0s1e
Warning: defective objects

P stripe-mirror.p0    S State: corrupt  Subdisks:     2 Size:        111 GB
P stripe-mirror.p1    S State: corrupt  Subdisks:     2 Size:        111 GB
S stripe-mirror.p0.s0   State: crashed  PO:        0  B Size:         55 GB
S stripe-mirror.p1.s0   State: crashed  PO:        0  B Size:         55 GB

vinum -> list
8 drives:
D max3                  State: up       Device /dev/ad4e        Avail: 0/57239 MB (0%)
D max4                  State: up       Device /dev/ad6e        Avail: 0/57239 MB (0%)
D max1                  State: up       Device /dev/ad0s1e      Avail: 0/19529 MB (0%)
D max2                  State: up       Device /dev/ad2s1e      Avail: 0/19529 MB (0%)
D wd1                   State: up       Device /dev/ad8s1e      Avail: 0/57239 MB (0%)
D wd2                   State: up       Device /dev/ad10s1e     Avail: 0/57239 MB (0%)

2 volumes:
V stripe-mirror         State: up       Plexes:       2 Size:        111 GB
V var-mirror            State: up       Plexes:       2 Size:         19 GB

4 plexes:
P stripe-mirror.p0    S State: corrupt  Subdisks:     2 Size:        111 GB
P stripe-mirror.p1    S State: corrupt  Subdisks:     2 Size:        111 GB
P var-mirror.p0       C State: up       Subdisks:     1 Size:         19 GB
P var-mirror.p1       C State: up       Subdisks:     1 Size:         19 GB

6 subdisks:
S stripe-mirror.p0.s0   State: crashed  PO:        0  B Size:         55 GB
S stripe-mirror.p0.s1   State: up       PO:      512 kB Size:         55 GB
S stripe-mirror.p1.s0   State: crashed  PO:        0  B Size:         55 GB
S stripe-mirror.p1.s1   State: up       PO:      512 kB Size:         55 GB
S var-mirror.p0.s0      State: up       PO:        0  B Size:         19 GB
S var-mirror.p1.s0      State: up       PO:        0  B Size:         19 GB

vinum -> list -v
8 drives:
Drive max3:     Device /dev/ad4e
                Created on  at Wed Oct 24 21:54:10 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      60019835904 bytes (57239 MB)
                Used:      60019577344 bytes (57239 MB)
                Available:      258560 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0

Drive max4:     Device /dev/ad6e
                Created on  at Wed Oct 24 21:54:10 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      60019835904 bytes (57239 MB)
                Used:      60019577344 bytes (57239 MB)
                Available:      258560 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0

Drive max1:     Device /dev/ad0s1e
                Created on whiskey.klatsch.org at Tue May  1 02:38:57 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      20478108160 bytes (19529 MB)
                Used:      20478108160 bytes (19529 MB)
                Available:           0 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0

Drive max2:     Device /dev/ad2s1e
                Created on whiskey.klatsch.org at Tue Jul 31 17:47:06 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      20478108160 bytes (19529 MB)
                Used:      20478108160 bytes (19529 MB)
                Available:           0 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0

Drive wd1:      Device /dev/ad8s1e
                Created on whiskey.klatsch.org at Tue Jul 31 13:00:55 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      60019835904 bytes (57239 MB)
                Used:      60019577344 bytes (57239 MB)
                Available:      258560 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0

Drive wd2:      Device /dev/ad10s1e
                Created on whiskey.klatsch.org at Mon Jul 30 15:30:41 2001
                Config last updated Wed Oct 24 21:54:52 2001
                Size:      60019835904 bytes (57239 MB)
                Used:      60019577344 bytes (57239 MB)
                Available:      258560 bytes (0 MB)
                State: up
                Last error: none
                Active requests:        0
                Maximum active:         0


2 volumes:
Volume stripe-mirror:   Size: 120038883328 bytes (114478 MB)
                State: up
                Flags: 
                2 plexes
                Read policy: round robin
Volume var-mirror:      Size: 20477972480 bytes (19529 MB)
                State: up
                Flags: 
                2 plexes
                Read policy: round robin

4 plexes:
Plex stripe-mirror.p0:  Size:   120038883328 bytes (114478 MB)
                Subdisks:        2
                State: corrupt
                Organization: striped   Stripe size: 512 kB
                Part of volume stripe-mirror

Plex stripe-mirror.p1:  Size:   120038883328 bytes (114478 MB)
                Subdisks:        2
                State: corrupt
                Organization: striped   Stripe size: 512 kB
                Part of volume stripe-mirror

Plex var-mirror.p0:     Size:   20477972480 bytes (19529 MB)
                Subdisks:        1
                State: up
                Organization: concat
                Part of volume var-mirror

Plex var-mirror.p1:     Size:   20477972480 bytes (19529 MB)
                Subdisks:        1
                State: up
                Organization: concat
                Part of volume var-mirror


6 subdisks:
Subdisk stripe-mirror.p0.s0:
                Size:      60019441664 bytes (57239 MB)
                State: crashed
                Plex stripe-mirror.p0 at offset 0 (0  B)
                Drive max3 (/dev/ad4e) at offset 135680 (132 kB)

Subdisk stripe-mirror.p0.s1:
                Size:      60019441664 bytes (57239 MB)
                State: up
                Plex stripe-mirror.p0 at offset 524288 (512 kB)
                Drive wd1 (/dev/ad8s1e) at offset 135680 (132 kB)

Subdisk stripe-mirror.p1.s0:
                Size:      60019441664 bytes (57239 MB)
                State: crashed
                Plex stripe-mirror.p1 at offset 0 (0  B)
                Drive max4 (/dev/ad6e) at offset 135680 (132 kB)

Subdisk stripe-mirror.p1.s1:
                Size:      60019441664 bytes (57239 MB)
                State: up
                Plex stripe-mirror.p1 at offset 524288 (512 kB)
                Drive wd2 (/dev/ad10s1e) at offset 135680 (132 kB)

Subdisk var-mirror.p0.s0:
                Size:      20477972480 bytes (19529 MB)
                State: up
                Plex var-mirror.p0 at offset 0 (0  B)
                Drive max1 (/dev/ad0s1e) at offset 135680 (132 kB)

Subdisk var-mirror.p1.s0:
                Size:      20477972480 bytes (19529 MB)
                State: up
                Plex var-mirror.p1 at offset 0 (0  B)
                Drive max2 (/dev/ad2s1e) at offset 135680 (132 kB)


For some reason, the drives came back at ad[46]e not ad[46]s1e.

I was thinking about recent changes to the system config, since 
this setup had run reliably for several months off the same 
sources, and it was rebooted about 1 week before this crash 
happened.  Some new sysctl's took effect after that reboot:

kern.ipc.somaxconn=2048
net.inet.icmp.drop_redirect=1
net.inet.icmp.log_redirect=1
net.inet.tcp.sendspace=32768
net.inet.tcp.recvspace=32768
vfs.vmiodirenable=1

Also, 2-3 days before the problems started occurring, I had 
created some additional swap space on both ad4 and ad6.  Those 
are the only lowlevel changes made to the system since it was built 
from the running sources.

FreeBSD  4.4-RC FreeBSD 4.4-RC #0: Tue Aug 21 20:53:12 EDT 2001
root@whiskey.klatsch.org:/usr/obj/usr/src/sys/WHISKEY  i386

root@ [10:21pm][~]>>disklabel /dev/ad4s1
<snip>
8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
  b:  2097152 117226242      swap                       # (Cyl. 7296*- 7427*)
  c: 120053682        0    unused        0     0        # (Cyl.    0 - 7472*)
  e: 117226242        0     vinum                       # (Cyl.    0 - 7296*)

root@ [10:23pm][~]>>disklabel /dev/ad6s1
<snip>
8 partitions:
#        size   offset    fstype   [fsize bsize bps/cpg]
  b:  2097152 117226242      swap                       # (Cyl. 7296*- 7427*)
  c: 120053682        0    unused        0     0        # (Cyl.    0 - 7472*)
  e: 117226242        0     vinum                       # (Cyl.    0 - 7296*)

I'm happy to try anything that would assist you in tracking down 
the problem, and I can arrange for console access if it would be 
helpful.

Thanks, Greg.

-ben

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20011025023359.D64298>