From owner-freebsd-questions  Tue Nov  5 12:16: 2 2002
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C23D237B401
	for <freebsd-questions@freebsd.org>; Tue,  5 Nov 2002 12:15:59 -0800 (PST)
Received: from palanthas.neverending.org (palanthas.neverending.org [167.206.208.232])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 16A8C43E6E
	for <freebsd-questions@freebsd.org>; Tue,  5 Nov 2002 12:15:59 -0800 (PST)
	(envelope-from ftobin@neverending.org)
Received: by palanthas.neverending.org (Postfix, from userid 1000)
	id 5406720527; Tue,  5 Nov 2002 15:16:02 -0500 (EST)
Received: from localhost (localhost [127.0.0.1])
	by palanthas.neverending.org (Postfix) with ESMTP id 5257920526
	for <freebsd-questions@freebsd.org>; Tue,  5 Nov 2002 15:16:02 -0500 (EST)
Date: Tue, 5 Nov 2002 15:16:02 -0500 (EST)
From: Frank Tobin <ftobin@neverending.org>
To: freebsd-questions@freebsd.org
Subject: vinum: recovering from hd read errors
Message-ID: <Pine.LNX.4.44.0211051436450.9987-100000@palanthas.neverending.org>
X-Bogus: aaron7@neverending.org
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-questions@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-questions.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-questions>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-questions>
X-Loop: FreeBSD.ORG

Note: This post is a long shot at trying to see if anyone else has
encountered a similar issue, and see if there are any tricks/traps that I
should avoid.  It takes a while to get to the actual question, but a lot
of preparatory info is needed.

On a legacy FreeBSD 4.0-CURRENT, from about Aug 26 1999, machine, I have a
vinum volume with two plexes, each with a single subdisk.  (There are
other volumes, but they are 'okay' and probably not relevant).  One of the
subdisks is on a hd with read errors.

The volume's name is 'data', and the subdisk data.p0.s0 is on is /dev/da1,
the physical drive with read errors.  To be exact, the relevant pieces of
vinum 'l -v' output (with some notes)  are:

Drive drive2:	Device /dev/da1s1e
*** this is the drive with read errors ***
		Created on gouda.netmonger.net at Sat Aug 28 02:32:14 1999
		Config last updated Mon Nov  4 14:11:15 2002
		Size:       9105023488 bytes (8683 MB)
		Used:       9104921088 bytes (8683 MB)
		Available:      102400 bytes (0 MB)
		State: up
		Last error: none
		Free list contains 1 entries:
		   Offset	     Size
		 17783049	      200

Volume data:	Size: 9104785408 bytes (8683 MB)
*** this is the volume that has data I want to recover ***
		State: down
		Flags: open 
		2 plexes
		Read policy: round robin

Plex data.p0:	Size:	9104785408 bytes (8683 MB)
		Subdisks:        1
		State: faulty
		Organization: concat
		Part of volume data

Plex ex-data.p1:	Size:	9104785408 bytes (8683 MB)
*** unknown how long this has been initializing, maybe months; ***
*** possibly a config screwup ***
		Subdisks:        1
		State: initializing
		Organization: concat
		Part of volume data

Subdisk data.p0.s0:
*** this is the crashed subdisk, on top of a hd with read errors ***
		Size:       9104785408 bytes (8683 MB)
		State: crashed
		Plex data.p0 at offset 0 (0  B)
		Drive drive2 (/dev/da1s1e) at offset 135680 (132 kB)

Subdisk data.p1.s0:
*** because the plex has likely been initializing forever, ***
*** this data is likely worthless ***
		Size:       9104785408 bytes (8683 MB)
		State: obsolete
		Plex ex-data.p1 at offset 0 (0  B)
		Drive drive3 (/dev/da3e) at offset 135680 (132 kB)


Let me be the first to point out that I fully acknowledge that this setup
and circumstances are quite weird.  Inherited problems are always the most
interesting/trying.

The underlying device for the subdisk data.p0.s0, /dev/da1, has suffered
read errors.  Furthermore, it's quite possible that the other plex for the
data volume, ex-data.p1, has been 'initializing' for an extremely long
time (the person who set this up tried to do mirroring, but gave up, and
left it in a 'working' state; hence the ex-data name).  It's quite unclear
how long the ex-data.p1 plex it has been 'initializing';  it's possibly a
config screwup.

Anyways, as you can see, subdisk data.p0.s0 is 'crashed'.  I've gotten as
much as possible off by dd'ing /dev/da1s1e (its underlying device).

There are two approaches I can currently take to try to recover data.  
One, described in another mail to this list, asks about offset into a
vinum partition that could be used to get to the 'real' data, that might
possibly be mounted as a freebsd filesystem.

The second approach, and my real question of this message, is to possibly
run 'vinum start', and try to bring the filesystem back up, to try to
glean as much as possible from a filesystem interface.  However, I'm
worried about the vinum automagically wiping something critical with its
auto-config, so that this data cannot be recovered.  However, this seems
one of my few options.

So, is vinum's 'start' the thing to do in my circumstance?

-- 
Frank Tobin			http://www.neverending.org/~ftobin/


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message