Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Aug 2009 20:07:41 +0200
From:      Roland Smith <rsmith@xs4all.nl>
To:        Kelly Martin <kellymartin@gmail.com>
Cc:        FreeBSD Questions <freebsd-questions@freebsd.org>
Subject:   Re: hard disk failure - now what?
Message-ID:  <20090826180741.GA23120@slackbox.xs4all.nl>
In-Reply-To: <1338880b0908252246s21191e83k7c251366b706532@mail.gmail.com>
References:  <1338880b0908241129p75b6845cg26d21804e118364@mail.gmail.com> <20090824223247.GD43410@slackbox.xs4all.nl> <1338880b0908252246s21191e83k7c251366b706532@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--r5Pyd7+fXNt84Ff3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Aug 25, 2009 at 11:46:50PM -0600, Kelly Martin wrote:
> plugging the drive in and accessing it, I heard those tell-tale signs
> of hard drive failure: clicks and pops and other unusual noises, so I
> know that it has some damage. I hate those sounds, having heard them
> on failing drives too many times before.

If the drive is that bad, it is doubtfull if dd or ddrescue will be able to
get a good copy.

> >> My question: what kind of checks and/or repair tools should I run on
> >> the damaged drive after it's mounted?
> >
> > As others have mentioned, first make a copy (with the disk unmounted) o=
f the
> > partitions on that disk with dd, saving them to another drive. That way=
 you
> > can experiment with the data without further deterioration of the
> > original.
>=20
> I ran dd and it took over 20 hours to complete. In fact it just
> finished this evening, after running all day. Lots of FAILURE errors
> were reported along the way, enough to fill two console screens or
> more. And of course to complicate things I didn't have a spare drive
> as an output device that was the *same size*, so I used a smaller
> drive thinking that it wouldn't matter since the source drive wasn't
> full anyway. I have no idea if data is scattered around on the FFS
> filesystem such that cloning a mostly empty, larger drive onto
> something smaller might lose data... I searched Google and couldn't
> find the answer, so I proceeded anyway. It doesn't matter now though,
> as I have a new drive now and another plan.

Using dd you make a block-for block copy; dd doesn't know about filesystems.
You could pipe the output from dd through a compression program like gzip or
bzip2. That could yield a smaller image. But you'd have to uncompress it in
order to use it.

Or you could try just copying the filesystems separately. E.g. copy from
ad4s1f instead of the whole ad4. That way you can split the data over sever=
al
files which you can store in different places.

> I'm going to try dd a second time, but this time I'll use ddrescue as
> some people suggested and I'll make the target drive an
> identical-sized 500 Gbyte drive, which I purchased today. I imagine it
> will take a long time to create this cloned disk... hopefully with
> fewer errors than dd gave me, though we'll see.
=20
I hope you get a good copy, but it doesn't sound too likely. I'm not a hard=
ware
expert, but if the disk is really breaking down in the hardware or
electronics, it is not inconceivable that even reading might further
deteriorate it. If you do not get a good 1:1 copy, you'll have extra errors=
 in
your data! Depending on the options you give dd, it will either skip blocks
with errors or fill it with zeroes or other characters. See the piece of the
manual page of fsck_ufs that describes the 'noerror' conversion.

> Indeed some of the partitions seem to be beyond repair. In particular
> my /var partition is totally fubar'ed. When using fsck_ffs I got all
> sorts of errors when trying to repair the partition, things like:
>=20
> BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTER=
NATE
> So I used the -b option suggested in the man page, "fsck_ffs -y -b 160
> /dev/ad0s1d" and it ran and fixed a few things, but then stopped with
> the following error:
>=20
> fsck_ufs: cannot alloc 4294967292 bytes for inoinfo

The meaning of errors is explained in Appendix A of "Fsck - The UNIX File
System Check Program". You can find it this as
/usr/share/doc/smm/03.fsck/paper.ascii.gz

> MySQL databases are normally stored in /var/db/mysql. But then I
> remembered my MySQL server was actually running in a Jail environment,
> and therefore it was located at /usr/jails/myjail/var/db/mysql instead
> of /var/db/mysql, and therefore the jailed MySQL database was on a
> totally different partition. Lucky! And I was also very lucky that I
> could mount the large /usr partition in read-only mode and copy off
> the most critical files I needed, starting with the database. No
> errors on that part of the disk so far, at least with the few critical
> files I've copied over. Whew!

Congratulations!
=20
> Until just a few minutes ago I didn't think there'd be a happy ending.
> But I've got the most critical data copied over now, the rest can
> wait. I'm going to go run dd a second time (well, ddrescue) now and
> then start work on the copy once it finishes, in a day or two.

Time to start thinking about a solid backup strategy as well. :-)


Roland
--=20
R.F.Smith                                   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)

--r5Pyd7+fXNt84Ff3
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.12 (FreeBSD)

iEYEARECAAYFAkqVee0ACgkQEnfvsMMhpyVkJQCfUPZJ6/+mKeDEgTVDgzQff5fb
vYoAnR0oeOcTeY//jVp+RfwmuIYOdqfc
=NSb7
-----END PGP SIGNATURE-----

--r5Pyd7+fXNt84Ff3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090826180741.GA23120>