Date: Mon, 27 Oct 2003 08:42:25 -0800 From: David Wolfskill <david@egation.com> To: freebsd-isp@freebsd.org Subject: Re: restoring dumps from crashed drive Message-ID: <20031027164225.GA361@frecnocpc2.noc.egation.com> In-Reply-To: <DBEIKNMKGOBGNDHAAKGNCEDOGFAC.dave@nexusinternetsolutions.net> References: <DBEIKNMKGOBGNDHAAKGNCEDOGFAC.dave@nexusinternetsolutions.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 27, 2003 at 08:26:49AM -0500, Dave [Nexus] wrote: > recently had an unfotunate incident where a hard drive crashed after only 8 > months of service. Total loss of data on the drive even after sending it in to a > recovery company. >... > We had boot disks, and the thought was to build a base installation, mount the > backup drive(secondary hard drive), then simply run restore over the various > partitions. Some of the problems we ran into were; > - unable to copy various system files, kernel, etc... > - restore being unable to find files and trees referred to by symbolic link > (which at first I figured would be solved by simply running it twice once the > files were there to be linked to) > - and other peculiarities. > Bottom line is we ended up ditching it, installing a 4.8, cvsup to 4.9, then > rebuilding the server by hand, and copying user data over. We are still trying > to get database files restored which are problematic because of the massive > changes in the various MySQL and PostgreSQL since previous versions. The above list of modes of failure strike me as unexpected, at best. > Aside from the nice dump/restore examples, does anyone have a real world > situation where they could discuss the proceedures they did to restore a server > from backup, assuming total loss of the primary drive. Certainly. By its nature, dump requires a nearly incestuous relationship with the type of file system it's reading; on the other hand, if the file system has capabilities that more general utilities (e.g. tar or cpio) may not be aware of -- such as "flags"( cf. "man chflags"), a more file system- specific tool is appropriate to use. My backups at home are done with dump (transported via ssh); I have recovered from failed boot drives on a couple of FreeBSD systems and a Solaris (2.6) system via those backups. For the FreeBSD systems, I set them up to boot from either slice 1 or slice 2 (so I have both / and /usr on those slices, and /var and "everything else" -- including swap -- on the 3rd slice). In these cases, I do a minimal install on slice 2, boot from slice 2, then restore to slice 1. In the case of the Solaris system, I still had a flaky, but marginally-servicable, disk drive from which I could boot, while I put the new drive in the other position (this was on a SPARCstation 5) and partition the new drive, created the file systems, then restored the data. The reason for setting up the FreeBSD systems to boot from either of 2 slices, however, is not to facilitate such recovery (though it does do that); rather, it is to make fairly frequent upgrades (while preserving an ability to fall back to a reasonably well-known system). I use a "dump | restore" pipeline to copy the file systems from the active slice to the inactive one, then boot from the newly-wwritten slice. I then do the "make installkernel && mergemaster -p && make installworld && mergemaster" sequence in-place on the (now-active) slice -- I use a different (and faster) machine to do the builds, both for the world (including the sendmail configs) and the kernels. (I note, too, that I typically have /usr mounted read-only except during upgrades. I tried mounting / read-only a few years ago, but seem to recall ssh having significant problems with that ... and since a couple of the boxes I care about run headless, breaking the ability to use ssh to access them wasn't exactly high on my list of "fun things to do." Despite that, / doesn't tend to be a very active file system on boxes I run -- except during upgrades, of course.) Since I track -STABLE on my laptop (thus getting a "feel" for just how "stable" it is for my usage), I tend to do these upgrades -- at home -- about every couple of weeks or so. (If there are circumstances that justify a more frequent schedule, such as problems with SSL, I'll do that; if it is my perception that -STABLE isn't suitably "stable" for my use, I'll hold off for a week or so.) I confess that I have yet to implement that (or a similar) scheme here at work, though the new machines I've put into production do get set up to support it. But I just got started here.... :-} So I'm sorry to read of your "tale of woe," but find myself puzzled as to how that happened. I cannot help but recommend, though, that anyone doing (or planning) backups actually *test* the ability to use those backups from time to time. Peace, david -- David H. Wolfskill david@egation.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20031027164225.GA361>