Date: Fri, 31 Jul 1998 21:16:23 +0000 (GMT) From: Terry Lambert <tlambert@primenet.com> To: mystify@wkstn4-208.lxr.georgetown.edu (Patrick Hartling) Cc: freebsd-fs@FreeBSD.ORG Subject: Re: Trying to recover lost file Message-ID: <199807312116.OAA29689@usr09.primenet.com> In-Reply-To: <199807311235.IAA01340@wkstn4-208.lxr.georgetown.edu> from "Patrick Hartling" at Jul 31, 98 08:35:18 am
next in thread | previous in thread | raw e-mail | index | archive | help
> Last night, I did something phenomenally dumb and, to make a long story > short, lost everything changed in my home directory between June 19 and > last night. It was all in a tar file on a Jaz disk, and instead of restoring > it, I removed it. :( I realize this is a very tall order, but is there > anything I can do to get the file back? I realize that the general principle > is that "once it's gone, it's gone," but I am very willing to spend hours, > days, weeks, etc. trying to get this tar file back. It contains, among many > other things, work I've done for my graduate research that I'd really rather > not try to do over again if I can avoid it (even if it means spending more > time trying to get it back than it took to do it in the first place). > > So, here's my situation. The file system on the Jaz disk has not been > modified since I removed the tar file. I dd'd the entire file system to a > file just to be safe. Running more(1) on that file shows that at least the > file name of the deleted file is still in the file system in some form. A > friend pointed me at fsdb(8), and I did an experiment with /usr/obj wherein I > dd'd /dev/zero to a file for a couple of seconds, figured out which inode that > file was at, removed it, then went to that inode to see what information was > there. Everything looked the same, so now I am wondering what, if anything, > can be done to "restore" that file? My file system skills and knowledge are > poor at best, and some of what I've said here may sound ridiculous, but I am > desperate enough to go through all 126,000+ inodes until I find something > that looks vaguely like what I'm looking for (thank goodness for libedit(3)!). Please tell me you were mounted sync, and tell me you didn't create any files on the drive after you did this! The first thing to do is to find the inode of the file that was deleted. To do this, you need to read-only mount a copy of the disk image, and go to the directory. If it was in /, you won't need the mount, since you will know that the root inode is inode #2. Go through the raw directory blocks. So long as you did not create any new files, the record will still be there. If the file was not the first file in a directory block, then the inode number of the file will still be there. If it was not the first, then you will need to go "fishing". If you need to go fishing, then you need to go through the directory blocks looking for a .tgz file signature: 00000000 1f 8b 08 00 61 67 d6 34 00 03 ed 5a 4b 53 e3 38 |....ag.4...ZKS.8| (this one is for a "gzip compressed data, deflated" file; see the file /usr/share/misc/magic for the range of posible numbers). Given the way tar and gzip operate on the input and output, this will probably be the first of a set of contiguous blocks of data. At a minimum, the size of an FS block. Knowing your block size at this point would be a good thing. Use the "tunefs" and "dumpfs" programs on the raw device: ex: # tunefs -p /dev/rsd0a tunefs: maximum contiguous block count: (-a) 7 tunefs: rotational delay between contiguous blocks: (-d) 0 ms tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time # dumpfs /dev/rsd0a magic 11954 time Fri Jul 31 20:35:34 1998 cylgrp dynamic inodes 4.4BSD nbfree 2220 ndir 63 nifree 6796 nffree 57 ncg 2 ncyl 32 size 32768 blocks 31759 bsize 8192 shift 13 mask 0xffffe000 fsize 1024 shift 10 mask 0xfffffc00 frag 8 shift 3 fsbtodb 1 cpg 16 bpg 2048 fpg 16384 ipg 3840 minfree 8% optim time maxcontig 7 maxbpg 2048 rotdelay 0ms rps 60 ntrak 1 nsect 2048 npsect 2048 spc 2048 symlinklen 60 trackskew 0 interleave 1 contigsumsize 7 nindir 2048 inopb 64 nspf 2 maxfilesize 549755813888 sblkno 16 cblkno 24 iblkno 32 dblkno 512 sbsize 2048 cgsize 4096 cgoffset 1024 cgmask 0xffffffff csaddr 512 cssize 1024 shift 9 mask 0xfffffe00 cgrotor 0 fmod 0 ronly 0 clean 0 (no rotational position table) cs[].cs_(nbfree,ndir,nifree,nffree): (1179,53,3572,6) (1041,10,3224,51) [ ... ] For my example, my block size is 8192 -- 8k. This means that after you find the signature, then you will have ~16 blocks of information, minimally, to start recovery. 8k is enough to partially decompress: # dd if=/dec/rsd0a skip={block offset} count=16 of=foo 16+0 records in 16+0 records out 8192 bytes transferred in 0.009735 secs (841501 bytes/sec) # cat foo | gunzip -f > foofoo # file foofoo gunzip: stdin: unexpected end of file # file foofoo foofoo: GNU tar archive # tar tvf foofoo [ ... verify this is your data ... ] Once you find the data, then you need to find the data blocks that reference it, and the inode that references them. Knowing the size that the file was so you know whether or not it used indirect blocks would be useful. Of course, if it wasn't the first entry in the directory, this is what happened when you deleted the file: Before: [ ][ ][ ] `----------> `------> `--------------------------> After: [ ][ ][ ] `--------------------> `--------------------------> ^ | In other words, the inode number is still there, and it's easier to work down than up... Unfortunately, all the tools I cobbled together to do this the last time I shot my foot off are on 6525 QIC tape, and they apply to the UFS in SunOS 4.1.3u1, and don't know anything about indirect blocks using negative offsets, etc.. One thing to note: free space is coelesced on creates. This means that if you created any files in the directory, you may have destroyed the directory entry identification of the inode number. If so, you get to grovel the disk (whee!). This is probably what fsdb sould be capable of, but it's not. In any case, once you get a set of block lists, you are home free. The use of compression makes things hardware on you; it is easy to identify the next block of data in a tar archive by using tar to test. It is harder to test using gzip for the next block of gzip data. You can either brute-force it, or you can hack up gzip. The moral of this story: keep important things unzipped if your backup storage is not a linear archival format (ie: an FS is not linear). If you must use compression, use a deterministic algorithm. Either specify it, or use UNIX compress, not gzip, so that if this happens, you don't have to grovel 20+ magic numbers. In any case, that should get you started. If you end up hacking tools out of this, the tools should be smart enough to avoid all allocated space. For a partially filled disk, this cuts down the work immensely. You may also want to make a offset-on-device-device; either that, or hack the magic program to be able to start it at non-zero offsets. Part of the problem with the magic (file) program is that its data format in the file is rather inflexible to arbitrary application of the information, and the license on the source code makes it a rewrite to add the capability in, if you want to distribute the resulting code. Let me know if you have any FS layout specific questions that you can't get answers on from the header files. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199807312116.OAA29689>