Date: Tue, 30 Apr 2002 09:27:17 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: Joshua Steele <jsteele@CodefusionIS.com> Cc: Michael Sierchio <kudzu@tenebras.com>, freebsd-fs@freebsd.org Subject: Re: newfs overwrite... Message-ID: <3CCEC5E5.FED0CBF@mindspring.com> References: <20020429121106.V97112-100000@lilly>
next in thread | previous in thread | raw e-mail | index | archive | help
Joshua Steele wrote: > Well..this was the backup/storage server. I contacted drivesavers, and > its going to be about 7,000.00US to get it fixed by them...which is not an > option because i do not have that much in resources to get the drive fixed > (i am a small business) > > Are there any other tools, etc. for freebsd that aide in rebuilding the fs > table? Or am i basically not going to be able to repair the drive, and > might as well move on and start salvaging what financial data i do have at > the current time before the tax quarter is up.... Buy a much-larger-than-60G disk (preferrably, more than twice as large), and: 1) dd the image of the 60G disk into a single file Note: Not really necessary, but it prevents you from screwing up your "live" disk) 2) Start copying out chunks of data base on cylinder groups, and identification of secondary indirect blocks The data better be *really* valuable, as this is a manual, labor intensive operation. If it's recognizable to a human, then you are going to be doing a lot of looking; if it's not, you are going to be using the remainder of the disk space to write some programs to recover particular file contents type of data. It's alway easiest if the drive is human readable. I recovered a good 250,000 lines of source code from a spammed drive this way, at one time in my misspent youth, so that the project, due in a couple of days after the fact, would not be turned in late. The main problem is that when you delete a file, the physical analogy is to take the contents out of the file folder, rip the label off the file folder, and then shuffle the pages that were in the folder into your blank printer paper (knowing that the printer will erase them before it prints on them), after which you throw the file floder back into the supply cabinet. You've basically done this with all your files. If the papers don't contain binary information (e.g. the moral equivalent of encrypted data, in terms of being able to identify which piece of paper goes in which file folder, or which piece of paper goes in what order), then it's just a big sorting job. If it's binary data, you can basically perform an iterative search based on your knowledge of the contents, in order to recover the data. For an executable, this is probably not worthwhile (you can always replace it), but identifying "magic numbers" for things like Postscript, ELF executables, etc., are actually very easy; the remainder of the file, less so. The other hint you have is that every set of 9 pages in large file folders are "stapled together" -- members of the same clyinder group. If you have a rough idea of the FS size (which you do), then examining the post-newfs disk read-only will tell you where all the FS layout information lives. From this, you can probably recover directory information pretty easily, which can give you inode and relative cylinder group information; doing this requires a fairly deep understanding of the FS in question. THe drive recovery place might be a deal. Basically, they copy the normallay readable data off the disk, and then read the disk, taking head hysteresis into account, to recover the misaligned track writes, if any, to recover the data (which is why MILSPEC erasure requires the writing of patterned data to the disk, from both seek directions, to achieve erasure of "secret" data). On a theoretical standpoint: o Everything above is predicated on the idea you are using FFS. If you use another FS, the recovery details become very much easier or very much harder, depending on the FS. o It's pretty trivial to change the process to lazy-bind the contents of deleted information, so that instead of writing zero'ed inodes to the disk, you leave the index information intact, and only zeroit on reallocation; this makes undeleting files a lot easier, because it doesn't put the unlabelled file folder back into the file cabinet. It also leaves the papers in the folder, though they are available for the printer to grab and clear at random, if it's asked to print (saving new files to the FS may overwrite "deleted" data). This would be a rather simple operation for FFS, actually. o It's also pretty trivial to change it so that formatting actually scrubs the disk, and then deletion also scrubs the disk. In combination, this would be a bad thing, but seperately, it would allow you to recover a lot of data much more quickly, by being able to rule out large amounts of disk space from consideration. o It's pretty trivial to change the formatting process to resemble the Windows formatting process, which means that the newfs can be made largely reversible. This is actually probably a pretty good idea, for general small businesses like yours, actually. No one has seriously attempted to productize UNIX, yet... not even Univel, back in the day. Anecdote time: One thing we often did at the local university any time a machine was donated was to first undelete everything, and see if there were games on the disks. The FS layout helped us considerably. This was before doing such things was considered illegal. o If you are depending on the data being unrecoverable merely because you format the disk... it's not going to happen... o The data is always recoverable. The speed and time is a matter of the effort you are willing to expend. Depending on the unrecoverability of the data is a losing proposition, unless it's encrypted, and if it's something like DES, using "the crypt breaker's workbench" makes it pretty trivial to recover the data, as well. o Having some of the financial data on hand in a format that allows recreation of partial data gives you enough information that you can probably eliminate the data There are some things that can make a disk unrecoverable, but they all require the use of cryptographic mechanisms. If you have used a good one on your financial data on the disk... it's time to start over entering the data. If you have time pressure on you right now, spend the money. If you have some leeway, then recover the data the slow way, and if it's not panning out, then spend the money before the time-to-recover window closes. You might also look at this as an opportunity to build the tools needed to recover the data more quickly. It's actually not that difficult to build such tools, and you have a test image (now) that is a relatively expensive thing to create. 8-(. Frankly, any time I've done this to a disk, I've always been most concerned with a small subset of the data, not the whole disk, so the recovery was simultaneously much easier and much less totally labor intensive; like a linear search, I could stop after only having examined about 50% of my total data set. It also means that all the tools I wrote for the job were so small that I just threw them away when I was done with them (e.g. I didn't archive them for posterity, but I also didn't actively seek to get rid of them, they just got backed up on tape and ignored, over time). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3CCEC5E5.FED0CBF>