Date: Wed, 9 Jul 2008 04:10:51 +0200 From: Polytropon <freebsd@edvax.de> To: "freebsd-questions@freebsd.org" <freebsd-questions@freebsd.org> Subject: Data loss after power out - fsck: bad inode number to nextinode Message-ID: <20080709041051.bad001ab.freebsd@edvax.de>
next in thread | raw e-mail | index | archive | help
Hi, since last week I'm in big trouble: After an power outage my main system didn't boot up anymore, so I checked its hard disk (FreeBSD 5.4) in my new system (FreeBSD 7.0). I booted the system in SUM and ran fsck on the partitions. / on /dev/ad1s1a could be repaired, /var on 1d too, /usr on 1e lost many directory entries (X11R6, for exmaple), but all files and directory entry points got restored to lost+found. Okay, that's as I know it should be. But it doesn't matter, because everything there could be reinstalled. Problems occured when checking /home on /dev/ad1s1f. After lot of 1101472 DUP I=260035 UNEXPECTED SOFT UPDATE INCONSISTENCY and EXCESSIVE DUP BLKS I=260039 CONTINUE? yes and 7310315658325879925 BAD I=260051 UNEXPECTED SOFT UPDATE INCONSISTENCY fsck ended up this way: INCORRECT BLOCK COUNT I=290557 (3104 should be 736) CORRECT? yes fsck_4.2bsd: bad inode number 306176 to nextinode The result: The home directories of all other users where present, but mine (!) - /home/adec - was missing. I may explain this a bit more precise: When looking at the files using the Midnight Commander, the name of my home directory was displayed, preceeded by "?", and in red colour, with a strange date (the epoch?). |?adec | 0|Jan 1 1970| So I could not change into this directory and get my files out of there. In order not to damage the system more, I made a ddrescue dump of the partition: % ddrescue -d -r 3 -n /dev/ad1s1f home.ddrescue logfile The data could be read without problems. The resulting file seemed to be an 1:1 copy of the partition. % file home.ddrescue home.ddrescue: Unix Fast File system [v2] (little-endian) last mounted on /mnt, last written at Wed Jul 2 18:51:06 2008, clean flag 0, readonly flag 0, number of blocks 44322272, number of data blocks 42925108, number of cylinder groups 472, block size 16384, fragment size 2048, average file size 16384, average number of files in dir 64, pending blocks to free 0, pending inodes to free 0, system-wide uuid 0, minimum percentage of free blocks 8, TIME optimization When checking it with % fsck -t ufs -yf /dev/md10 fsck gives the same error message as above. Then I mounted the image: % sudo mdconfig -a -t vnode -u 10 -f home.ddrescue % mount -t ufs -o ro /dev/md10 mnt And guess what? Same problem: Directory name shown, but directory not changable. But then, I noticed something interesting: % df -h Filesystem Size Used Avail Capacity Mounted on /dev/md10 82G 75G 716M 99% /export/home/adec/rescue/mnt See the size differences? Something seems to be missing. I hope it is the content of my home directory that's still on the disk. Some checking: % sudo du -sch mnt du: mnt/adec: Bad file descriptor du: mnt/archiv/cr/clips.w32/s01.wmv: Bad file descriptor du: mnt/archiv/cr/clips.w32/s02.wmv: Bad file descriptor 52G mnt 52G total This reveals that it seems to be possible that approx. 30 GB are not marked as free. % file mnt/adec mnt/adec: cannot open `mnt/adec' (Bad file descriptor) % cd mnt/adec mnt/adec: Not a directory. Before bothering anyone here at this list, I checked information on the net and found that only one (!!!) person except me seemd to have this problem. And he got no help. Do I? =^_^= Of course I took the time to read about the FFS architecture. If I did understand it correctly, fsck stops working, showing the informative error message "bad inode number 306176 to nextinode" because it cannot get the next inode from a concatenated list that represents the file and directory hierarchy, so there must be a "bad pointer". While the names of the next things represented by inodes reside within a data structure at level N, the corresponting data entries reside at level N + 1 where a pointer should lead to. This may be an explaination why the name "adec" is still in ad1s1f's root directory, but the data that says "I'm a directory, this is my content" is not referenced anymore. So fsck cannot continue. The missing inodes need to get reconnected. In most cases, that's what lost+found usually contains: unreferenced inodes that are not marked free: their names are gone (N), but their content is still there (N + 1), and the new file name is "#" plus their inode number. What should I do? Help is VERY welcome! If you have any ideas what to do, I'd be glad to save the money I would have to spend when sending the disk to a data recovery service - 1000 Euro and more are nothing I can afford. And when you're low on money, adequate tape backup systems are too expensive (allthoug such a device would be my first choice). By the way, this must be the revenge of a higher instance. I always talk about backups, but because everything works fine for years, I got lazy... I'm a long time happy FreeBSD user and I newer saw this kind of problem. My whole existance is connected to my home directory. Yes, it is that hard for me... please help! Thanks!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080709041051.bad001ab.freebsd>