Date: Fri, 23 Nov 2018 02:17:20 +0100 From: "Julian H. Stacey" <jhs@berklix.com> To: soralx@cydem.org Cc: freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: Re: [bug] fsck refuses to repair damaged UFS using backup superblock Message-ID: <201811230117.wAN1HKAT037185@fire.js.berklix.net> In-Reply-To: Your message "Tue, 20 Nov 2018 05:30:00 -0800." <20181120053000.56fbee6b@mscad14>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi soralx@cydem.org, Added cc: <freebsd-fs@freebsd.org> to ensure file system specialists see this. Reference: > From: <soralx@cydem.org> > Date: Tue, 20 Nov 2018 05:30:00 -0800 soralx@cydem.org wrote: > > Howdy! > > Since send-pr(1) is now gone, I guess the next option is to send a > message directly to the developers... > > Yesterday, I ran into a bug in fsck_ffs that gave me a little scare. > > Short story: on -CURRENT, fsck refuses to check a FS with a corrupted > superblock, even when an alternate (backup) SB location is given. > > Long story. I've been testing a newly-built system based on an X399 > platform with a 2950X CPU and an Optane 905P 480GB U.2 drive. The > system ran a ~2-day old -CURRENT; when compiling newest world and > kernel, I found the machine in a locked-up state. After a hard reset, > boot failed because the root FS became corrupted & was not available: > kernel: Superblock check-hash failed: recorded check-hash XXX != computed check-hash YYY > > I have not yet figured out why the corruption happened... bad hardware? > bug in the NVMe driver? > > "OK", I thought, "No worries. We'll just boot using another disk, fsck > the corrupted FS with a backup superblock, and be up in a moment". > The machine was doing nothing but compiling, so no valuable data loss. > > So I did `dumpfs -m /dev/ada0p3` on the spare disk (which was the > source for the new disk image, thus had almost identical partitions > and filesystems) to get the FS details, then did `newfs -N [...] > /dev/ada0p3` to find locations of superblock backups, then finally > ran `fsck_ffs -b 192 /dev/nvd0p3` -- only to get the same "check- > -hash failed" message, plus another strange message: "Can't open > /dev/nvd0p3: [...]". Then fsck quits. > Note that `fsck_ffs -b ...` on a FS with good superblock works OK. > > After fiddling with a debugger for a bit, I commented out the line > "return (0);" in /usr/src/sbin/fsck_ffs/setup.c:136, recompiled fsck, > and the FS was recovered successfully. > > What was actually happening: fsck's setup.c calls ufs_disk_fillout() > from libufs' type.c, which in turn calls sbread() from the same > library, which then calls sbget(disk->d_fd, &fs, -1) [[where '-1' > is hard-coded to indicate the primary superblock]] that then simply > invokes ffs_sbget from ffs kernel driver -- and this returns ENOENT, > which eventually causes fsck to give up before even looking at the > specified backup superblock. > > I don't know what exactly ufs_disk_fillout() does, but fortunately > for me, fsck worked without the "sbread(disk)" part of that function > having much luck on a disk with corrupted superblock. Also, I have a > feeling that calling a kernel's ffs driver function when using fsck > to fix a broken filesystem is not the best thing to do... > > Please CC, as I am not subscribed. > > -- > [SorAlx] ridin' VN2000 Classic LT > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > Cheers, Julian -- Julian Stacey, Computer Consultant, Systems Engineer, BSD Linux Unix, Munich. Brexit referendum stole 3,700,000 votes from Brits abroad, inc. 700,000 in EU UK PM lied it's democratic in Article 50 http://exitbrexit.uk/brexit/#lie Campaign lies, criminal funded; Markets, jobs & pound down; New Referendum!
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201811230117.wAN1HKAT037185>