From owner-freebsd-hackers@freebsd.org Fri Nov 23 01:20:34 2018 Return-Path: Delivered-To: freebsd-hackers@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D01DF113685B; Fri, 23 Nov 2018 01:20:33 +0000 (UTC) (envelope-from jhs@berklix.com) Received: from land.berklix.org (land.berklix.org [144.76.10.75]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "land.berklix.org", Issuer "land.berklix.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id D92BB706A7; Fri, 23 Nov 2018 01:20:31 +0000 (UTC) (envelope-from jhs@berklix.com) Received: from mart.js.berklix.net (p2E52CE62.dip0.t-ipconnect.de [46.82.206.98]) (authenticated bits=0) by land.berklix.org (8.15.2/8.15.2) with ESMTPSA id wAN1HgT1047482 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 23 Nov 2018 01:17:53 GMT (envelope-from jhs@berklix.com) Received: from fire.js.berklix.net (fire.js.berklix.net [192.168.91.41]) by mart.js.berklix.net (8.14.3/8.14.3) with ESMTP id wAN1Hcks051877; Fri, 23 Nov 2018 02:17:38 +0100 (CET) (envelope-from jhs@berklix.com) Received: from fire.js.berklix.net (localhost [127.0.0.1]) by fire.js.berklix.net (8.14.7/8.14.7) with ESMTP id wAN1HKAT037185; Fri, 23 Nov 2018 02:17:32 +0100 (CET) (envelope-from jhs@berklix.com) Message-Id: <201811230117.wAN1HKAT037185@fire.js.berklix.net> To: soralx@cydem.org cc: freebsd-hackers@freebsd.org, freebsd-fs@freebsd.org Subject: Re: [bug] fsck refuses to repair damaged UFS using backup superblock From: "Julian H. Stacey" Organization: http://berklix.eu BSD Unix Linux Consultants, Munich Germany User-agent: EXMH on FreeBSD http://berklix.eu/free/ X-From: http://www.berklix.eu/~jhs/ In-reply-to: Your message "Tue, 20 Nov 2018 05:30:00 -0800." <20181120053000.56fbee6b@mscad14> Date: Fri, 23 Nov 2018 02:17:20 +0100 X-Rspamd-Queue-Id: D92BB706A7 X-Spamd-Result: default: False [-2.39 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-0.98)[-0.978,0]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; NEURAL_HAM_LONG(-0.95)[-0.953,0]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[berklix.com]; AUTH_NA(1.00)[]; HAS_ORG_HEADER(0.00)[]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; MX_GOOD(-0.01)[land.berklix.com,slim.berklix.com]; NEURAL_HAM_SHORT(-0.88)[-0.882,0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[98.206.82.46.zen.spamhaus.org : 127.0.0.10]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:24940, ipnet:144.76.0.0/16, country:DE]; RCVD_TLS_LAST(0.00)[]; IP_SCORE(-0.47)[ipnet: 144.76.0.0/16(0.56), asn: 24940(-2.91), country: DE(-0.01)] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Nov 2018 01:20:34 -0000 Hi soralx@cydem.org, Added cc: to ensure file system specialists see this. Reference: > From: > Date: Tue, 20 Nov 2018 05:30:00 -0800 soralx@cydem.org wrote: > > Howdy! > > Since send-pr(1) is now gone, I guess the next option is to send a > message directly to the developers... > > Yesterday, I ran into a bug in fsck_ffs that gave me a little scare. > > Short story: on -CURRENT, fsck refuses to check a FS with a corrupted > superblock, even when an alternate (backup) SB location is given. > > Long story. I've been testing a newly-built system based on an X399 > platform with a 2950X CPU and an Optane 905P 480GB U.2 drive. The > system ran a ~2-day old -CURRENT; when compiling newest world and > kernel, I found the machine in a locked-up state. After a hard reset, > boot failed because the root FS became corrupted & was not available: > kernel: Superblock check-hash failed: recorded check-hash XXX != computed check-hash YYY > > I have not yet figured out why the corruption happened... bad hardware? > bug in the NVMe driver? > > "OK", I thought, "No worries. We'll just boot using another disk, fsck > the corrupted FS with a backup superblock, and be up in a moment". > The machine was doing nothing but compiling, so no valuable data loss. > > So I did `dumpfs -m /dev/ada0p3` on the spare disk (which was the > source for the new disk image, thus had almost identical partitions > and filesystems) to get the FS details, then did `newfs -N [...] > /dev/ada0p3` to find locations of superblock backups, then finally > ran `fsck_ffs -b 192 /dev/nvd0p3` -- only to get the same "check- > -hash failed" message, plus another strange message: "Can't open > /dev/nvd0p3: [...]". Then fsck quits. > Note that `fsck_ffs -b ...` on a FS with good superblock works OK. > > After fiddling with a debugger for a bit, I commented out the line > "return (0);" in /usr/src/sbin/fsck_ffs/setup.c:136, recompiled fsck, > and the FS was recovered successfully. > > What was actually happening: fsck's setup.c calls ufs_disk_fillout() > from libufs' type.c, which in turn calls sbread() from the same > library, which then calls sbget(disk->d_fd, &fs, -1) [[where '-1' > is hard-coded to indicate the primary superblock]] that then simply > invokes ffs_sbget from ffs kernel driver -- and this returns ENOENT, > which eventually causes fsck to give up before even looking at the > specified backup superblock. > > I don't know what exactly ufs_disk_fillout() does, but fortunately > for me, fsck worked without the "sbread(disk)" part of that function > having much luck on a disk with corrupted superblock. Also, I have a > feeling that calling a kernel's ffs driver function when using fsck > to fix a broken filesystem is not the best thing to do... > > Please CC, as I am not subscribed. > > -- > [SorAlx] ridin' VN2000 Classic LT > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > Cheers, Julian -- Julian Stacey, Computer Consultant, Systems Engineer, BSD Linux Unix, Munich. Brexit referendum stole 3,700,000 votes from Brits abroad, inc. 700,000 in EU UK PM lied it's democratic in Article 50 http://exitbrexit.uk/brexit/#lie Campaign lies, criminal funded; Markets, jobs & pound down; New Referendum!