From owner-freebsd-current@FreeBSD.ORG Wed Dec 28 05:14:04 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6D801065673 for ; Wed, 28 Dec 2011 05:14:04 +0000 (UTC) (envelope-from lx@redundancy.redundancy.org) Received: from redundancy.redundancy.org (75-101-96-57.dsl.static.sonic.net [75.101.96.57]) by mx1.freebsd.org (Postfix) with SMTP id 9B9538FC0C for ; Wed, 28 Dec 2011 05:14:04 +0000 (UTC) Received: (qmail 82671 invoked by uid 1001); 28 Dec 2011 05:14:28 -0000 Date: Tue, 27 Dec 2011 21:14:28 -0800 From: David Thiel To: d@delphij.net Message-ID: <20111228051404.GL45484@redundancy.redundancy.org> References: <20111227215330.GI45484@redundancy.redundancy.org> <20111227223638.GK45484@redundancy.redundancy.org> <4EFA4B4E.201@delphij.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EFA4B4E.201@delphij.net> X-OpenPGP-Key-fingerprint: 482A 8C46 C844 7E7C 8CBC 2313 96EE BEE5 1F4B CA13 X-OpenPGP-Key-available: http://redundancy.redundancy.org/lx.gpg X-Face: %H~{$1~NOw1y#%mM6{|4:/ List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Dec 2011 05:14:04 -0000 On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote: > >> - use journalled fsck; - use normal fsck to check if the > >> journalled fsck did the right thing. Ok, here is the log of fsck with and without journal. http://redundancy.redundancy.org/fscklog3 That was done the very next boot, after a clean shutdown. The errors from the previous live fsck aren't there (oddly), but there are still are apparently some corrections made. The next fsck still complains, but doesn't give any salvage prompts. Here is jsa@'s, done on a live FS with SU+J: http://redundancy.redundancy.org/fscklog4 I'm not actually looking to solve my particular problem per se. The issue is that almost everyone I've checked with that's running SU+J gets unref'd file and other errors when they check their filesystem (with the fs live). Unless I'm missing something, a running FS should never have those kinds of errors unless you deliberately disabled fsck. This leaves only a couple options: - SU+J and fsck do not work correctly together to fix corruption on boot, i.e. bgfsck isn't getting run when it should - Stuff is getting completely screwed up after boot - fsck is giving incorrect results - I'm completely clueless about how SU+J is supposed to behave or be deployed I'm pretty certain that the first is the issue here. It would be great if others could check their own SU+J filesystems so we could get a few more data points.