From owner-svn-src-all@FreeBSD.ORG Mon Mar 28 23:22:55 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 8E1361065679; Mon, 28 Mar 2011 23:22:55 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from 65-241-43-5.globalsuite.net (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id ADF461A43D2; Mon, 28 Mar 2011 23:22:20 +0000 (UTC) Message-ID: <4D91182C.601@FreeBSD.org> Date: Mon, 28 Mar 2011 16:22:20 -0700 From: Doug Barton Organization: http://SupersetSolutions.com/ User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110319 Thunderbird/3.1.9 MIME-Version: 1.0 To: Jeff Roberson References: <4D840BD0.4030306@freebsd.org> <201103200000.p2K00pue003373@chez.mckusick.com> <20110320162212.GI1606@alchemy.franken.de> <4D8662B0.8000705@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.1.2 OpenPGP: id=1A1ABC84 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Tue, 29 Mar 2011 01:45:36 +0000 Cc: src-committers@freebsd.org, kvedulv@kvedulv.de, Jeff Roberson , Kirk McKusick , Gavin Atkinson , Nathan Whitehorn , Marius Strobl , svn-src-head@freebsd.org, svn-src-all@freebsd.org Subject: Re: svn commit: r219667 - head/usr.sbin/bsdinstall/partedit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 23:22:55 -0000 On 03/21/2011 00:33, Jeff Roberson wrote: > On Sun, 20 Mar 2011, Doug Barton wrote: > >> On 03/20/2011 09:22, Marius Strobl wrote: >> >>> I fear it's still a bit premature for enable SU+J by default. Rather >>> recently I was told about a SU+J filesystems lost after a panic >>> that happend after snapshotting it (report CC'ed, maybe he can >>> provide some more details) and I'm pretty sure I've seen the problem >>> described in PR 149022 also after the potential fix mentioned in its >>> feedback. >> >> +1 >> >> I tried enabling SU+J on my /var (after backing up of course) and >> after a panic random files were missing entirely. Not the last updates >> to those files, the whole file, and many of them had not been written >> to in days/weeks/months. >> > > So you're saying the directory entry was missing? I'm saying that the file wasn't visible to 'ls /var/db/pkg/foo/'. I didn't debug it past determining that the files were missing. > Can you tell me how big the directory was? Most of the damage was in /var/db/pkg/, so the individual directories that were missing files were small, no more than 10 files each. I imagine there was probably other damaged scattered throughout /var, but once I learned how many files were missing I just nuked it and restored from backup. > Number of files? I stopped counting around 20 or so. > Approximate directory size when > you consider file names? When you fsck'd were inodes recovered and > linked into lost and found? No. > What was the actual path? To the lost files? The ones that I actually noticed missing were all /var/db/pkg/*/+CONTENTS. There were probably a lot of other files missing, but those were noticeable because the ports tree was throwing errors, and a missing +CONTENTS file can't be recovered from without re-installing the port. > I'm trying to wrap my head around how this would be possible and where > the error could be and whether it could be caused by SUJ. It never happened before enabling SUJ, happened shortly after I did, and has never happened since I disabled it. It's probably worth reiterating that the damage happened after an actual panic, as opposed to during "regular" operation. > The number of > interactions with disk writes are minimal. Corruption if it occurs would > most likely be caused by a bad journal recovery. Unlikely in this case, since the damage was not confined to recently-written files. hth, Doug PS, my primary concern was that we not enable this by default until it can be demonstrated to be more robust. However Nathan has already enabled it in the new installer, so now perhaps it would be fitting to send a message to -current letting people know that the plan is to have it on by default in 9.0, and asking people to resume more rigorous testing. -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/