Date: Sat, 28 Apr 2012 12:51:10 -0400 From: Alejandro Imass <aimass@yabarana.com> To: Robert Bonomi <bonomi@mail.r-bonomi.com> Cc: wojtek@wojtek.tensor.gdynia.pl, freebsd-questions@freebsd.org Subject: Re: UFS Crash and directories now missing Message-ID: <CAHieY7S-o-iFG0Z9SW08puMagDnHQnznLkWYJOR_6LQdHr70dw@mail.gmail.com> In-Reply-To: <CAHieY7Sip7LePPnt7S6Yqt=nuAoytG%2B5EqfH4t5kVnqFFZtRkg@mail.gmail.com> References: <CAHieY7ToprF89C7yoeWkX8Pqom-=PY9tk2raNuNGHsbnhukXmg@mail.gmail.com> <201204281539.q3SFdtir061045@mail.r-bonomi.com> <CAHieY7Sip7LePPnt7S6Yqt=nuAoytG%2B5EqfH4t5kVnqFFZtRkg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Apr 28, 2012 at 12:36 PM, Alejandro Imass <aimass@yabarana.com> wro= te: > On Sat, Apr 28, 2012 at 11:39 AM, Robert Bonomi > <bonomi@mail.r-bonomi.com> wrote: >> >> =A0Alejandro Imass <aimass@yabarana.com> wrote: >>> On Sat, Apr 28, 2012 at 3:22 AM, Wojciech Puchar >>> <wojtek@wojtek.tensor.gdynia.pl> wrote: >>> >> I somewhat agree, but it wasn't a person. I am the only administrato= r, >>> >> the only one with root access. The jails were effectively moved to t= he >>> >> /usr/local/etc/apache22 of the single that survived at the top level= . >>> >> I'm thinking something between mount, EzJail, the journal and the wa= y >>> >> MySQL created a great deal of head contention, so something must hav= e >>> >> gotten corrupted at the directory level like you state, but the >>> >> strange part is no _data_ corruption as such, because I was able to >>> >> physically archive the jails, move them to the correct directory and >>> > >>> > >>> > no matter what you do FreeBSD DOES NOT ramdomly move directories. if = you are >>> > sure you didn't move it yourself then it must be machine hardware pro= blem >>> > but still unlikely. >>> >>> After a little more research, ___it it NOT unlikely at all___ that >>> under high distress and a hard boot, UFS could have somehow corrupted >>> the directory structure, whilst maintaining the data intact. >> >> This is techically accurate, *BUT* the specifics of the quote "corruptio= n" >> unquote in the case under discussion make it *EXTREMELY* unlikely that t= his >> is what happened. >> >> 99.99+++% of all UFS filesystem "corruption' issues are the result of a >> system crash _between_ the time cached 'meta-data' is updated in memory >> and that data is flushed to disk (a deferred write). >> >> The second most common (and vanishingly rare) failure mode is a powerfai= l >> _as_ a sector of disk is being written -- resulting in 'garbage data' >> being written to disk. >> >> The next possibility is 'cosmic rays'. =A0If running on 'cheap' hardware= (i.e., >> without 'ECC' memory), this can cause a *SINGLE-BIT* error in data being >> output. >> >> The fact that the 'corrupted' filesystem passed fsck -without- any repor= ted >> errors shows that everything in the filesystem meta-data was consistent >> >> Given *that*, there are precisely *TWO* ways that the 'results' that hav= e >> been reported could have happened. >> >> =A01) "Something" did a mv(2) of the various jail directories 'from' the= ir >> =A0 =A0 original location to the 'apache' diretory. =A0This involves sim= ply >> =A0 =A0 *copying* the diretory entry from the jail's 'parent directory' = to >> =A0 =A0 the apache directory, and then marking the entry in the original >> =A0 =A0 parent as 'unused'. =A0Nothing other than the =A0directory whre = the jail >> =A0 =A0 'used to live', and the directory 'where it was found' are touch= ed. >> =A0 =A0 This occured _through_ the system 'mv' function, so all the norm= al >> =A0 =A0 'housekeeping' was done properly. >> >> =A02) it was -not- done though mv(2) -- but that requires that a whole >> =A0 =A0 *series* of "corruptions" of the filesystem, _ALL_ of which had = to >> =A0 =A0 occur in 'exactly' the right way. =A0They are: > > [...] > >> I think it is safe to conclude that the probabilities -greatly- favor >> alternative #1. >> > > OK. So after your comments and further research I concur with you on > the mv but if it wasn't a human, then this might be exposing a serious > security flaw in the jail system or the way EzJail implements it. The > whole point of using jails is to protect things like this from > happening. Given that the only jail that survived was the front-end > Apache Web server/reverse proxy, then it is also safe to suspect the > apache (or other) process running on it was able to perform a mv of > the rest of the jails to it's own /usr/local/etc/apache22 directory. > > Is there no possibility is that after the system crash, the journal > recocery process and/or fsck could have moved this directories ? > Also note that even the EzJail basejail was moved also, so it could be a security hole in the way nullfs is used or in nullfs itself. but the curious thing is that the basejail is supposed to be mounted read-only so how did that get moved to the http-proxy jail?? That is why I suspect it could have been something in the boot process like the journal recovery, fsck or something else with that kind of privilege and when the EzJail filesystems were unmounted. --=20 Alejandro
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHieY7S-o-iFG0Z9SW08puMagDnHQnznLkWYJOR_6LQdHr70dw>