From owner-freebsd-questions@FreeBSD.ORG Sat Apr 28 16:51:11 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ABA431065670 for ; Sat, 28 Apr 2012 16:51:11 +0000 (UTC) (envelope-from aimass@yabarana.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6D20A8FC16 for ; Sat, 28 Apr 2012 16:51:11 +0000 (UTC) Received: by iahk25 with SMTP id k25so3216915iah.13 for ; Sat, 28 Apr 2012 09:51:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=WwEH9gTij9yEhy2Jg56EudqP/o5RpGAZHh/DLZAn4Kg=; b=LywzNQ0WSa1tQJJpaeG0X6YoIwBv+vJDw6jhgovrIFFKnbtJ8dIUymOGc+jhtlVy+Y URiD86gjo6eYvg8dn/LSMQRlJkSyKhrUBx+QJ4cPNIJW6lDAfDJwrhTEt4gGhYjNxCA+ LhVDGHDOr4lqAVU+FhzypawL/ANUXbSy0iUfyOqLLEC6fEH3Sf7im1X46CWY8MION8tG 5a95bwBftJzUFu1h8Y5ObInOetrP2y5dadQ6uOhLEbiK81AiPlQd+MtCHp+XdWkg2bA/ 8DsjIVeUfjSI4P4xDdXmFWXUdj7wA+EpfsQTdnHkVWDB2Hs5K7Q0OpcuOXuq9WgqNEI8 rKcg== MIME-Version: 1.0 Received: by 10.50.149.170 with SMTP id ub10mr6201254igb.43.1335631871050; Sat, 28 Apr 2012 09:51:11 -0700 (PDT) Received: by 10.231.74.138 with HTTP; Sat, 28 Apr 2012 09:51:10 -0700 (PDT) In-Reply-To: References: <201204281539.q3SFdtir061045@mail.r-bonomi.com> Date: Sat, 28 Apr 2012 12:51:10 -0400 Message-ID: From: Alejandro Imass To: Robert Bonomi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQlGqi169rcbW9+lwGuRNSC0KpYMA8+KMjxBeC7qSkiGRFNaA1ZOpTQq/+J1FWtMENEqWrqQ Cc: wojtek@wojtek.tensor.gdynia.pl, freebsd-questions@freebsd.org Subject: Re: UFS Crash and directories now missing X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Apr 2012 16:51:11 -0000 On Sat, Apr 28, 2012 at 12:36 PM, Alejandro Imass wro= te: > On Sat, Apr 28, 2012 at 11:39 AM, Robert Bonomi > wrote: >> >> =A0Alejandro Imass wrote: >>> On Sat, Apr 28, 2012 at 3:22 AM, Wojciech Puchar >>> wrote: >>> >> I somewhat agree, but it wasn't a person. I am the only administrato= r, >>> >> the only one with root access. The jails were effectively moved to t= he >>> >> /usr/local/etc/apache22 of the single that survived at the top level= . >>> >> I'm thinking something between mount, EzJail, the journal and the wa= y >>> >> MySQL created a great deal of head contention, so something must hav= e >>> >> gotten corrupted at the directory level like you state, but the >>> >> strange part is no _data_ corruption as such, because I was able to >>> >> physically archive the jails, move them to the correct directory and >>> > >>> > >>> > no matter what you do FreeBSD DOES NOT ramdomly move directories. if = you are >>> > sure you didn't move it yourself then it must be machine hardware pro= blem >>> > but still unlikely. >>> >>> After a little more research, ___it it NOT unlikely at all___ that >>> under high distress and a hard boot, UFS could have somehow corrupted >>> the directory structure, whilst maintaining the data intact. >> >> This is techically accurate, *BUT* the specifics of the quote "corruptio= n" >> unquote in the case under discussion make it *EXTREMELY* unlikely that t= his >> is what happened. >> >> 99.99+++% of all UFS filesystem "corruption' issues are the result of a >> system crash _between_ the time cached 'meta-data' is updated in memory >> and that data is flushed to disk (a deferred write). >> >> The second most common (and vanishingly rare) failure mode is a powerfai= l >> _as_ a sector of disk is being written -- resulting in 'garbage data' >> being written to disk. >> >> The next possibility is 'cosmic rays'. =A0If running on 'cheap' hardware= (i.e., >> without 'ECC' memory), this can cause a *SINGLE-BIT* error in data being >> output. >> >> The fact that the 'corrupted' filesystem passed fsck -without- any repor= ted >> errors shows that everything in the filesystem meta-data was consistent >> >> Given *that*, there are precisely *TWO* ways that the 'results' that hav= e >> been reported could have happened. >> >> =A01) "Something" did a mv(2) of the various jail directories 'from' the= ir >> =A0 =A0 original location to the 'apache' diretory. =A0This involves sim= ply >> =A0 =A0 *copying* the diretory entry from the jail's 'parent directory' = to >> =A0 =A0 the apache directory, and then marking the entry in the original >> =A0 =A0 parent as 'unused'. =A0Nothing other than the =A0directory whre = the jail >> =A0 =A0 'used to live', and the directory 'where it was found' are touch= ed. >> =A0 =A0 This occured _through_ the system 'mv' function, so all the norm= al >> =A0 =A0 'housekeeping' was done properly. >> >> =A02) it was -not- done though mv(2) -- but that requires that a whole >> =A0 =A0 *series* of "corruptions" of the filesystem, _ALL_ of which had = to >> =A0 =A0 occur in 'exactly' the right way. =A0They are: > > [...] > >> I think it is safe to conclude that the probabilities -greatly- favor >> alternative #1. >> > > OK. So after your comments and further research I concur with you on > the mv but if it wasn't a human, then this might be exposing a serious > security flaw in the jail system or the way EzJail implements it. The > whole point of using jails is to protect things like this from > happening. Given that the only jail that survived was the front-end > Apache Web server/reverse proxy, then it is also safe to suspect the > apache (or other) process running on it was able to perform a mv of > the rest of the jails to it's own /usr/local/etc/apache22 directory. > > Is there no possibility is that after the system crash, the journal > recocery process and/or fsck could have moved this directories ? > Also note that even the EzJail basejail was moved also, so it could be a security hole in the way nullfs is used or in nullfs itself. but the curious thing is that the basejail is supposed to be mounted read-only so how did that get moved to the http-proxy jail?? That is why I suspect it could have been something in the boot process like the journal recovery, fsck or something else with that kind of privilege and when the EzJail filesystems were unmounted. --=20 Alejandro