From owner-freebsd-questions@FreeBSD.ORG Sat Apr 28 18:16:40 2012 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BF443106566C for ; Sat, 28 Apr 2012 18:16:40 +0000 (UTC) (envelope-from aimass@yabarana.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 80A488FC08 for ; Sat, 28 Apr 2012 18:16:40 +0000 (UTC) Received: by iahk25 with SMTP id k25so3305871iah.13 for ; Sat, 28 Apr 2012 11:16:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=v1nV+RJ77jDhI1pRcFVfEu+NvjkaLXdQvI8rho7w+ig=; b=ddugDpGx1+z4SzzjHOugmzCfaUgfEKpsHzg6uD5bwZBAclyKZMZUf1OTXbvmYDc7tC Ja80btmUYXWU21ydm9GPPBx4N4ZlQj7d8Tu7npQpgPtTO7e8rN6azXLqWv/seKadwLMZ NwbbZ/s949R7Fi1ey1dwoTD1sY2eIFJucI//a8jx1CBTHCds+5IarPpWbFXS+7dEtlYj HmT3bY3LBMarH/8xJBDoOA+ByKkTt1HsiDVyT8JRxbPYPB3FMoHkbx8+fkrtBSb4CHP6 CeuxerbmVj73GwKY/jLe/bj1NH0iheI7C6I7XX5AkmQgeRVuY/qAlJCoa2heAnI1o2ci Swsg== MIME-Version: 1.0 Received: by 10.50.202.100 with SMTP id kh4mr6283511igc.43.1335637000193; Sat, 28 Apr 2012 11:16:40 -0700 (PDT) Sender: aimass@yabarana.com Received: by 10.231.74.138 with HTTP; Sat, 28 Apr 2012 11:16:40 -0700 (PDT) In-Reply-To: <20120428200116.b2f5820e.freebsd@edvax.de> References: <201204281731.q3SHVaiM061997@mail.r-bonomi.com> <20120428200116.b2f5820e.freebsd@edvax.de> Date: Sat, 28 Apr 2012 14:16:40 -0400 X-Google-Sender-Auth: 4ymPuip7Z-ceLN_DL9sRmtMMI6A Message-ID: From: Alejandro Imass To: Polytropon Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQkxb+y3a2J/L2GIT2UisDsyK9WsSPQyYBsfRB9PLWOUV+mjDHz+j/t4v9yYOOJ6KZlkoyUz Cc: freebsd-questions@freebsd.org Subject: Re: UFS Crash and directories now missing X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Apr 2012 18:16:40 -0000 On Sat, Apr 28, 2012 at 2:01 PM, Polytropon wrote: > On Sat, 28 Apr 2012 13:52:02 -0400, Alejandro Imass wrote: >> On Sat, Apr 28, 2012 at 1:31 PM, Robert Bonomi wrote: >> > >> > Alejandro Imass wrote: >> >> On Sat, Apr 28, 2012 at 11:39 AM, Robert Bonomi >> >> wrote: >> >> > =A0Alejandro Imass wrote: >> >> >> After a little more research, ___it it NOT unlikely at all___ that >> >> >> under high distress and a hard boot, UFS could have somehow corrup= ted >> >> >> the directory structure, whilst maintaining the data intact. >> >> > >> >> > This is techically accurate, *BUT* the specifics of the quote "corr= uption" >> >> > unquote in the case under discussion make it *EXTREMELY* unlikely t= hat this >> >> > is what happened. >> >> > >> >> > 99.99+++% of all UFS filesystem "corruption' issues are the result = of a >> >> > system crash _between_ the time cached 'meta-data' is updated in me= mory >> >> > and that data is flushed to disk (a deferred write). >> >> > >> >> > The second most common (and vanishingly rare) failure mode is a pow= erfail >> >> > _as_ a sector of disk is being written -- resulting in 'garbage dat= a' >> >> > being written to disk. >> >> > >> >> > The next possibility is 'cosmic rays'. =A0If running on 'cheap' har= dware >> >> > (i.e., without 'ECC' memory), this can cause a *SINGLE-BIT* error i= n >> >> > data being output. >> >> > >> >> > The fact that the 'corrupted' filesystem passed fsck -without- any = reported >> >> > errors shows that everything in the filesystem meta-data was consis= tent >> >> > >> >> [...] >> >> >> >> > I think it is safe to conclude that the probabilities -greatly- fav= or >> >> > alternative #1. >> >> > >> >> >> >> OK. So after your comments and further research I concur with you on >> >> the mv but if it wasn't a human, then this might be exposing a seriou= s >> >> security flaw in the jail system or the way EzJail implements it. >> > >> > BOGON ALERT!!! >> > >> >> I admit my ignorance on how the filesystem works but I don't think >> your condescending remarks add a lot of value. The issue here is this >> actually happened and there is a flaw somewhere other than "the stupid >> administrator did it". > > If you search the archives of this list, you'll find my _first_ > post to that list: I've had a similar problem, df shows data > must be there after crash (panic -> reboot -> fsck trouble), but > files aren't there (even _not_ in lost+found). It's quite possible > that in _exceptional_ moments this can happen. The fsck program > is intended to repair the most typical file system faults, but > nothing "complicated" will be done without interaction: Altering > data on disk will _always_ involve the responsible (!) admin to > check if it is really intended "to do so". > > There can be many reasons. I've never found out what was the [...] > that might help locate "lost" data (quotes intended as long as > the data is still on the disk). The more complex your setting > is (e. g. striped disks, or ZFS), this can be nearly impossible. > "Plain old UFS" can sometimes be your saviour (but BACKUP should > be your real friend). > Thanks for your reply. I can't figure out how there was no data loss and yet the directories moved just like that. We have nightly backups and it's one of the features we love about EzJail and it's archive feature. The base system sits on another disk entirely and it's pristine, we don't install anything except the basic system on the system disk and the other disk is exclusively divided in jails, so the possibility of an outside process doing the mv is unlikely. Everything point to that something or someone executed a mv but how was this done? or if there is a potential problem and could happen again. And contrary to other comments here, and my admitted ignorance, I believe there are actually 3 possibilities: 1) something inside a jail was able to move the other jails into itself 2) something outside the jails moved the jails 3) the directories were moved at reboot by journal recovery, fsck or something else That is what worries me, is that it wasn't just some random bit or cosmic ray, but the potential of happening again. I am not so sure that it is *impossible* that a jail could affect other jails with EzJail. --=20 Alejandro