Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Apr 2012 06:36:08 -0500 (CDT)
From:      Robert Bonomi <bonomi@mail.r-bonomi.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: UFS Crash and directories now missing
Message-ID:  <201204301136.q3UBa8fj083478@mail.r-bonomi.com>

next in thread | raw e-mail | index | archive | help

Alejandro Imass <ait@p2ee.org> wrote:
> On Sun, Apr 29, 2012 at 11:49 PM, Erich Dollansky wrote:
> > On Monday 30 April 2012 02:02:41 jb wrote:
> >> Alejandro Imass <ait <at> p2ee.org> writes:
> >> > ...
>
> Back to theory on how the http-proxy jail 'swallowed' all the other
> jails including the basejail. 

A "theory" that contains assumptions which are, unfortunately, unsupported
by any factual evidence. 

Just like _every_other_ "theory" you have advanced to date.

FACT: It is a virtual certainty that something operating -outside- any jail
environment is what did the deed.

Available evidence to date is that you 'fixate' on a particular _remote_
possibility -- *without* knowledge of what it would take for that scenario
to come to pass -- making a sh*tload of 'assumptions' along the way (many
of which are contrary to reality), and offer that as 'the explanation' for
events.



> Given that EzJail uses a single basejail and links/mounts stuff in the
> child jails it would seem plausible (regression?) that somehow any
> jail could access other jails' files,

Demonstrating, yet again, that you do not understand how jails work. :((

>                                       or that _maybe_ in an event of
> crash the nullsfs mounts confuse the system somehow when fsck restores
> or the journal is recovered.

Demonstrating, yet again, that you do not understand what nullfs is, how
it works, or that it is totally -irrelevant- to fsck and/or journaling.

Hint: nullfs is merely a 'path translation' mechanism -- it affects _only_ 
'file open' syscalls.  fsck doesn't _touch_ nullfs.

Hint; journaling is an add-on to the UFS filesystem.  nullfs doesn't know 
what journaling is.   "Journal recovery" doesn't _touch_ a nullfs.

A competennt, "not stupid", sysadmin would know these things.  And not 
'remove all doubt' (in the words of Abraham Lincoln), by raising such
nonsense questions.

> Whatever the cause, it actually happened and I have already ruled out
> just about anything. It doesn't seem to have been an attack, it surely
> wasn't me, and EzJail author agrees it was not the EzJail scripts. So
> maybe nullfs and journaling, or crash + nullfs + journaling, could
> cause something like this to happen?

Postulating the "right" combination of _unrelated_ failures, virtually
*anything* can happen.   cf. "Nasal Monnkeys".

It has already been demonstrated how the (im-)probability of such an event
relates to the age of the universe.

>                                     Maybe journal has some confusion
> on restoring the nullfs view of the directories or something after bad
> crash like this one??

Short answer: "No chance."  Again, if you had any understandinng of how 
UFS, and nullfs for that matter, works -- not to mention how disk I/O works
inside the kernel, you wouldn't be embarassing yourself by your _continued_
raising of what are, to put it charitiably, such 'patently ridiculous' 
questions.

You can engage in all the 'unfounded speculation' you want to, but you are
simply -not- going to determine "what happened".  

IF there was a systemic fault, you have already destroyed the forensic 
evidence trail that _might_ have allowed a qualified expert to run it down,
*if* you could afford to have such an analysis done.  (middle five figures
is a starting point for such an analysis.)

Absent _multiple_ reports of like events, *WITH* enough detail in the reports
to have a reasonable chance of identifying a 'pattern' of events leading to 
the failure, *OR* the existance of a -reliable-, =repeadable=, method of 
inducing the failure, this simply isn't going to go anywere.  Absent any 
of those things, it is a 'freak' event, *PROBABLY* (read 'virtually certain')
caused by human error (despite your claim of the 'impossibility' of that 
factor) in some form.

If you insist on 'knowing' what happened in any future instance of single
putatively 'abnormal' events, you will need to change to a MIL-SPEC 'B2' 
(or higher) rated O/S, with active mandatory access controls, 'security 
labels' with multi-level, non-hierarchical,  security enabled, audit 
logging of -every- system call, etc.  This also requires a staff position 
of 'security officer', which is _separate_and_distinct_ from 'system 
administrtor'.   I strongly suspect that you cannot afford the required 
hardware and software for this type of 'solution'.

The 'underlying cause' almost certainly falls into the class known as PEBKAC.
(The current admin has demonstrated an inability to accurately report the 
 state of his system -- that at least one thing he previously asserted to be
 true was _not_, in fact, the case.  It is *HIGHLY*LIKELY* that _that_ 
 'exception' to the claimed state is =not= the only such violation on that
 system.)

That there was an action where there was a difference between 'that which 
was intended', and 'what it really did'.  Such things are almost -impossible-
for the perpetrator of the action to identify -- they 'know' what they did,
and "read" the act as 'doing what they meant it to do', even though it 
actually did 'something else'.  I cannot count the number of times _I_ have
fallen into that particular trap.

You insist on speculating about 'failure modes' in the way that THINGS YOU
DO NOT UNDERSAND THE FUNCTIONING OF work.  You are wasting your time, and
that of those whom you inflict those 'nonsensical' speculations on.


You are 'convinced' it could not have been human error, and have conclued
that it therefore *must* have been machine error.  You are looking for 
someone to 'validate' that conclusion. 

That simply *ISN'T* going to happen -- not without a -lot- more evidence 
than any individual can provide from a single =unrepeadable= incident.






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201204301136.q3UBa8fj083478>