From owner-freebsd-hackers  Tue Mar 10 16:40:38 1998
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Received: (from majordom@localhost)
          by hub.freebsd.org (8.8.8/8.8.8) id QAA11566
          for freebsd-hackers-outgoing; Tue, 10 Mar 1998 16:40:38 -0800 (PST)
          (envelope-from owner-freebsd-hackers@FreeBSD.ORG)
Received: from smtp01.primenet.com (smtp01.primenet.com [206.165.6.131])
          by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id QAA11538
          for <hackers@FreeBSD.ORG>; Tue, 10 Mar 1998 16:40:29 -0800 (PST)
          (envelope-from tlambert@usr08.primenet.com)
Received: (from daemon@localhost)
	by smtp01.primenet.com (8.8.8/8.8.8) id RAA29448;
	Tue, 10 Mar 1998 17:40:29 -0700 (MST)
Received: from usr08.primenet.com(206.165.6.208)
 via SMTP by smtp01.primenet.com, id smtpd029406; Tue Mar 10 17:40:26 1998
Received: (from tlambert@localhost)
	by usr08.primenet.com (8.8.5/8.8.5) id RAA13485;
	Tue, 10 Mar 1998 17:40:23 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199803110040.RAA13485@usr08.primenet.com>
Subject: Re: Fault tolerance issues
To: shimon@simon-shapiro.org
Date: Wed, 11 Mar 1998 00:40:23 +0000 (GMT)
Cc: tlambert@primenet.com, hackers@FreeBSD.ORG
In-Reply-To: <XFMail.980310120648.shimon@simon-shapiro.org> from "Simon Shapiro" at Mar 10, 98 12:06:48 pm
X-Mailer: ELM [version 2.4 PL25]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> >> I always wondered why this is not so.  Not even after sync(2).
> > 
> > With the old sync process (updated, not syncer), it wasn't very
> > cost effective.  It would happen on every sync.
> 
> ``Cost Effective'' in what way?  Losing a critical file, or corrupting an
> on-line database is a lot less effective than n% loss of speed.  N can be
> pretty large here, if you ask users who are in the know.
> 
> Again, a switch will be the best solution.  Dia in the level of security or
> reliability you desire.

You misunderstand.  A soft read-only marking is only instituted if there
is no dirty data to be written.  The difference is in reboot time, since
the *only* thing wrong with the disk is the clean flag isn't set and the
superblock information that isn't automagically replicated is out of sync.

It saves you fsck time after an ungraceful shutdown from a quiescent
state, nothing more.

It wasn't cost effective in the sense that if you marked and unmarked
the thing after every sync, you were marking it frequently enough
that the unmarking represented a significant start latency in the
median to high load case (where the sync wrote all outstanding
dirty data, but there would immediately be more dirty data that
needed written the next time).

With the syncer process, the sync clock puts a delay between when
the last data in and the last data out -- a sliding window in which
you would not "unship the heads" so to speak.

The entire window would need to be emptied for you to mark the
volume soft read-only, and you would have an entire sync clock in which
to "unship the heads".

Basically, you'd implement this by saying "at the next sync interval,
immediatle write the superblock as dirty, and on completion, mark
the FS non-soft-RO.


> > The difference is SFT (*Software* Fault Tolerance); that's why Novell
> > is still making money in the server market (or at least one of the
> > reasons).
> 
> How many Novell servers have you seen without a UPS behind them?

Generally, or at Novell?  Generally, quite a few.

Novell has this luxury because their threading is coopertive tasking
with explicit yield (unless you yield, all operations run to completion,
so you are never more than one operation away from ground state; like
running with sync mounts on the pre-soft updates FFS).


> Again, MHO is that software should protect against abrupt termination as
> well as it can.  But, it is OK to clearly define the constraints, and say
> ``For this I need at least n seconds of continued processing time''.

"Seconds" is a *long* time.  I was thinking no more than 30uS or even
25uS in those country too poor to afford 60 sine waves per duty cycle ;-).

If you are thinking about checkpoint/restart -- well, that's a whole
different ballgame.  You will need to either revisit memory overcommit,
or have a checkpoint reserve equal to the amount of kernel memory plus
a startup reserve to less you restore state (or reserve a set amount
of main RAM for the job, but that's wasteful).


> > If I have to write all the code myself, it'll be a long time before
> > it gets done.  But If I'm serious, I'll write it in vanilla K&R so
> > I can run the C++ branch path analysis tool from the comp.unix.sources
> > archives on it.
> 
> I may be blind, and behind the times, but, aside from formal prototypes, I
> fail to see what really improved in the C language since K&R.

That's not the point.  The point is that I can automatically generate
code coverage tests for K&R C but not for ANSI C.  8-(.


> > One component that's being overlooked here is QA as opposed to QC.
> 
> QC is management measurable (almost).  QA is more of a moral issue.

This is where the people who hate ISO 9000 begin to hate it.  The
management involvement always seems to take the form of "what would
we like to have measurements on" rather than "what is measurable".
I dread this type of QC management.  QA is more "how can I guarantee
that the code matches the intent of the code".

This is a discussion we should take offline, unless there is a seperate
list where it's appropriate.


> The weakness of this environment is that, at times, we bite more than we
> can chew, and that, in the FreeBSD in particular, our efforts are difused;
> We work on a lot of different big things.  Instead, we should try to form
> task forces which work on specific things, broadening or exeprtise level,
> and ensuring maturity of features. rather than count.

Well, organize away!  ...8-)

> > I would definitely like to see someone produce a PrestoServ card for
> > FreeBSD (for example).  This would get it into a hell of a lot of
> > traditionally "big iron" shops.
> 
> And what is a PrestoServ card?

Battery backed RAM for stable storage of NFS writes.

Uh... "hard updates"... 8-).


> > The whole fault tolerance issue is "how do I make small iron look like
> > big iron without getting Tony Overfield to redesign the PC?".
> > 
> > 8-).
> 
> And who is Tony Overfield?  You are talking to some ignorant audience here

Engineer at Dell.  Argues hardware and BIOS, occasionally.  Good source
for feedback about "how do I talk to PC hardware instead of non-perverse
hardware".


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message