Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 12 May 1996 16:05:39 -0700 (MST)
From:      Terry Lambert <terry@lambert.org>
To:        perry@alpha.jpunix.com (John A. Perry)
Cc:        terry@lambert.org, questions@freebsd.org
Subject:   Re: async filesystems
Message-ID:  <199605122305.QAA08690@phaeton.artisoft.com>
In-Reply-To: <Pine.NEB.3.93.960511212342.221A-100000@alpha.jpunix.com> from "John A. Perry" at May 11, 96 09:28:24 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> =09OK. I read the man page as you suggested and now I know it's a
> dangerous option. The question still remains, what is it, what does it do
> for you, why use it, etc? I'm not asking the question because I feel a
> need to waste bandwidth. I'm asking because I want to know more about it.
> A little more verbiosity other than "read the man page" might even educate
> others if you're not careful.


The option causes metadata writes to return after they have been
scheduled but before they have completed.

This speeds things like POSIX time update semantics for access
time on directories and files which are being statted, which
can be agregated instead of waiting for them to be committed
to disk.

This, in itself, is not a bad thing, because I don't totally agree
with how FFS interprets POSIX.

It also speeds up things like directory entry modification and
inode allocation, and block allocation in inodes, and file size
update in inodes following block allocation... etc..

This is evil, because it means that a system failure of any kind
will no longer be deterministically recoverable.  You could get
to a "working state" from the "crash state", but whether this
is "the right state" is another question.


So, in order:

What is it:

	Disabling of the ordering guarantees that ensure that
	a file system can be correctly recovered following a
	failure (crash, kernel panic, power failure, or hardware
	failure).

What does it do for you:

	It makes certain type of worst case file system accesses
	faster than if you were using sync writes to make your
	ordering guantees.  The most notable cases are:

	o	Installation -- where the trade is between
		starting over and it taking twice as long,
		or probably not having a failure during install
		and it taking ~20% less time.

	o	Restore from backup -- same tradeoff.

	o	Temporary FS's that you plan to recreate at
		boot time anyway.

	o	News spools which are not used for locally
		created articles (since they would be lost in
		event of a crash).

	o	The lmbench 1000 file create/delete test, which
		was designed to make Linux look good compared
		to BSD, taking advantage of sync vs. async
		speed, and incorrectly stating that this was
		representative of a typical compilation, with
		large numbers of small temporary files (typical
		compilation uses pipes, not temp files).

	It also lets you be lazy regarding the need to implement
	faster ordering mechanisms than sync writes.  Programmers
	use async instead of implementing Delayed Ordered Writes
	(used by SVR4 UFS), or soft updates.

Why use it:

	o	You want to gain the advantages listed above

	o	You don't care if you have to reinstall your
		machine in order to get a small performance
		improvement in certain atypical usage cases

	o	You need one or both of the previous 2 reasons,
		but you are too lazy to implement a technically
		correct soloution to the problem, so you fly by
		the seat of your pants instead.

In general, there are *well known* methods for implementing ordering
guarantees that get near async speeds (soft updates, for instance,
are withing 5% of memory speeds, which is, in some cases of buffer
contention, even faster than vanilla async scheduling).


Correspondingly, it's *NOT* anything to do with whether or not async
I/O operations on files as container objects will work or not.  It
has *nothing* to do with sync(), fsync(), O_SYNC, O_WRITESYNC, or
the functions aioread/aiowrite/aiowait/aiocancel.


Think of it as being similar to enabling interrupts during IDE
DMA's: sure, it's faster, but it puts you at great risk of error
in some situations.  By default, you shouldn't assume that Intel
has fixed their RZ1000 chip.  The same is true of async metadata
updates (which is what the option really means): it should be
there fore people who know they won't crash and burn if they
turn it on.  But it should be off by default because of the
"anything that works is better than anything that doesn't" rule.

PS: this was covered in great gory detail in unsenet news regarding
advanced FS design.  The real question is "ordered vs. unordered
metadata updates and the effect on FS failure recovery".  Sync
writes just happens to be one of several methods (and the only
one implemented in Linux and BSD, at least in public sources)
to guarantee ordered metadata updates.


It's expected that the "async" option will go away after the
integration of soft updates, since it will be just as fast (or
faster, for some cases) as async (unordered) updates, and won't
leave your butt hanging out in the wind like async does.


Hope this answers your questions...


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605122305.QAA08690>