Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 14 Apr 2002 00:15:12 +0200
From:      "Patrick O'Reilly" <bsd@perimeter.co.za>
To:        "Gregory Keefe" <keefeg@keefeg.com>, <freebsd-questions@FreeBSD.ORG>
Subject:   Re: Softupdates FileSystem
Message-ID:  <200204132258.2568@.perimeter.co.za>
In-Reply-To: <001401c1e31d$bf0cd310$9865fea9@GPC>
References:  <001401c1e31d$bf0cd310$9865fea9@GPC>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat 13 Apr 02 21:02, Gregory Keefe wrote:
> FreeBSD Claim:
> http://www.freebsd.org/features.html
> Soft Updates allows improved file system performance without
> sacrificing safety and reliability
>
> A Unix Expert's Claim:
> http://cr.yp.to/qmail/faq/reliability.html
> ``Do not use async or softupdates filesystems. If you do, and if your
> system crashes at the wrong moment, you will lose [data].''
>
> http://cr.yp.to/daemontools/multilog.html
> ``Beware that NFS, async filesystems, and softupdates filesystems may
> discard files that were not safely written to disk before an
> outage.''
>
> Which should I believe?
>

Gregory,

I respond to this question as, shall we say, a layman.  So, take my 
comments from whence they come.  I welcome comments from any REAL 
EXPERTS on this subject.

The reason I am taking the time to do this is because I find it 
incredibly iritating that authors for supposedly reputable sources 
insist on tainting facts with insinuations which the average reader 
will just swallow because it's there in black and white.  The fact that 
you are taking the effort to pose this question here indicates that you 
are not the average reader - good for you.

First, all of the above statements are true.  However, the tone in 
which they are written may imply more that the truth, and it certainly 
attempts to undermine the reader's faith in the named filesystems.  
And, if a list of filesystems is painted with one brush, then surely 
the list should be complete - otherwise (and obviously intentionally) 
the reader is lead to assume that any filesystems NOT listed are 
materially different. BUT they might not be!

In terms of file system integrity, let's work through the most common 
options:

1) Synchronous fs (filesystems) write data in a blocking mode.  This 
means that when data is to be written, the process doing the writing 
will wait till the hardware confirms that the write of both data and 
meta-data has physically taken place.
Evaluation?:
Good: Once the process completes the write, it is SAFE.
Bad:  This is slow because we wait for physical disk access.
Risk: Low.  Failure during physical writing will cause corruption.

2) Asynchronous fs write data in non-blocking mode, as and when it 
seems convenient.  Depending on the implementation, that physical 
transfer of data and meta-data to the disk may happen at any time and 
in any sequence AFTER the process has already proceeded with other 
processing.
Evaluation?:
Good: Fast - very Fast. The process does not wait on the physical disk.
Bad:  Window of opportunity for corruption becomes very large. Failure 
at any time between logical write and physical write may (almost 
certainly will) cause corruption.
Risk: High. It is directly proportional to the latency between logical 
and physical writes.
This is a killer combination - High risk of Corruption!

So far, so good.  Very few people will debate the above issues too 
much.  The fun starts when we start to compare various methods of 
trying to achieve the best of both the above worlds.  In other words, 
we would like to have the following:
Evaluation?:
Good: Once the process completes the write, it is SAFE.
Good: Fast - very Fast. The process does not wait on the physical disk.
Risk: Low, even None.

There are two main schools of thought in addressing this paradoxical 
requirement:
3) Journaling File Systems
4) BSD's Softupdates

3) Journaling fs implement a "journal" to co-ordinate disk activity.  
The methods for doing this are as varied as the authors of the systems, 
but essentially they all do something like this:
a) open a journal which records every request for disk access.  Since 
this journal is permanently open, access to it is very fast.  Some 
journals, I believe, even reside in non-volatile memory specifically to 
ensure speedy access.
b) writes are asynchronous - the process can carry on while the fs 
takes care of writing to disk.
c) data and meta-data is transferred to physical disk in any convenient 
sequence, and once the hardware "signs off" on all the parts of the 
job, the journal entries are either deleted or marked as complete.

So, what is the evaluation?
Good: Fast - very Fast. The process does not wait on the physical disk.
Good: The journal will ensure that the data and meta-data are written 
safely to disk.  If failure should occur during this process, the 
journal can be used to restore the fs to a stable state.  On restart of 
the fs, any incomplete, or outstanding, journal entries are examined, 
and the partially updated (corrupt) data can be cleared out.
Bad:  Although there is no risk of corruption at the file system level, 
there is risk of data loss between logical write and physical write.
Risk: High. It is directly proportional to the latency between logical 
and physical writes.  But, the risk is of data loss only, NOT of fs 
corruption.

4) Softupdates fs implements a "sequence manager" (my terminolgy:) to 
co-ordinate disk activity.  Essentially it does something like this:
a) open a queue (in memory I think) which records every request for 
disk access.  Since this queue is permanently open, and memory based, 
access to it is very fast.
b) writes are asynchronous - the process can carry on while the fs 
takes care of writing to disk.
c) data and meta-data is transferred to physical disk. Here is where 
the "queue manager" is important. It will sequence all the physical 
activity such that all new data is written to the disk first, and the 
mata-data updates are only completed thereafter.  Therefore, once the 
meta-data write completes, we can be sure that the entire set of data 
is safely on disk, and the queue entries are then deleted.

So, what is the evaluation?
Good: Fast - very Fast. The process does not wait on the physical disk.
Good: The "sequence manager" will ensure that the data and meta-data 
are written safely to disk.  If failure should occur during this 
process, the fs is actually in a stable state, logically, though not 
physically.  On restart of the fs, a clean-up process may be run in the 
background which will search the disk for data blocks which do not have 
corresponding meta-data, and these blocks can be scrubbed.  Or it can 
simply be ignored (I think), as these data blocks will be re-used in 
due course anyway.
Bad:  Although there is no risk of corruption at the file system level, 
there is risk of data loss between logical write and physical write.
Risk: High. It is proportional to the latency between logical and 
physical writes.  But, the risk is of data loss only, NOT of fs 
corruption.

---------

Enough already!

As you can see, there really is not much to choose between the two 
methods.  Both are trying to achieve the same result, and enjoying 
similar results, though the method is somewhat different.  The 
Softupdates method does provide a little more scope for efficiency, 
since disk writes are being resequenced anyway, the opportunity can be 
used to group writes per disk track, etc, thereby minimising physical 
effort involved in the transfer of data.

I recall seeing a set of benchmarks about six months ago, though I do 
not recall where :( , which compared synchronous, asynchronous, 
Softupdates, and about 4 different journaling implementations.  
Asynchronous was quickest at virtually everything, hands-down.  
Synchronous was a dog.  In between were Softupdates and the various 
journaling files systems.  Softupdates was fastest (excluding async, 
obviously) in many cases, but not all, and not by very large margins 
either.

So back to your quotes:

> FreeBSD Claim:
> http://www.freebsd.org/features.html
> Soft Updates allows improved file system performance without
> sacrificing safety and reliability

In the context of preventing fs corruption, this is absolutely true.

> A Unix Expert's Claim:
> http://cr.yp.to/qmail/faq/reliability.html
> ``Do not use async or softupdates filesystems. If you do, and if your
> system crashes at the wrong moment, you will lose [data].''

In the context of ensuring your data is on disk instantaneously, this 
is true.  But then the author should also have listed journaling file 
systems!!!!  I also find it very biased that softupdates and async are 
mentioned in one sentence, as if they are the same type of technology.  
Async fs is known to be fast, but risky with no attempt to garantee 
integrity.  Softupdates goes to great lengths to ensure integrity.

> http://cr.yp.to/daemontools/multilog.html
> ``Beware that NFS, async filesystems, and softupdates filesystems may
> discard files that were not safely written to disk before an
> outage.''

Ditto the above.  And NFS - what the heck does that have to do with a 
discussion of REAL fs integrity?  I say it again, some authors will 
stoop to any level to cast aspersions on something they do not like, or 
perhaps do not understand.

<disclaimer> This is my oppinion, as a self-taught "lay person".  Take 
it from whence it comes</disclaimer>

I look forward to hearing more comments on this subject. :)

-- 
Regards,
Patrick O'Reilly.
    ___        _            __
   / _ )__ __ (_)_ __ ___ _/ /____ __
  / __/ -_) _) /  ~  ) -_), ,-/ -_) _)
 /_/  \__/_//_/_/~/_/\__/ \__/\__/_/
    http://www.perimeter.co.za

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204132258.2568>