Date: Sun, 14 Apr 2002 00:15:12 +0200 From: "Patrick O'Reilly" <bsd@perimeter.co.za> To: "Gregory Keefe" <keefeg@keefeg.com>, <freebsd-questions@FreeBSD.ORG> Subject: Re: Softupdates FileSystem Message-ID: <200204132258.2568@.perimeter.co.za> In-Reply-To: <001401c1e31d$bf0cd310$9865fea9@GPC> References: <001401c1e31d$bf0cd310$9865fea9@GPC>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat 13 Apr 02 21:02, Gregory Keefe wrote: > FreeBSD Claim: > http://www.freebsd.org/features.html > Soft Updates allows improved file system performance without > sacrificing safety and reliability > > A Unix Expert's Claim: > http://cr.yp.to/qmail/faq/reliability.html > ``Do not use async or softupdates filesystems. If you do, and if your > system crashes at the wrong moment, you will lose [data].'' > > http://cr.yp.to/daemontools/multilog.html > ``Beware that NFS, async filesystems, and softupdates filesystems may > discard files that were not safely written to disk before an > outage.'' > > Which should I believe? > Gregory, I respond to this question as, shall we say, a layman. So, take my comments from whence they come. I welcome comments from any REAL EXPERTS on this subject. The reason I am taking the time to do this is because I find it incredibly iritating that authors for supposedly reputable sources insist on tainting facts with insinuations which the average reader will just swallow because it's there in black and white. The fact that you are taking the effort to pose this question here indicates that you are not the average reader - good for you. First, all of the above statements are true. However, the tone in which they are written may imply more that the truth, and it certainly attempts to undermine the reader's faith in the named filesystems. And, if a list of filesystems is painted with one brush, then surely the list should be complete - otherwise (and obviously intentionally) the reader is lead to assume that any filesystems NOT listed are materially different. BUT they might not be! In terms of file system integrity, let's work through the most common options: 1) Synchronous fs (filesystems) write data in a blocking mode. This means that when data is to be written, the process doing the writing will wait till the hardware confirms that the write of both data and meta-data has physically taken place. Evaluation?: Good: Once the process completes the write, it is SAFE. Bad: This is slow because we wait for physical disk access. Risk: Low. Failure during physical writing will cause corruption. 2) Asynchronous fs write data in non-blocking mode, as and when it seems convenient. Depending on the implementation, that physical transfer of data and meta-data to the disk may happen at any time and in any sequence AFTER the process has already proceeded with other processing. Evaluation?: Good: Fast - very Fast. The process does not wait on the physical disk. Bad: Window of opportunity for corruption becomes very large. Failure at any time between logical write and physical write may (almost certainly will) cause corruption. Risk: High. It is directly proportional to the latency between logical and physical writes. This is a killer combination - High risk of Corruption! So far, so good. Very few people will debate the above issues too much. The fun starts when we start to compare various methods of trying to achieve the best of both the above worlds. In other words, we would like to have the following: Evaluation?: Good: Once the process completes the write, it is SAFE. Good: Fast - very Fast. The process does not wait on the physical disk. Risk: Low, even None. There are two main schools of thought in addressing this paradoxical requirement: 3) Journaling File Systems 4) BSD's Softupdates 3) Journaling fs implement a "journal" to co-ordinate disk activity. The methods for doing this are as varied as the authors of the systems, but essentially they all do something like this: a) open a journal which records every request for disk access. Since this journal is permanently open, access to it is very fast. Some journals, I believe, even reside in non-volatile memory specifically to ensure speedy access. b) writes are asynchronous - the process can carry on while the fs takes care of writing to disk. c) data and meta-data is transferred to physical disk in any convenient sequence, and once the hardware "signs off" on all the parts of the job, the journal entries are either deleted or marked as complete. So, what is the evaluation? Good: Fast - very Fast. The process does not wait on the physical disk. Good: The journal will ensure that the data and meta-data are written safely to disk. If failure should occur during this process, the journal can be used to restore the fs to a stable state. On restart of the fs, any incomplete, or outstanding, journal entries are examined, and the partially updated (corrupt) data can be cleared out. Bad: Although there is no risk of corruption at the file system level, there is risk of data loss between logical write and physical write. Risk: High. It is directly proportional to the latency between logical and physical writes. But, the risk is of data loss only, NOT of fs corruption. 4) Softupdates fs implements a "sequence manager" (my terminolgy:) to co-ordinate disk activity. Essentially it does something like this: a) open a queue (in memory I think) which records every request for disk access. Since this queue is permanently open, and memory based, access to it is very fast. b) writes are asynchronous - the process can carry on while the fs takes care of writing to disk. c) data and meta-data is transferred to physical disk. Here is where the "queue manager" is important. It will sequence all the physical activity such that all new data is written to the disk first, and the mata-data updates are only completed thereafter. Therefore, once the meta-data write completes, we can be sure that the entire set of data is safely on disk, and the queue entries are then deleted. So, what is the evaluation? Good: Fast - very Fast. The process does not wait on the physical disk. Good: The "sequence manager" will ensure that the data and meta-data are written safely to disk. If failure should occur during this process, the fs is actually in a stable state, logically, though not physically. On restart of the fs, a clean-up process may be run in the background which will search the disk for data blocks which do not have corresponding meta-data, and these blocks can be scrubbed. Or it can simply be ignored (I think), as these data blocks will be re-used in due course anyway. Bad: Although there is no risk of corruption at the file system level, there is risk of data loss between logical write and physical write. Risk: High. It is proportional to the latency between logical and physical writes. But, the risk is of data loss only, NOT of fs corruption. --------- Enough already! As you can see, there really is not much to choose between the two methods. Both are trying to achieve the same result, and enjoying similar results, though the method is somewhat different. The Softupdates method does provide a little more scope for efficiency, since disk writes are being resequenced anyway, the opportunity can be used to group writes per disk track, etc, thereby minimising physical effort involved in the transfer of data. I recall seeing a set of benchmarks about six months ago, though I do not recall where :( , which compared synchronous, asynchronous, Softupdates, and about 4 different journaling implementations. Asynchronous was quickest at virtually everything, hands-down. Synchronous was a dog. In between were Softupdates and the various journaling files systems. Softupdates was fastest (excluding async, obviously) in many cases, but not all, and not by very large margins either. So back to your quotes: > FreeBSD Claim: > http://www.freebsd.org/features.html > Soft Updates allows improved file system performance without > sacrificing safety and reliability In the context of preventing fs corruption, this is absolutely true. > A Unix Expert's Claim: > http://cr.yp.to/qmail/faq/reliability.html > ``Do not use async or softupdates filesystems. If you do, and if your > system crashes at the wrong moment, you will lose [data].'' In the context of ensuring your data is on disk instantaneously, this is true. But then the author should also have listed journaling file systems!!!! I also find it very biased that softupdates and async are mentioned in one sentence, as if they are the same type of technology. Async fs is known to be fast, but risky with no attempt to garantee integrity. Softupdates goes to great lengths to ensure integrity. > http://cr.yp.to/daemontools/multilog.html > ``Beware that NFS, async filesystems, and softupdates filesystems may > discard files that were not safely written to disk before an > outage.'' Ditto the above. And NFS - what the heck does that have to do with a discussion of REAL fs integrity? I say it again, some authors will stoop to any level to cast aspersions on something they do not like, or perhaps do not understand. <disclaimer> This is my oppinion, as a self-taught "lay person". Take it from whence it comes</disclaimer> I look forward to hearing more comments on this subject. :) -- Regards, Patrick O'Reilly. ___ _ __ / _ )__ __ (_)_ __ ___ _/ /____ __ / __/ -_) _) / ~ ) -_), ,-/ -_) _) /_/ \__/_//_/_/~/_/\__/ \__/\__/_/ http://www.perimeter.co.za To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200204132258.2568>