Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 24 Jul 2003 01:23:51 -0700
From:      Terry Lambert <tlambert2@mindspring.com>
To:        Brad Knowles <brad.knowles@skynet.be>
Cc:        FreeBSD Chat Mailing List <freebsd-chat@FreeBSD.ORG>
Subject:   Re: maildir with softupdates
Message-ID:  <3F1F9797.770751C8@mindspring.com>
References:  <3F1E6456.9090400@fsn.hu> <a0600120abb447b7be0fb@[10.0.1.2]>

next in thread | previous in thread | raw e-mail | index | archive | help
Brad Knowles wrote:
>         Moreover, the software not only needs to issue an fsync() on the
> file, it also needs to issue an fsync() on the directory, in order to
> have reasonable guarantees that the date has been safely written.  My
> recollection is that, with fsync() on the file and fsync() on the
> directory, softupdates is actually safe for these kinds of
> applications (at least, the filesystem won't be left in an
> inconsistent state), whereas ext3fs or other filesystems might not be.

This is incorrect.  Technically, you can comply with POSIX with
an implementation for directories which does not involve the
directory data being stored in a file abstraction, and therefore
making it impossible to "fsync() a directory" on such a system;
VMS is one example of such a system that does not implement the
directories as normal files, and for which it is therefore not
possible to obtain an fd for the directory on which an fsync()
may operate.

POSIX semantics from both Section 2 ("Compliance") and from the
"Rationale" and "Corrigenda" sections specifically state the
metadata semantics for files, when it comes to "SHALL be updated"
vs. "SHALL bemarked for update" vs. no stated semantics.

A system which does not guarantee metadata integrity, and ordering
of metadata vs. data when a file in a directory is itself fsync()'ed
while there is an incompleted operation in progress, is *NOT*
compliant with the POSIX or IEEE-1003.1-2001 or Sungle UNIX
Specification standards.

There is no other interpretation possible:

    http://www.opengroup.org/onlinepubs/007904975/functions/fchdir.html

    "A conforming application can obtain a file descriptor for
     a file of type directory using open(), provided that the
     file status flags and access modes do not contain O_WRONLY
     or O_RDWR."


>         I know that sendmail is safe on softupdates (indeed, softupdates
> is recommended), but I also recall that some source modifications
> were required to have it to an fsync() on both the file and the
> directory, before it was safe.

If you examine the Sendmail sources, you will see that it calls
fsync() in queue.c and collect.c.  In both these cases, the call
takes place on "tfp" -- a temporay file pointer.  In no case does
it call fsync on an fd that references a directory.

Sendmail, in other words, expects a POSIX compliant system, in
terms of metadata update ordering semantic guarantees.


>         Unfortunately, I don't recall if the fync()-on-file-and-directory
> trick is enough to make sendmail sufficiently safe on ext3fs.  You'd
> have to ask people who are more knowledgeable with that configuration
> than I am.

It would not be, unless it did an fsync-to-root on intermediate
directories, as well as the queue directory, to ensure all the
extents referred to their committed data from the inferior
directories, instead of referring to the previously committed
data of an intermediate directory, which did not refer to the
newly committed extents.


>         In the long run, it all comes down to how much danger you're
> willing to live with, and how much safety you believe is required
> before you are in proper compliance with the protocol specifications.

This is just rationalizing.  In the long run, it comes down to
compliance or not compliance with international standards; The
ext3fs fails to comply with these standards, because it assumes
by default that metadata integrity is the responsibility of the
application, not the OS, and it assumes an implementation detail
that the standards in question are not prepared to allow it to
assume, to wit: that directories are implemented in terms of file
primitives, at the lowest level, and that it is therefore possible
to both (a) get a writeable file descriptor for a directory and
(b) to utilize such a file descriptor in an fsync() call.

In fact, a strict reading of the relevent standards permits an
fsync() on any fd that is not open for write to return EBADF or EIO.


>         If you want to run your e-mail system on a pure RAM disk that has
> no battery backup or UPS, and you're willing to lose all that e-mail
> if the power goes out, then you should be able to do that.  However,
> if you have any customers, you should make operational decisions like
> this known to them, so that they can make their own determination as
> to whether or not you are conforming to the level of service that
> they require.

This is what Hotmail does, in fact, and it makes it clear that
the service is not intended for business use, and that business
use is in fact prohibited, in it's service agreement in the Terms
Of Service section.

The fact is, though, that this renders their service non-compliant
with both the international standard RFC-821, and the international
standard RFC-2821 which supercedes the former.

Again, it comes down to compliance with international standards.

-- Terry



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3F1F9797.770751C8>