From owner-freebsd-chat@FreeBSD.ORG Thu Jul 24 01:25:16 2003 Return-Path: Delivered-To: freebsd-chat@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5394D37B401; Thu, 24 Jul 2003 01:25:16 -0700 (PDT) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 74C2F43FA3; Thu, 24 Jul 2003 01:25:13 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from user-2ivfnj3.dialup.mindspring.com ([165.247.222.99] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 19fbPR-0002Io-00; Thu, 24 Jul 2003 01:25:10 -0700 Message-ID: <3F1F9797.770751C8@mindspring.com> Date: Thu, 24 Jul 2003 01:23:51 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Brad Knowles References: <3F1E6456.9090400@fsn.hu> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a445ffeb4013304e48a4c064263d053f04350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c cc: Attila Nagy cc: David Schultz cc: FreeBSD Chat Mailing List Subject: Re: maildir with softupdates X-BeenThere: freebsd-chat@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Non technical items related to the community List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jul 2003 08:25:16 -0000 Brad Knowles wrote: > Moreover, the software not only needs to issue an fsync() on the > file, it also needs to issue an fsync() on the directory, in order to > have reasonable guarantees that the date has been safely written. My > recollection is that, with fsync() on the file and fsync() on the > directory, softupdates is actually safe for these kinds of > applications (at least, the filesystem won't be left in an > inconsistent state), whereas ext3fs or other filesystems might not be. This is incorrect. Technically, you can comply with POSIX with an implementation for directories which does not involve the directory data being stored in a file abstraction, and therefore making it impossible to "fsync() a directory" on such a system; VMS is one example of such a system that does not implement the directories as normal files, and for which it is therefore not possible to obtain an fd for the directory on which an fsync() may operate. POSIX semantics from both Section 2 ("Compliance") and from the "Rationale" and "Corrigenda" sections specifically state the metadata semantics for files, when it comes to "SHALL be updated" vs. "SHALL bemarked for update" vs. no stated semantics. A system which does not guarantee metadata integrity, and ordering of metadata vs. data when a file in a directory is itself fsync()'ed while there is an incompleted operation in progress, is *NOT* compliant with the POSIX or IEEE-1003.1-2001 or Sungle UNIX Specification standards. There is no other interpretation possible: http://www.opengroup.org/onlinepubs/007904975/functions/fchdir.html "A conforming application can obtain a file descriptor for a file of type directory using open(), provided that the file status flags and access modes do not contain O_WRONLY or O_RDWR." > I know that sendmail is safe on softupdates (indeed, softupdates > is recommended), but I also recall that some source modifications > were required to have it to an fsync() on both the file and the > directory, before it was safe. If you examine the Sendmail sources, you will see that it calls fsync() in queue.c and collect.c. In both these cases, the call takes place on "tfp" -- a temporay file pointer. In no case does it call fsync on an fd that references a directory. Sendmail, in other words, expects a POSIX compliant system, in terms of metadata update ordering semantic guarantees. > Unfortunately, I don't recall if the fync()-on-file-and-directory > trick is enough to make sendmail sufficiently safe on ext3fs. You'd > have to ask people who are more knowledgeable with that configuration > than I am. It would not be, unless it did an fsync-to-root on intermediate directories, as well as the queue directory, to ensure all the extents referred to their committed data from the inferior directories, instead of referring to the previously committed data of an intermediate directory, which did not refer to the newly committed extents. > In the long run, it all comes down to how much danger you're > willing to live with, and how much safety you believe is required > before you are in proper compliance with the protocol specifications. This is just rationalizing. In the long run, it comes down to compliance or not compliance with international standards; The ext3fs fails to comply with these standards, because it assumes by default that metadata integrity is the responsibility of the application, not the OS, and it assumes an implementation detail that the standards in question are not prepared to allow it to assume, to wit: that directories are implemented in terms of file primitives, at the lowest level, and that it is therefore possible to both (a) get a writeable file descriptor for a directory and (b) to utilize such a file descriptor in an fsync() call. In fact, a strict reading of the relevent standards permits an fsync() on any fd that is not open for write to return EBADF or EIO. > If you want to run your e-mail system on a pure RAM disk that has > no battery backup or UPS, and you're willing to lose all that e-mail > if the power goes out, then you should be able to do that. However, > if you have any customers, you should make operational decisions like > this known to them, so that they can make their own determination as > to whether or not you are conforming to the level of service that > they require. This is what Hotmail does, in fact, and it makes it clear that the service is not intended for business use, and that business use is in fact prohibited, in it's service agreement in the Terms Of Service section. The fact is, though, that this renders their service non-compliant with both the international standard RFC-821, and the international standard RFC-2821 which supercedes the former. Again, it comes down to compliance with international standards. -- Terry