From owner-freebsd-stable@FreeBSD.ORG Thu Apr 17 05:03:56 2003 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A36D537B401; Thu, 17 Apr 2003 05:03:56 -0700 (PDT) Received: from premijer.tel.fer.hr (premijer.tel.fer.hr [161.53.19.221]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8DCF843FBD; Thu, 17 Apr 2003 05:03:55 -0700 (PDT) (envelope-from zec@tel.fer.hr) Received: from tel.fer.hr (unknown [161.53.19.14]) by premijer.tel.fer.hr (Postfix) with ESMTP id 0B89C1380; Thu, 17 Apr 2003 14:03:37 +0200 (MET DST) Message-ID: <3E9E9827.4BB19197@tel.fer.hr> Date: Thu, 17 Apr 2003 14:03:51 +0200 From: Marko Zec X-Mailer: Mozilla 4.8 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Ian Dowse References: <200304162310.aa96829@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: freebsd-fs@freebsd.org cc: freebsd-stable@freebsd.org cc: Kirk McKusick Subject: Re: PATCH: Forcible delaying of UFS (soft)updates X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Apr 2003 12:03:57 -0000 Ian Dowse wrote: > In message <3E9C517B.6039679A@tel.fer.hr>, Marko Zec writes: > >Tempted by a lot of opposition to the concept of (optionally) ignoring > >fsync() calls when running on battery power, I wonder what effect the > >concept of unconditional delaying of _all_ disk updates by ATA-disk > >firmware will make on FS consistency in case of system crash or power > >failure? I do not want to imply such a concept is a priori bad, however > >I fail to realize its advantages over OS-controlled delaying of disk > >synching. > > Note that the ATA "delayed write" mechanism only delays writes while > the disk is spun down; at other times there is no change in behaviour. > Since the disk only spins down after it has been idle for a time, > it is very unlikely that the disk is left in an inconsistent state > while it is stopped. > > Just after the disk spins up there is a small window where the > cached writes get written out in a burst. Due to the amount of > cached data and the probable re-ordering of writes, the disk is > quite likely to be in an inconsistent state during this flurry of > writes, but the window is short so it is probably not a big issue > in practice. > > The main advantage of using the ATA delayed write mechanism is that > the disk itself can take advantage of knowing whether or not it is > spinning, whereas the OS does not have that information. The OS _does_ know (approximately) when the disk is spinning and when not. For example, if the disk is configured to stop spinning immediately after the last I/O operation, the OS can safely assume 10 or more seconds afterwards the spinning will be stopped. The OS only has to keep record (in form of timestamp or something similar) when it has issued the last I/O request to the disk. In my patch this is accomplished using the stratcalls marker, which is increased every time the strategy routine of the ATA disk driver is invoked. Therefore the OS can also successfully coalesce the pending disk updates with other outstanding I/O disk operations, which are typically reads of uncached sectors or VM swapping. > The downside > is that it is not guaranteed that fsync'd data gets written to disk > immediately, though in practice the disk tends to be spinning when > the fsync is performed due to the previous accesses. I've been using > ATA delayed writes on a few laptops for over a year and it has never > caused me any problems - it generally works just right in the sense > that the disk remains spun down when the machine is mostly idle, > and spins up when you save files from an editor etc. I agree the ATA delayed writes is a great functionality that can help save battery power. I just want to point out that it can suffer from the same consistency problems as the model of OS controlled delayed synching combined with null fsync() processing. However, if the OS controls the delaying of updates, you can turn on or off normal fsync() semantics as desired. With delaying writes in ATA firmware, you simply do not have the choice :) Cheers, Marko > Doing the write delaying in the OS is always going to be a tradeoff > between excessively delaying writes when the machine is busy and > maximising the time between spin-ups when idle, though obviously > there is more control possible over which writes get delayed and > which don't. > > Ian