From owner-freebsd-fs@FreeBSD.ORG Thu Apr 17 12:43:47 2003 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 978E537B404; Thu, 17 Apr 2003 12:43:47 -0700 (PDT) Received: from mail.tel.fer.hr (zg05-025.dialin.iskon.hr [213.191.138.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9F19C43FA3; Thu, 17 Apr 2003 12:43:45 -0700 (PDT) (envelope-from zec@tel.fer.hr) Received: from marko-tp (marko@[192.168.202.105]) by mail.tel.fer.hr (8.12.6/8.12.6) with ESMTP id h3HJfhxI000841; Thu, 17 Apr 2003 21:41:47 +0200 (CEST) (envelope-from zec@tel.fer.hr) From: Marko Zec To: Terry Lambert Date: Thu, 17 Apr 2003 21:43:26 +0200 User-Agent: KMail/1.5 References: <200304162310.aa96829@salmon.maths.tcd.ie> <3E9E9827.4BB19197@tel.fer.hr> <3E9EDC38.1CE381C6@mindspring.com> In-Reply-To: <3E9EDC38.1CE381C6@mindspring.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200304172143.26387.zec@tel.fer.hr> cc: freebsd-fs@freebsd.org cc: Ian Dowse cc: freebsd-stable@freebsd.org cc: Kirk McKusick Subject: Re: PATCH: Forcible delaying of UFS (soft)updates X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Apr 2003 19:43:48 -0000 On Thursday 17 April 2003 18:54, Terry Lambert wrote: > Marko Zec wrote: > > Ian Dowse wrote: > > > Note that the ATA "delayed write" mechanism only delays writes while > > > the disk is spun down; at other times there is no change in behaviour. > > > Since the disk only spins down after it has been idle for a time, > > > it is very unlikely that the disk is left in an inconsistent state > > > while it is stopped. > > I'm wondering if the ATA "delayed write" actually does this, or if > it merely relaxes the cache restrictions, without retaining the > ordering enforcement. > > I suspect that it does not retain the ordering enforcement, as > there is no way to disconnect on a tagged queue write, because > you must issue a request for status, and it can't be done as a > seperate ATA operation (see the posts by the Maxtor employee, on > and around January 20th of this year to the -FS list for details). > > You are much better off accumulating requests in the kernel in > buffers, and then using the normal write mechanism to push them > out to the drive ordered (IMO). That is precisely what the original OS-controlled delayed synching patch does :) > This implies a barrier and new > code above the bwrite interface, to keep the buffers from getting > locked, and stalling you applications in user space. > > A problem I see here is that swap is on a totally different path, > and in a different area of the disk (practically guaranteeing a > seek, and a track buffer invalidation on the disk), even if you > could cause swapping to be delayed (I don't think you can; FreeBSD > aggressively uses memory, and so when you need to swap, you *need* > to swap). > > > The OS _does_ know (approximately) when the disk is spinning and when > > not. For example, if the disk is configured to stop spinning immediately > > after the last I/O operation, the OS can safely assume 10 or more seconds > > afterwards the spinning will be stopped. The OS only has to keep record > > (in form of timestamp or something similar) when it has issued the last > > I/O request to the disk. In my patch this is accomplished using the > > stratcalls marker, which is increased every time the strategy routine of > > the ATA disk driver is invoked. Therefore the OS can also successfully > > coalesce the pending disk updates with other outstanding I/O disk > > operations, which are typically reads of uncached sectors or VM swapping. > > This is useful, but not enough. You need to actually communicate > the information above the block I/O layer, to the soft updates. I > think, effectively, what you actually want to do is to stop the > soft updates clock Hey man, that's exactly what I have done in my patch ("stopping the soft updates clock" as you call it). On the block I/O layer I'm only checking if the disk is active or not... Are you sure you have checked out the patch / code? > , rather than trying to play stupid disk tricks > with timers, etc., above and beyond what you have to do. I can see > it being useful on SCSI disks, as well, particularly where there are > temperature issues. Though in that case, you probably are more > memory starved than anything, and it will end up doing you no good. > > > I agree the ATA delayed writes is a great functionality that can help > > save battery power. > > I don't; only if the write order is maintained is it "great". > > > I just want to point out that it can suffer from the same > > consistency problems as the model of OS controlled delayed synching > > combined with null fsync() processing. However, if the OS controls the > > delaying of updates, you can turn on or off normal fsync() semantics as > > desired. With delaying writes in ATA firmware, you simply do not have the > > choice :) > > I think people are confusing fsync() with syncd at this point. 8-(. > > -- Terry