From owner-freebsd-hackers Fri Jun 1 9:58:12 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from granger.mail.mindspring.net (granger.mail.mindspring.net [207.69.200.148]) by hub.freebsd.org (Postfix) with ESMTP id 0205037B422 for ; Fri, 1 Jun 2001 09:58:05 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.245.138.21.Dial1.SanJose1.Level3.net [209.245.138.21]) by granger.mail.mindspring.net (8.9.3/8.8.5) with ESMTP id MAA04498; Fri, 1 Jun 2001 12:57:47 -0400 (EDT) Message-ID: <3B17C9A7.5A6535D2@mindspring.com> Date: Fri, 01 Jun 2001 09:58:15 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Kris Kennaway Cc: Mike Silbersack , Terry Lambert , hackers@FreeBSD.ORG Subject: Re: general speed differences between 4.1.1-RELEASE and 4.3-RELEASE References: <20010527214531.R65666-100000@achilles.silby.com> <3B14D2AF.47CD9ECB@mindspring.com> <20010530050531.A64906@xor.obsecurity.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Kris Kennaway wrote: > > > 1. Have the ata driver leave the write cache setting > > > alone by default, providing a sysctl which can cause > > > disabled or enabled if requested. When the default is > > > allowed, put something in dmesg which says "Note: Write > > > caching may be enabled. See ata(4) for the reliability > > > implications of this." > > > > You need to look at the code; it would be relatively hard > > to make this runtime tunable instead of boot-time tunable. > > Until recently it *was* a sysctl. Wrong code. Look at the soft updates code. It is not happy about when you yank its assumptions out from under it. It is particularly not happy when it gets enabled with cached data for which it has not created dependencies, as it was formerly disabled. This can be fixed by flushing out all vnode data from an FS to disk, so that there is no cached data, and all the subsequent operations which have cached data thereafter have dependencies created whenever the data is cached. The primary problem with doing exactly this, and making soft updates a runtime tunable as a mount option, is that there is also the data associated with the device vnode on which the FS is mounted, which needs to be flushed (it only needs to be tracked via the dependencies of the FS data, since no data that has been modified will end up cached without a dependencicy). Doing that is a little harder, but not impossible (once soft updates is running, the data from the FS is write-through cached, so it can be correctly noted when it has been committed to stable storage). Now if you want the write caching on the drive to be a runtime tunable, what you need is to either: (1) Provide a "flush write cache" function in the driver, so that it can be flushed the same as the dirty vnode data. Then you need to get the disk to quit lying to the OS so that the in core dependency graph remains valid (currently, the only way to do this is to replace your IDE disks with SCSI disks, or to find one of the nonexistant IDE manufacturers whose drives support tagged command queues, AND whose drives do not lie, as with tagged command queues they do not need to, AND implement a cache-to-disk notification to the host). (2) Make the tunable turn off soft updates at the same time, since at that point you are effectively doing async I/O anyway. Turning off soft updates not at boot time has some other nasty ramifications; they are (mostly) already handled in the unmount code, however. Really, if you are going to run with IDE write cache enabled, you might as well mount sync. I'm actually not positive that this will actually cause the metadata to be written sync, or if it will continue to allow the IDE drive to lie to the OS about commits to stable storage, when it was really only a commit to the write cache. I'm also not positive about the switching mechanism between sync and async sufficiently to know that, if you switch to sync from async, and there is data in the write cache that is dirty, that it gets written immediately instead of orphaned; if orphaned, it may be impossible toswitch from async to sync, without a huge latency. If it orphans data, or can't support both sync and async operations simultaneously, well, then enabling write caching should imply an async mount, for all the good it is going to do you. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message