From owner-freebsd-hackers@FreeBSD.ORG Fri Mar 4 22:23:29 2005 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F349C16A4EE for ; Fri, 4 Mar 2005 22:23:28 +0000 (GMT) Received: from evelocity03.evelocity.net (evelocity03.evelocity.net [64.240.92.173]) by mx1.FreeBSD.org (Postfix) with ESMTP id DCEDC43D48 for ; Fri, 4 Mar 2005 22:23:27 +0000 (GMT) (envelope-from jyoung@ziggy.evelocity.net) Received: from ziggy.evelocity.net ([192.168.200.3])j24MMqZ2006510; Fri, 4 Mar 2005 17:22:52 -0500 Received: from ziggy.evelocity.net (localhost.evelocity.net [127.0.0.1]) by ziggy.evelocity.net (8.13.1/8.13.1) with ESMTP id j24Mb7Fb091327; Fri, 4 Mar 2005 16:37:07 -0600 (CST) (envelope-from jyoung@ziggy.evelocity.net) Received: from localhost (jyoung@localhost)j24Mb5e8091324; Fri, 4 Mar 2005 16:37:05 -0600 (CST) (envelope-from jyoung@ziggy.evelocity.net) Date: Fri, 4 Mar 2005 16:37:05 -0600 (CST) From: Jason Young To: Peter Jeremy In-Reply-To: <20050304183747.GS57256@cirb503493.alcatel.com.au> Message-ID: <20050304161201.B87252@ziggy.evelocity.net> References: <200503022115.j22LFnWk083926@marlena.vvi.at> <20050304183747.GS57256@cirb503493.alcatel.com.au> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-762004626-1109975825=:87252" X-Spam-Status: No, score=-5.5 required=6.5 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.0-r124864 X-Spam-Checker-Version: SpamAssassin 3.1.0-r124864 (2005-01-10) on evelocity03.evelocity.net cc: ALeine cc: phk@phk.freebsd.dk cc: hackers@freebsd.org Subject: Re: FUD about CGD and GBDE X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Mar 2005 22:23:29 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-762004626-1109975825=:87252 Content-Type: TEXT/PLAIN; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Sat, 5 Mar 2005, Peter Jeremy wrote: > [CC list pruned] > > On Wed, 2005-Mar-02 13:15:49 -0800, ALeine wrote: >> If only hardware manufacturers were to equip hard drives with >> a mechanism to ensure atomic writes. A capacitor large enough >> to hold enough energy to flush the cache upon detecting the >> power supply was cut would be sufficient. > > I'm not sure thus is readily practical at the drive level. Based > on some back-of-envelope calculations using figures in a Seagate > Barracuda manual (which I happened to have handy): > - Random seek is 9.5 msec. > - Rotational period is 8.3msec > - Power consumption at 50% R/W, 50% seek is 0.82A @ 12V + 0.68A @ 5V > > A single random seek + track write will take 17.8 msec. This translates > to an electrical charge of 0.015C @ 12V and 0.012C @ 5V. > > Assuming the drive is designed to allow the supply rails to droop 20% > whilst functioning correctly during this shutdown phase (which is a > significantly bigger drop than the standard specifications), a single > random seek + track write would require the drive to include a 6000=B5F > capacitor on the 12V rail and a 12000=B5F capacitor on the 5V rail. > > As a first order approximation all Unix disk operations are writes > (reads can be satisfied from the buffer cache). Given the size of > current generation drive caches, it's more likely that there are > around 50 writes cached - which requires capacitors 50 times as large > - which would make them significantly larger than the drive itself. > >> They could even use >> a battery the status of which could be monitored via S.M.A.R.T., >> I don't see how implementing something like that could possibly >> make the cost noticably higher. > > Batteries (and standard supercaps) are generally designed to release > their energy over an extended period - several minutes is the lower > realistic limit. It would make sense to build a battery backup system > which maintained power for (say) 5 minutes. It's not realistic to > build a battery backup system that can release all it's available > energy in 1 second. > > If you're going to the effort of building a battery backup system, > you might as well backup the entire computer. These are available > off the shelf and have the added advantage of allowing users and > the system to clean up before the power goes away. > > --=20 > Peter Jeremy I must be missing something, but I'll succumb to the temptation and ask: Why not put a flash chip into the drive's onboard electronics, of the same= =20 size as the drive's cache, or the max possible size of all outstanding=20 cached writes? If power dies, park the heads immediately. Use your=20 last-gasp energy source of choice to commit the write cache contents into= =20 nonvolatile storage. Next time it's powered up, the drive firmware could=20 flush the outstanding write requests to "real" storage before coming ready= =20 to the operating system. At least some modern drives (seen this on HP/Compaq servers, etc) already= =20 have flash-upgradeable firmware. It's just a matter of adding a little=20 more. You would use it only when power fails, so it's not like you would=20 wear it out. Surely this would require far less power than spinning and seeking the=20 disk for the required amount of time? It might take longer based on the=20 flash chip's write speed, but that's probably a feature, since a small=20 battery would be able to supply much less current over time. Cost should=20 be reasonably low since this stuff is in all sorts of consumer devices. Jason Young, CCIE #8607 (R&S, Voice), MCSE Consulting Engineer e-velocity technical consulting, llc. (513)677-6223 x108 --0-762004626-1109975825=:87252--