From owner-freebsd-small Wed Dec 26 15: 9:55 2001 Delivered-To: freebsd-small@freebsd.org Received: from workhorse.iMach.com (workhorse.iMach.com [206.127.77.89]) by hub.freebsd.org (Postfix) with ESMTP id 8083437B405 for ; Wed, 26 Dec 2001 15:09:48 -0800 (PST) Received: from localhost (forrestc@localhost) by workhorse.iMach.com (8.9.3/8.9.3) with ESMTP id QAA19049; Wed, 26 Dec 2001 16:09:12 -0700 (MST) Date: Wed, 26 Dec 2001 16:09:12 -0700 (MST) From: "Forrest W. Christian" To: Andrew Hannam Cc: John Hanley , freebsd-small@FreeBSD.ORG Subject: Re: Disk Writes In-Reply-To: <001901c18dba$83dfcbc0$0104010a@famzon.com.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-small@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Are these ide drives? If so. you probably need to turn off write caching. I *THINK* this is hw.ata.wc but needs to be turned off in a special way. see man tuning. On Wed, 26 Dec 2001, Andrew Hannam wrote: > Date: Wed, 26 Dec 2001 13:07:46 +1000 > From: Andrew Hannam > To: John Hanley > Cc: freebsd-small@FreeBSD.ORG > Subject: Re: Disk Writes > > Thanks for your help... > > Just a short note on the power-fail condition ; With our equipment a power > failure is more likely to come just after a transaction that requires > writing to the disk. The power fail (if it occurs) is most likely to occur > 5 -> 10 seconds after the transaction and is the result of action by a > serviceman at the machine (about once or twice a week). I therefore believe > it should be possible to achieve a safe write 100% of the time. > This however has not been born out in practice with a failure rate of about > 0.5% in these conditions. With a 1000 machines in the field this would > equate to a failure of about 1 to 2 machines a day. This is not acceptable > in practice so I must find a solution. > > The hard-links idea is a useful bit - I'll add it to my toolkit for FreeBSD. > I had tried this technique before on a Linux box but in Linux the rename(2) > call is not atomic where an existing file exists. > > Using fsync() or even sync are not generally options for me as I am using > java for a large part of the application. Where C has been used - it is > liberally sprinkled with fsync and sync. Special care has been taken with > the java to ensure that files are being closed properly. Without the files > being closed after each write operation I found that even on unbuffered > writes that there was a high probability of file corruption on power-down. > Now that files are being closed after each write, I never seem to lose > information during an fsck auto-repair (a great improvement) however > occasionally fsck is not able to repair it at all effectively causing the > device to be inoperable with a return to depot for repair. The return to > depot is expensive and requires special low level data extraction to try and > get the information off the now badly corrupted disk. > > I have tried both with and without soft updates. The largest problem of > using soft updates is the latency - using the standard parameters it can > take up to 30 seconds for data to actually be written out to disk thus > introducing a good probability of losing information on a power down. > > The 0, 1 & 2 second delays are much better (using my kernel parameters) but > 2 seconds is still a long time in my application. Looking through the code - > 0, 1 & 2 second delays appear to be the smallest periods available without > affecting the way soft updates work. With soft updates I seem to be more > likely to lose information but less likely to kill the disk. Given this > compromise I have turned off soft-updates. > > Is there an alternative file-system that would be more tolerant to > power-down issues? For example, with original DOS operating system I can't > remember ever having this sort of problem until Windows started adding write > caching. > > Is there some option to make file-system calls totally synchronous (turning > off all write caching) thus significantly reducing the risk? Write speed > performance is not a critical criteria. Integrity and completeness are far > more important. > > It might be true that voltage sag on write is toasting extra super-block > copies or alternatively that the super-blocks are not being written > synchronously. I have two variants of the motherboard hardware using > different chipsets with two different sized hard-drives (3" and 2") so it > doesn't appear to be a hardware specific problem. > > If this is what is happening then having more than one super-block is an > integrity risk rather than an integrity improver because I cannot manually > fsck after a super-block corruption. How then would I turn off the extra > super-block copies? I presume this would be done at file-system creation > time. > > An example of redundant information being useless is in the original FAT > file-system. The second copy of the FAT is only ever used to detect that the > two copies of the FAT are out of sync. I have never seen a DOS or Windows > utility that takes any notice of the information written in the second copy > of the FAT. For example, scandisk (equivalent to fsck) detects that they are > different but the only repair option is to write the first copy of the FAT > on top of the second copy of the FAT. > > Is the UFS file-system and fsck different in this regard? > > ----- Original Message ----- > From: "John Hanley" > To: "Andrew Hannam" > Sent: Monday, December 24, 2001 5:06 PM > Subject: Re: Disk Writes > > > > --- Andrew Hannam wrote: > > > The application has been > > > written to only append to files or to replace the file by creating a new > > > file and then writing a single byte to redirect which file is in use. > > > > BTW, using hard links might be pretty slick, here. Watch me atomically > > delete "file": > > > > $ date > file > > $ TMP=file.$$ > > $ mv $TMP file > > > > At worst, file.${pid} is left around. > > At all times, we have either the old or new contents available. > > The rename(2) does the "delete" and the "make new contents available" > > operations as a single atomic operation, safe in the face of power fails. > > > > > > > My current kernel settings are: > > > sysctl -w vfs.write_behind=0 kern.filedelay=2 kern.dirdelay=1 > > > kern.metadelay=0 > > > > I'll bet that combination of parameters has received less testing > > than the default parameters. I feel that the default params with > > soft updates should be working just fine for you. > > > > Is power fail pretty straightforward? Does the CPU go down before > > the device that /app is on goes down? Maybe voltage sag at the time > > of a write is toasting one or more superblock replicas? > > > > > > Cheers, > > JH > > > > __________________________________________________ > > Do You Yahoo!? > > Send your FREE holiday greetings online! > > http://greetings.yahoo.com > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-small" in the body of the message > - Forrest W. Christian (forrestc@imach.com) AC7DE ---------------------------------------------------------------------- The Innovation Machine Ltd. P.O. Box 5749 http://www.imach.com/ Helena, MT 59604 Home of PacketFlux Technogies and BackupDNS.com (406)-442-6648 ---------------------------------------------------------------------- Protect your personal freedoms - visit http://www.lp.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-small" in the body of the message