Date: Wed, 22 Mar 2000 16:48:40 -0800 (PST) From: Matthew Dillon <dillon@apollo.backplane.com> To: FREENIX IS OVERRATED <pedophile@INT.TELE.DK> Cc: current@FreeBSD.ORG, fs@FreeBSD.ORG Subject: Re: FreeBSD random I/O performance issues Message-ID: <200003230048.QAA94830@apollo.backplane.com> References: <Pine.BSF.3.96.1000323001655.85607G-100000@fLuFFy.iNt.tElE.dK>
next in thread | previous in thread | raw e-mail | index | archive | help
:> out. :> :> What the write-behind code tries to do is to prevent the buffer cache :> from being saturated with dirty buffers and to smooth out disk write :> I/O. It makes the assumption that write-behind data is not typically :> accessed by the program immediately after being written -- an assumption :> that winds up being incorrect in the DBM case you tested and resulting :> in stalls due to the buffer / VM pages being locked during the write I/O. :> The stalls are *not* due to the I/O itself but instead are due to side :> effects of the I/O being in-progress. : :And that sounds a heck of a lot like what those of us who have been :running INN news swervers with 1,1GB size text history files on 2.whazzit :(now dead, may it rest in pieces widely-scattered) and later have seen. : :You should have forgotten that a couple months or so ago, I wrote to :one of these lists to ask why I was getting only about 50-70% :availability as my 1.5+MD5-based-dbz innd was stuck in ufslck2 during :these every-30-seconds syncs. The .hash and .index files from this, :which are comparable to the dbm (dbz) files being typically 125MB and :85MB or so, this under 3.4-STABLE. : :Well, I've meant to get around to trying 4.0 on it, and Real Soon Now :I will, but I wanted to relate my experiences in turning traitor, a :heretic who has left the fold, deserving to be ridden out of town on :a rail and stuff, which sounds like a lot of fun. I tried NetBSD. : :NetBSD (at least the development now 1.4V version) has trickle :syncing, which seems to work quite well when having to cope with :these rather large database files, keeping a full 14 days of message :IDs from a full news feed. Personally speaking I agree with you in regards to the syncer code. I don't have time to fix it, though I suspect it would not be difficult. Trickle syncing is an inherently easy thing to do. Kirk and I have both had serious trouble with the syncer daemon not being able to smooth out write I/O's due to it fsync'ing whole files all in one go. The buffer daemon does a much better job which is why the speedup_syncer stuff is being slowly depreciated in favor of bd_speedup(). For INN there are several things you can tune in 4.0. First and foremost you can try turning off the write-behind code, sysctl -w vfs.write_behind=0. Secondly you can mess around with the vfs.hidirtybuffers sysctl (generally lower it) in order to force out dirty pages earlier and thus reduce the number that fsync has to deal with. I believe that INN also messes around with shared/R+W mmap()'s - it may be possible to add MAP_NOSYNC to those maps to turn off the 30 second fsync on pages dirtied through the VM system (for those maps), though this may increase the amount of stale (unwritten) data after a crash. :There. I've confessed. It feels really good. Now have at me. : :Naturally, since I haven't followed this discussion closely, you may :be talking about something completely different, but I did want to :mention generally improved (yet not totally perfect) performance :with huge INN database files and NetBSD's trickle syncing. Now, :go out and steal some k0deZ, okay? : : :barry bouwsma, tele danMerika internet -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200003230048.QAA94830>