From owner-freebsd-hackers Mon Feb 7 15:22: 8 2000 Delivered-To: freebsd-hackers@freebsd.org Received: from fw.wintelcom.net (ns1.wintelcom.net [209.1.153.20]) by builder.freebsd.org (Postfix) with ESMTP id 44D1C3FFE for ; Mon, 7 Feb 2000 15:22:06 -0800 (PST) Received: (from bright@localhost) by fw.wintelcom.net (8.9.3/8.9.3) id PAA20689; Mon, 7 Feb 2000 15:49:25 -0800 (PST) Date: Mon, 7 Feb 2000 15:49:25 -0800 From: Alfred Perlstein To: Wes Peters Cc: Matthew Dillon , hackers@freebsd.org Subject: Re: Syncing a vector of fileoffsets and lengths? Message-ID: <20000207154925.A17536@fw.wintelcom.net> References: <20000207114042.E25520@fw.wintelcom.net> <200002071938.LAA50114@apollo.backplane.com> <20000207125636.G25520@fw.wintelcom.net> <389F49F7.7290B179@softweyr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: <389F49F7.7290B179@softweyr.com>; from wes@softweyr.com on Mon, Feb 07, 2000 at 03:40:55PM -0700 Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG * Wes Peters [000207 15:02] wrote: > Alfred Perlstein wrote: > > > > I asked this question because of a problem that Postgresql has, > > basically multiple processes will be updating a file, they may do > > scattered IO to multiple offsets into the file, at the end of a > > transaction they want to sync the data... fsync(). ow. This causes > > buffers dirtied from multiple processes to be pushed to disk where > > they really only want thier own. The order doesn't really matter, > > just that all of the IO is on stable storage. > > So, what you're looking for is something like writev, only having > the vector entries consist of (fd, pos, nbytes) triples? And > perhaps a sync vs. async flag on the call? Yes, and a later callback to use to poll/wait for all the queued IO to complete. But the write interface shouldn't have to be with the sync interface. Someone could do this by opening another fd with O_FSYNC, the problem is that each write blocks until the IO is done instead of allowing it to be disksorted() into optimal ordering and possibly combined with other writes. A process doing this can cause the disk to seek back and forth because it has no idea of the optimal order to sync it's own file ranges. Also it then hangs waiting for IO to complete. We already have hooks to 'do stuff' when IO on a buffer completes, we even 'own' buffers over to processes, it seems like we're just half a step away from realizing this functionality, ie adding a flag to do the accounting, a syscall to poll/wait, a syscall to queue the requests and finally an interface to the filesystem to do range syncs instead of complete syncs. Well maybe it's a bit more than half a step. :) -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message