From owner-freebsd-hackers@FreeBSD.ORG  Thu May 29 05:36:59 2003
Return-Path: <owner-freebsd-hackers@FreeBSD.ORG>
Delivered-To: freebsd-hackers@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 6859F37B401; Thu, 29 May 2003 05:36:59 -0700 (PDT)
Received: from mail.eecs.harvard.edu (bowser.eecs.harvard.edu [140.247.60.24])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id A7A6B43F85; Thu, 29 May 2003 05:36:58 -0700 (PDT)
	(envelope-from ellard@eecs.harvard.edu)
Received: by mail.eecs.harvard.edu (Postfix, from userid 465)
	id 3563F54C491; Thu, 29 May 2003 08:36:55 -0400 (EDT)
Received: from localhost (localhost [127.0.0.1])
	by mail.eecs.harvard.edu (Postfix) with ESMTP
	id 32ACA54C48E; Thu, 29 May 2003 08:36:55 -0400 (EDT)
Date: Thu, 29 May 2003 08:36:55 -0400 (EDT)
From: Daniel Ellard <ellard@eecs.harvard.edu>
To: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org
Message-ID: <Pine.BSF.4.51.0305290807120.90090@bowser.eecs.harvard.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: how to do asynchrounous I/O at the device level?
X-BeenThere: freebsd-hackers@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Technical Discussions relating to FreeBSD
	<freebsd-hackers.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-hackers>
List-Post: <mailto:freebsd-hackers@freebsd.org>
List-Help: <mailto:freebsd-hackers-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-hackers>,
	<mailto:freebsd-hackers-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 29 May 2003 12:36:59 -0000


I'm not sure if this a question for fs or hackers, so I apologize if
you see this twice.

I'm writing a device driver for a "soft-mirrored" disk.  The idea is
similar to ordinary disk mirroring, except that the focus is entirely
on higher performance instead of fault tolerance -- the secondary disk
need not be an exact duplicate of the first.  (I have a method for
keeping track of which blocks on the secondary are actually in sync
with the primary, and which might contain stale data.)

What I want to do is an ordinary write to the primary disk and an
asynchronous write to the secondary, so that it is possible that the
calling process can continue on its way before the writes are actually
finished on the secondary.

I've implemented my scheme with synchronous mirror writes by hacking
up the CCD driver.  (This wasn't a big deal, because CCD already
implements disk mirroring, but because I'm also futzing around with a
bunch of other stuff, the resulting code is structured a bit
differently.) Now I want to make the secondary writes asynchronous.

The challenge is that I need to make copies of whatever state the
device underneath CCD needs in order to do the I/Os.  As soon as the
primary writes are finished, the file system is going to deallocate or
reuse the structures it passed down to CCD.  I can't hack the file
system code to delay this, because I need to hide all this inside the
device driver.

I know I need to copy the buffer, and clone the buf struct.  My
questions are:

1.  How to properly clone the buf struct to make a "standalone" buf.

	Just bcopy'ing it will result in it being filled with pointers
	linking it to the rest of the buffer pool, which I suspect
	will lead to horrible problems later -- I'm pretty sure that I
	don't want the buffer manager to know about this buf, or this
	buf to believe it's part of the buffer pool.

2.  Whether there's any other state that I need to preserve.

If this sort of thing has already been implemented somewhere, just
point me to it...

Thanks,
	-Dan