From owner-freebsd-fs@FreeBSD.ORG Wed Aug 6 18:59:16 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6BED7106566C for ; Wed, 6 Aug 2008 18:59:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id F06038FC0A for ; Wed, 6 Aug 2008 18:59:15 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 2079045C89; Wed, 6 Aug 2008 20:59:15 +0200 (CEST) Received: from localhost (chello087206045140.chello.pl [87.206.45.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 45AA645685; Wed, 6 Aug 2008 20:59:07 +0200 (CEST) Date: Wed, 6 Aug 2008 20:59:09 +0200 From: Pawel Jakub Dawidek To: Peter Schuller Message-ID: <20080806185909.GC2580@garage.freebsd.pl> References: <200807262005.54235.peter.schuller@infidyne.com> <20080726205118.GB1345@garage.freebsd.pl> <200807272026.54907.peter.schuller@infidyne.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="R+My9LyyhiUvIEro" Content-Disposition: inline In-Reply-To: <200807272026.54907.peter.schuller@infidyne.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Asynchronous writing to zvols (ZFS) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Aug 2008 18:59:16 -0000 --R+My9LyyhiUvIEro Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Jul 27, 2008 at 08:26:46PM +0200, Peter Schuller wrote: > Hello, >=20 > > The problem is that we don't between async and sync I/O request on GEOM > > level, that's why I decided to commit a ZIL log after each write, which > > wasn't very smart it seems. This is handled differently in version I've > > in perforce. Could you try the below patch and see how it performs now? > > > > http://people.freebsd.org/~pjd/patches/zvol.c.patch >=20 > The above (though the files has moved, for anyone else reading wanting to= =20 > apply) does eliminate the synchronicity problem. I am now seeing 5-15=20 > MB/second write speeds to the zvol, with 100% constituent disk utilizatio= n. >=20 > I am not sure why I don't see faster writes; I get more like 40-60 when= =20 > writing to a file in a ZFS file system on the same pool. But regardless, = the=20 > synchronisity issue is gone. Not sure why's that, I spent no time on optimizing ZVOL yet, sorry. > Does your comment above regarding distinguishing bewteen sync and asynch = apply=20 > to the section of code affected by the above patch, or did you mean there= is=20 > some other place above the zvol handling where there is lack of distincti= on? >=20 > That is, is the end-effect of the above change that we *never* do synchro= nous=20 > writes (because the fact that a write is supposed to be synchronous is=20 > somehow lost before it reaches that point)? >=20 > I understand a zil_commit is only required on BIO_FLUSH requests, which i= s=20 > what the patch fixes. But I get the impression from your phrasing above t= hat=20 > the reason that a zil_commit was done on every I/O from the get go was in= an=20 > effort to honor actual synchronous writes by conservatively *always* doin= g=20 > synchronous writes, because the synchronicity of synchronous writes would= not=20 > be propagated down to the zvol class. I wouldn't want to sacrifice=20 > correctness just to get the speed ;) With the patch above we synchoronize in-memory transactions every 5 seconds or when queue is full or when we receive BIO_FLUSH. Of course the previous behaviour was more conservative, but sending writes down doesn't mean they will reach disk platters. There is still disk cache in the way. If we really want to be sure that data is safe on the disk, we should send BIO_FLUSH. In other words if you use UFS on raw disk, sync writes can still be delayed by disk's cache. When you use UFS on top of ZVOL, writes can be delayed by ZFS cache. I think the way to go is to pass sync/async property of I/O request down to the GEOM stack. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --R+My9LyyhiUvIEro Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFImfR9ForvXbEpPzQRAmFpAKDWDG6e500u/ENxVw+gFw5K8DL/fACfVgdL cr+CDy5pzlEYToSBnFgKBZU= =fGxK -----END PGP SIGNATURE----- --R+My9LyyhiUvIEro--