From owner-freebsd-geom@FreeBSD.ORG Tue Jun 26 07:54:58 2007 Return-Path: X-Original-To: freebsd-geom@freebsd.org Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CF31416A400 for ; Tue, 26 Jun 2007 07:54:58 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 30B2F13C46A for ; Tue, 26 Jun 2007 07:54:57 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 9429845B26; Tue, 26 Jun 2007 09:54:55 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 5414E45681; Tue, 26 Jun 2007 09:54:50 +0200 (CEST) Date: Tue, 26 Jun 2007 09:54:32 +0200 From: Pawel Jakub Dawidek To: Fluffles Message-ID: <20070626075432.GD12278@garage.freebsd.pl> References: <4680438E.9030407@fluffles.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="hxkXGo8AKqTJ+9QI" Content-Disposition: inline In-Reply-To: <4680438E.9030407@fluffles.net> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-geom@freebsd.org Subject: Re: gjournal performance issues X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2007 07:54:58 -0000 --hxkXGo8AKqTJ+9QI Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 26, 2007 at 12:37:02AM +0200, Fluffles wrote: > Hello list, >=20 > I'm testing gjournal present in -CURRENT as of 13 June 2007. So far i'm n= ot really impressed with performance, i'm writing to the list for any sugge= stions and information=20 > regarding gjournal. >=20 > First, my setup: > 8 disks in RAID5 using geom_raid5, gjournal on top where both the journal= (1GB) and the data is stored on the same consumer. Since gjournal uses bot= h metadata and file/dir=20 > for journaling, this means that, theoretically, the write speed of sequen= tial operations is doubled. Unfortunately, it appears to have crashed. >=20 > My problems: > - first, throughput appears to be only 8% of the throughput when not usin= g gjournal at all. Whereas it should be close to 50%. I had numbers in my presentation somewhere, but for single thread doing sequential write when journal and data are on the same disk (which is worstcase scenario) I recall something between 40%-50% of what you have =66rom regular UFS. > - second, during the 'switch' (writing the journal to its final location = and starting a new journal) it appears no read operations are possible to t= he .journal device. If=20 > the .journal is /usr, that means the whole system basically freezes for 3= to 5 seconds. Not really sexy. Why would it block read requests? On journal switch all operations that use vn_start_write() are blocked. Read operations shouldn't use vn_start_write(), but for example close operation is using it. I've some patches to eliminate it, because it is only needed when we close already deleted file and need to free its blocks, etc. which is quite rare case. > - when using one consumer for both journal and data, it appears the journ= al is placed at the end of the device. Why? Normally, the beginning of a di= sk is the fastest and=20 > therefore preferable location for the journal. To allow to mount file system without gjournal. > - when analysing graid5 sysctl statistics, it appears gjournal is causing= non-contiguous I/O which causes a lot of 2-phase I/O's (involving both rea= ding and writing for 1=20 > write request), the performance difficulties are most probably related to= this issue. Why doesn't gjournal read a chunk of journal (50MB) and then w= rite it? And why doesn't=20 > it write contiguously? Actually gjournal should be much better on this than UFS. It combines contiguous I/Os up to MAXPHYS and eliminates multiple write in the same place. On the other hand graid5 should be able to delay writes a bit to see if it can do full stripe write. > I've tried: > - playing with graid5 tunables, including disabling write-back buffer > - playing with journal tunables, including disabling optimization (combin= ing), reducing parallel operations to 1, reducing journal switch time and m= ore > - kmem is 500MB, gjournal can use 250MB of kernel memory for it's cache (= more than the default) > - standard UFS2 using async option and without softupdates, newfs used wi= th -J parameter >=20 > Anyone has any input? I was hoping for at least 40MB/s throughput and no = blocking I/O for read requests. Have you tried not to use graid5 with gjournal? Could you retest for raw disk and/or gmirror and/or graid3? --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --hxkXGo8AKqTJ+9QI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGgMY4ForvXbEpPzQRAgjRAKCdMSgoxbWivhdbeuB2//+BuDfSAQCdHxAH a9WXNa4FB9SmiAOiKhRLUfA= =vKNo -----END PGP SIGNATURE----- --hxkXGo8AKqTJ+9QI--