From owner-freebsd-fs@FreeBSD.ORG Thu Feb 19 23:23:39 2004 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4DAE16A4CE; Thu, 19 Feb 2004 23:23:39 -0800 (PST) Received: from VARK.homeunix.com (adsl-68-122-0-124.dsl.pltn13.pacbell.net [68.122.0.124]) by mx1.FreeBSD.org (Postfix) with ESMTP id B245F43D1D; Thu, 19 Feb 2004 23:23:39 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from VARK.homeunix.com (localhost [127.0.0.1]) by VARK.homeunix.com (8.12.11/8.12.10) with ESMTP id i1K7Mwkl017628; Thu, 19 Feb 2004 23:22:58 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by VARK.homeunix.com (8.12.11/8.12.10/Submit) id i1K7MwM4017627; Thu, 19 Feb 2004 23:22:58 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Thu, 19 Feb 2004 23:22:58 -0800 From: David Schultz To: mi+mx@aldan.algebra.com Message-ID: <20040220072258.GA17579@VARK.homeunix.com> Mail-Followup-To: mi+mx@aldan.algebra.com, fs@FreeBSD.ORG, performance@FreeBSD.ORG References: <200402181729.06202@misha-mx.virtual-estates.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200402181729.06202@misha-mx.virtual-estates.net> cc: performance@FreeBSD.ORG cc: fs@FreeBSD.ORG Subject: Re: strange performance dip shown by iozone X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Feb 2004 07:23:40 -0000 On Wed, Feb 18, 2004, mi+mx@aldan.algebra.com wrote: > I'm trying to tune the amrd-based RAID5 and have made several iozone > runs on the array and -- for comparision -- on the single disk connected > to the Serial ATA controller directly. [...] > The filesystems displayed different performance (reads are better with > RAID, writes -- with the single disk), but both have shown a notable dip > in writing (and re-writing) speed when iozone used the record lengthes > of 128 and 256. Can someone explain that? Is that a known fact? How can > that be avoided? This is known as the small write problem for RAID 5. Basically, any write smaller than the RAID 5 stripe size is performed using an expensive read-modify-write operation so that the parity can be recomputed. The solution is to not do that. If you expect lots of small random writes and you can't do anything about it, you need to either use RAID 1 instead of RAID 5, or use a log-structured filesystem, such as NetBSD's LFS.