From owner-freebsd-performance@FreeBSD.ORG Tue Oct 6 09:47:05 2009 Return-Path: Delivered-To: freebsd-performance@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB9DD106568B for ; Tue, 6 Oct 2009 09:47:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx06.syd.optusnet.com.au (fallbackmx06.syd.optusnet.com.au [211.29.132.8]) by mx1.freebsd.org (Postfix) with ESMTP id 47C2D8FC0C for ; Tue, 6 Oct 2009 09:47:04 +0000 (UTC) Received: from mail08.syd.optusnet.com.au (mail08.syd.optusnet.com.au [211.29.132.189]) by fallbackmx06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n9673KM1018267 for ; Tue, 6 Oct 2009 18:03:20 +1100 Received: from c122-107-125-150.carlnfd1.nsw.optusnet.com.au (c122-107-125-150.carlnfd1.nsw.optusnet.com.au [122.107.125.150]) by mail08.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n9673Gwd004442 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Oct 2009 18:03:18 +1100 Date: Tue, 6 Oct 2009 18:03:16 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Dieter In-Reply-To: <200910051755.RAA11047@sopwith.solgatos.com> Message-ID: <20091006174121.V25604@delplex.bde.org> References: <200910051755.RAA11047@sopwith.solgatos.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-performance@FreeBSD.org Subject: Re: tuning FFS for large files Re: A specific example of a disk i/o problem X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Oct 2009 09:47:05 -0000 On Mon, 5 Oct 2009, Dieter wrote: > I found a clue! The problem occurs with my big data partitions, > which are newfs-ed with options intended to improve things. > > Reading a large file from the normal ad4s5b partition only delays other > commands slightly, as expected. Reading a large file from the tuned > ad4s11 partition yields the delay of minutes for other i/o. > ... > Here is the newfs command used for creating large data partitions: > newfs -e 57984 -b 65536 -f 8192 -g 67108864 -h 16 -i 67108864 -U -o time $partition Any block size above the default (16K) tends to thrash and fragment buffer cache virtual memory. This is obviously a good pessimization with lots of small files, and apparently, as you have found, it is a good pessimization with a few large files too. I think severe fragmentation can easily take several seconds to recover from. The worst case for causing fragmentaion is probably a mixture of small and large files. Some users fear fs consistency bugs with block sizes >= 16K, but I've never seen them cause any bugs ecept performance ones. > Even this isn't tuned the way I wanted to. > -g * -h must be less than 4 G due to 32 bit problem (system panics). The panic is now avoided in some versions of FreeBSD (-8 and -current at least). > Note the 32 bit problem is in the filesystem code, I'm running amd64. > IIRC there is a PR about this. (I'm assuming the bug hasn't been fixed yet) > Result is that I must specify -g and -h smaller than they should be. I bet you can't see any difference (except the panic) from enlarging -g and -h. > And they have way more inodes than needed. (IIRC it doesn't actually > use -i 67108864) It has to have at least 1 inode per cg, and may as well have a full block of them, which gives a fairly large number of inodes especially if the block size is too large (64K), so the -i ratio is limited. Bruce