From owner-freebsd-fs@FreeBSD.ORG Tue Sep 28 23:15:02 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC29C106566B for ; Tue, 28 Sep 2010 23:15:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail03.syd.optusnet.com.au (mail03.syd.optusnet.com.au [211.29.132.184]) by mx1.freebsd.org (Postfix) with ESMTP id 78F678FC2E for ; Tue, 28 Sep 2010 23:15:01 +0000 (UTC) Received: from besplex.bde.org (c122-107-116-249.carlnfd1.nsw.optusnet.com.au [122.107.116.249]) by mail03.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o8SNEvp4006110 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 Sep 2010 09:14:58 +1000 Date: Wed, 29 Sep 2010 09:14:57 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bruce Evans In-Reply-To: <20100929054826.E797@besplex.bde.org> Message-ID: <20100929084801.M948@besplex.bde.org> References: <20100929031825.L683@besplex.bde.org> <20100929054826.E797@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: fs@freebsd.org Subject: Re: ext2fs now extremely slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 23:15:02 -0000 On Wed, 29 Sep 2010, Bruce Evans wrote: > On Wed, 29 Sep 2010, Bruce Evans wrote: > >> For benchmarks on ext2fs: >> >> Under FreeBSD-~5.2 rerun today: >> untar: 59.17 real >> tar: 19.52 real >> >> Under -current run today: >> untar: 101.16 real >> tar: 172.03 real >> >> So, -current is 8.8 times slower for tar, but only 1.7 times slower for >> untar. >> ... >> So it seems that only 1 block in every 8 is used, and there is a seek >> after every block. This asks for an 8-fold reduction in throughput, >> and it seems to have got that and a bit more for reading although not >> for writing. Even (or especially) with perfect hardware, it must give >> an 8-fold reduction. And it is likely to give more, since it defeats >> vfs clustering by making all runs of contiguous blocks have length 1. >> >> Simple sequential allocation should be used unless the allocation policy >> and implementation are very good. > > This work a bit better after zapping the 8-fold way: Things > ... > This gives an improvement of: > > untar: 101.16 real -> 63.46 > tar: 172.03 real -> 50.70 > > Now -current is only 1.1 times slower for untar and 2.6 times slower for > tar. > > There must be a problem with bpref for things to have been so bad. There > is some point to leaving a gap of 7 blocks for expansion, but the gap was > left even between blocks in a single file. > ... > I haven't tried the bde_blkpref hack in the above. It should kill bpref > completely so that there is no jump between lbn0 and lbn1, and break > cylinder group based allocation even better. Setting bde_blkpref to 1 > restores the bug that was present in ext2fs in FreeBSD between 1995 and > 2010. This bug gave seqential allocation starting at the beginning of > the disk in almost all cases, so map searches were slow and early groups > filled up before later groups were used at all. Tried this (patch repeated below), and it gave essentially the same speed as old versions. The main problem seems to be that the `goal' variables aren't initialized. After restoring bits verbatim from an old version, things seem to work as expected: % Index: ext2_alloc.c % =================================================================== % RCS file: /home/ncvs/src/sys/fs/ext2fs/ext2_alloc.c,v % retrieving revision 1.2 % diff -u -2 -r1.2 ext2_alloc.c % --- ext2_alloc.c 1 Sep 2010 05:34:17 -0000 1.2 % +++ ext2_alloc.c 28 Sep 2010 21:08:42 -0000 % @@ -1,2 +1,5 @@ % +int bde_blkpref = 0; % +int bde_alloc8 = 0; % + % /*- % * modified for Lites 1.1 % @@ -117,4 +120,8 @@ % ext2_alloccg); % if (bno > 0) { % + /* set next_alloc fields as done in block_getblk */ % + ip->i_next_alloc_block = lbn; % + ip->i_next_alloc_goal = bno; % + % ip->i_blocks += btodb(fs->e2fs_bsize); % ip->i_flag |= IN_CHANGE | IN_UPDATE; The only things that changed recently in this block were the 4 deleted lines and 4 lines with tabs corrupted to spaces. Perhaps an editing error. % @@ -542,6 +549,12 @@ % then set the goal to what we thought it should be % */ % +if (bde_blkpref == 0) { % if(ip->i_next_alloc_block == lbn && ip->i_next_alloc_goal != 0) % return ip->i_next_alloc_goal; % +} else if (bde_blkpref == 1) { % + if(ip->i_next_alloc_block == lbn) % + return ip->i_next_alloc_goal; % +} else % + return 0; % % /* now check whether we were provided with an array that basically Not needed now. % @@ -662,4 +675,5 @@ % * block. % */ % +if (bde_alloc8 == 0) { % if (bpref) % start = dtogd(fs, bpref) / NBBY; % @@ -679,4 +693,5 @@ % } % } % +} % % bno = ext2_mapsearch(fs, bbp, bpref); The code to skip to the next 8-block boundary should be removed permanently. After fixing the initialization, it doesn't generate holes inside files but it still generates holes between files. The holes are quite large with 4K-blocks. Benchmark results with just the initialization of `goal' variables restored: %%% ext2fs-1024-1024: tarcp /f srcs: 78.79 real 0.31 user 4.94 sys tar cf /dev/zero srcs: 24.62 real 0.19 user 1.82 sys ext2fs-1024-1024-as: tarcp /f srcs: 52.07 real 0.26 user 4.95 sys tar cf /dev/zero srcs: 24.80 real 0.10 user 1.93 sys ext2fs-4096-4096: tarcp /f srcs: 74.14 real 0.34 user 3.96 sys tar cf /dev/zero srcs: 33.82 real 0.10 user 1.19 sys ext2fs-4096-4096-as: tarcp /f srcs: 53.54 real 0.36 user 3.87 sys tar cf /dev/zero srcs: 33.91 real 0.14 user 1.15 sys %%% The much larger holes between the files are apparently responsible for the decreased speed with 4K-blocks. 1K-blocks are really too small, so 4K-blocks should be faster. Benchmark results with the fix and bde_alloc8 = 1. ext2fs-1024-1024: tarcp /f srcs: 71.60 real 0.15 user 2.04 sys tar cf /dev/zero srcs: 22.34 real 0.05 user 0.79 sys ext2fs-1024-1024-as: tarcp /f srcs: 46.03 real 0.14 user 2.02 sys tar cf /dev/zero srcs: 21.97 real 0.05 user 0.80 sys ext2fs-4096-4096: tarcp /f srcs: 59.66 real 0.13 user 1.63 sys tar cf /dev/zero srcs: 19.88 real 0.07 user 0.46 sys ext2fs-4096-4096-as: tarcp /f srcs: 37.30 real 0.12 user 1.60 sys tar cf /dev/zero srcs: 19.93 real 0.05 user 0.49 sys Bruce