From owner-freebsd-fs@FreeBSD.ORG Mon Mar 28 20:03:26 2005 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E276716A4CE; Mon, 28 Mar 2005 20:03:26 +0000 (GMT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4911743D2F; Mon, 28 Mar 2005 20:03:26 +0000 (GMT) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.1/8.13.1) with ESMTP id j2SK3GEV095360; Mon, 28 Mar 2005 12:03:20 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200503282003.j2SK3GEV095360@gw.catspoiler.org> Date: Mon, 28 Mar 2005 12:03:16 -0800 (PST) From: Don Lewis To: dwmalone@maths.tcd.ie In-Reply-To: <20050328153506.GA198@walton.maths.tcd.ie> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: freebsd-fs@FreeBSD.org cc: rwatson@FreeBSD.org Subject: Re: UFS Subdirectory limit. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2005 20:03:27 -0000 On 28 Mar, David Malone wrote: > Here's the benchmark results comparing a two level scheme (which > I've labeled "sqrt") with a single directory with 150000 subdirectories > (which I've labeled "flat"). > > The benchmark is in 4 phases: > > mkdir) This builds the directory structure. > write) This writes a small amount of data into 100000 files > in a pseudo random sequence of subdirectories. > read) This reads back the data from each of the 100000 > files (in the same order they were written). > rm) This does an "rm -fr" of the whole tree. > > I just used /usr/bin/time on each phase and synced out the data > between each phase. The results (averaged over 4 runs, see the end > of the mail for the output of ministat on the data). > > real time user time sys time > mkdir write read rm | mkdir write read rm | mkdir write read rm > sqrt 499 4302 2409 1569 | 1.84 1.94 1.72 1.69 | 29.9 33.5 21.3 161.6 > flat 1172 4318 2407 1717 | 1.47 1.62 1.52 1.66 | 26.1 33.5 20.6 158.1 > > So, it seems that while making the directory structure takes a bit > longer for the flat method, there's no significant penality in real > time for using it. The user times are pretty irrelevant (though the > flat scheme is slightly faster, probably because some of the phases > don't do sqrts ;-). > > Interestingly, the system times for the flat structure are actually > *better* than the two level structure! I think this supports Don's > suggestion that the layout of data on the disk with very large > directories is not as good as it could be. Just for grins, you might want to try a "very-flat" experiment where you create all 100000 files in the top directory. Traditionally directories were always allocated in another cylinder group than their parent, which would spread them all over the disk. This turns out to be somewhat sub-optimal because it causes an excessive amount of seek activity when traversing large directory trees. When the dirpref code was added, it allowed a limited number of subdirectories to be allocated using the same cylinder group as their parent, but I suspect that the allocations will still be fairly well distributed when running your benchmark.