From owner-freebsd-performance@FreeBSD.ORG Fri Dec 21 23:55:25 2007 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6536B16A419 for ; Fri, 21 Dec 2007 23:55:25 +0000 (UTC) (envelope-from biancalana@gmail.com) Received: from mu-out-0910.google.com (mu-out-0910.google.com [209.85.134.191]) by mx1.freebsd.org (Postfix) with ESMTP id CD42F13C44B for ; Fri, 21 Dec 2007 23:55:24 +0000 (UTC) (envelope-from biancalana@gmail.com) Received: by mu-out-0910.google.com with SMTP id w9so595100mue.6 for ; Fri, 21 Dec 2007 15:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=+0CPQscKda3KHBycmg459ZR21D6hiXofgAaH+mGbqRY=; b=K1eYbAnKPoQkjD10LWoJJrB8RmlncLappbybdvLuFSqfnWboLb7SlG6OjBWAYonsPeMAB30XX5TLLgizzXhpxjupo7uIu8XqJCPzfmRrbk1QSYsJo6nQIL+EHRWPqZTv5b1Ba9IpGEclLAxONVHo0uCiQrvjXjv8f+08DYgId94= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=DJZtzOcMKwxIt4elanbGj4i8VsbEAmxXSVxoZYpkIDnIvk6DnoaYofuyEYSbNWTG/MNZzG8S7J2kQTKvuvezVDLCBzLqRqLQhHvFkSchFzO0UET+lFyQtFMEfinErGwjh4GNww4rizlfSSGTbCv6Nu0KPq6KKWP2L4FCVfU9iz4= Received: by 10.64.3.9 with SMTP id 9mr19856696qbc.0.1198281322381; Fri, 21 Dec 2007 15:55:22 -0800 (PST) Received: by 10.64.184.9 with HTTP; Fri, 21 Dec 2007 15:55:22 -0800 (PST) Message-ID: <8e10486b0712211555n3efe8729qff14387be128cf10@mail.gmail.com> Date: Fri, 21 Dec 2007 20:55:22 -0300 From: "Alexandre Biancalana" To: "Alfred Perlstein" In-Reply-To: <20071221212808.GE16982@elvis.mu.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <8e10486b0712191109n3d21b02cyf5183ee0cd01d8ce@mail.gmail.com> <20071221201625.GZ16982@elvis.mu.org> <8e10486b0712211249v4c5571ddud21b277f686992b2@mail.gmail.com> <20071221212808.GE16982@elvis.mu.org> Cc: freebsd-performance@freebsd.org Subject: Re: Bad performance when accessing a lot of small files X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Dec 2007 23:55:25 -0000 On 12/21/07, Alfred Perlstein wrote: > * Alexandre Biancalana [071221 12:48] wrote: > > On 12/21/07, Alfred Perlstein wrote: > > > > Hi Alfred ! > > > > > > > > There is a lot of very good tuning advice in this thread, however > > > one thing to note is that having ~1 million files in a directory > > > is not a very good thing to do on just about any filesystem. > > > > I think I was not clear, I will try explain better. > > > > This Backup Server has a /backup zfs filesystem of 4TB. > > > > Each host that do backups to this server has a /backup/ and > > /backup//YYYYMMDD zfs filesystems, the last contains the > > backups for some day of that server. > > > > My problem is with some hosts that have in your directory structure a > > lot of small files, independent of the hierarchy. > > Can you not tar these files together? This is what I'm trying to do.... > > > > One trick that a lot of people do is hashing the directories themselves > > > so that you use some kind of computation to break this huge dir into > > > multiple smaller dirs. > > > > I have the two cases, when you have a lot of files inside on directory > > without any directory organization/distribution but I also have > > problems with hosts that have files organized in a hierarchy like > > YYYY/MM/DD/ having no more that 200 files in the day directory > > level, but almost one million of files in total. > > > > Just for info, I made the previous suggested tuning (raise dirhash, > > maxvnodes) but this improve nothing. > > > > Thanks for your hint! > > What application are you scanning these files with? I know I had > issues with rsync in particular where I had to have it rsync > smaller pieces of a collection for it to work nicely instead of > going for the whole heirarchy. tar I run tar in the /backup//YYYYMMDD writing to LTO3 tape drive, the problem is that when origin directory contains a lot of small files the process is *much* more slow.... this is my question since the thread start.