From owner-freebsd-questions@FreeBSD.ORG Thu May 18 02:11:05 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3C17E16A5E3 for ; Thu, 18 May 2006 02:11:05 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from fw.farid-hajji.net (fw.farid-hajji.net [213.146.115.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 906D043D4C for ; Thu, 18 May 2006 02:11:03 +0000 (GMT) (envelope-from cpghost@cordula.ws) Received: from epia2.farid-hajji.net (epia-2 [192.168.254.11]) by fw.farid-hajji.net (Postfix) with ESMTP id 836D8DDE6C; Thu, 18 May 2006 04:06:10 +0200 (CEST) Date: Thu, 18 May 2006 02:14:21 +0000 From: cpghost To: Albert Shih Message-ID: <20060518021421.GA84475@epia2.farid-hajji.net> References: <20060517213512.GE10915@math.jussieu.fr> <20060517233747.GB17856@math.jussieu.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060517233747.GB17856@math.jussieu.fr> User-Agent: Mutt/1.5.11 Cc: freebsd-questions@freebsd.org Subject: Re: Fast du X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 May 2006 02:11:05 -0000 On Thu, May 18, 2006 at 01:37:47AM +0200, Albert Shih wrote: > Le 17/05/2006 ? 18:17:54-0400, Charles Swiger a ?crit > > On May 17, 2006, at 5:35 PM, Albert Shih wrote: > > >I search some technics/command/anything can make very fast ?du? > > >especialy > > >when in the file system there are lot of lot of hard-link. I know no solution to this... just a few random thoughts: If you didn't have subdirs and hard links, you could cache the results of slow-du somewhere, and look up the results there, updating the cache only if directory m_time(s) changed. Let du-cache := { (dir-ino, (du-value, timestamp)) | dir-ino is directory inode number [key of cache], du-value is disk usage of dir-ino, taken at timestamp } But with subdirs, you need to take care of recursion; and that makes bookkeeping the du-cache somewhat more complicated. With hard-links, esp. across directories; you need an additional hard-link cache; and AFAICS there's no way to have that automatically updated, when a hard-linked file changes size elsewhere... ...unless you decide to add some hooks to VFS(9). But if you go this route, you could as well hook up the entire fast-du bookkeeping at VFS level, but that's most likely a major undertaking (if you do, remember quota(1)). If you don't need absolute accuracy 100% of the time, you could build a du-cache once every few days or so, just like locate(1)'s database; and use directory timestamps to incrementally update it, so it would only take a lot of time the first time to build, and hopefully relatively less time subsequently (depending on usage pattern, of course). > Albert SHIH > Universite de Paris 7 (Denis DIDEROT) > U.F.R. de Mathematiques. > 7 i?me ?tage, plateau D, bureau 10 -cpghost. -- Cordula's Web. http://www.cordula.ws/