From owner-freebsd-fs Sat May 4 8:16: 4 2002 Delivered-To: freebsd-fs@freebsd.org Received: from gull.prod.itd.earthlink.net (gull.mail.pas.earthlink.net [207.217.120.84]) by hub.freebsd.org (Postfix) with ESMTP id 85C5D37B400 for ; Sat, 4 May 2002 08:16:00 -0700 (PDT) Received: from pool0048.cvx22-bradley.dialup.earthlink.net ([209.179.198.48] helo=mindspring.com) by gull.prod.itd.earthlink.net with esmtp (Exim 3.33 #2) id 1741GB-0003B5-00; Sat, 04 May 2002 08:15:44 -0700 Message-ID: <3CD3FB02.3EC1DA29@mindspring.com> Date: Sat, 04 May 2002 08:15:14 -0700 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: utsl@quic.net Cc: Bakul Shah , Scott Hess , "Vladimir B. Grebenschikov" , fs@FreeBSD.ORG Subject: Re: Filesystem References: <200205040019.UAA13780@illustrious.cnchost.com> <3CD32F43.327CDA46@mindspring.com> <20020504041936.GA19646@quic.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-fs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org utsl@quic.net wrote: [ ... linear directory search times on the majority of systems ... ] > OTOH, I've seen a very large application (it ran on a Sun E10K) that did > absolutely nothing about it. It was designed to put some ~1-2k files > into a spool directory, and rotate every day. Unfortunately, the > application didn't ever get redesigned to handle the scale it was being > used for. So when I dealt with it, they had a filesystem that had > 800,000 to 1M files in 15-16 directories. (Varied from day to day.) I > found out about it when I was asked to figure out why the incremental > backups for that filesystem never completed. They would run for ~35-40 > hours and then crash. If I remember right, the backup program was > running out of address space. 8-) > > Even if the filesystem had used btrees, the backup program would still > have crashed. It was trying to make a list in memory of all the files > it needed to backup. It never actually wrote anything to tape... I don't > know if all backup software does incrementals that way, but I'd bet most > of them do. > > So there can be other disadvantages to having lots of files in a > directory besides slow directory lookups. I wasn't really trying to exhasutively list all the reasons that it was bad to put a bunch of files in a large directory. There are an incredibly large number of reasons for it to be bad, and I have better things to do than spending the rest of time pointing out impedence mismatches in algorithms. 8-). My take on an application that doesn't scale is that "fixing" the application by changing the behaviour of the underlying system is just propping up bad code. Bad code deserves to lose. So if someone wrote an application like that, it's just as well that the programmer who failed to consider scaling issues lose out to the programmer who considered them. After all, it's very likely that the failure to consider scaling issues is more of an "all or nothing" thing, and that the failure to consider one means that solving it in the OS will just expose the next one. There's really no way you can make the OS behave perfectly for all applications. At some point, applications programmers will have to learn how to program, or all bets are off. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-fs" in the body of the message