From owner-freebsd-current@FreeBSD.ORG Thu Apr 22 18:27:35 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 687FD16A4CE for ; Thu, 22 Apr 2004 18:27:35 -0700 (PDT) Received: from kientzle.com (h-66-166-149-50.snvacaid.covad.net [66.166.149.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0510343D5C for ; Thu, 22 Apr 2004 18:27:35 -0700 (PDT) (envelope-from tim@kientzle.com) Received: from kientzle.com (p54.kientzle.com [66.166.149.54]) by kientzle.com (8.12.9/8.12.9) with ESMTP id i3N1RS90085062; Thu, 22 Apr 2004 18:27:28 -0700 (PDT) (envelope-from tim@kientzle.com) Message-ID: <40887100.3040606@kientzle.com> Date: Thu, 22 Apr 2004 18:27:28 -0700 From: Tim Kientzle User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.4) Gecko/20031006 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Eric Anderson References: <40867A5D.9010600@centtech.com> In-Reply-To: <40867A5D.9010600@centtech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-current@freebsd.org Subject: Re: Directories with 2million files X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 23 Apr 2004 01:27:35 -0000 Eric Anderson wrote: > First, let me say that I am impressed (but not shocked) - FreeBSD > quietly handled my building of a directory with 2055476 files in it. > > However, several tools seem to choke on that many files ... > > $ ls -al | wc -l > ls: fts_read: Cannot allocate memory > 0 > > Watching memory usage, it goes up to about 515Mb, and runs out of memory > (can't swap it), and then dies. (I only have 768Mb in this machine). Not "can't swap", but "doesn't need to swap." Your 'top' output shows you've got plenty of free swap, so that's not the issue. I suspect you have got a per-process data limit of 512MB, so the kernel is killing the process when it gets too big. Up that limit, and it should succeed. What does "limit -d" say? What is the 'datasize' set to in /etc/login.conf? What are you using for DFLDSIZ in your kernel config file? (See /usr/src/sys/conf/NOTES for more information on DFLDSIZ, which I believe defaults to 512MB.) If you're using directories with over 2million files, you probably have other processes that could use more memory as well, so upping this limit is advisable. The Real Fix Of course, 'ls' should probably not be using this much memory. Having some recent experience with fts, I think I can explain why it does use so much memory and what would be required to fix this, in case some enterprising soul would like to tackle it. (I might have a few details wrong, of course. Caveat hacker.) ls, du, and find all use the fts library (src/lib/libc/gen/fts.c) to walk the dir heirarchy. fts is pretty nifty, but it can be memory-hungry. To understand why, I'll first explain how I tried to walk a directory tree once before I started using fts myself: opendir() each directory, read and return entries one at a time (recursively handling each directory as you encounter it), then closedir(). This approach is simple, obvious, and breaks on very deeply nested directory trees because it keeps an open file handle for each nested directory. Instead, the BSD fts library reads each dir entirely into memory when it sees it, so it can immediately close that directory. In your case, the fts library is keeping an in-memory list of every file in the directory. ls is then requesting all of that information from fts, formatting all of the lines and then holding them in memory so that it can compute the optimal column widths. This last bit surprised me, too, when I stumbled across it. I'm not at all certain that 'ls -lf' should do that. As a result, 'ls' has two copies of the dir information: one copy in the form of 'stat' structures within the fts library and another copy in the form of partially-formatted lines for the final output. Yes, that's a lot of memory. Here are some ideas for improving 'ls'; maybe someone will take a stab at implementing one or more of these: * Easy: Change ls -lf to not store the lines and adjust the column widths but just output the lines immediately, using pre-set minimum column widths. Less pretty, but certainly agrees with the expected behavior of -f. * Moderate: Have 'ls' not use fts for generating a listing for single directories. Instead, use opendir/readdir/closedir directly. Combined with the above, ls -lf would be exactly as fast as you would expect. * Moderate-Difficult: Make fts smarter/more flexible: = Maybe fts should not always read a full directory as soon as it sees it (for instance, it might only slurp the remainder of the directory into memory if it encountered a subdir, if the client requires sorted output, or if fts exceeded some small limit on file descriptors). = Adding more flags to fts might help (if fts knows the client will never re-visit files, for example, then fts can be more aggressive about releasing memory; similarly, a number of clients, including 'ls', need _directories_ to be visited in sorted order, but not regular files). ls should arguably be doing its own sorting, anyway, because it has to keep the lines in memory anyway. = fts might only read the first entries initially, reading the remainder only on demand. Some combination of the above would allow fts to use a lot less memory in a number of common situations. Overhauling fts could make a very good student project for someone at the senior/masters level. It's a surprisingly tricky exercise in algorithm and library API design. There are a number of trade-offs in fts that would require some real insight to fully overcome. > du does the exact same thing. This puzzles me. It shouldn't use nearly as much memory as ls does, because 'du' doesn't store any per-file information outside of fts. That's a real head-scratcher. If I have time, I'll look at this one. > find, however, works fine (and is very fast!): > $ time find . | wc -l > 2055476 'find' also uses fts, so it should be using approximately as much memory as 'du'. I find it a bit peculiar that it behaves so differently. Tim Kientzle