From owner-freebsd-stable@freebsd.org Fri Oct 21 13:41:16 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 34C29C1BBE3 for ; Fri, 21 Oct 2016 13:41:16 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from sdf.lonestar.org (mx.sdf.org [192.94.73.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "ol.sdf.org", Issuer "ol.sdf.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id E08CB6C6 for ; Fri, 21 Oct 2016 13:41:14 +0000 (UTC) (envelope-from bennett@sdf.org) Received: from sdf.org (norge.freeshell.org [192.94.73.17]) by sdf.lonestar.org (8.15.2/8.14.5) with ESMTPS id u9LDesrG010930 (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits) verified NO); Fri, 21 Oct 2016 13:40:54 GMT Received: (from bennett@localhost) by sdf.org (8.15.2/8.12.8/Submit) id u9LDer6D018453; Fri, 21 Oct 2016 08:40:53 -0500 (CDT) From: Scott Bennett Message-Id: <201610211340.u9LDer6D018453@sdf.org> Date: Fri, 21 Oct 2016 08:40:52 -0500 To: freebsd-stable@freebsd.org Subject: Re: zfs, a directory that used to hold lot of files and listing pause Cc: "Eugene M. Zheganin" User-Agent: Heirloom mailx 12.5 6/20/10 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Oct 2016 13:41:16 -0000 On Fri, 21 Oct 2016 16:51:36 +0500 "Eugene M. Zheganin" wrote: >On 21.10.2016 15:20, Slawa Olhovchenkov wrote: >> >> ZFS prefetch affect performance dpeneds of workload (independed of RAM >> size): for some workloads wins, for some workloads lose (for my >> workload prefetch is lose and manualy disabled with 128GB RAM). >> >> Anyway, this system have only 24MB in ARC by 2.3GB free, this is may >> be too low for this workload. >You mean - "for getting a list of a directory with 20 subdirectories" ? >Why then does only this directory have this issue with pause, not >/usr/ports/..., which has more directories in it ? > >(and yes, /usr/ports/www isn't empty and holds 2410 entities) > >/usr/bin/time -h ls -1 /usr/ports/www >[...] >0.14s real 0.00s user 0.00s sys > Oh, my goodness, how far afield nonsense has gotten! Have all the good folks posting in this thread forgotten how directory blocks are allocated in UNIX? This isn't even a BSD-specific thing; it's really ancient. What Eugene has complained of is exactly what is to be expected-- on really old hardware. The only eyebrow-raiser is that he has created a use case so extreme that a live human can actually notice the delays on modern hardware. I quote from his original posting: "I also have one directory that used to have a lot of (tens of thousands) files." and "But now I have 2 files and a couple of dozens directories in it". A directory with tens of thousands of files in it at one point in time most likely has somewhere well over one thousand blocks allocated. Directories don't shrink. Directory entries do not get moved around within directories when files are added or deleted. Directories can remain the same length or they can grow in length. If a directory once had many tens of thousands of filenames and links to their primary inodes, then the directory is still that big, even if it now only contains two [+ 20 to 30 directory], probably widely separated, entries. To read a file's entry, all blocks must be searched until the desired filename is found. Likewise, to list the contents of a directory, all blocks must be read until the number of files found matches the link count for the directory. IOW, if you want the performance to go back to what it was when the directory was fresh (and still small), you have to create a new directory and then move the remaining entries from the old directory into the new (small) directory. The only real difference here between UFS (or even the early AT&T filesystem) and ZFS is that the two remaining entries in a formerly huge directory are likely to be in different directory blocks that could be at effectively random locations scattered around the space of a partition for one filesystem in UFS or over an entire pool of potentially many filesystems and much more space in ZFS. Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************