From owner-freebsd-performance@FreeBSD.ORG Wed Dec 12 14:51:01 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BED28408; Wed, 12 Dec 2012 14:51:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 4F8278FC12; Wed, 12 Dec 2012 14:51:00 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qBCEoq9S001612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 13 Dec 2012 01:50:53 +1100 Date: Thu, 13 Dec 2012 01:50:52 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: =?ISO-8859-1?Q?Olav_Gr=F8n=E5s_Gjerde?= Subject: Re: find vs ls performance for walking folders, are there any faster options? In-Reply-To: Message-ID: <20121213012632.M1201@besplex.bde.org> References: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-821341582-1355323852=:1201" X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=Zr21sKHG c=1 sm=1 a=yebKrrqT3fcA:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=On_ekTe0GqwA:10 a=6nvIKyoRHYfj9tnll-sA:9 a=45ClL6m2LaAA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: freebsd-fs@freebsd.org, freebsd-performance@freebsd.org X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Dec 2012 14:51:01 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-821341582-1355323852=:1201 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 12 Dec 2012, [ISO-8859-1] Olav Gr=F8n=E5s Gjerde wrote: > I'm working on scanning filesystems to build a file search engine and > came over something interesting. > > I can walk through 300 000 folders in ~19.5seconds with this command: > ls -Ra | grep -e "./.*:" | sed "s/://" > > With find, it surprisingly takes ~50.5 seconds.: > find . -type d This is because 'find' with '-type' lstats all the files. It doesn't use DT_DIR from dirent for some reason. ls can be slowed down similarly using -F. > My results are based on five runs of each command to warm up the disk cac= he. > I've tried both this with both UFS and ZFS, and both filesystems shows > the same speed difference. I get almost exactly the same ratio of speeds on an old version of FreeBSD. All the data was cached, and there were only 7 symlinks. Thr file system was mounted with -noatime, so the cache actually worked. > On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just > slight faster than find(about 15-20%). Apparently lstat() is relatively much slower in FreeBSD. It only takes 5 usec here, but that is a lot for converting cached data (getpid() takes 0.2 usec). A file system mounted with -atime might be much slower, for writing directory timestamps (the sync of the timestamps is delayed, but it is a very heavyweight operation). > Are there a faster way to walk folders on FreeBSD? Are there some > options(sysctl) I could tune to improve the performance? Nothing much faster than find without -type. Whatever fts(3) gives. Bruce --0-821341582-1355323852=:1201--