From owner-freebsd-fs@FreeBSD.ORG Thu Dec 13 19:38:05 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D83DB73C; Thu, 13 Dec 2012 19:38:05 +0000 (UTC) (envelope-from olavgg@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 68EDC8FC12; Thu, 13 Dec 2012 19:38:04 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id l1so2895550vba.13 for ; Thu, 13 Dec 2012 11:38:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=8slPrRYAkN+K2QTxu6RxYrTYbPHymezf2j7FxgDaEhY=; b=IwNVVxuRwGOHVL/BovxQI80qBm/7IaYpQriiJx7+dibZfvAxY4tQy77koGfGO5mHvW pt6gd+vndWF9eY32SbsHgeAVi1zHN764aKtB4H48+byB3SKwBwDb1kdVCwxld2eboLFy MxA9c4/aDo1ZRbSXfftV7yFT+pxbIlaObfyzZYbCKAVWLvsaJsxZ0wnlg7P32n05bR8f A7QwK/be1iqsxmmCwCcmdmb3q7qBR9d/R06ZJFemK/9xOsJdDj0JqSOzTOPtIc+8ozMI bpucarTMTfPALhOL+9u4un4M7LHI+0mQ7GKERa4TXHYbfpdetJRibrzxzyzJ7CESU6bb X4mA== MIME-Version: 1.0 Received: by 10.220.238.148 with SMTP id ks20mr5586184vcb.5.1355427482862; Thu, 13 Dec 2012 11:38:02 -0800 (PST) Received: by 10.58.254.195 with HTTP; Thu, 13 Dec 2012 11:38:02 -0800 (PST) In-Reply-To: <20121213012632.M1201@besplex.bde.org> References: <20121213012632.M1201@besplex.bde.org> Date: Thu, 13 Dec 2012 20:38:02 +0100 Message-ID: Subject: Re: find vs ls performance for walking folders, are there any faster options? From: =?ISO-8859-1?Q?Olav_Gr=F8n=E5s_Gjerde?= To: Bruce Evans Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-performance@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2012 19:38:06 -0000 Thank you, that was a really good answer. On Wed, Dec 12, 2012 at 3:50 PM, Bruce Evans wrote: > On Wed, 12 Dec 2012, [ISO-8859-1] Olav Gr=F8n=E5s Gjerde wrote: > >> I'm working on scanning filesystems to build a file search engine and >> came over something interesting. >> >> I can walk through 300 000 folders in ~19.5seconds with this command: >> ls -Ra | grep -e "./.*:" | sed "s/://" >> >> With find, it surprisingly takes ~50.5 seconds.: >> find . -type d > > > This is because 'find' with '-type' lstats all the files. It doesn't > use DT_DIR from dirent for some reason. ls can be slowed down similarly > using -F. > > >> My results are based on five runs of each command to warm up the disk >> cache. >> I've tried both this with both UFS and ZFS, and both filesystems shows >> the same speed difference. > > > I get almost exactly the same ratio of speeds on an old version of FreeBS= D. > All the data was cached, and there were only 7 symlinks. Thr file system > was mounted with -noatime, so the cache actually worked. > > >> On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just >> slight faster than find(about 15-20%). > > > Apparently lstat() is relatively much slower in FreeBSD. It only takes > 5 usec here, but that is a lot for converting cached data (getpid() > takes 0.2 usec). A file system mounted with -atime might be much > slower, for writing directory timestamps (the sync of the timestamps > is delayed, but it is a very heavyweight operation). > > >> Are there a faster way to walk folders on FreeBSD? Are there some >> options(sysctl) I could tune to improve the performance? > > > Nothing much faster than find without -type. Whatever fts(3) gives. > > Bruce