From owner-freebsd-performance@FreeBSD.ORG Wed Dec 12 14:20:01 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 60C6698C; Wed, 12 Dec 2012 14:20:01 +0000 (UTC) (envelope-from olavgg@gmail.com) Received: from mail-gh0-f182.google.com (mail-gh0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 086E08FC13; Wed, 12 Dec 2012 14:20:00 +0000 (UTC) Received: by mail-gh0-f182.google.com with SMTP id z15so143290ghb.13 for ; Wed, 12 Dec 2012 06:19:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=iGAH3i73fuGXzFMmSM0DKPxjb3dlDIrRXEtxTrx8NF0=; b=yTmRR6NUv8X5TIPMsL6KZrgbtMVyNKXr4fxsl7wUcnGNBYnuO+uWOX/shk2jpX8NwY 41pH/N9uu9QbaKrDcX2S+XlhXPIIpCcWTxGjzxODXQvTx6t9ASCXcvd4iMKsQP9p5i0I jxzFRg3rxLcFGDvr6dD4og+eYsDnFOs4TBDG517seaK9ct1h11BmMEfTdhoUdgSPgJ60 acbIxiWxe13ISyaXeQOB6ecvtB7H7QrKz19/jlZXsbpdWv7AY53WzpSYHoS++x6j7B04 tyk0+XxjyHEXowir/XzELY1Z6KOjqqgnIr4qVCq9aTLwLberNxiih6SfMvgCLL2dX7Ku qkyg== MIME-Version: 1.0 Received: by 10.58.48.231 with SMTP id p7mr603571ven.11.1355321998719; Wed, 12 Dec 2012 06:19:58 -0800 (PST) Received: by 10.58.254.195 with HTTP; Wed, 12 Dec 2012 06:19:58 -0800 (PST) Date: Wed, 12 Dec 2012 15:19:58 +0100 Message-ID: Subject: find vs ls performance for walking folders, are there any faster options? From: =?ISO-8859-1?Q?Olav_Gr=F8n=E5s_Gjerde?= To: freebsd-performance@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Mailman-Approved-At: Wed, 12 Dec 2012 14:32:35 +0000 X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Dec 2012 14:20:01 -0000 I'm working on scanning filesystems to build a file search engine and came over something interesting. I can walk through 300 000 folders in ~19.5seconds with this command: ls -Ra | grep -e "./.*:" | sed "s/://" With find, it surprisingly takes ~50.5 seconds.: find . -type d My results are based on five runs of each command to warm up the disk cache. I've tried both this with both UFS and ZFS, and both filesystems shows the same speed difference. On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just slight faster than find(about 15-20%). Are there a faster way to walk folders on FreeBSD? Are there some options(sysctl) I could tune to improve the performance? From owner-freebsd-performance@FreeBSD.ORG Wed Dec 12 14:51:01 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BED28408; Wed, 12 Dec 2012 14:51:01 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 4F8278FC12; Wed, 12 Dec 2012 14:51:00 +0000 (UTC) Received: from c122-106-175-26.carlnfd1.nsw.optusnet.com.au (c122-106-175-26.carlnfd1.nsw.optusnet.com.au [122.106.175.26]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id qBCEoq9S001612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 13 Dec 2012 01:50:53 +1100 Date: Thu, 13 Dec 2012 01:50:52 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: =?ISO-8859-1?Q?Olav_Gr=F8n=E5s_Gjerde?= Subject: Re: find vs ls performance for walking folders, are there any faster options? In-Reply-To: Message-ID: <20121213012632.M1201@besplex.bde.org> References: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="0-821341582-1355323852=:1201" X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.0 cv=Zr21sKHG c=1 sm=1 a=yebKrrqT3fcA:10 a=PO7r1zJSAAAA:8 a=JzwRw_2MAAAA:8 a=On_ekTe0GqwA:10 a=6nvIKyoRHYfj9tnll-sA:9 a=45ClL6m2LaAA:10 a=bxQHXO5Py4tHmhUgaywp5w==:117 Cc: freebsd-fs@freebsd.org, freebsd-performance@freebsd.org X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 12 Dec 2012 14:51:01 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --0-821341582-1355323852=:1201 Content-Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 12 Dec 2012, [ISO-8859-1] Olav Gr=F8n=E5s Gjerde wrote: > I'm working on scanning filesystems to build a file search engine and > came over something interesting. > > I can walk through 300 000 folders in ~19.5seconds with this command: > ls -Ra | grep -e "./.*:" | sed "s/://" > > With find, it surprisingly takes ~50.5 seconds.: > find . -type d This is because 'find' with '-type' lstats all the files. It doesn't use DT_DIR from dirent for some reason. ls can be slowed down similarly using -F. > My results are based on five runs of each command to warm up the disk cac= he. > I've tried both this with both UFS and ZFS, and both filesystems shows > the same speed difference. I get almost exactly the same ratio of speeds on an old version of FreeBSD. All the data was cached, and there were only 7 symlinks. Thr file system was mounted with -noatime, so the cache actually worked. > On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just > slight faster than find(about 15-20%). Apparently lstat() is relatively much slower in FreeBSD. It only takes 5 usec here, but that is a lot for converting cached data (getpid() takes 0.2 usec). A file system mounted with -atime might be much slower, for writing directory timestamps (the sync of the timestamps is delayed, but it is a very heavyweight operation). > Are there a faster way to walk folders on FreeBSD? Are there some > options(sysctl) I could tune to improve the performance? Nothing much faster than find without -type. Whatever fts(3) gives. Bruce --0-821341582-1355323852=:1201-- From owner-freebsd-performance@FreeBSD.ORG Thu Dec 13 14:18:43 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C91FDE41 for ; Thu, 13 Dec 2012 14:18:43 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 840168FC12 for ; Thu, 13 Dec 2012 14:18:43 +0000 (UTC) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 4AAA42842D for ; Thu, 13 Dec 2012 15:18:36 +0100 (CET) Received: from [192.168.1.2] (unknown [89.177.49.69]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 417C228429 for ; Thu, 13 Dec 2012 15:18:35 +0100 (CET) Message-ID: <50C9E3BB.8060105@quip.cz> Date: Thu, 13 Dec 2012 15:18:35 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.19) Gecko/20110420 Lightning/1.0b1 SeaMonkey/2.0.14 MIME-Version: 1.0 To: freebsd-performance@freebsd.org Subject: Phoronix comparision of Linux & FreeBSD kernel on Debian Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2012 14:18:43 -0000 Debian Linux vs. Debian kFreeBSD With Squeeze & Wheezy http://www.phoronix.com/scan.php?page=article&item=debian_wheezy_bsd They are comparing 4 instances of Debian. kFreeBSD versions are running with 8.1 and 9.0 kernels. There are some improvements and some regressions too. Miroslav Lachman From owner-freebsd-performance@FreeBSD.ORG Thu Dec 13 19:38:05 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D83DB73C; Thu, 13 Dec 2012 19:38:05 +0000 (UTC) (envelope-from olavgg@gmail.com) Received: from mail-vb0-f54.google.com (mail-vb0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 68EDC8FC12; Thu, 13 Dec 2012 19:38:04 +0000 (UTC) Received: by mail-vb0-f54.google.com with SMTP id l1so2895550vba.13 for ; Thu, 13 Dec 2012 11:38:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=8slPrRYAkN+K2QTxu6RxYrTYbPHymezf2j7FxgDaEhY=; b=IwNVVxuRwGOHVL/BovxQI80qBm/7IaYpQriiJx7+dibZfvAxY4tQy77koGfGO5mHvW pt6gd+vndWF9eY32SbsHgeAVi1zHN764aKtB4H48+byB3SKwBwDb1kdVCwxld2eboLFy MxA9c4/aDo1ZRbSXfftV7yFT+pxbIlaObfyzZYbCKAVWLvsaJsxZ0wnlg7P32n05bR8f A7QwK/be1iqsxmmCwCcmdmb3q7qBR9d/R06ZJFemK/9xOsJdDj0JqSOzTOPtIc+8ozMI bpucarTMTfPALhOL+9u4un4M7LHI+0mQ7GKERa4TXHYbfpdetJRibrzxzyzJ7CESU6bb X4mA== MIME-Version: 1.0 Received: by 10.220.238.148 with SMTP id ks20mr5586184vcb.5.1355427482862; Thu, 13 Dec 2012 11:38:02 -0800 (PST) Received: by 10.58.254.195 with HTTP; Thu, 13 Dec 2012 11:38:02 -0800 (PST) In-Reply-To: <20121213012632.M1201@besplex.bde.org> References: <20121213012632.M1201@besplex.bde.org> Date: Thu, 13 Dec 2012 20:38:02 +0100 Message-ID: Subject: Re: find vs ls performance for walking folders, are there any faster options? From: =?ISO-8859-1?Q?Olav_Gr=F8n=E5s_Gjerde?= To: Bruce Evans Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Mailman-Approved-At: Thu, 13 Dec 2012 20:27:41 +0000 Cc: freebsd-fs@freebsd.org, freebsd-performance@freebsd.org X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2012 19:38:06 -0000 Thank you, that was a really good answer. On Wed, Dec 12, 2012 at 3:50 PM, Bruce Evans wrote: > On Wed, 12 Dec 2012, [ISO-8859-1] Olav Gr=F8n=E5s Gjerde wrote: > >> I'm working on scanning filesystems to build a file search engine and >> came over something interesting. >> >> I can walk through 300 000 folders in ~19.5seconds with this command: >> ls -Ra | grep -e "./.*:" | sed "s/://" >> >> With find, it surprisingly takes ~50.5 seconds.: >> find . -type d > > > This is because 'find' with '-type' lstats all the files. It doesn't > use DT_DIR from dirent for some reason. ls can be slowed down similarly > using -F. > > >> My results are based on five runs of each command to warm up the disk >> cache. >> I've tried both this with both UFS and ZFS, and both filesystems shows >> the same speed difference. > > > I get almost exactly the same ratio of speeds on an old version of FreeBS= D. > All the data was cached, and there were only 7 symlinks. Thr file system > was mounted with -noatime, so the cache actually worked. > > >> On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just >> slight faster than find(about 15-20%). > > > Apparently lstat() is relatively much slower in FreeBSD. It only takes > 5 usec here, but that is a lot for converting cached data (getpid() > takes 0.2 usec). A file system mounted with -atime might be much > slower, for writing directory timestamps (the sync of the timestamps > is delayed, but it is a very heavyweight operation). > > >> Are there a faster way to walk folders on FreeBSD? Are there some >> options(sysctl) I could tune to improve the performance? > > > Nothing much faster than find without -type. Whatever fts(3) gives. > > Bruce