From owner-freebsd-performance@FreeBSD.ORG Sun Feb 19 04:39:48 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 32A1216A422 for ; Sun, 19 Feb 2006 04:39:48 +0000 (GMT) (envelope-from anderson@centtech.com) Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id B33F643D45 for ; Sun, 19 Feb 2006 04:39:47 +0000 (GMT) (envelope-from anderson@centtech.com) Received: from [192.168.42.21] (andersonbox1.centtech.com [192.168.42.21]) by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id k1J4disD057573 for ; Sat, 18 Feb 2006 22:39:45 -0600 (CST) (envelope-from anderson@centtech.com) Message-ID: <43F7F691.6030303@centtech.com> Date: Sat, 18 Feb 2006 22:39:45 -0600 From: Eric Anderson User-Agent: Thunderbird 1.5 (X11/20060112) MIME-Version: 1.0 To: freebsd-performance@freebsd.org References: <20060219040656.GT2756@rabbit> <20060219041752.GU2756@rabbit> <43F7F58C.3020908@centtech.com> In-Reply-To: <43F7F58C.3020908@centtech.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.87.1/1292/Fri Feb 17 03:39:02 2006 on mh1.centtech.com X-Virus-Status: Clean Subject: Re: stat speed X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Feb 2006 04:39:48 -0000 Eric Anderson wrote: > Mark Bucciarelli wrote: >> On Sat, Feb 18, 2006 at 11:06:57PM -0500, Mark Bucciarelli wrote: >> >>> I'm curious how fast stat is. >>> >>> I generated a list of 200,000 file names >>> >>> # find / | head -200000 > files.statspeed >>> >>> then ran a million iterations of randomly picking a file name and >>> stating it (see attached program). >>> >> >> Hmmm, 200,000 files 1,000,000 iterations. On avg, each file hit >> five times. Uhh, that's not a good way to avoid caching. Doh. >> >> Wow, caching is pretty amazing. I just reran the program, this time >> using 500,000 file paths and only stat'ing 10,000 of them. >> >> The first run was 99,059/second, the second was 188,239. >> >> So I guess 100,000/second is about right on my system w/o cache. >> > > I'm also wondering if by using find, and getting a list of > files/directories in the default order, you might be seeing some > results that aren't really completely random. What I mean is, your > find is traversing the tree, probably digging through directories > based on inode number or last modified time (can't recall which), but > either way, it's possible your list consisted of clumps of files/dirs > in the same cylinder groups, specially since you grabbed the first > 500k files, instead of picking a random file from the entire list of > files on the filesystem, and building a list from that random > plucking.. This is all speculative, but if you had lots of files in a > directory, those could be clumped in a few cylinder groups and > therefore you might see higher numbers than sampling from the entire > disk (since the speed is probably mostly dominated by disk seeks I > believe). > > What exactly are you trying to determine? You are also timing the rand() function. I suggest randomizing the list first, then stating the files in the randomized list. Eric -- ------------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology Anything that works is better than anything that doesn't. ------------------------------------------------------------------------