Date: Sat, 18 Feb 2006 22:39:45 -0600 From: Eric Anderson <anderson@centtech.com> To: freebsd-performance@freebsd.org Subject: Re: stat speed Message-ID: <43F7F691.6030303@centtech.com> In-Reply-To: <43F7F58C.3020908@centtech.com> References: <20060219040656.GT2756@rabbit> <20060219041752.GU2756@rabbit> <43F7F58C.3020908@centtech.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Eric Anderson wrote: > Mark Bucciarelli wrote: >> On Sat, Feb 18, 2006 at 11:06:57PM -0500, Mark Bucciarelli wrote: >> >>> I'm curious how fast stat is. >>> >>> I generated a list of 200,000 file names >>> >>> # find / | head -200000 > files.statspeed >>> >>> then ran a million iterations of randomly picking a file name and >>> stating it (see attached program). >>> >> >> Hmmm, 200,000 files 1,000,000 iterations. On avg, each file hit >> five times. Uhh, that's not a good way to avoid caching. Doh. >> >> Wow, caching is pretty amazing. I just reran the program, this time >> using 500,000 file paths and only stat'ing 10,000 of them. >> >> The first run was 99,059/second, the second was 188,239. >> >> So I guess 100,000/second is about right on my system w/o cache. >> > > I'm also wondering if by using find, and getting a list of > files/directories in the default order, you might be seeing some > results that aren't really completely random. What I mean is, your > find is traversing the tree, probably digging through directories > based on inode number or last modified time (can't recall which), but > either way, it's possible your list consisted of clumps of files/dirs > in the same cylinder groups, specially since you grabbed the first > 500k files, instead of picking a random file from the entire list of > files on the filesystem, and building a list from that random > plucking.. This is all speculative, but if you had lots of files in a > directory, those could be clumped in a few cylinder groups and > therefore you might see higher numbers than sampling from the entire > disk (since the speed is probably mostly dominated by disk seeks I > believe). > > What exactly are you trying to determine? You are also timing the rand() function. I suggest randomizing the list first, then stating the files in the randomized list. Eric -- ------------------------------------------------------------------------ Eric Anderson Sr. Systems Administrator Centaur Technology Anything that works is better than anything that doesn't. ------------------------------------------------------------------------
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?43F7F691.6030303>