Date: Sun, 26 Dec 2010 19:46:53 +0200 From: Gleb Kurtsou <gleb.kurtsou@gmail.com> To: Ivan Voras <ivoras@freebsd.org> Cc: freebsd-hackers@freebsd.org Subject: Re: [rfc] Replacing FNV and hash32 with Paul Hsieh's SuperFastHash Message-ID: <20101226174653.GA45598@tops> In-Reply-To: <AANLkTinBJnWfTijL3LSfa8MQV%2BbGPG67euDgT1uG56rD@mail.gmail.com> References: <20101223224619.GA21984@tops> <if5gmr$a5r$1@dough.gmane.org> <20101226132431.GA16490@tops> <AANLkTinBJnWfTijL3LSfa8MQV%2BbGPG67euDgT1uG56rD@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On (26/12/2010 15:20), Ivan Voras wrote: > On 26 December 2010 14:24, Gleb Kurtsou <gleb.kurtsou@gmail.com> wrote: > > On (25/12/2010 20:29), Ivan Voras wrote: > >> On 23.12.2010 23:46, Gleb Kurtsou wrote: > >> > >> > For testing I've used dbench with 16 processes on 1 Gb swap back md > >> > device, UFS + SoftUpdates: > >> > Old hash (Mb/s): 599.94 600.096 599.536 > >> > SFH hash (Mb/s): 612.439 612.341 609.673 > >> > > >> > It's just ~1% improvement, but dbench is not a VFS metadata intensive > >> > benchmark. Subjectively it feels faster accessing maildir mailboxes > >> > with ~10.000 messages : ) > >> > >> Try blogbench if you need metadata-intensive operations, or even fsx. > > > blogbench should be good, but I've always had hard time interpreting its > > results. Besides results tend to very a lot, there is no way to set seed > > value like in fsx, so that I could run exactly the same test in different > > configurations. > > I think the exact sequence of blogbench operations depends on duration > of previous operations (it's multithreaded) so from that angle you are Why should it? Operation order in dbench or fsx doesn't depend on duration of previous operations. > right - you can't do a repeatable run except in the trivial cases. On > the other hand, it uses rand() without seeding it with > srand()/sranddev() so this part is actually very repeatable :) I've once tried to make its behaviour more predictable, I can't find the patch and can't recall any specifics, but there were architectural issues. You are right, setting seed and calling rand() should give stable results, that's what I was trying to achieve. The other way to work around such "limitation" is too run sufficiently large number of tests. Which requires patience :) > > fsx is a different beast, it reads/writes/truncates at random offsets - > > great tool for debugging mmap/truncate issues. Patch doesn't improve it > > in any way. > > It depends on what metadata operations you require - blogbench will > create, find and write files (if we ignore atime); fsx will create a > decent amount of traffic with file size and mtime changes. In your > case you'll probably need to run it on a memory file system or tmpfs > due to sensitivity to disk IO latencies (if your improvements is on > the order of few percent). I meant create/readdir/remove as metadata intensive operations -- blogbench is very good for it. fsx creates single file. Most people will only notice changes in vfs_cache.c and UFS' dirhash, that's 600 Mb/s vs 613 Mb/s improvement I've written about above. I'd appreciate if someone could benchmark if_lagg, it was using hash32 for binary data, which could result in poor hash table usage, which could possibly make most of the data go on single interface. But there would be hardly any performance improvement due to limited network bandwidth. Besides old hash32 is faster than new SFH.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101226174653.GA45598>