Date: Sun, 6 Jun 2010 10:20:20 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: freebsd-hackers@freebsd.org Subject: Re: sysbench / fileio - Linux vs. FreeBSD Message-ID: <201006061720.o56HKKBu069660@apollo.backplane.com> References: <4C09932B.6040808@wooh.hu> <201006050236.17697.bruce@cran.org.uk> <4C09FC43.8070804@wooh.hu> <4C0A7F2F.3030105@elischer.org> <4C0A816A.9040403@feral.com>
next in thread | previous in thread | raw e-mail | index | archive | help
:All of these tests have been apples vs. oranges for years. : :The following seems to be true, though: : :a) FreeBSD sequential write performance in UFS has always been less than :optimal. If there's no read activity sequential write performance should be maximal with UFS. The keyphrase here is "no read activity". UFS's main problem, easily demonstrated by running something like blogbench --iterations=100, is that read I/O is given such a huge precedence over write I/O it can cause the write I/O to come to a complete grinding halt once the system caches are blown out and the reads start having to go to disk. Another big issue with filesystem benchmarks is the data footprint size of the benchmark. Many benchmarks do not have a sufficiently large data footprint and wind up simply testing how much memory the kernel is willing to give over to cache the benchmark's tests, instead of testing disk performance. Bonnie++ is a really good example of the latter problem. That said, all the BSDs have stall issues with parallel read & write activity on the same file. It essentially comes down to the vnode lock held during writes which can cause reads on the same file to stall even when those reads could be satisfied from the VM/BUF cache. Linux might appear to work better in such benchmarks because Linux essentially allows infininte write buffering, up to the point where system memory is exhausted, and the BSDs do not. Infinite write buffering might make a benchmark look good but it creates horrible stalls and inconsistencies on production systems. I noticed that FreeBSD's ZFS implementation issues VOP_WRITE's with a shared lock instead of an exclusive lock, thus avoiding this particular problem. It would be possible to do this with UFS too with some work to prevent file size changes from colliding during concurrent writes, or even using a separate serializer for modifying/write operations so read operations can continue to run concurrently. blogbench is a good way to test read/write interference during the system-cache phase of blogbench's operation (that would be the first 500-800 or so blogs on a 4G system). If working properly both read and write operations should be maximal during this phase. That is, the disk should be 100% saturated with writes while all reads are still fully satisfiable from the buffer cache / VM system, and at the same time the read rate should not suffer (not be seen to stall). It would be interesting to see a blogbench comparison between UFS and ZFS on the same hw/disk. -Matt
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201006061720.o56HKKBu069660>