Date: Mon, 6 Nov 2006 13:04:15 +0100 (CET) From: Oliver Fromme <olli@lurza.secnetix.de> To: freebsd-geom@FreeBSD.ORG, Oles Hnatkevych <don_oles@able.com.ua> Subject: Re: geom stripe perfomance question Message-ID: <200611061204.kA6C4FXt079703@lurza.secnetix.de> In-Reply-To: <961295086.20061105000919@able.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
Oles Hnatkevych wrote: > I wonder why geom stripe works much worse than the separate disks that > constitute stripe. It depends on your workload (or your benchmark). > I have a stripe from two disks. Disks are on separate ATA channels. > [...] > Stripesize: 262144 > [...] > Now let's read one of them and stripe. > > root# dd if=/dev/ad1 of=/dev/null bs=1m count=1000 > 1048576000 bytes transferred in 14.579483 secs (71921343 bytes/sec) > > root# dd if=/dev/stripe/bigdata of=/dev/null bs=1m count=1000 > 1048576000 bytes transferred in 15.882796 secs (66019610 bytes/sec) > > What I would expect is doubling the speed of transfer, not > slowing down. Am I wrong? Or is geom_stripe inefficient? > I tried to do the same with gvinum/stripe - the read > speed was degraded too. And with gmirror depending on slice size speed > was degraded differently. I wonder why people always try to use dd for benchmarking. It's bogus. dd is not for benchmarking. It works in a sequential way, i.e. it first reads 256 KB (your stripe size) from the first compontent, then 256 KB from the 2nd, and so on. While it reads from one disk, the other one is idle. So it is not surprising that you don't see a speed increase (in fact, there's a small decrease because of the seek time overhead when switching from on disk to the other). [*] The performance of a stripe should be better when you use applications that perform parallel I/O access. Your benchmark should be as close to your real-world app as possible. If your real-world app is dd (or another one that accesses big files sequentially without parallelism), then you shouldn't use striping. Best regards Oliver PS: [*] It could be argued that the kernel could prefetch the next 256 KB from the other disk, so both disks are kept busy for best throughput. The problem with that is that the kernel doesn't know that the next 256 KB will be needed, so it doesn't know whether it makes sense to prefetch them or not. dd has no way to tell the kernel about its usage pattern (it would require an API similar to madvise(2)). -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Dienstleistungen mit Schwerpunkt FreeBSD: http://www.secnetix.de/bsd Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "It combines all the worst aspects of C and Lisp: a billion different sublanguages in one monolithic executable. It combines the power of C with the readability of PostScript." -- Jamie Zawinski, when asked: "What's wrong with perl?"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200611061204.kA6C4FXt079703>