Date: Thu, 19 Sep 1996 14:13:52 -0500 (CDT) From: Joe Greco <jgreco@brasil.moneng.mei.com> To: michaelv@MindBender.serv.net (Michael L. VanLoon -- HeadCandy.com) Cc: jgreco@brasil.moneng.mei.com, henrich@crh.cl.msu.edu, freebsd-isp@freebsd.org Subject: Re: News server... Message-ID: <199609191913.OAA11376@brasil.moneng.mei.com> In-Reply-To: <199609191752.KAA11272@MindBender.serv.net> from "Michael L. VanLoon -- HeadCandy.com" at Sep 19, 96 10:52:19 am
next in thread | previous in thread | raw e-mail | index | archive | help
> [...] > >I also do not have a GOOD set of tools with which to measure concurrency > >within a news filesystem. I have some basic programs that I run several > >of simultaneously, but how fair they are? Dunno. It is certainly a > >problem that could use a researcher. All I am interested in is convincing > >people that a stripe size of 8K is _foolish_. :-) > > I am about half way though writing a little benchmarking tool that > works a lot more like news (lots of files in lots of directories, > being read and written simultaneously). I would be "done" by now if I > had more time... Is there anything specific you would like to see in > something like this? We would all be "done" by now if we had more time. My ideal benchmarking tool? Hmmmmmmmm... If you really want to be faithful to news, it would have to have 25000 hierarchically stacked directories... and it is hard to do it the way news does it... basically the two big "fan outs" are at /news and /news/alt, each of which generally hold hundreds of subdirectories... daily-planet# ls /news|wc 777 777 4260 daily-planet# ls /news/alt|wc 1158 1158 9477 This is quite a thing to try to allow for... it would maybe be easiest to generate a dir pathname using something like char pathname[]; sprintf(pathname, "%d/%d/%d/%d", n > 10000 ? 1 : 0, thousands-digit- of-n * 10 + hundreds-digit-of-n, tens-digit-of-n, ones-digit-of-n)... which might weight the first digits in a similar fashion.. *shrug* It would also be beneficial to weight different directories with different numbers of articles to be "held"... hmmm argh does that ever get complex! Anyways. My thoughts: 1) It would have to be repeatable. Use a seeded random function generator. While I am willing to run a test long enough to judge "average" behavior, there is no reason I can think of not to use a seeded random generator. 2) Multiple(!!) readers, single writer. Single writer adds file "N+1" to "random" directory, some size between 1K-16K, at the rate of five per second. Multiple readers read random articles out of random directories, but may have a tendency to read two or three articles out of the same directory. This might be something to make a configurable parameter... i.e. 50% chance the next article comes out of same directory. This is particularly important due to the fact that behavior on a reader machine and on a feeder machine is very different. Actually it would really be good to provide that same capability for the writer too. I have seen some behavior that I am trying to explain, related to writers and pre-sorted article lists. 3) Maybe an "expire" process that walks through and removes oldest article(s) in a directory. 4) The minimum, average, and maximum times to perform a specific type of operation. I see four operations: a) article write in different dir b) article write in same dir c) article read in different dir d) article read in same dir 5) Number of readers configurable. Potentially to a fairly large number (worst case saturation testing). Michael, I do not expect your tool to do all these things, but if it did, it would answer a lot of "fine tuning" questions I personally have (but have not taken the time to find the answers to because a tool did not exist). I could have written one but I do not place THAT much value on squeezing another 10-15% out of the system. ... JG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609191913.OAA11376>