Date: Tue, 24 Sep 1996 11:55:23 -0500 (CDT) From: Joe Greco <jgreco@brasil.moneng.mei.com> To: taob@io.org Cc: freebsd-isp@FreeBSD.ORG Subject: Re: Thoughts on a news server cluster Message-ID: <199609241655.LAA06476@brasil.moneng.mei.com> In-Reply-To: <Pine.NEB.3.92.960923122044.24621R-100000@zap.io.org> from "Brian Tao" at Sep 24, 96 12:06:11 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> On Mon, 23 Sep 1996, Joe Greco wrote: > > > > You get more concurrency if you use ccd too :-) > > True enough, but I didn't feel ccd was stable enough when I first > built our news server (late last year). I've been using it for about that long without many problems... but it was certainly rough around the edges at first. > > If nothing else, you can get a "SCSI-SCSI" translator (made by Mylex > > and others) where you just provide a fast/wide SCSI adaptor (2940, > > etc) and let the black box handle the RAID aspects. > > Good news... Open Storage says they will have a 5x4GB CRD-5300 > (might be a bit off on the model number) with 64MB cache available for > me in the next couple of days. The PPro systems are arriving this > afternoon, and I'm going to order a bunch of 2GB drives in a rackmount > chassis for next week. That will give me one system with a single F/W > drive, a ccd of 2GB drives, a Streamlogic hardware RAID and a CMD > hardware RAID for benchmark comparisons. The bits will be flying. ;-) Ahhh nice :-) > > Support is probably going to appear for several host adapter RAID > > solutions in the not too distant future, if I believe what people are > > telling me :-) > > Anything happening with the effort to pool some money together to > pay a programmer to accelerate his port the DPT drivers? I *might* be > able to convince the company to toss in some money towards such an > effort. I had heard a few various words from people, but IIRC somebody already "almost" has a DPT driver sitting on a back burner. Rod Grimes might have said something about looking at this - but you will have to ask him. > > You do NOT want to NFS mount!!! I have done it. If you have the I/O > > bandwidth and CPU (but not the RAM) to spare on a machine, it may be a > > worthwhile option... but the tax on the host is high. And you take a > > major reliability hit if that host goes down. > > I'm trying to do a simple sort of cost-benefit analysis. Two F/W > controllers and level 5 RAID with 25GB of usuable capacity costs in > the $25000 range. Per machine. For that kind of money, I'm > definitely willing to give NFS-mounted reader servers a try. J****! Let me build this in my mind quickly... Pentium 133 with ASUS P/E-XXXX????? MB $ 800 3 x NCR 810, 1 x SMC EtherPower 10/100 $ 320 192MB RAM $1200 6 x ST32550N $4100 4 x ST31055N $1200 2 x ST15150N $1850 Ext enclosures (3) $ 660 ----- $10130 That gives you 24GB usable _local_ disk capacity and additional I/O bandwidth on top of it... and you can build three with your $25000 plus some change, considering that you can get quantity pricing on a purchase of so many drives. And you get _complete_ redundancy rather than only disk subsystem redundancy. That is the part that gets me excited. > > It gives me 9 days retention on most stuff, 12 on alt, 2 on > > alt.binaries. It supports 150 users and _flies_, and even at 200 the > > performance is "acceptable" (but starts to degrade.. pretty much > > simultaneously RAM and I/O bandwidth start to dry up). > > The only performance problem I'm seeing is long delays or timeouts > when attempting to open an NNRP session. Once I'm in, the server is > niec and fast. I haven't tried anything special with starting > in.nnrpd's out of inetd and running innd on a different port, etc. It > seems to be related to the number of incoming innxmit connections. Yes. I deal with it by not launching nnrp's out of innd. I have something (local hackery) called "connectd" which is like a nnrp inetd, but has additional intelligence and allows me to limit the number of simultaneous connections or the respawn rate from a particular host. We have wankers around here who like running crap like NewsBin95. You also have windows of unavailability when the server is running news.daily and doing a renumber, etc. etc... spawning out of innd is not ideal. > > PPro200? Heavy iron for news... what exactly are you doing with all > > that horsepower... :-) > > They were roughly same price as Pentium 200's (a couple hundred > dollars difference). Maybe I'll start playing with on-the-fly > compression of news articles. ;-) Why not compute a few prime numbers too. ;-) > > That is one way to handle it, but I find that running XREPLIC off of > > the feeds system is perfectly acceptable... if I was going to have a > > separate "reader" fan-out machine I would probably STILL run it as an > > XREPLIC slave from the feeder machine... convenience. > > I don't want to "lock" myself into using XREPLIC though. If the > main feeder blows up and I have to newfs the spool, it'll take extra > work to resync those article numbers. If I just treat the feeder and Why? Grab the active off of a slave - if you are really anal, grab the active off all the slaves and write a little perl script to find the max for each group (just in case the slaves were a tad out of sync). That is a bit of work, I agree, but not hard. > the primary reader machine as entirely autonomous servers, something > that goes wrong with one is less likely to affect the other. Also, > isn't slave mode required for XREPLIC? Yes. > If the feeder server is > unavailable, none of the reader machines will be able to post. A qualified "Yes." You have the same problem no matter what you do, since INN has a synchronous posting paradigm that in my opinion bites the big one. I got exasperated and did something different. I developed a smart spooling system to deal with it. Now people can "post" even if the master and all the other slaves are dead. At the same time I took the opportunity to add a comprehensive posting accounting system that records whatever the hell gets posted. It's been useful several times already... % cd /var/log/news/logposts/posts/idiot.user@execpc.com/ % grep "^Message-ID: " * | awk '{print $2}' > /tmp/cancelme % spamcancel /tmp/cancelme :-) MUCH easier than in the old days... digging through logs, etc. > I've > not played with XREPLIC before, so my understanding may be off. XREPLIC is a form of mildly tying your hands. On the other hand, it keeps your machines in sync! Which is what an ISP needs. > > I don't know, I would think that I would start seeing some I/O > > contention with just one machine.. > > I don't think we're going to hit 1000 simultaneous readers at this > POP for a while yet. It will be a gradual curve up, so any > anticipated I/O bottlenecks can be headed off before they become a > problem. Do we have any kernel optimizations yet for PPro memory- > intensive operations? Dunno > > And I have not seen any basis for supporting that many readers on a > > single machine.. how big is your active file? What does "top"'s > > output look like on one of your readers? Enquiring minds want to know :-) > > It's a pretty small active file, just under 9000 groups (407187 > bytes). 'top' looks like this: Ahhh that's why. I have 25000+++ with well over 1MB size. > load averages: 0.36, 0.42, 0.41 11:52:58 > 109 processes: 1 running, 118 sleeping > Cpu states: 2.7% user, 1.5% nice, 14.2% system, 2.3% interrupt, 79.2% idle > Mem: 82M Active, 6152K Inact, 20M Wired, 19M Cache, 7785K Buf, 176K Free > Swap: 262M Total, 8336K Used, 254M Free, 3% Inuse > > PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND > 27230 news -6 0 24M 24M biowai 95:05 13.08% 11.02% innd.nodebug > 27238 root 29 0 352K 808K RUN 0:00 3.15% 0.57% top > 26658 news 2 4 316K 708K select 0:01 0.38% 0.38% in.nnrpd > 25061 news 2 0 220K 352K sbwait 0:22 0.31% 0.31% innxmit > 27200 news 2 4 292K 868K sbwait 0:00 0.23% 0.23% in.nnrpd > 27235 news 2 0 292K 992K select 0:00 0.38% 0.19% in.nnrpd > 27233 news -6 0 152K 484K piperd 0:00 0.20% 0.15% overchan > 27150 news 2 4 288K 728K sbwait 0:00 0.08% 0.08% in.nnrpd > 27190 news 2 4 284K 692K sbwait 0:00 0.08% 0.08% in.nnrpd > 26803 news 2 4 292K 732K sbwait 0:00 0.04% 0.04% in.nnrpd > 26480 news 2 0 448K 548K select 0:04 0.04% 0.04% innxmit > 23024 news 2 0 220K 308K sbwait 0:31 0.04% 0.04% innxmit > [...] Looks more like load averages: 1.48, 0.89, 0.60 11:40:14 96 processes: 2 running, 94 sleeping Cpu states: 1.8% user, 15.5% nice, 24.1% system, 2.5% interrupt, 56.1% idle Mem: 92M Active, 396K Inact, 19M Wired, 71M Cache, 5304K Buf, 260K Free Swap: 369M Total, 40M Used, 329M Free, 11% Inuse PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 24372 news 74 4 47M 21M RUN 727:22 27.89% 27.89% innd 6228 news 2 0 1048K 2368K select 0:02 1.64% 1.64% in.nnrpd 6805 news 2 0 1080K 2384K select 0:00 2.02% 1.07% in.nnrpd 6801 news 2 0 1032K 2340K select 0:01 1.14% 0.99% in.nnrpd 6812 news 2 0 1028K 2324K select 0:00 1.89% 0.95% in.nnrpd 6633 news 2 0 1020K 2332K netio 0:03 0.92% 0.92% in.nnrpd here, see the difference in size... :-( > Assuming 32MB for kernel and OS stuff, 32MB for innd, 150MB for 500 > readers and no feeds, that still leaves ~40MB for disk cache and other > processes (like expires) on a 256MB machine. Must be nice to have a small active file. ;-) ... JG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199609241655.LAA06476>