Date: Tue, 14 Nov 1995 12:45:40 -0700 (MST) From: Terry Lambert <terry@lambert.org> To: jgreco@brasil.moneng.mei.com (Joe Greco) Cc: terry@lambert.org, luigi@labinfo.iet.unipi.it, hackers@FreeBSD.org Subject: Re: Multiple http servers - howto ? Message-ID: <199511141945.MAA20656@phaeton.artisoft.com> In-Reply-To: <199511141851.MAA29115@brasil.moneng.mei.com> from "Joe Greco" at Nov 14, 95 12:51:44 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> > You're still doing round-robin address assignment, which expects that > > clients will behave statistically identical to one another. And they > > won't, even if the TTL is honored. > > Somebody else who doesn't really understand that when N is a random function > that may not be random for small values of x, still is random enough for > large values of x.... :-) > > The TTL hack simply reduces the definition of "may not be random for small > values of x". For P(N1(X) == N2(X)) << 1. I quantified this as "statistically identical client behaviour". If the distribution of clients by duration is not uniform, then the effective "randomness" is reduced. > If you are trying to tell me that if I have 4 addresses and 5,000 sites do > a DNS lookup on me, I will state that at least 1,000 sites will get assigned > to each address. That does not imply that the loading will be identical or > totally equal, but it should be reasonably distributed. I may not care if > the distribution is 1000/1000/1000/2000, because it is still better than > 5000 against a single box - and I would bet that it would be more evenly > distributed than I am suggesting, most of the time. The loading due to each client will be non-identical. For P(Nn(X)) for 'n' hosts, the probability of divergence is given by: n * session duration / sample interval. I guess if you don't check your per server connection load too often relative to the TTL, it will be better balanced on average. 8-). This is the problem with connecting to machines instead of to services. In any case, the point is that distribution of server load by DNS is non-optimal, since it assumes all clients are equal in terms of duration and/or server load factor, etc.. The actual tendency is for a loaded server to service requests slower, and so become more loaded if a truly round-robin assignment scheme is used. Using a machine connection assignment oriented mechanism (like picking your DNS response to a potential client), you want to assign clients based on inverse of relative server load to optimize per client response times. Maybe you could wire a special DNS server that knew WWW server loads per some sample reporting interval? This would still be inferior to a dynamic load balancing mechanism, like service connection instead of machine connectin, but it wouldn't require protocol changes to implement. You'd end up with no load increase on an over-used server, though load decrease would not fall off proportionally to unloading of other servers in the same group of assigns, like it would with service connection or some meta-protocol for client handoff between identical service providers. The other thing that isn't taken into account is topology management for geographically seperate servers: no way to get the least loaded server cosest to your location to reduce overall network congestion. Actually, someone could probably get a nice little paper out of building an inverse load preferential DNS (and load reporting daemons) if they wanted to. 8-). Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199511141945.MAA20656>