Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Mar 1999 16:37:51 +0100 (MET)
From:      Luigi Rizzo <luigi@labinfo.iet.unipi.it>
To:        asmodai@wxs.nl (Jeroen Ruigrok/Asmodai)
Cc:        cmsedore@maxwell.syr.edu, freebsd-net@FreeBSD.ORG, luigi@labinfo.iet.unipi.it (Luigi Rizzo)
Subject:   Re: IP source address based load balancing
Message-ID:  <199903061537.QAA00361@labinfo.iet.unipi.it>
In-Reply-To: <XFMail.990306183203.asmodai@wxs.nl> from "Jeroen Ruigrok/Asmodai" at Mar 6, 99 06:31:44 pm

next in thread | previous in thread | raw e-mail | index | archive | help
> On 06-Mar-99 Christopher M Sedore wrote:
> 
> > Yes.  I'm hoping that I'll be able to write a "cluster" daemon that will
> > monitor the machines and reconfigure on failure or addition.  A broadcast
> > heartbeat would allow failure detection.  This is somewhat more complex
> > that I had originally hoped, since the operations would ideally not need
> > a "master" machine, and would be all distributed.  This means that all
> > machines in a cluster need to somehow negotiate a mutually agreed upon
> > configuration through broadcasting to each other.  I have some
> > not-too-developed ideas on how to do this, but there are a number of
> > failure modes, etc that have to be considered.
> 
> Shouldn't a multicast heartbeat suffice? If one could avoid broadcasting
> wherever possible, avoid it.

the issue is not broadcast vs. multicast (it's one pkt per second or so),
it's guaranteeing consistent behaviour on reconfiguration -- e.g. no
dropping of active connections, no double responses to a single SYN, no
RST sent just because one of the server receives a pkt destined to
another one...

i think there is no easy way to implement a fully distributed
solution without using a centralized machine as a dispatcher/NAT
device.

The reason is, you generally need to reconfigure things (either for
adding/removing servers, or for doing some real load balancing)
without breaking existing connections. This poses problems because,
especially when you add new servers, you can't move already active
connections and so changing filters on the fly won't work. You need
your filters to work mainly on connection setup (packets with a
SYN, basically) but then you have to make TCP not send a RST for
pkts not matching any incoming connection...

All in all i think a centralized solution (e.g. in the form of a
PicoBSD machine with an in-kernel NAT for performance reasons) is
much more flexible. Most pieces are already there (including a
user-space natd which can be useful to test allocation policies
etc.).  By using a picobsd approach (or in general, a readonly
filesystem since it does not have to use persistent storage), your
natd switch would be not less reliable than your average router.

Basically i think one could do things as follows:

  * define a new ipfw command, call it "natd" or so, where you define
    a list of servers for a given server.
  * let the in-kernel natd allocate connections using round robin or
    choose the less-loaded machine.

The natd switch could even try to determine the load on each server
using one of the following heuristics:

  * count each open connection as a unit of load (this is ok as long
    as all connections give approx. the same load on the server);
  * only count as 'loaded' connections for which there is no unacked
    data from either the client or the server. This should match
    situations where presumably the connection is idle because
    (presumably) the server is processing a request.

Because the natd switch has an entry per connection, and has to do a
connection-matching on each packet, updating the state should be
reasonably cheap. You can monitor servers for being down by simply
checking that the connections they serve make progress. Removing a
server just requires to update the list of servers on the natd, and
wait for all of its connection to terminate. Adding a new server is
even easier, as it just requires to update the list of servers.

	cheers
	luigi
-----------------------------------+-------------------------------------
  Luigi RIZZO                      .
  EMAIL: luigi@iet.unipi.it        . Dip. di Ing. dell'Informazione
  HTTP://www.iet.unipi.it/~luigi/  . Universita` di Pisa
  TEL/FAX: +39-050-568.533/522     . via Diotisalvi 2, 56126 PISA (Italy)
-----------------------------------+-------------------------------------


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199903061537.QAA00361>