Date: Sat, 12 Oct 1996 22:37:45 -0600 (MDT) From: Wes Peters <softweyr@xmission.com> To: James FitzGibbon <james@nexis.net> Cc: questions@freebsd.org Subject: Redundancy in FBSD web server Message-ID: <199610130437.WAA01056@obie.softweyr.com> In-Reply-To: <72125273@toto.iv>
next in thread | previous in thread | raw e-mail | index | archive | help
James FitzGibbon writes:
> I need to set up a web server that (in my client's humble words) "CANNOT
> EVER BE DOWN". They've got the budget, so I recommended two servers that
> can serve domains concurrently.
>
> I'd be interested in hearing how people have/would implement this. My
> thoughts so far would be to:
>
> a) Use a powerful box as the main server, with a backup box mirroring
> sites and ready to take over should the main one go down.
>
> -or-
>
> b) Use machines of equal power, using a DNS entry with multiple A records
> to shuffle requests back and forth.
>
> Opinions appreciated, including ways of detecting a downed host and taking
> over (ifconfig aliasing) IPs of a machine that has crashed.
I've just finished (5 minutes ago, literally) a project of this sort
at my "day" job: a redundant, 24x7 television broadcast automation
system. Our system, a large audio/video switch, uses a control
processor based on an M68000. In order to acheive reliable backup, we
put two of them in the system, and have them monitor each others
state. This is a really simplistic system, but it works fairly well.*
What I'd suggest you do is to have two machines connected to your
router. Each has a network interface, neither interface is the
www.whatever address. When the "primary" machine boots, it adds the
address of www.whatever as an alias for its network interface; the
standby begins pinging (or attempting http connections to) the
www.whatever address. If the standby machine detects the primary has
gone down, by not answering the pings, it adds www.whatever as an
alias for *its* network and takes over.
Things you have to account for:
o The original machine comes back up. Does it now take over and the
"backup" shut up, or does it become the backup. In our system, a
control board always comes up "standby" and only goes "active" once
it has determined there isn't another active board.
o Keeping the HTML "database" up to date. In our system, critical
dynamic configuration data is downloaded from the active board
whenever a system comes up standby and an active board exists.
o Routing and ARP tables. You're juggling the hardware address
associated with the www.whatever IP address dynamically on your
local network. I know this isn't going to "just work," but I'd
have to study the routing implications of this before commiting to
do this.
o Communications between the two systems. We use a pair of dedicated
serial ports for our redundancy state messages; each board
transmits its current state 4x/sec and expects the other board to
report its state at least 2x/sec. If a report is not seen within
500 msec, it is assumed that the other board has died, and this
board becomes active. For your application, pinging over the
network may be good enough.
People who really study backup systems will explain that you can't
have a proper redundant system with 2, or *any* even number of
processors, or with the same software on every system. On the other
hand, you can measurably increase your reliability in the face of
simple hardware failures without a lot of custom programming.
Good luck. Feel free to e-mail back if I can answer any questions for
you. ;^)
Wes Peters
* Our most common mode of failure is, of course, catastrophic software
failure. In this case, what usually happens is the active board
crashes, the standby board takes over, the control system resends its
last command, and the newly active board dies in *exactly* the same
manner the previous active board did. Sigh. That's why you can't
have a truly redundant system running the *same* software. But who
can afford to develop *two* control systems, when this one took twelve
years to develop to this stage already???
--
Wes Peters |
Softweyr | Where am I, and what am I doing in this handbasket?
Consulting |
softweyr@xmission.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610130437.WAA01056>
