Date: Sat, 12 Oct 1996 22:37:45 -0600 (MDT) From: Wes Peters <softweyr@xmission.com> To: James FitzGibbon <james@nexis.net> Cc: questions@freebsd.org Subject: Redundancy in FBSD web server Message-ID: <199610130437.WAA01056@obie.softweyr.com> In-Reply-To: <72125273@toto.iv>
next in thread | previous in thread | raw e-mail | index | archive | help
James FitzGibbon writes: > I need to set up a web server that (in my client's humble words) "CANNOT > EVER BE DOWN". They've got the budget, so I recommended two servers that > can serve domains concurrently. > > I'd be interested in hearing how people have/would implement this. My > thoughts so far would be to: > > a) Use a powerful box as the main server, with a backup box mirroring > sites and ready to take over should the main one go down. > > -or- > > b) Use machines of equal power, using a DNS entry with multiple A records > to shuffle requests back and forth. > > Opinions appreciated, including ways of detecting a downed host and taking > over (ifconfig aliasing) IPs of a machine that has crashed. I've just finished (5 minutes ago, literally) a project of this sort at my "day" job: a redundant, 24x7 television broadcast automation system. Our system, a large audio/video switch, uses a control processor based on an M68000. In order to acheive reliable backup, we put two of them in the system, and have them monitor each others state. This is a really simplistic system, but it works fairly well.* What I'd suggest you do is to have two machines connected to your router. Each has a network interface, neither interface is the www.whatever address. When the "primary" machine boots, it adds the address of www.whatever as an alias for its network interface; the standby begins pinging (or attempting http connections to) the www.whatever address. If the standby machine detects the primary has gone down, by not answering the pings, it adds www.whatever as an alias for *its* network and takes over. Things you have to account for: o The original machine comes back up. Does it now take over and the "backup" shut up, or does it become the backup. In our system, a control board always comes up "standby" and only goes "active" once it has determined there isn't another active board. o Keeping the HTML "database" up to date. In our system, critical dynamic configuration data is downloaded from the active board whenever a system comes up standby and an active board exists. o Routing and ARP tables. You're juggling the hardware address associated with the www.whatever IP address dynamically on your local network. I know this isn't going to "just work," but I'd have to study the routing implications of this before commiting to do this. o Communications between the two systems. We use a pair of dedicated serial ports for our redundancy state messages; each board transmits its current state 4x/sec and expects the other board to report its state at least 2x/sec. If a report is not seen within 500 msec, it is assumed that the other board has died, and this board becomes active. For your application, pinging over the network may be good enough. People who really study backup systems will explain that you can't have a proper redundant system with 2, or *any* even number of processors, or with the same software on every system. On the other hand, you can measurably increase your reliability in the face of simple hardware failures without a lot of custom programming. Good luck. Feel free to e-mail back if I can answer any questions for you. ;^) Wes Peters * Our most common mode of failure is, of course, catastrophic software failure. In this case, what usually happens is the active board crashes, the standby board takes over, the control system resends its last command, and the newly active board dies in *exactly* the same manner the previous active board did. Sigh. That's why you can't have a truly redundant system running the *same* software. But who can afford to develop *two* control systems, when this one took twelve years to develop to this stage already??? -- Wes Peters | Softweyr | Where am I, and what am I doing in this handbasket? Consulting | softweyr@xmission.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199610130437.WAA01056>