Date: Wed, 29 Apr 1998 09:45:28 -0400 From: "Nguyen HM (Mike)" <NguyenHM@ucarb.com> To: Greg Lehey <grog@lemis.com>, Hans Huebner <hans@artcom.de> Cc: freebsd-hackers@FreeBSD.ORG Subject: RE: FreeBSD HA configuration / Ethernet address takeover Message-ID: <332F90115D96D0119CD500805FEA976B013E85D6@HSCMS01>
next in thread | raw e-mail | index | archive | help
At my day job, we use a product that HP makes called MC/Serviceguard w/ HP-UX. Two or more machines, sharing a set of disks. This system doesn't do Ethernet address takeover (we actually use FDDI here). Instead, you would define a "package" that would have its own IP address, and the MC/Serviceguard would map that to an interface on the machine on which the "package" is currently running. The "package" being the software and the volume group it lives on. All machines in the cluster have some kind of common heartbeat connection (e.g. serial or networking, we use a non-routed ethernet between the machines in a cluster). If a machine goes down, the MC/S will restart the package on another machine. Of course, there is some kind of lock manager to make sure you don't have two machines trying to run the same package (and thus access the same disks). Works reasonably well, and you can have, for example, two packages running on two machines with a common failover. It can also do things like hot spare network interfaces (e.g. two or more interfaces connected to the same network, but only one is active/ifconfig'ed, if it dies, MC/S will automatically bring up a spare interface; we actually use this product on several machines just for this purpose). This approach, of using an IP address, would save you the hassle of reprogramming the hardware address of an ethernet controller, as well as allow you to use whatever networking tech. you want. For those interested, I there's a bunch of stuff, including manuals are at http://www.docs.hp.com/hpux/ha, including a doc on how to set up a HA NFS environment. Mike. // Mike Nguyen // Unix Systems Analyst and Geek // Union Carbide Corporation * (281) 212-8073 // nguyenhm@ucarb.com * mikenguyen@sprintmail.com (personal) > ---------- > From: Hans Huebner[SMTP:hans@artcom.de] > Sent: Wednesday, April 29, 1998 6:02 AM > To: Greg Lehey > Cc: freebsd-hackers@FreeBSD.ORG > Subject: Re: FreeBSD HA configuration / Ethernet address takeover > > On Wed, 29 Apr 1998, Greg Lehey wrote: > > > A big difference between the environment you're considering and the > > Tandem environment is that the Tandem environments are logically a > > single system which doesn't fail, whereas you're looking at separate > > systems, one of which may fail. A significant problem is to > determine > > when the primary machine fails (what, you don't get a reply from the > > machine? Maybe *your* Ethernet board has failed). This problem has > > caused Tandem headaches for decades, and I'm not going to discuss it > > in this message. > > I was suggesting this approach while I had the OpenVision Axxion HA > approach in mind. Axxion HA is a HA system for Solaris, and it works > with > Ethernet address takeover and dual-ported FC/AL disks. Clients see a > Axxion HA pair as two seperate physical machines and one 'logical' > machine > which is run on either hosts of the pair. Services are normally > accessed > at the 'logical' machine, but clients are free to use the 'physical' > systems if they feel the need to do so. > > Axxion HA reserves one physical ethernet port for interconnection > between > the two CPUs in a HA pair. This private ethernet is normally > implemented > with a crossed TP cable, and the HA monitoring software sends > alive-messages through this interface. If one system detects that the > other does no longer send alive messages, it first checks whether the > public ethernet of the other still responds before taking over the > public services. > > The disks in an Axxion HA configuration are dual-ported, but each disk > is > accessed only by one host at a time. The dual-porting is used solely > to > minimize the time a failover from one machine to the other takes. > > There are situations, of course, where such an approach fails, but a > HA > solution is not what Tandem offers (100% availability). A HA > configuration can help nevertheless when one needs to perform system > upgrades, reliable level 0 dumps or other work on a system which runs > services users depend on. > > > 1. Reliable Ethernet > > > [...] > > > In the case of a board which can't change its MAC address, the > > alternative of assuming its IP address and sending a couple of > > pings to the broadcast address sounds like a good workaround. > > Certainly it will normalize things faster than waiting for the > > application layer to try an alternative IP. > > This does not work with an IP alias address unless special > configuration > measures are met. In fact, this is our primary problem, since we run > our > name service on a IP-adress which is an alias on the host our name > server > runs on. > > > 2. SCSI takeover. > > > [...] > > > I can't see a good solution in using two host adaptors on two > > different machines connected to a single string. As long as the > > second machine doesn't have access to the first machine's buffer > > cache, data can get lost, and a takeover must involve an fsck. > > The overhead of fsck could go into several minutes, much longer > > than the time that the application layer takes to try another IP > > address. I don't think that this would make much sense from an > > availability standpoint, though it obviously makes sense to > > recover the file systems and make them available on another > > machine if the first machine is going to be out of commission > for > > any length of time. > > With respect to the HA discussion, dual-hosted SCSI busses would only > allow for faster access to the disks if a failover occurs. In other > applications, dual-hosting a SCSI bus would propably make more sense > (sharing of tape drives would be one example, fast IP connectivity > between > two hosts would be another). > > > What makes more sense is to replicate the data across multiple > > systems. Possibly a software layer like the vinum volume > manager > > would be able to perform this function: put one copy of the data > > on the local machine, another on one or two other machines via > NFS > > or some other protocol, and always read from the local machine. > > As long as the write rate is not too high, this should allow for > > higher availability. > > If there is such a thing for FreeBSD, i want it ;) > > Best regards, > Hans > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?332F90115D96D0119CD500805FEA976B013E85D6>