Date: Thu, 29 Aug 2002 14:42:04 -0700 (PDT) From: Anshuman Kanwar <akanwar@engineering.ucsb.edu> To: <bill@carracing.com>, <friar_josh@webwarrior.net> Cc: <freebsd-questions@FreeBSD.ORG> Subject: RE: network link failover Message-ID: <Pine.LNX.4.33.0208291436330.30604-100000@ecipc056.engr.ucsb.edu>
next in thread | raw e-mail | index | archive | help
Hmm, here is a round wheel, if not exactly a *polished* round one ;) We needed failover functionality to gaurd against switch failure on our servers. Unable to find anything for freebsd I wrote a perl daemon (appended below). It is quite specific (and exhaustive) to our setup and might require some hacking to work with yours. I've replaced all IPs with xxx's. You may want to look at the do_switch function for the meat of the failover. Please feel free to ask me about the details. Ansh Kanwar. ---------------begin #!/usr/bin/perl # Version : 0.7 # Release Date: Aug 23 2002 # Author : Anshuman Kanwar # ver 0.7 modifications # - works with FreeBSD 4.6 # - do_switch() is cleaner with much of 4.4 # specific code commented out by uname # - intelligent route deletes # (if its not there, dont try to delete it) # - The way to speify alias is changed to # ifconfig_fxp0_alias0="inet xxx.xxx.xxx.xxx/16" # 4.6 does not support the "alias" at the end anymore # # ver 0.6 additions # - now works with vlan aliased configs # - also with plain aliases #-Summary--------------------------------------------- # # Detects network interface failure and migrates # connectivity to the standby interface. # # # Functional Overview: # --------------------- # 1. Figure out what is the active interface # 2. Ping a known host in order to detect packet loss # 3. If we detect packet loss, switch to other interface # # Important properties: # - This script uses very little CPU when there are no # networking problems. This is important since it runs # together with important server processes that may need # 100% CPU of the machine. # - During an outage,tThe CPU usage is a little higher, # but still low. We have measured a 1.6% peak on a # 850Mhz Dell 350 server. # - Typically the script will detect an iterface outage # in less than 10 seconds. # - Eventually the script will give up and exit if after # $MAX_FLAPS interface switches we do not restore connectivity. # - When this script exits abnormally it tries to restore the # network interfaces to a "safe" state # # WARNING: Be very careful when changing the parameters # defined below. You should make sure the resulting # script behavior conforms to the properties above. # #----------------------------------------------------- use strict; #-----------define config variables here------- my $DEBUG = 1; # debug(1) or quiet (0) my $DOIT = 1; # do a test run (0) real thing (1) my $IS_DAEMON = 0; # run as a daemon (1) foreground process (0) my $DO_MINUS_ALIAS = 1; # required if you are running aliases (NOT VLAN aliases) # keep the MINUS VLAN = 0, will eventually be dropped my $DO_MINUS_VLAN = 0; # # OS and machine specific parameters # my $FILE_RC_CONF = "/etc/rc.conf"; # Where are the configs. OS specific my $INT_TYPE = "fxp"; # Type of interface. Machine specific my $INTF_0 = "fxp0"; # Interface zero. Machine specific my $INTF_1 = "fxp1"; # Interface one. Machine specific #--------------------------------------------- # Data-center specific patameters. # # # Headquarters values # ############# my $LOCATION_HQ = "HQ"; my $LOCATION_SNV = "SNV"; my %HASH_DEF_ROUTE = ( $LOCATION_HQ => "10.4.xx.xx", $LOCATION_SNV => "xx.xxx.xxx.xxx", ); my %HASH_HOST1_TO_PING = ( $LOCATION_HQ => "10.4.xx.xxx", $LOCATION_SNV => "xxx.xx.xxx.xx", ); my %HASH_HOST2_TO_PING = ( $LOCATION_HQ => "xx.xxx.xx.xx", $LOCATION_SNV => "xx.xxx.xxx.x", ); ################### my $OUR_LOCATION = $LOCATION_HQ; my $DEF_ROUTE = $HASH_DEF_ROUTE{$OUR_LOCATION}; # Default route if no explicit entry in conf my $HOST_TO_PING = $HASH_HOST1_TO_PING{$OUR_LOCATION}; # Router's IP (to ping) my $HOST2_TO_PING = $HASH_HOST2_TO_PING{$OUR_LOCATION}; # backup router unless (defined $DEF_ROUTE) { log_die("Define default route\n"); } unless (defined $HOST_TO_PING) { log_die("Define Host to ping\n"); } unless (defined $HOST2_TO_PING){ log_die("Define Host2 to ping\n"); } # #--------------------------------------------- # # Algorithm parameters. # my $MAX_FLAPS = 2000; # Limit before give up my $WAIT_N_LOSSES = 3; # ON ping loss: retry limit file. my $HOSTNAME = `hostname`; chomp($HOSTNAME); my $OS_VER=`uname -r`; chomp($OS_VER); #--- --# #You should not be required to edit below this# #--- --# #sub-------------die_log----------------------# # # # Log critical error and die # #---------------------------------------------# sub log_die { my ($message) = @_; system("logger -p local0.error $message"); # # TODO: restore a safe state # die($message); } #sub-------------ping_host--------------------# # # # Pings $HOST_TO_PING 1 time # # Returns number of pings received # #---------------------------------------------# sub ping_host { my ( $host # host to be pinged ) = @_; my $pingval = undef; # Number of successful pings # Call ping and capture its output my @pingin = (); my $pingLine = `ping -c 1 -q -t 2 $host`; # # TODO: handle pingLine == undef. If needed # @pingin = split(/\n/, $pingLine); # Parse Ping output to find out packet loss foreach $pingLine (@pingin){ chomp($pingLine); $pingLine =~ s/\s+$//; if($pingLine =~ /^\d packets transmitted, (\d) packets received,/){ $pingval = $1; } } unless(defined $pingval){log_die(" x DYING:Unexpected ping output\n");} return $pingval; } # ping_host #sub------------parse_rc_file---------------------# # # # Parse /etc/rc.conf and return the commands that # # need to be issued to failover. # # All vlan alias and inet commands are included # # # # It extracts commands from teh config file and # # replaces all occurrances of $old_intf with # # $new_int # # # #-------------------------------------------------# sub parse_rc_file { my ( $conf_file, # file to parse $old_intf, # interface failed $new_intf, # new interface $int_type # type of int. ) = @_; # Accumulate result in those two arrays my @if_cmds_in = (); # buffer for rc.conf ouput my @vlan_cmds_in = (); # buffer for vlan part of rc.conf open (PARSERC, $conf_file) || log_die (" x DIED:no rc file"); while (my $line = <PARSERC>) { next if($line =~ /^\s*$/); # Delete blank lines next if($line =~ /^\#.*/); next if ($line =~ /^ifconfig_vlan\d_alias\d/); next if ($line =~ /^ifconfig.*alias/); if($line =~ /^ifconfig_$int_type/) { $line =~ s/(_|=|\")/ /g; # strip punctuation $line =~ s/$old_intf/$new_intf/g; # replace interface push(@if_cmds_in, $line); # all ifconfig !vlan } if($line =~ /^ifconfig_vlan/) { $line =~ s/(_|=|\")/ /g; #strip punctuation $line =~ s/$old_intf/$new_intf/g; #replace inteface push(@vlan_cmds_in, $line); } } # while close(PARSERC); push (@if_cmds_in, @vlan_cmds_in); return @if_cmds_in; } # parse_rc_file #sub------------vlan_rc_file-------------------- # # # # #------------------------------------------------ sub vlan_rc_file { my ( $conf_file, # file to parse ) = @_; # Accumulate result in those two arrays my @alias_list = (); # buffer for rc.conf ouput open (PARSERC, $conf_file) || log_die (" x DIED:no rc file"); while (my $line = <PARSERC>) { next if($line =~ /^\s*$/); # Delete blank lines next if($line =~ /^\#.*/); if ($line =~ /ifconfig_vlan\d_alias\d/){ if($line =~ /inet\s+(\d+.\d+.\d+.\d+).*vlan/) { my $ipadd=$1; push(@alias_list, $ipadd); # all ifconfig !vlan } } } # while close(PARSERC); return @alias_list; } # vlan_rc_file #sub------------alias_rc_file-------------------- # # # # #------------------------------------------------ sub alias_rc_file { my ( $conf_file, # file to parse ) = @_; # Accumulate result in those two arrays my @alias_list = (); # buffer for rc.conf ouput open (PARSERC, $conf_file) || log_die (" x DIED:no rc file"); while (my $line = <PARSERC>) { next if($line =~ /^\s*$/); # Delete blank lines next if($line =~ /^\#.*/); next if ($line =~ /ifconfig_vlan/); if($line =~ /^ifconfig.*alias.*\s([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+.[0-9]+)/) { my $ipadd=$1; push(@alias_list, $ipadd); # all ifconfig !vlan } } # while close(PARSERC); return @alias_list; } # alias_rc_file #sub------------parse_ifconf-------------------- # # execute ifconfig -u to list UP interfaces # parse output to determine the number of match # -ing interfaces and their status # # Returns: # $sys_int_total_up - number of "up" interfaces # $intf - name of last "up" interface # $stat - status of $intf #----------------------------------------------- sub parse_ifconf { my ( $int_type # Interface type ) = @_; my $intf; # return interface my $stat; # return status my $sys_int_total_up; # total interfaces up # Get ifconfig information #-------------------------- my @ifconin = (); my $if_in_line = `ifconfig -u`; @ifconin = split(/\n/, $if_in_line); # Extract info from ifconf output #-------------------------------- foreach my $ifconLine (@ifconin) { chomp($ifconLine); $ifconLine =~ s/\s+$//; # Delete trailing blanks if($ifconLine =~ /^$int_type[0-9]:/){ my @fields = split(/:\s/, $ifconLine); $intf = $fields[0]; $sys_int_total_up++; # number of up interfaces } if($ifconLine =~ /\s+status:\s/){ my @fields = split(/:\s/, $ifconLine); $stat = $fields[1]; # Status information } } # for each return ($sys_int_total_up, $intf, $stat) } # parse_ifconf #sub----------------do_switch--------------------- # # # # Switched from $old_itntf to alternative interface # # # # Has these distinct parts # # 1. parse $conf_file, typically /etc/rc.conf # # 2. bring failed interface down # # 3. bring new interface up # # # # Returns: # # - no return value # #------------------------------------------------- # sub do_switch { my ( $old_intf, # Interface $conf_file, # file to parse $int_type # interface type ) = @_; my $ret; # temp variable for return values of system() calls # Based on present interface decide new interface. my $new_intf = undef; if ($old_intf eq $INTF_0) { $new_intf = $INTF_1; } elsif ($old_intf eq $INTF_1) { $new_intf = $INTF_0; } else { log_die(" x DYING:Interface is invalid\n"); } # Always log attempts to switch # These will be picked up by logsurfer system("logger -p local0.info SWITCH ATTEMPTED: [$old_intf] to [$new_intf]"); # Get ifconfig commands from config file my @if_cmds_in = parse_rc_file($conf_file, $old_intf, $new_intf, $int_type); if ($DEBUG) { print(@if_cmds_in); } if ($DOIT){ if ( $OS_VER =~ /4\.4/){ # HACK because FreeBSD 4.4 does not understand -ifp # option for route. Not needed for 4.5 and up if ($DO_MINUS_ALIAS) { system("ifconfig $new_intf 192.168.0.200 -alias"); my (@ret)= alias_rc_file("/etc/rc.conf"); foreach my $ip(@ret){ # system ("ifconfig $old_intf $ip -alias"); # print $ip." deleted as alias\n"; } } $ret = system("ifconfig $old_intf inet 192.168.0.200"); if($ret != 0) { log_die (" x DYING:hack failed\n"); } } if ($DO_MINUS_VLAN) { system("ifconfig vlan0 -vlandev $old_intf"); } # Special parsing is required for alias IPs # delete them from the old interface my (@ret)= alias_rc_file("/etc/rc.conf"); foreach my $ip(@ret){ print $ip."----------\n"; system ("ifconfig $old_intf inet $ip -alias"); } # Bring failed interface down $ret = system("ifconfig $old_intf down"); if ($ret != 0) { log_die(" x DYING:ifdown failed\n"); } # Clear ARP cache $ret = system("arp -a -d"); if ($ret != 0) { log_die (" x DYING:arp failed\n"); } # to gaurd against border cases # check if default route is in routing table # only then delete it my $netstat_out=`netstat -rn`; if ( $netstat_out =~ "default ") { # Delete old route # not required in 4.6 as it intelligently deletes the route # when the interface is downed $ret = system("route delete default"); if ($ret != 0) { log_die(" x DYING:Route Delete failed\n");} } # Bring new interface up $ret = system("ifconfig $new_intf up"); if($ret != 0) { log_die (" x DYING:ifup failed\n"); } # Fail Over to Other Interface including VLANs foreach my $execline (@if_cmds_in) { $ret = system($execline); if ($ret != 0) { log_die (" x DYING:ifconfig failed\n"); } } if ($DO_MINUS_VLAN) { my (@ret_vlan)= vlan_rc_file("/etc/rc.conf"); foreach my $vlan_ip(@ret_vlan){ system ("ifconfig vlan0 inet $vlan_ip/32 alias"); print $vlan_ip."\n"; } } my (@ret)= alias_rc_file("/etc/rc.conf"); foreach my $ip(@ret){ print $ip."----------\n"; system ("ifconfig $new_intf inet $ip alias"); } # Add new route $ret = system("route add -ifp $new_intf default $DEF_ROUTE"); if($ret != 0) { log_die(" x Route Add Failed\n");} } return $new_intf; } # do_switch #-[daemonize]---------------------------------------------------- # # Makes this process independent of any terminal # Runs as a daemon in the background #---------------------------------------------------------------- sub daemonize { #chdir '/' or die "Can't chdir to /: $!"; open STDIN, '/dev/null' or die "Can't read /dev/null: $!"; open STDOUT, '>>/dev/null' or die "Can't write to /dev/null: $!"; open STDERR, '>>/dev/null' or die "Can't write to /dev/null: $!"; defined(my $pid = fork) or die "Can't fork: $!"; exit if $pid; umask 0; } #-[-MAIN-]--------------------- { my $SLEEP_TIME_NORMAL = 3; # 3 seconds between probes when nothing is wrong my $SLEEP_TIME_AGGRESSIVE = 1; # 1 second between probes when we detect an outage my $nFlipFlops = 0; # Number of consequtive flip-flops between interfaces my $total_iter = 0; # total iterations so far my $sleeptime = $SLEEP_TIME_NORMAL; # time between iterations my $nPingLoss = 0; # Number of back-to-back ping tests that had packet loss daemonize if ($IS_DAEMON); system (" echo $$ > /tmp/FAIL_PIDFILE\n"); # Check if the config file exists, if not, then die die(" x Cannot start, config file missing\n") unless (-e $FILE_RC_CONF); #----begin main loop---------# MAIN: while (1) { # Maintain Counters $nFlipFlops++; if ($DEBUG) { $total_iter++;} if ($nFlipFlops > $MAX_FLAPS) { system("logger -p local0.info MAX FLAPS met: QUITTING"); log_die(" x DYING:Max Flaps met\n"); } if ($DEBUG) { print("[$HOSTNAME\@$OS_VER]___[Stuckiter:", $nFlipFlops - 1,"]___[Iter:", $total_iter, "]\n"); } # Step 1: get ifconfig status my ( $sys_int_total_up, # number of "up" interfaces $cur_intf, # last "up" interface $stat_cur_intf # state of $cur_intf ) = parse_ifconf($INT_TYPE); if ($sys_int_total_up gt 1) { if ($DEBUG) { print(" x MORE than ONE interfaces are UP: Waiting\n");} sleep ($SLEEP_TIME_NORMAL * 2); next MAIN; } elsif (!defined($sys_int_total_up)){ if ($DEBUG) { print (" x No Interface is up: Waiting\n");} sleep ($SLEEP_TIME_NORMAL * 2); next MAIN; } if ($DEBUG) { print ("+ EXACTLY One interface is up\n"); } # Step 2: ping the $HOST_TO_PING my $pingval1 = ping_host($HOST_TO_PING); if ($DEBUG) { print ("+ PING returned ", $pingval1, "\n"); } my $pingval2 = ping_host($HOST2_TO_PING); if ($DEBUG) { print ("+ PING2 returned ", $pingval2, "\n"); } # Discard the lesser value of number of pings answered # assuming that the host/router could have gone down. my $pingval= ($pingval1 < $pingval2) ? $pingval2 : $pingval1; if ($pingval < 1){ # Packet loss! Make a log entry if ($DEBUG) { print(" - Ping Loss\n"); } system("logger -p local0.info PING LOSS"); } else { # No packet loss. Clear error counters if ($DEBUG) { print(" | No ping loss\n"); } $nFlipFlops = 0; $nPingLoss = 0; $sleeptime = $SLEEP_TIME_NORMAL; } if (($stat_cur_intf eq "active") && ($pingval < 1)) { # The interface status is "active" and we have packet loss # Get aggressive (reduce sleeptime). $sleeptime = $SLEEP_TIME_AGGRESSIVE; $nPingLoss++; if ($DEBUG) { print(" - aggressive\n"); } # If we do not recover within $WAIT_N_LOSSES iterations, then # we call do_switch() to swicth to a different interface if ($nPingLoss > $WAIT_N_LOSSES ) { do_switch($cur_intf, $FILE_RC_CONF, $INT_TYPE); if ($DEBUG) { print(" - SWITCHING PINGLOSS from $cur_intf\n"); } } } elsif ($stat_cur_intf ne "active"){ # The current interface is not active. # Maybe cable broken or switch died # # Future feature: to find out the nature of the failure # and act accordingly if ($DEBUG) { print(" - LINK LOSS: SWITCHING IMMIDIATELY from $cur_intf\n"); } system("logger -p local0.info LINK LOSS: SWITCHING IMMIDIATELY from $cur_intf"); # Switch to a different interface in the hope that it works do_switch($cur_intf, $FILE_RC_CONF, $INT_TYPE); # We don't know if the new interface will work. # So, we stay aggressive until we know for sure $sleeptime = $SLEEP_TIME_AGGRESSIVE; } if ($DEBUG) { print(" | sleep for $sleeptime sec\n\n"); } # Hardcoded guard against thrashing. Never sleep for less than one seconds if ($sleeptime < 1) { sleep(1); } else { sleep($sleeptime); } } # while (1) } #---------------end -----Original Message----- From: Josh Paetzel [mailto:friar_josh@webwarrior.net] Sent: Tuesday, August 20, 2002 12:49 PM To: W. Desjardins Cc: freebsd-questions@FreeBSD.ORG Subject: Re: network link failover On Tue, 2002-08-20 at 18:56, W. Desjardins wrote: > Hi, > > I couldnt find anything in the archives specific to this question. > > I was wondering if there are any network card drivers that support link > failover between either 2 cards, or 2 ports on the same card? basically, I > am looking to have a server hooked to 2 switches and have the link > failover (while maintaining IP address) failover to the new port in the > event of a dead switch. > > I currently use this functionality in solaris with a daemon called > in.mpathd that uses interface aliases as floating ip's between network > interfaces. Solaris will failover and back, any links that fail for any > reason. Its nice in that besides load balancing to all interfaces in a > group, I can use any number of interfaces on any card as a group. > > I know some of the multiport ethernet cards such as the intel dual-port > and dlink quad-port claim failover capabilities, but I suspect that is > only for windows. is this correct? Is there any ability in the drivers fro > these cards to accomodate failover? > > Oh...and I know I can write a script in an hour or so to perform this duty > and unless I find any new info here, is what I will be doing. I just dont > care to reinvent a sub-standard wheel if a nice round one already exists > ;) > > Thanks, > > Bill Nope. No wheels here to reinvent. FreeBSD is basically designed to be a one way shot to the internet routing platform. I don't know for a fact that the intel multiport cards are capable of what you want, but in 6 years of using FreeBSD I've never heard of one being used in that capacity. (Take it for what it's worth) Josh To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message --- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.4.33.0208291436330.30604-100000>