From owner-freebsd-net@FreeBSD.ORG Thu Dec 20 11:57:33 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6AE56E01; Thu, 20 Dec 2012 11:57:33 +0000 (UTC) (envelope-from eugen@grosbein.net) Received: from eg.sd.rdtc.ru (eg.sd.rdtc.ru [IPv6:2a03:3100:c:13::5]) by mx1.freebsd.org (Postfix) with ESMTP id BE6608FC0A; Thu, 20 Dec 2012 11:57:32 +0000 (UTC) Received: from eg.sd.rdtc.ru (localhost [127.0.0.1]) by eg.sd.rdtc.ru (8.14.5/8.14.5) with ESMTP id qBKBvSmj033006; Thu, 20 Dec 2012 18:57:28 +0700 (NOVT) (envelope-from eugen@grosbein.net) Message-ID: <50D2FD23.90505@grosbein.net> Date: Thu, 20 Dec 2012 18:57:23 +0700 From: Eugene Grosbein User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; ru-RU; rv:1.9.2.13) Gecko/20110112 Thunderbird/3.1.7 MIME-Version: 1.0 To: "net@freebsd.org" Subject: libradius dead_time option Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Sergey Matveychuk X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2012 11:57:33 -0000 Hi! Recently libradius(3) got long-awaited 'dead_time' option that allows to skip 'dead' radius servers for 'dead_time' timeout while dealing with multiple servers. I'd like to ask for small improvement of the code. Presently it will fail without a try if all servers marked 'dead'. Instead, in that case it shoud ignore 'dead' state of servers and make at try as if all of them were marked 'alive'. That would greatly increase recovery time after great network disasters. Also, I'd like to be able to see notification of such disasters in system logs (as an option). Something like this (compile-tested only): --- radlib.c.orig 2012-12-20 18:13:25.000000000 +0700 +++ radlib.c 2012-12-20 18:54:43.000000000 +0700 @@ -55,6 +55,9 @@ #include #include #include +#ifdef LIBRADIUS_USE_SYSLOG +#include +#endif #include #include "radlib_private.h" @@ -686,9 +689,35 @@ if (h->servers[h->srv].num_tries >= h->servers[h->srv].max_tries) { /* Set next probe time for this server */ if (h->servers[h->srv].dead_time) { + int alldead = 1; + int i; +#ifdef LIBRADIUS_USE_SYSLOG + char host[13]; /* AF_INET in dot notation */ + syslog(LOG_INFO, + "RADIUS server %s:%u is not responding and is being marked dead.", + inet_ntop(AF_INET, &(h->servers[h->srv].addr.sin_addr), + host, sizeof(host)), + (int)ntohs(h->servers[h->srv].addr.sin_port)); +#endif h->servers[h->srv].is_dead = 1; h->servers[h->srv].next_probe = now + h->servers[h->srv].dead_time; + for (i = 0; i < h->num_servers; i++) { + if (!h->servers[i].is_dead) { + alldead = 0; + break; + } + } + if (alldead) { +#ifdef LIBRADIUS_USE_SYSLOG + syslog(LOG_NOTICE, "ALL RADIUS servers are dead."); +#endif + /* don't be idle */ + for (i = 0; i < h->num_servers; i++) { + h->servers[i].is_dead = 0; + h->servers[i].num_tries = 0; + } + } } do { h->srv++; @@ -698,6 +727,14 @@ break; if (h->servers[h->srv].dead_time && h->servers[h->srv].next_probe <= now) { +#ifdef LIBRADIUS_USE_SYSLOG + char host[13]; /* AF_INET in dot notation */ + syslog(LOG_INFO, + "RADIUS server %s:%u is being marked alive.", + inet_ntop(AF_INET, &(h->servers[h->srv].addr.sin_addr), + host, sizeof(host)), + (int)ntohs(h->servers[h->srv].addr.sin_port)); +#endif h->servers[h->srv].is_dead = 0; h->servers[h->srv].num_tries = 0; break;