From owner-freebsd-bugs Thu Oct 4 1:30:15 2001 Delivered-To: freebsd-bugs@hub.freebsd.org Received: from freefall.freebsd.org (freefall.FreeBSD.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 9775F37B407 for ; Thu, 4 Oct 2001 01:30:01 -0700 (PDT) Received: (from gnats@localhost) by freefall.freebsd.org (8.11.4/8.11.4) id f948U1I13047; Thu, 4 Oct 2001 01:30:01 -0700 (PDT) (envelope-from gnats) Received: from tomts11-srv.bellnexxia.net (tomts11.bellnexxia.net [209.226.175.55]) by hub.freebsd.org (Postfix) with ESMTP id EB8E437B405 for ; Thu, 4 Oct 2001 01:23:42 -0700 (PDT) Received: from khan.anarcat.dyndns.org ([65.92.161.107]) by tomts11-srv.bellnexxia.net (InterMail vM.4.01.03.16 201-229-121-116-20010115) with ESMTP id <20011004082341.PBFG19737.tomts11-srv.bellnexxia.net@khan.anarcat.dyndns.org> for ; Thu, 4 Oct 2001 04:23:41 -0400 Received: from shall.anarcat.dyndns.org (shall.anarcat.dyndns.org [192.168.0.1]) by khan.anarcat.dyndns.org (Postfix) with ESMTP id 6768A1AF8 for ; Thu, 4 Oct 2001 04:23:37 -0400 (EDT) Received: by shall.anarcat.dyndns.org (Postfix, from userid 1000) id A6EBC20BE1; Thu, 4 Oct 2001 04:23:42 -0400 (EDT) Message-Id: <20011004082342.A6EBC20BE1@shall.anarcat.dyndns.org> Date: Thu, 4 Oct 2001 04:23:42 -0400 (EDT) From: The Anarcat Reply-To: The Anarcat To: FreeBSD-gnats-submit@freebsd.org X-Send-Pr-Version: 3.113 Subject: bin/31029: syslogd remote logging back down Sender: owner-freebsd-bugs@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org >Number: 31029 >Category: bin >Synopsis: syslogd remote logging back down >Confidential: no >Severity: serious >Priority: high >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: change-request >Submitter-Id: current-users >Arrival-Date: Thu Oct 04 01:30:01 PDT 2001 >Closed-Date: >Last-Modified: >Originator: The Anarcat >Release: FreeBSD 4.4-STABLE i386 >Organization: Nada, inc. >Environment: System: FreeBSD shall.anarcat.dyndns.org 4.4-STABLE FreeBSD 4.4-STABLE #7: Sat Sep 15 00:41:38 EDT 2001 anarcat@shall.anarcat.dyndns.org:/usr/obj/usr/src/sys/SHALL i386 >Description: From -questions: On Tue, Oct 02, 2001 at 11:57:08AM -0400, The Anarcat wrote: > Hi. > > I think I noticed what seems to me undesirable (and undocumented?) > behavior in syslogd. When a remote logging host (@host) is > unreachable: > > syslogd: sendto: Host is down > > syslogd *never* tries to reach it again, unless it receives a HUP. > Shouldn't it try to reach it again, from time to time? > > The @host was indeed down, but when it was brought back up, remote > logging wasn't resumed. >How-To-Repeat: *.* @host where host is down or unreachable. >Fix: This is a draft of what I would call "approximate exponential backoff algorithm". :) There's a lot of debugging code that can be removed, but they help seeing what's going on. There's probably a better way to do this too. :) --- syslogd.c.orig Wed Oct 3 15:56:32 2001 +++ syslogd.c Thu Oct 4 00:06:49 2001 @@ -142,6 +142,9 @@ #define MARK 0x008 /* this message is a mark */ #define ISKERNEL 0x010 /* kernel generated message */ +#define DELAY_MUL 2 /* delay multiplier */ +#define DELAY_INIT 30 /* initial delay in seconds */ + /* * This structure represents the files that will have log * copies printed. @@ -159,6 +162,9 @@ #define PRI_EQ 0x2 #define PRI_GT 0x4 char *f_program; /* program this applies to */ + /* should this be part of the union? */ + time_t f_unreach; /* time since last unreach */ + time_t f_delay; /* backoff time */ union { char f_uname[MAXUNAMES][UT_NAMESIZE+1]; struct { @@ -999,6 +1005,15 @@ l = MAXLINE; if (finet) { + dprintf("FORW: now: %d f_unreach: %d f_delay: %d\n", (int) now, (int) f->f_unreach, (int) f->f_delay); + /* XXX: must make sure this is initialized to 0 */ + if (f->f_unreach) { /* there was a failure last time */ + dprintf("another try at host\n"); + if ( (now - f->f_unreach) < f->f_delay) { + dprintf("skipping: now: %d, f_unreach: %d f_delay: %d\n", (int) now, (int) f->f_unreach, (int) f->f_delay); + break; /* do not send */ + } + } for (r = f->f_un.f_forw.f_addr; r; r = r->ai_next) { for (i = 0; i < *finet; i++) { #if 0 @@ -1017,12 +1032,38 @@ if (lsent == l && !send_to_all) break; } + dprintf("lsent: %d\n", lsent); if (lsent != l) { int e = errno; - (void)close(f->f_file); - errno = e; - f->f_type = F_UNUSED; + dprintf("sendto: f_unreach: %d f_delay: %d\n", (int) f->f_unreach, (int) f->f_delay); logerror("sendto"); + errno = e; + switch (errno) { + case EHOSTUNREACH: + case EHOSTDOWN: + if (f->f_unreach) + f->f_delay *= DELAY_MUL; + else { + f->f_unreach = now; + f->f_delay = DELAY_INIT; + } + dprintf("setting: f_unreach: %d f_delay: %d\n", (int) f->f_unreach, (int) f->f_delay); + break; + /* case EBADF: */ + /* case EACCES: */ + /* case ENOTSOCK: */ + /* case EFAULT: */ + /* case EMSGSIZE: */ + /* case EAGAIN: */ + /* case ENOBUFS: */ + /* case ECONNREFUSED: */ + default: + dprintf("removing entry\n", e); + (void)close(f->f_file); + errno = e; + f->f_type = F_UNUSED; + break; + } } } break; @@ -2301,3 +2342,7 @@ return(socks); } + +/* Local Variables: *** */ +/* c-basic-offset:8 *** */ +/* End: *** */ >Release-Note: >Audit-Trail: >Unformatted: To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-bugs" in the body of the message