Date: Sat, 25 Mar 2000 23:51:03 -0600 (CST) From: Jonathan Lemon <jlemon@flugsvamp.com> To: sue@welearn.com.au, hackers@freebsd.org Subject: Re: syslogd stops logging - caught in the act Message-ID: <200003260551.XAA37498@prism.flugsvamp.com> In-Reply-To: <local.mail.freebsd-hackers/20000326140241.C43926@welearn.com.au>
next in thread | previous in thread | raw e-mail | index | archive | help
I asked Sue to get a ktrace of the syslogd, and here's the output: 18869 syslogd 954045445.977145 PSIG SIGALRM caught handler=0x804b068 mask=0x0 code=0x0 18869 syslogd 954045445.977343 RET poll -1 errno 4 Interrupted system call 18869 syslogd 954045445.977366 CALL gettimeofday(0xbfbfc5f0,0) 18869 syslogd 954045445.977382 RET gettimeofday 0 18869 syslogd 954045445.977403 CALL setitimer(0,0xbfbfc5e8,0xbfbfc5d8) 18869 syslogd 954045445.977424 RET setitimer 0 18869 syslogd 954045445.977438 CALL old.sigreturn(0xbfbfc624) 18869 syslogd 954045445.977456 RET old.sigreturn JUSTRETURN 18869 syslogd 954045445.977476 CALL poll(0xbfbfc6f0,0x1,0x9c40) 18869 syslogd 954045475.987785 PSIG SIGALRM caught handler=0x804b068 mask=0x0 code=0x0 18869 syslogd 954045475.987859 RET poll -1 errno 4 Interrupted system call 18869 syslogd 954045475.987879 CALL gettimeofday(0xbfbfc5f0,0) 18869 syslogd 954045475.987895 RET gettimeofday 0 18869 syslogd 954045475.987917 CALL setitimer(0,0xbfbfc5e8,0xbfbfc5d8) 18869 syslogd 954045475.987938 RET setitimer 0 18869 syslogd 954045475.987952 CALL old.sigreturn(0xbfbfc624) 18869 syslogd 954045475.987969 RET old.sigreturn JUSTRETURN 18869 syslogd 954045475.987990 CALL poll(0xbfbfc6f0,0x1,0x9c40) 18869 syslogd 954045505.997954 PSIG SIGALRM caught handler=0x804b068 mask=0x0 code=0x0 18869 syslogd 954045505.998120 RET poll -1 errno 4 Interrupted system call The poll() calls are from libc/net/res_send, while the gettimeofday() calls are from the alarm handler (in syslogd). The res_send code does roughly the following: msec = (timeout calculated based on # of tries) repeat: poll(pfd, 1, msec); if (errno == EINTR) goto repeat; So what's happening here is it seems that after the # of tries grows to a certain point, the timeout being passed to poll() is larger than the timeout between calls to the SIGALRM handler. Since the poll() timeout is not reset, this leads to an infinite loop. In the traces above, the poll() timeout is 40000msec (== 40 sec), and the alarm handler is called every 30 sec. The fix should probably be to change res_send.c so that it properly decrements it's timeout value after being interrrupted. -- Jonathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200003260551.XAA37498>