From owner-freebsd-net@FreeBSD.ORG Tue Oct 19 13:56:14 2004 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B5B0216A4CE; Tue, 19 Oct 2004 13:56:14 +0000 (GMT) Received: from melusine.cuivre.fr.eu.org (melusine.cuivre.fr.eu.org [82.225.155.84]) by mx1.FreeBSD.org (Postfix) with ESMTP id 349C143D1F; Tue, 19 Oct 2004 13:56:14 +0000 (GMT) (envelope-from thomas@FreeBSD.ORG) Received: by melusine.cuivre.fr.eu.org (Postfix, from userid 1000) id 360822C3D0; Tue, 19 Oct 2004 15:56:13 +0200 (CEST) Date: Tue, 19 Oct 2004 15:56:13 +0200 From: Thomas Quinot To: freebsd-net@freebsd.org Message-ID: <20041019135612.GA27971@melusine.cuivre.fr.eu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-message-flag: WARNING! Using Outlook can damage your computer. User-Agent: Mutt/1.5.6i cc: wpaul@freebsd.org Subject: yppush going into an endless loop X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Oct 2004 13:56:14 -0000 On a 5.2.1-REL NIS server where NIS maps are updated every hour from a crontab, I often see yppush going into an endless loop: 0x281220dc in __vfprintf () from /lib/libc.so.5 (gdb) bt #0 0x281220dc in __vfprintf () from /lib/libc.so.5 #1 0x28121b3f in strchr () from /lib/libc.so.5 #2 0x28122173 in __vfprintf () from /lib/libc.so.5 #3 0x281220a6 in vfprintf () from /lib/libc.so.5 #4 0x2810ed6d in fprintf () from /lib/libc.so.5 #5 0x08049bf2 in __verr ( fmt=0x804c2c0 "warning: exiting with transfer to %s (transid = %lu) still pending", ap=0x2813d120 "\\`\f") at /usr/src/RELENG_5_2/usr.sbin/ypserv/yp_error.c:58 #6 0x08049c58 in yp_error (fmt=0xbbc004d0 "Ð") at /usr/src/RELENG_5_2/usr.sbin/ypserv/yp_error.c:71 #7 0x0804a39f in yppush_exit (now=672387024) at /usr/src/RELENG_5_2/usr.sbin/yppush/yppush_main.c:195 #8 0x0804a406 in handler (sig=6) at /usr/src/RELENG_5_2/usr.sbin/yppush/yppush_main.c:213 #9 #10 0x280c1dcf in kill () from /lib/libc.so.5 #11 0x2812ef82 in abort () from /lib/libc.so.5 #12 0x2812d6fe in tcflow () from /lib/libc.so.5 #13 0x2812d72b in tcflow () from /lib/libc.so.5 #14 0x2812e459 in free () from /lib/libc.so.5 #15 0x281090e3 in _nsdbtput () from /lib/libc.so.5 #16 0x2810909b in _nsdbtput () from /lib/libc.so.5 ---Type to continue, or q to quit--- #17 0x28108c98 in endhostent () from /lib/libc.so.5 #18 0x2810944f in _nsdbtput () from /lib/libc.so.5 #19 0x2812f384 in exit () from /lib/libc.so.5 #20 0x0804a3c6 in yppush_exit () at /usr/src/RELENG_5_2/usr.sbin/yppush/yppush_main.c:201 #21 0x0804a406 in handler (sig=6) at /usr/src/RELENG_5_2/usr.sbin/yppush/yppush_main.c:213 #22 #23 0x280c1dcf in kill () from /lib/libc.so.5 Script done on Tue Oct 19 15:19:12 2004 Two questions: 1. has anyone already observed similar behaviour; 2. from code-reading, it looks like we attempt to prevent yppush_exit from wrecking havoc when we call it from a signal handler by zeroing out yppush_jobs. Shouldn't that be yppush_joblist? As far as 2 is concerned, the following change should work around the problem (but then I'd still have to find out why yppush is exiting on a signal in the first place). Index: yppush_main.c =================================================================== RCS file: /home/ncvs/src/usr.sbin/yppush/yppush_main.c,v retrieving revision 1.18 diff -u -r1.18 yppush_main.c --- yppush_main.c 3 May 2003 21:06:41 -0000 1.18 +++ yppush_main.c 19 Oct 2004 13:37:22 -0000 @@ -209,7 +209,7 @@ handler(int sig) { if (sig == SIGTERM || sig == SIGINT || sig == SIGABRT) { - yppush_jobs = 0; + yppush_joblist = NULL; yppush_exit(1); } -- Thomas.Quinot@Cuivre.FR.EU.ORG