From owner-freebsd-bugs@FreeBSD.ORG Sat Oct 24 18:30:12 2009 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A43C610656A6 for ; Sat, 24 Oct 2009 18:30:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6F0988FC0A for ; Sat, 24 Oct 2009 18:30:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n9OIUAxT034907 for ; Sat, 24 Oct 2009 18:30:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n9OIUA1H034902; Sat, 24 Oct 2009 18:30:10 GMT (envelope-from gnats) Date: Sat, 24 Oct 2009 18:30:10 GMT Message-Id: <200910241830.n9OIUA1H034902@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Jilles Tjoelker Cc: Subject: Re: kern/129172: [libc] signals are not delivered always X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jilles Tjoelker List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Oct 2009 18:30:12 -0000 The following reply was made to PR kern/129172; it has been noted by GNATS. From: Jilles Tjoelker To: bug-followup@FreeBSD.org, Roman.Gritsulyak@gmail.com Cc: Subject: Re: kern/129172: [libc] signals are not delivered always Date: Sat, 24 Oct 2009 20:26:29 +0200 It seems what you are looking for is not reliable delivery of signals, but queuing of SIGCHLD in particular. This is implemented in FreeBSD 7.0 and newer. In FreeBSD 6 and older, SIGCHLD from child processes is not queued: if another SIGCHLD signal arrives when one is already pending, the two signals are coalesced and the handler is only called once. Your test program should work fine if it calls waitpid(-1, NULL, WNOHANG) from the signal handler until it returns 0 or -1. Even when run on FreeBSD 7, the test program has some problems. Firstly, it may exit before all the child processes. Secondly, it assumes that wait() returns terminated child processes in the same order as SIGCHLD signals. Apparently this is the case on Linux, but it is not the case on FreeBSD. Then, when wait() returns status for a different child process than the signal was for, the signal for that child process is dequeued (POSIX prescribes this, and it must be that way to limit the number of pending SIGCHLD signals to the number of child processes). As a result, the zombie for the original child process is never removed and the signal handler is called less than 100 times. If you want to wait for one process per signal handler call, you can fix this by making the handler a SA_SIGINFO one, and calling waitpid() with si->si_pid where si is the siginfo_t pointer passed to the handler. Otherwise use the simpler fix I mentioned above. Note that POSIX also says that implementations may avoid queuing if SA_SIGINFO is not enabled, but this is not the case in FreeBSD. Thirdly, it uses unsafe functions with signal handlers. The use of sem_wait() in a signal handler is not safe (apart from data consistency issues with fast userspace implementations, the risk of deadlock is pretty high -- a signal handler is not a thread). Only sem_post() is async-signal-safe. It seems that the objective for the semaphore is already met by the automatic blocking of a signal while its handler is executing. printf() may also cause problems with signal handlers, and also with fork() (double output). -- Jilles Tjoelker