From owner-freebsd-hackers@FreeBSD.ORG Sun Mar 30 19:23:02 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1F55137B401 for ; Sun, 30 Mar 2003 19:23:02 -0800 (PST) Received: from mta6.snfc21.pbi.net (mta6.snfc21.pbi.net [206.13.28.240]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7A67143FA3 for ; Sun, 30 Mar 2003 19:23:01 -0800 (PST) (envelope-from mbsd@pacbell.net) Received: from atlas ([64.160.45.145]) by mta6.snfc21.pbi.net (iPlanet Messaging Server 5.1 HotFix 1.6 (built Oct 18 2002)) with ESMTP id <0HCL000OFG2D0L@mta6.snfc21.pbi.net> for hackers@freebsd.org; Sun, 30 Mar 2003 19:23:01 -0800 (PST) Date: Sun, 30 Mar 2003 19:22:55 -0800 (PST) From: =?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?= In-reply-to: <007e01c2f730$4b5863d0$0300000a@slugabed.org> X-X-Sender: mikko@atlas.home To: Sean Hamilton Message-id: <20030330191611.J1122@atlas.home> MIME-version: 1.0 Content-type: TEXT/PLAIN; charset=US-ASCII Content-transfer-encoding: 7BIT References: <001101c2f71d$8d9e4fb0$0300000a@slugabed.org> <20030331023856.GL74971@dan.emsphone.com> <007e01c2f730$4b5863d0$0300000a@slugabed.org> cc: hackers@freebsd.org Subject: Re: wait()/alarm() race condition X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 Mar 2003 03:23:03 -0000 On Sun, 30 Mar 2003, Sean Hamilton wrote: > Dan Nelson wrote: > | Just make sure your signal handler has the SA_RESTART flag unset > | (either via siginterrupt() if the handler was installed with signal(), > | or directly if the signal was installed with sigaction() ), and the > | signal will interrupt the wait() call. > > Er, I think you've missed my problem. Or I'm not getting your solution. > > I'm concerned about this order of events: > > - alarm() > - wait() returns successfully > - if (alarmed...) [false] > - SIGALRM is delivered, alarmed = true > - loop > - wait() waits indefinitely > > This is incredibly unlikely to ever happen, but it's irritating me somewhat > that the code isn't airtight. Bad design. Surely there is some atomic means > of setting a timeout on a system call. My stock solution to this kind of problem is to turn those pesky signals into I/O and use an old fashioned select() loop to handle them; create a pipe(2), let signal handlers write one-byte "messages" (the signal number) into the pipe and then use select() to dequeue the events (signals) from the pipe. Select() has a timeout parameter you can play with to your hearts content, and provided you don't overflow the pipe, no events will get lost. You'd have to install a hander for SIGCHLD, of course. $.02, /Mikko