From owner-freebsd-arch Fri Oct 11 6:12:17 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4FE9E37B401; Fri, 11 Oct 2002 06:12:15 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8C26843E9E; Fri, 11 Oct 2002 06:12:14 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g9BDC3vU045194; Fri, 11 Oct 2002 06:12:07 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200210111312.g9BDC3vU045194@gw.catspoiler.org> Date: Fri, 11 Oct 2002 06:12:03 -0700 (PDT) From: Don Lewis Subject: Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc., [for review]] To: rwatson@FreeBSD.ORG Cc: dl-freebsd@catspoiler.org, jmallett@FreeBSD.ORG, wollman@lcs.mit.edu, arch@FreeBSD.ORG In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 11 Oct, Robert Watson wrote: > On Fri, 11 Oct 2002, Don Lewis wrote: > >> > I'm playing with an idea in my head such that: >> > in a signal queuer/sender(not sendsig, that's really md_postsig -- it posts >> > a signal), >> > 1. signal_add is called. >> > a. Does this signal have an SA_SIGINFO handler? >> > I. Were we given a ksiginfo to queue? >> > 1. Allocate one... Does that fail? >> > a. Invoke an OOM killer, or such. >> >> Solaris returns an EAGAIN to the caller and the target is unaffected. If >> the caller really wants to nuke the target, it could retry with kill(). >> The same error will be returned if there are too many signals in the >> target's queue, which should prevent the signal queue for a wedged >> process from consuming all of kmem. > > Agreed. I think it would be best if the signal code itself didn't kill > processes (Well, with the exception of cases where it is supposed to :-) > to reclaim resources. Or, if that's the best place to put it, the caller > should definitely be able to indicate its disposition with regards to > failure modes. The temptation would be (assuming this was feasible): > > 1 If the target isn't doing anything special for the signal, don't pay the > price of reliable delivery. > 2 If the target is doing something special for the signal, allow the code > attempting to deliver the signal figure out what to do if it fails. > > I know that (2) is possible, because Linux does that. I don't know > much/anything about (1), but the conversation seems suggestive that that > is possible. I'd be comfortable with this route as the experimental > direction to see how well it all pulls together in the Perforce branch. > However, for each case where we're considering (2) for a kernel generated > signal, we need to determine what (if any) failure mode is appropriate. > That would probably take looking at the specs closely, looking at other > implementations, etc. Alas, RH 7.3 doesn't seem to have a man page for sigqueue(), so I don't know much about it's failure modes. The sigaction() man page describes all sorts of wonderful things that can be returned in the siginfo structure. One thing in the Solaris implementation that is not in the Linux implementation is the value SI_NOINFO value for si_code, which indicates that no other information is being returned. Nothing needs to be allocated on the kernel side to implement this, and it looks like a reasonable precedent for doing an incremental implementation. I wonder if the Linux version actually queues the information for SIGSEGV, etc. If the info is only returned if the signal is enabled when the error occurs, then the info could be just copied back to user space by the trap handler (well, it's not that easy because of the way we have to return to user space to invoke the signal handler ...). It's should be easy to cheat for SIGCHLD. The information can be harvested from the not yet waited-for child process. The Linux implementation appears to have made provisions for returning information for SIGIO, but it doesn't appear to be implemented yet. I wonder why that is ... To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message