Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Oct 2002 08:52:59 -0400 (EDT)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Juli Mallett <jmallett@FreeBSD.org>
Cc:        Don Lewis <dl-freebsd@catspoiler.org>, wollman@lcs.mit.edu, arch@FreeBSD.org
Subject:   Re: [jmallett@FreeBSD.org: [PATCH] Reliable signal queues, etc., [for review]]
Message-ID:  <Pine.NEB.3.96L.1021011083816.42071C-100000@fledge.watson.org>
In-Reply-To: <20021011053720.A2431@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Fri, 11 Oct 2002, Juli Mallett wrote:

> > Solaris returns an EAGAIN to the caller and the target is unaffected. If
> > the caller really wants to nuke the target, it could retry with kill().
> > The same error will be returned if there are too many signals in the
> > target's queue, which should prevent the signal queue for a wedged
> > process from consuming all of kmem.
> 
> Uhm, not really.  Retrying with SIGKILL won't result in the signal being
> queued.

I think you may be missing the thrust: there are two sources of signals in
the world:

(1) User processes signalling each other or themselves.

(2) Kernel services signalling user processes in response to a trap or an
    event.

In both cases, we're talking about an EAGAIN error getting returned if
insufficient resources are available to the source of the signal, and in
both cases, we may be interested in a fail-stop approach.  The case I
believe Don is talking about specifically is the:

  Application boomctl tries to deliver SIGUSR1 to boomd, the reliable boom
  daemon.  boomctl gets back EAGAIN because the kernel does not have the
  resources to reliably deliver the signal, and boomd has a handler for
  SIGUSR1.  boomctl/boomd have fail-stop semantics, so boomctl calls
  kill(boomd_pid, SIGKILL).  Or, if it doesn't care about the failure very
  much, it queues the instance delivery via some other sort of
  non-asynchronous-delivery IPC. 

This permits fail-stop semantics where they are needed, but doesn't force
them on applications that would rather not stop.

Another case to consider is that of init.  Init may be interested in
SIGCHLD with process information, but not so interested that it wants to
be terminated if the pid can't be delivered with a siginfo; it can always
call wait().  You care a lot about reliable init behavior in a memory
constraint situation because if init dies, your system either halts or
panics, depending on the circumstance.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1021011083816.42071C-100000>