From owner-freebsd-arch  Thu Mar  7 22:36:22 2002
Delivered-To: freebsd-arch@freebsd.org
Received: from ns.yogotech.com (ns.yogotech.com [206.127.123.66])
	by hub.freebsd.org (Postfix) with ESMTP id 4BE3537B402
	for <arch@FreeBSD.ORG>; Thu,  7 Mar 2002 22:36:13 -0800 (PST)
Received: from caddis.yogotech.com (caddis.yogotech.com [206.127.123.130])
	by ns.yogotech.com (8.9.3/8.9.3) with ESMTP id XAA18482;
	Thu, 7 Mar 2002 23:36:09 -0700 (MST)
	(envelope-from nate@yogotech.com)
Received: (from nate@localhost)
	by caddis.yogotech.com (8.11.6/8.11.6) id g286a4C06813;
	Thu, 7 Mar 2002 23:36:04 -0700 (MST)
	(envelope-from nate)
From: Nate Williams <nate@yogotech.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15496.23508.148366.980354@caddis.yogotech.com>
Date: Thu, 7 Mar 2002 23:36:04 -0700
To: Julian Elischer <julian@elischer.org>
Cc: Nate Williams <nate@yogotech.com>,
	Poul-Henning Kamp <phk@critter.freebsd.dk>, arch@FreeBSD.ORG
Subject: Re: Contemplating THIS change to signals. (fwd)
In-Reply-To: <Pine.BSF.4.21.0203072034130.46043-100000@InterJet.elischer.org>
References: <15496.14272.351722.199146@caddis.yogotech.com>
	<Pine.BSF.4.21.0203072034130.46043-100000@InterJet.elischer.org>
X-Mailer: VM 6.96 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid
Reply-To: nate@yogotech.com (Nate Williams)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

> > > > > My suggestion is to stop making STOP type signals an exception,
> > > > > because it should not be necessary to stop them in the middle of a
> > > > > syscall, just stop them from getting back to userspace.
> > > > 
> > > > What about when you suspend a process in the middle of read/write, which
> > > > are syscalls?  This kind of behavior is *extremely* common-place
> > > 
> > > hmm can you explain what you mean? I can't think of anything 
> > > that would change..
> > 
> > 'read' is a system call.  If a program is sitting in a read (waiting for
> > user input), this system call must be interruptible.
> > 
> 
> or not, either way it doesn't matter.
> 
> The important thing is that no matter what happens it doesn't return to
> the user while the user is suspended.

However, control is returned to the parent process (such as the shell),
correct?  Something is returning to userland.

> currently: If the sleep is interruptable (PCATCH) then the code will
> receive the STOP. It will not wake up, but transition into the STOPPED
> state (the timer keeps running however).

I'm with you so far.  (Although stating the 'the timer keeps running' is
misleading, since the 'timer' isn't created on a per-process basis).

> When the timer terminates, or the
> data arrives it attempts to wake up the process, but noticing it is
> STOPPED, doesn't actually put it on the run queue. Later when the CONT is
> received, it is put on the run queue, reads the data (or not) and returns
> to user mode.

Still with you.

> Until CONT is received it is holding resources etc.

Hopefully not. :)

> If the
> CONT happens before the data arrives or the timeout expires the process is
> put back into SLEEP mode.. When the data arrives or the timeout happens, 
> it doesn't notice it was STOPPED before.

Exactly.

> My suggestion: If the sleep is interruptable (PCATCH), a flag is set
> saying a STOP was seen, no other action. When the data arrives or the
> timeout expires, it reads the data (or not) and proceeds to the user
> boundary.

In effect, the context is actually still active inside the kernel, even
though the userland process is never given notice.  I could see this
having *really* strange effects for things that do network I/O.
Often-times I've suspended processes inside of gdb so I could set
breakpoints knowing full well that *NOTHING* was going to happen, even
if I was in the middle of a system call.

I'm still not getting a warm fuzzy that allowing the userland context to
complete and then block at the userland boundary is a good idea.  I'm
not saying it's a bad idea, but I'm almost positive there are gremlins
hiding in the details here. :)

> It waits there for the CONT signal. If the CONT happens before
> the data arrives or the timeout expires the flag is simply cleared again.

In effect, you're avoiding moving the process from the STOPPED queue and
back to the SLEEP queue.

> It doesn't make such a huge difference now but when there are multiple
> threads in a process, being able to report to the controlling shell,
> or the debugger "OK its STOPPED" as soon as I can prove that all
> threads are in the kernel and not in user space makes a big
> difference.

Right, but currently, stopping a process/thread stops it *everywhere*,
including the kernel.  We're changing something that has been done this
way for a very long time, and the POLA may come back and bite us.

> Also, the bit for "this process is STOPPING" can be set by another
> thread and all threads will see it. Saving me from having to run
> around all threads and do all sorts of stuff with each.

Except that there are now locking issues since the 'other thread' is now
messing around with parts of the thread that may be in use by the thread
itself.  (*ick*)

> I don't necessarily want to do this in -current on it's own.  I have
> code that does this in the KSE diffs. I want to know if I've forgotten
> anything because when I commit the KSE diffs I don;t want to discover
> I've broken everything.

I'm sure you have, but probably not intentionally.  There is almost
certainly software that relies on the existing behavior, and will behave
differently (perhaps not wrong) with the aboved described changes.

> it ispossible I might be able to
> commit this change separatly before hand so that it's an Isolated change
> however and if I can I might think about doing that.

Maybe a sysctl to turn it off/on?  However, I suspect this would be
*really* hard to keep both ways going.

Thanks for the more detailed explanation!!


Nate

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message