Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 8 Mar 2002 09:18:06 -0700
From:      Nate Williams <nate@yogotech.com>
To:        Julian Elischer <julian@elischer.org>
Cc:        Nate Williams <nate@yogotech.com>, Poul-Henning Kamp <phk@critter.freebsd.dk>, arch@FreeBSD.ORG
Subject:   Re: Contemplating THIS change to signals. (fwd)
Message-ID:  <15496.58430.16748.970354@caddis.yogotech.com>
In-Reply-To: <Pine.BSF.4.21.0203080017330.46841-100000@InterJet.elischer.org>
References:  <15496.23508.148366.980354@caddis.yogotech.com> <Pine.BSF.4.21.0203080017330.46841-100000@InterJet.elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> > In effect, the context is actually still active inside the kernel, even
> > though the userland process is never given notice.  I could see this
> > having *really* strange effects for things that do network I/O.
> > Often-times I've suspended processes inside of gdb so I could set
> > breakpoints knowing full well that *NOTHING* was going to happen, even
> > if I was in the middle of a system call.
> 
> You'd be surprised then because once the send() is done, the network IO
> will happen independently of the process.

I'm more thinking of send.  Once the send() system call has queued the
data for sending, it's been 'sent' (ie; the stack has it, and will
'DTRT' with it).

> this is no different.

Except for read() or recvfrom() system calls, and potentially things
like 'sendfile()'.  Also, write() may behave differently (since write
involve disk writing, not network writing).

> from the time you do the ^Z to the time the syscall thinks of returning is
> how long? If you say 3 seconds then all that is different is that in my
> case the data has been taken off the queue but previously it would have
> still been on the queue, but since the process is stopped,
> who can tell?

A lot can happen in 3 seconds. :)

> In fact if the data was already present then sleep(0 would
> have never been called, so the blocking would (even now) happen
> at the user boundary. All I'm doing is making it consitent.

Agreed.

> > I'm still not getting a warm fuzzy that allowing the userland context to
> > complete and then block at the userland boundary is a good idea.  I'm
> > not saying it's a bad idea, but I'm almost positive there are gremlins
> > hiding in the details here. :)
> 
> Userland context? where does userland context come into it?

Sorry, I meant 'kernel context' above.  My bad.  I'll repeat.

I'm still not getting a warm fuzzy that allowing the kernel context to
complete and then block at the userland boundary is a good idea.  I'm
not saying it's a bad idea, but I'm almost positive there are gremlins
hiding in the details here. :)

> > > It waits there for the CONT signal. If the CONT happens before
> > > the data arrives or the timeout expires the flag is simply cleared again.
> > 
> > In effect, you're avoiding moving the process from the STOPPED queue and
> > back to the SLEEP queue.
> 
> I'm allowing it to STAY in the SLEEEP queue, knowing that when it decides
> to continue, it can not get back to userland. because I've blocked that
> road with a fallen tree.

I know, but prior to this change, it moved to the STOPPED queue, and
then back to the SLEEP queue upon receipt of SIGSTOP and SIGCONT
signals.

> > > It doesn't make such a huge difference now but when there are multiple
> > > threads in a process, being able to report to the controlling shell,
> > > or the debugger "OK its STOPPED" as soon as I can prove that all
> > > threads are in the kernel and not in user space makes a big
> > > difference.
> > 
> > Right, but currently, stopping a process/thread stops it *everywhere*,
> > including the kernel.  We're changing something that has been done this
> > way for a very long time, and the POLA may come back and bite us.
> 
> No it doesn't stop it everywhere. if you have not set PCATCH or if you
> do not sleep you will proceed forwards oblivious of the pending signal,
> until you try return to user land, at which point you will be stopped,
> just as I am suggesting, except I am sugesting that we should ALWAYS
> stop it there.

I'm assuming we're talking about processes with PCATCH set, since that's
the behavior that will change.  Bu, thanks for clarifying.

> > > Also, the bit for "this process is STOPPING" can be set by another
> > > thread and all threads will see it. Saving me from having to run
> > > around all threads and do all sorts of stuff with each.
> > 
> > Except that there are now locking issues since the 'other thread' is now
> > messing around with parts of the thread that may be in use by the thread
> > itself.  (*ick*)
> 
> No, it is in the proc struct.
> All threads sharing the same proc see the same bit.

Then we have the potential for race issues, since it could be possible
for one thread to get a SIGSTOP, and another to get a SIGCONT very soon
after, so we must guarantee that the order these are received is done
correctly.  Doing this 'safely' almost always involves locking.



Nate

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?15496.58430.16748.970354>