From owner-freebsd-hackers  Sat Sep 27 15:18:07 1997
Return-Path: <owner-freebsd-hackers>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.7/8.8.7) id PAA15703
          for hackers-outgoing; Sat, 27 Sep 1997 15:18:07 -0700 (PDT)
Received: from usr08.primenet.com (tlambert@usr08.primenet.com [206.165.6.208])
          by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id PAA15698
          for <freebsd-hackers@FreeBSD.ORG>; Sat, 27 Sep 1997 15:18:02 -0700 (PDT)
Received: (from tlambert@localhost)
	by usr08.primenet.com (8.8.5/8.8.5) id PAA13667;
	Sat, 27 Sep 1997 15:17:53 -0700 (MST)
From: Terry Lambert <tlambert@primenet.com>
Message-Id: <199709272217.PAA13667@usr08.primenet.com>
Subject: Re: ee taking up weird cpu amount.
To: joerg_wunsch@uriah.heep.sax.de
Date: Sat, 27 Sep 1997 22:17:52 +0000 (GMT)
Cc: freebsd-hackers@FreeBSD.ORG
In-Reply-To: <19970927145007.HB02894@uriah.heep.sax.de> from "J Wunsch" at Sep 27, 97 02:50:07 pm
X-Mailer: ELM [version 2.4 PL23]
Content-Type: text
Sender: owner-freebsd-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk

> > > I've seen this problem on Illtrix on an old Vax2000 as well.  To what
> > > shell are your claims above related?
> > 
> > None of them, of course.
> 
> Huh?  ``To what shell?'' -- ``None of them.''  Am i in the wrong
> movie?
> 
> >  This has only to do with ourder of revoke()
> > processing when revoke() is a result of on-to-off-DCD transition.
> 
> The foreground process group gets properly signalled.  I've been using
> a FreeBSD-based ISP for long enough to know that it works.

Then under what circumstances is it ever possible to get a process
buzzing in a "read returns 0 bytes" loop?  If your claim is true,
that it works flawlessly, then the process will get a SIGHUP instead
because it's in the foreground.

And the error he was reporting didn't happen.

So he must be hallucinating.

;-).


> > > ksh has a weird (mis-)feature to lead all its kids to death when it
> > > dies itself.
> > 
> > This, of course, has to do with ksh's method of backgrounding a
> > process.
> 
> No.  It hasn't.  ksh properly shuffles each job into a separate
> process group (unlike the non-jobcontrol prehistoric /bin/sh).  It
> *purposely* kills all its children before dying.  That's why you can
> achieve the csh's default behaviour by logging out with ``kill -9 $$''
> -- the background process groups will then behave like they do in csh.

Uh... how does this differ from what I said?


> > The Bourne shell specifically distinguished children this way via
> > the "nohup" mechansim.  The csh implies "nohup" for all subshells,
> 
> No.  The children are still sensitive to a SIGHUP if you send it to
> them.  The shell doesn't send it to them iff they are _running in
> background_ when the shell exits.  (Stopped jobs will be reaped
> nevertheless.)  This is why the manual says that running background
> jobs are ``effectively nohuped''.  They are not really nohuped.

Which manual are you reading?

     DESCRIPTION
          The nohup utility invokes command with its arguments and at
	  this time sets the signal SIGHUP to be ignored. The signal
	  SIGQUIT may also be set to be ignored.

8-).


> > > Btw., nvi doesn't suffer from this behaviour.  Ignoring an error
> > > return from the input device is always an error on the side of the
> > > program in question.
> > 
> > An error return is impossible in the normal signal propagation case,
> 
> That's no reason for curses to never assume an error could not happen.

Yes, it is.  It's on the order of doing a select on an fd attached to
a disk file to "see if it's readable".  The answer is "YES" before you
ask the question; why ask an inane question?  Like "I'm not catching
SIGHUP, so the default behaviour of terminating the program when the
tty is revoked and I'm sent SIGHUP is in effect, so should I ask if
that read from the tty failed because of EOF from the tty being
revoked?"

The answer is "NO" before you ask the question.


> Errors can happen for more reasons.  Errors are to be caught.

"Never check for an error that you can't handle" -- Donald Knuth

Though I admit, he probably meant that you should handle them.  ;-).


> Anything else is sloppy programming.
> 
> There used to be a Usenix paper titled:
> 
> 	   Can't happen

Then don't check for the condition.  If you don't have a test for it,
it truly cannot hit the code bounded by the test.

> 
> 	         or
> 
> 	  /* NOTREACHED */

This is a comment to make LINT shut up.  How often does exit()
return to the caller in your programs?  8-).

> 
> 	         or
> 
> 	Real Programs dump Core.

They do, and we look at the traceback and correct the code.  And we catch
this in our testing before it ever goes out the door, because our test
cases are machine generated code coverage tests based on doing branch
path analysis, because we know one support call will cost us most of our
profit, and we're not one of those companies who erroneously looks at
support as a profit center and uses 1-900 numbers instead of 1-800
numbers.  Because we learn by example, and we saw Word Perfect turf it,
and we're not morons who believe that "this time, when I let go of the
brick, it won't fall".

But that's irrelevent here... 8-).


Here are the possible returns from read(2), and their real meaning in
the current context:

-1		You screwed up.  Your code is broken.  Look at errno
		in case you are too stupid to figure out how it's broken
		without someone holding your hand.

0		You screwed up.  You are trapping SIGHUP and then not
		doing anything about it.  You should probably not trap
		it, and the default behaviour will abort your program
		like most normal programs.  To teach you to use better
		coding practices in the future, I'm going to buzz loop
		here forever , sucking CPU time until you notice me.

>0		Hey, what do you know, it's not an error!


Here are the possible errors from read(2), and their real meaning in
the current context:

EBADF		Code needs to be corrected to be proactive in its
		decision to use a particular fd.  Fix your code.

EFAULT		Code needs to allocate buffers before it uses them.
		Fix your code.

EIO		Terminals aren't file systems.  The tty code should
		already be aware of this fact.  Fix the stupid tty code.

EINTR		Some idiot changed the default BSD system call restart
		behaviour after signals are received.  Needs to be
		changed back, and only set before specific calls,
		and then only when you are too frigging anal about
		anything that mildly resembles "goto" to use setjmp()
		before the call and longjmp() in the signal handler,
		like God intended.  Fix the stupid signal code.

EINVAL		Hello!  McFly!  You're passing in bogus descriptors!
		Go back and read the system call manual again; you
		apparently don't grasp the concept of "you have to open
		it before you can read it".  Fix your code.

EAGAIN		You are using non-blocking I/O, probably because
		you are too stupid to figure out the arguments to
		the select() system call, or too lazy to write an
		aioread/aiowrite/aiocancel and/or async call gate
		code to do it the right way.  Fix your code.


I don't see one case where the error return needs to be checked in a
working program in this particular usage of read(2)... unless vi can
have it's input redirected from a file in normal usage (EIO)?


					Regards,
					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.