From owner-freebsd-questions  Thu Feb 28 14:41:39 2002
Delivered-To: freebsd-questions@freebsd.org
Received: from segfault.monkeys.com (246.dsl6660157.rstatic.surewest.net [66.60.157.246])
	by hub.freebsd.org (Postfix) with ESMTP id 1ED4B37B405
	for <freebsd-questions@freebsd.org>; Thu, 28 Feb 2002 14:41:34 -0800 (PST)
Received: from monkeys.com (localhost [127.0.0.1])
	by segfault.monkeys.com (Postfix) with ESMTP
	id C8B3E660C; Thu, 28 Feb 2002 14:41:33 -0800 (PST)
To: =?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?= <mikko@rsasecurity.com>
Cc: freebsd-questions@freebsd.org
Subject: Re: Annoying/non-intutive/undocumented poll(2) behavior: Bug or feature? 
In-reply-to: Your message of Thu, 28 Feb 2002 14:03:20 -0800.
             <20020228140039.T93443-100000@mikko.rsa.com> 
Date: Thu, 28 Feb 2002 14:41:33 -0800
Message-ID: <75496.1014936093@monkeys.com>
From: "Ronald F. Guilmette" <rfg@monkeys.com>
Sender: owner-freebsd-questions@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-questions.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-questions>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-questions>
X-Loop: FreeBSD.ORG


In message <20020228140039.T93443-100000@mikko.rsa.com>, you wrote:

>
>"Ronald F. Guilmette" <rfg@monkeys.com>:
>
>> Riddle:  When is a socket error not a socket error?
>>
>> Answer:  When you are using the poll(2) syscall in FreeBSD (4.3) to
>>          check for the completion status of an outbound connect(2).
>>
>> I'm just about to file a formal problem {report}  on this...

(I did file the problem report, by the way.)

>>...
>> Eventually, after a suitable waiting period and a suitable number of
>> retries, the attempt to connect will fail, and the call to poll(2)
>> will then return.  At that point, the program checks to see if the
>> POLLERR bit is set in the returned `revents' field of the pollfd
> structure.
>>...
>
>A connect() failure is not, as the FreeBSD man page puts it, "an
>exceptional condition" nor a "device error".

The term ``exceptional condition'' is clearly one requiring interpretation.

More to the point, it would clearly be more useful if the kernel set the
POLLERR bit in cases where an asynchronous connect attempt has failed.

>And Linux, believe it or not, is not a good indication of what might
>be considered "standard behaviour".

Oh I most definitely _do_ believe that.

It is self-evidently true that only the behavior of systems derived from
actual BSD networking code can possibly be considered as being ``definitive''
when it comes to normative... dare I say ``standard'' behavior of what
we all do, after all, call ``Berkeley sockets''.

I only mentioned the Linux behavior, as an example of at least one other
system where the interpretation of the term ``exceptional condition'' has
been given a somewhat broader and, I would argue, more useful interpreta-
tion.

>Hint: you program does not behave
>as you would like on Solaris 7, HP-UX 11 or AIX 4.3 either.

I feel quite sure that you're right on that score, in part because I
strongly suspect that the networking in those systems and in FreeBSD
all have essentially the same lineage.

But just because the networking code in these various BSD-derived systems
behavies a given way, that certainly doesn't imply that the behavior in
question is either ``right'' (in any abstract or ultimate sense) or that
is it is any sense optimal for supporting real-world programs.

My claim is simply that a failure of an asynchronous connect attempt
can be and should be noted in the `struct pollfd' structures returned
from a call to poll(2) by setting the appropriate POLLERR bit.  That
is both the most intutively correct outcome and also the most functionally
useful one.

>If you want to write even remotely portable code, you should buy, beg,
>borrow or steal a copy of Stevens "UNIX Network Programming."

Got it already, thanks.  But only the first edition.  I haven't yet got
'round to purchasing the newer edition.

>There
>you will find a whole section on non-blocking connections, learn that
>they are somewhat painful to deal with correctly...

There's no good reason that I am aware of why they must necessarily be
quite so painful as they are.

It seems to me that they are more painful to deal with that they need
to be, simply because the original implementors overlooked a few minor
but important points... such as setting POLLERR and/or POLLHUP in certain
sets of very specific circumstances.

I don't believe that it is at all too late to correct those minor over-
sights.

>... and that your best
>bet on checking the status of a connect attempt is polling for
>read/write and using getsockopt() to check for error.

Umm... that's yet another kernel call.

But who's counting, right?

(Answer:  I'm counting.  I don't like having to make frivolous additional
context switches just because poll(2) isn't giving me the information that
it should, by all rights, be giving me in the first place.)

>P.S. To answer your question: it is neither bug nor feature,
>     it's just life :-)

Well, I disagree.  I think its a bug.

At the very least, it's a non-feature... or perhaps an anti-feature...
suppression of useful information which the kernel quite clearly _does_
already have in its possession.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message