From owner-freebsd-questions Thu Feb 28 14:41:39 2002 Delivered-To: freebsd-questions@freebsd.org Received: from segfault.monkeys.com (246.dsl6660157.rstatic.surewest.net [66.60.157.246]) by hub.freebsd.org (Postfix) with ESMTP id 1ED4B37B405 for ; Thu, 28 Feb 2002 14:41:34 -0800 (PST) Received: from monkeys.com (localhost [127.0.0.1]) by segfault.monkeys.com (Postfix) with ESMTP id C8B3E660C; Thu, 28 Feb 2002 14:41:33 -0800 (PST) To: =?ISO-8859-1?Q?Mikko_Ty=F6l=E4j=E4rvi?= Cc: freebsd-questions@freebsd.org Subject: Re: Annoying/non-intutive/undocumented poll(2) behavior: Bug or feature? In-reply-to: Your message of Thu, 28 Feb 2002 14:03:20 -0800. <20020228140039.T93443-100000@mikko.rsa.com> Date: Thu, 28 Feb 2002 14:41:33 -0800 Message-ID: <75496.1014936093@monkeys.com> From: "Ronald F. Guilmette" Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020228140039.T93443-100000@mikko.rsa.com>, you wrote: > >"Ronald F. Guilmette" : > >> Riddle: When is a socket error not a socket error? >> >> Answer: When you are using the poll(2) syscall in FreeBSD (4.3) to >> check for the completion status of an outbound connect(2). >> >> I'm just about to file a formal problem {report} on this... (I did file the problem report, by the way.) >>... >> Eventually, after a suitable waiting period and a suitable number of >> retries, the attempt to connect will fail, and the call to poll(2) >> will then return. At that point, the program checks to see if the >> POLLERR bit is set in the returned `revents' field of the pollfd > structure. >>... > >A connect() failure is not, as the FreeBSD man page puts it, "an >exceptional condition" nor a "device error". The term ``exceptional condition'' is clearly one requiring interpretation. More to the point, it would clearly be more useful if the kernel set the POLLERR bit in cases where an asynchronous connect attempt has failed. >And Linux, believe it or not, is not a good indication of what might >be considered "standard behaviour". Oh I most definitely _do_ believe that. It is self-evidently true that only the behavior of systems derived from actual BSD networking code can possibly be considered as being ``definitive'' when it comes to normative... dare I say ``standard'' behavior of what we all do, after all, call ``Berkeley sockets''. I only mentioned the Linux behavior, as an example of at least one other system where the interpretation of the term ``exceptional condition'' has been given a somewhat broader and, I would argue, more useful interpreta- tion. >Hint: you program does not behave >as you would like on Solaris 7, HP-UX 11 or AIX 4.3 either. I feel quite sure that you're right on that score, in part because I strongly suspect that the networking in those systems and in FreeBSD all have essentially the same lineage. But just because the networking code in these various BSD-derived systems behavies a given way, that certainly doesn't imply that the behavior in question is either ``right'' (in any abstract or ultimate sense) or that is it is any sense optimal for supporting real-world programs. My claim is simply that a failure of an asynchronous connect attempt can be and should be noted in the `struct pollfd' structures returned from a call to poll(2) by setting the appropriate POLLERR bit. That is both the most intutively correct outcome and also the most functionally useful one. >If you want to write even remotely portable code, you should buy, beg, >borrow or steal a copy of Stevens "UNIX Network Programming." Got it already, thanks. But only the first edition. I haven't yet got 'round to purchasing the newer edition. >There >you will find a whole section on non-blocking connections, learn that >they are somewhat painful to deal with correctly... There's no good reason that I am aware of why they must necessarily be quite so painful as they are. It seems to me that they are more painful to deal with that they need to be, simply because the original implementors overlooked a few minor but important points... such as setting POLLERR and/or POLLHUP in certain sets of very specific circumstances. I don't believe that it is at all too late to correct those minor over- sights. >... and that your best >bet on checking the status of a connect attempt is polling for >read/write and using getsockopt() to check for error. Umm... that's yet another kernel call. But who's counting, right? (Answer: I'm counting. I don't like having to make frivolous additional context switches just because poll(2) isn't giving me the information that it should, by all rights, be giving me in the first place.) >P.S. To answer your question: it is neither bug nor feature, > it's just life :-) Well, I disagree. I think its a bug. At the very least, it's a non-feature... or perhaps an anti-feature... suppression of useful information which the kernel quite clearly _does_ already have in its possession. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message