From owner-freebsd-hackers Tue Oct 17 14:28:48 1995 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id OAA08846 for hackers-outgoing; Tue, 17 Oct 1995 14:28:48 -0700 Received: from seattle.polstra.com (seattle.polstra.com [198.211.214.4]) by freefall.freebsd.org (8.6.12/8.6.6) with SMTP id OAA08837 for ; Tue, 17 Oct 1995 14:28:42 -0700 Received: from phaeton.artisoft.com by seattle.polstra.com with smtp (Smail3.1.28.1 #5) id m0t5JYp-000079C; Tue, 17 Oct 95 14:28 PDT Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA28400; Tue, 17 Oct 1995 14:23:44 -0700 From: Terry Lambert Message-Id: <199510172123.OAA28400@phaeton.artisoft.com> Subject: Re: getdtablesize() broken? To: jdp@polstra.com (John Polstra) Date: Tue, 17 Oct 1995 14:23:44 -0700 (MST) Cc: freebsd-hackers@polstra.com In-Reply-To: from "John Polstra" at Oct 17, 95 09:34:00 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 5732 Sender: owner-hackers@FreeBSD.org Precedence: bulk > > > Man pages only say that if the host does not support millisecond > > > accuracy then the value is rounded up to the nearest legal value > > > available. > > > > 10ms is the argument resoloution. On Solaris, it's 10ms, while select() > > is till 4uS. select() wins. 8-). > > Why do you say "10ms is the argument resoloution"? The man pages > explicitly say that the timeout is specified in milliseconds. Simple > tests indicate that the man pages are accurate. What is the basis for > your claim? The SVR4.2 man pages. You are reading the Sun man pages. Stop it. 8-). 1ms is too low a resolution in any case. I have apps that need 200uS; Colorado Memory systems has one that needs ~100uS (or a buzz loop). Sun hardware supports in the neighborhood of 4uS. That includes the printf in the test program and console scrolling. You are suggesting that 250 times less resoloution is acceptable, or that 2500 times less resoloution is acceptable, in the 10ms case. BTW: If you have a SVR4 or Solaris source license, look at the implementation of poll(). It uses the non-HRT timer services, which for non-RT processes are limited to a system quantum resoloution on a 100Hz timer. 1S/100 = 10ms. The inability to support setitimer/getitimer resoloutions at the system clock frequency is the basis of my claim that UnixWare snd SVR4.x are not in fact SVID compliant. Solaris wasn't, but I reported it and they fixed it, as well as fixing the ABI select() resoloution for statically, and later dynamically linked programs that call select(). One fault that Sun does not have is ignoring bug reports about standards conformance. If you have Sun hardware on hand that runs Solaris, I suggest you use gettimeofday (a statistically maintained value good to 4uS on most hardware after the SPARCStation 1+) to check your poll resoloution. I'd suggest the same for SVR4, but the system clock update frequency on SVR4 renders the gettimeofday() limited to 10ms resoloution (this is in fact acceptable under SVID, it's just bloody useless for profiling unless all you do is statistical profiling -- ie: how fast can I hit my cache). > > > [Responding to a claim by Terry that poll doesn't support simple timed > > > waits not involving file descriptors] > > > > > > That hasn't been my experience. poll(0, NULL, 10000); waits 10 seconds > > > on SunOS, all SVR4-en I have here, HPUX, and AIX; however Digital Unix's > > > poll looses. In fact in SVR4 select(3) is implemented using poll(2). > > > > That's a bug in SVR4. SVR4 is broken and bogus in many, many ways. > > You're confusing me. First you say that poll is no good because it > doesn't support simple timed waits. Then somebody points out that you > were wrong, and poll does in fact support that. So then you say that > polls which work that way are broken. > > Poll is broken, because it fails to exhibit the broken behavior which you > originally claimed it had? Have I got this right? No, poll is no good because it doesn't support *sufficient resoloution* on simple timed waits. In this case, "sufficient" is defined as being "equal to setitimer/getitimer". For the BSD case, this means that the interface would have to be changed. In addition, the select(2) call in BSD reserves the right to modify the timeval structure to indicate the remaining time to allow the use of the timeout as an even outcall mechanism for logical multithreading. The poll(2) call does not show the time remaining on the time in the non-timeout case. Poll is broken on Mentat-derived streams sources for no FD's present in the argument list. I think early versions of Lachman streams (ie: SCO) also have this bug. I don't know if Lachman has corrected them (neither does the former Lachman employee in the next office). AIX no longer has this bug (as of 3.1? 3.2?) because we reported it to IBM when we were using that mechanism for SAP daemons in the NetWare for UNIX 4.x code. I personally corrected the problem on the VMS version of Mentat Streams, and either Linda Shelton or Alan Clark of Novell personally corrected the problem in the Ultrix version. Basically, all DEC OS's that support poll() use Mentat sources, and then some, so you can't write portable code that depends on that behaviour. > > SVR4 is broken and bogus in many, many ways. > > Of course SVR4 is broken. FreeBSD is broken. Linux is broken. VMS is > broken. They're all broken, in one way or another. That doesn't > automatically mean that their every feature is broken. I said "broken and bogus". FreeBSD isn't bogus. 8-). If you can't depend on the feature working, it's effectively "not there". You have to decide then whether you want working code or portable code, or if you can find a different soloution. > I've used both select and poll in many, many applications. They both got > the job done for me. Your applications have obviously used timer resoloutions of 10ms or greater, or you were unaware of them rounding your value to 0 and causing a non-blocking check (and eating all free system time, if this was something's main control loop). UnixWare actually rounds non-zero values up to 10ms, which is, in fact, a violation of the select(3) documenation (but not the poll(2) documentation). This type of problem is non-obvious. I'm not blaming you for missing the non-conformance problems: without kernel sources, they are almost impossible to track. But they do exist, and I will scream bloody murder any time someone says they don't, especially if they say that instead of fixing the things. Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.