Date: Fri, 15 Nov 1996 17:00:53 -0600 (CST) From: Joe Greco <jgreco@brasil.moneng.mei.com> To: terry@lambert.org (Terry Lambert) Cc: jgreco@brasil.moneng.mei.com, terry@lambert.org, jdp@polstra.com, scrappy@ki.net, hackers@FreeBSD.org Subject: Re: Sockets question... Message-ID: <199611152300.RAA29354@brasil.moneng.mei.com> In-Reply-To: <199611152155.OAA27106@phaeton.artisoft.com> from "Terry Lambert" at Nov 15, 96 02:55:30 pm
next in thread | previous in thread | raw e-mail | index | archive | help
> > If I want to do a > > > > write(fd, buf, 1048576) > > > > on a socket connected via a 9600 baud SLIP link, I might expect the system > > call to take around 1092 seconds. If I have a server process dealing with > > two such sockets, response time will be butt slow if the server is > > currently writing to the other socket... it has to wait for the write to > > complete because write(2) has to finish sending the entire 1048576 bytes. > > Actually, write will return when the data has been copied into the > local transmit buffers, not when it has actually been sent. It's > only when you run out of local transmit buffers that the write blocks. Yes, that should be clear, I made it clear that this is precisely what allows non-blocking sockets to be useful in this scenario. > And well it should: something needs to tell the server process to > quit making calls which the kernel is unable to satisfy. Halting > the server process based on resource unavailability does this. So does returning EWOULDBLOCK to the server process, allowing the server to react to this by going on to service someone else. > > So a clever software author does not do this. He has 1048576 bytes of > > (different, even) data that he wants to write "simultaneously" to two > > sockets. He wants to do the equivalent of Sun's > > > > aiowrite(fd1, buf1, 1048576, SEEK_CUR, 0, NULL); > > aiowrite(fd2, buf2, 1048576, SEEK_CUR, 0, NULL); > > Yes. This is *exactly* what he wants to do. > > > Well how the hell do you do THAT if you are busy blocked in a write call? > > He uses a native aiowrite(). Which doesn't exist in a portable fashion. ANYWHERE. > Or he wants to call a write from a thread dedicated to that client, > which may block the thread, but not the process, and therefore not > other writes. Which is fine IF you have a threads implementation. Which is, again, not a given, and therefore, not portable. > The underlying implementation may use non-blocking I/O, or it may use > an OS implementation of aiowrote (like Sun's SunOS 4.3 LWP user space > threads library provided). It doesn't matter. That's the point of > using threads. Yes, well, the point of using threads is currently that you're not really assured of being portable. I do not disagree that in an ideal world, threads are a good way to deal with this. > > Well, you use non-blocking I/O... and you take advantage of the fact that > > the OS is capable of buffering some data on your behalf. > > > > Let's say you have "buf1" and "buf2" to write to "fd1" and "fd2", and "len1" > > and "len2" for the size of the corresponding buf's. > > > > You write code to do the following: > > > > rval = write(fd1, buf1, len1) # Wrote 2K of data > > len1 -= rval; # 1046528 bytes remain > > buf1 += rval; # Move forward 2K in buffer > [ ... ] > > You can trivially do this with a moderately complex select() mechanism, > > so that the outbound buffers for both sockets are kept filled. > > > This is exactly the finite state automaton I was talking about > having to move into user space code in order to use the interface. > > It makes things more complex for the user space programmer. So? Making things more complex is a small tradeoff if it makes it POSSIBLE to do something in the first place. Tell me, how else do you do this on a system that does NOT support threads? You can select() on writability and send one byte at a time on a blocking socket until select() reports no further writability. Poor solution. > > A little hard to do without nonblocking sockets. Very useful. I don't > > think that this is a "stupid idea" at all. > > Maybe not compared to being unable to do it at all... but BSD is not > limited this way. We have threads. _FREE_BSD is not limited this way. _FREE_BSD has threads. The local 4.3BSD Tahoe system (it _is_ a BSD system, I hope you would agree) offers nonblocking writes but does not offer threads. Ultrix does not offer threads. I am sure there are other examples... You are missing the point as usual. BSD != FreeBSD, and FreeBSD != UNIX in general. I am continually amazed that someone like you could make that error... In order to write portable code, one must write portable code. > > > What is the point of a non-blocking write if this is what happens? > > > > I will leave that as your homework for tonite. > > Answer: for writes in a multiple client server. Ahhhh. You got it. > Extra credit: the failure case that originated this discussion was > concerned with a client using read. That is not very relevant. The statement which originated _THIS_ discussion was your assertion that "Non-blocking sockets for reliable stream protocols like TCP/IP are a stupid idea." I do not care about Karl's problem... he may well have a legitimate problem, and I agreed that it was probably beyond the scope of a usage discussion given his description. I do not care about Marc's problem... that is a separate issue. I am simply correcting a misconception that you are spreading that non-blocking sockets are a "stupid idea". > > Please tell that to FreeBSD's FTP server, which uses a single (blocking) > > write to perform delivery of data. > > > > Why should an application developer have to know or care what the available > > buffer space is? Please tell me where in write(2) and read(2) it says I > > must worry about this. > > > > It doesn't. > > Exactly my point on a socket read not returning until it completes. Yes, that's fine. I agree that there are merits on both sides. The read() returning what is available is probably more generally useful, and that seems to be what is implemented. I am not going to argue with the design and implementation of the Berkeley networking code, since it is widely considered to be the standard model for networking. Most other folks have not found this to be a critical design flaw, and neither do I. I can see several cases where a blocking read() call would be a substantial nuisance, and so I think that the behaviour as it exists makes a fair amount of sense. > > > Indeterminate sockets are evil. They are on the order of not knowing > > > your lock state when entering into a function that's going to need > > > the lock held. > > > > I suppose you have never written a library function. > > > > I suppose you do not subscribe to the philosophy that you should be > > liberal in what you accept (in this case, assume that you may need to > > deal with either type of socket). > > If I wrote a library function which operated on a nonu user-opaque > object like a socket set up by the user, then it would function for > all potential valid states in which that object could be at the time > of the call. For potential invalid states, I would trap the ones > which I could identify from subfunction returns, and state that the > behaviour for other invalid states was "undefined" in the documentation > which I published with the library (ie: optimise for the success case). What do you define "potential valid states" to be? I do not claim to cover all the bases all the time, but I do at least catch exceptional conditions I was not expecting. In my case, I would try to write a socket-handling library function to handle both blocking and non-blocking sockets if it was reasonably practical to do so. If not, I would cause it to bomb if it detected something odd. I think you are saying the same thing: that is good. > More likely, I would encapsulate the object using an opaque data > type, and I would expect the users who wish to consume my interface > to obtain an object of that type, operate on the object with my > functions, and release the object when done. In other words, I > would employ standard data encapsulation techniques. Nifty. That's even possible in many cases if you are designing from scratch. Otherwise, it is a real pain in the butt. > > I wonder if anyone has ever rewritten one of your programs, and made > > a fundamental change that silently broke one of your programs because > > an underlying concept was changed. > > Unlikely. I document my assumptions. So what? If I, as the engineer who replaces you five years down the road, decide that your program needs to use non-blocking writes, and I change the program to do them, and I miss one place where you failed to check a return value, your "documented assumptions" are worth diddly squat. Code your assumptions when they are this trivial to check. > > Any software author who writes code and does not perform reasonable > > sanity checks on the return value, particularly for something as important > > as the read and write system calls, is hanging a big sign around their > > neck saying "Kick Me I Code Worth Shit". > > On the other hand, "do not test for an error condition which you can > not handle". One can handle ANY error condition by bringing it to the attention of a higher authority. My UNIX kernel panicks when it hits a condition that it does not know how to handle. It does not foolishly take your advice and "do not test for an error condition which you can not handle". To do so would risk great havoc. You ALWAYS test for error conditions, PARTICULARLY the ones which you can not handle - because they are the really scary ones. > If as part of my rundown in a program, I go to close a file, and the > close fails, what should I do about it? Not exit? Give me a break... No, but if a close() fails, and you had a reasonable expectation for it to succeed, printing a warning is not unreasonable. According to SunOS, there are two reasons this could happen: EBADF and EINTR. If you are closing an inactive descriptor, it is clearly an error in the code, and I WOULD CERTAINLY WANT TO KNOW. If it is due to a signal, it is unclear what to do, but it is certainly not a "bad" idea to at least be aware that such a thing can (and has) happened! > > > It bothers me too... I am used to formatting my IPC data streams. I > > > either use fixed length data units so that the receiver can post a > > > fixed size read, or I use a fix length data unit, and guarantee write > > > ordering by maintaining state. I do this in order to send a fixed > > > length header to indicate that I'm writing a variable length packet, > > > so the receiver can then issue a blocking read for the right size. > > > > I have never seen that work as expected with a large data size. > > I have never seen *any* IPC transport work (reliably) with large data > sizes... depending on your definition of large. To deal with this, > you can only encapsulate the transport and handle them, or don't use > large data sizes in the first place. Okay, here we are in complete agreement. One _always_ needs to be aware of this, then. ... JG
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199611152300.RAA29354>