From owner-freebsd-hackers Fri Nov 15 14:09:39 1996 Return-Path: owner-hackers Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA20604 for hackers-outgoing; Fri, 15 Nov 1996 14:09:39 -0800 (PST) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA20588 for ; Fri, 15 Nov 1996 14:09:32 -0800 (PST) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA27106; Fri, 15 Nov 1996 14:55:30 -0700 From: Terry Lambert Message-Id: <199611152155.OAA27106@phaeton.artisoft.com> Subject: Re: Sockets question... To: jgreco@brasil.moneng.mei.com (Joe Greco) Date: Fri, 15 Nov 1996 14:55:30 -0700 (MST) Cc: terry@lambert.org, jgreco@brasil.moneng.mei.com, jdp@polstra.com, scrappy@ki.net, hackers@FreeBSD.org In-Reply-To: <199611152014.OAA28769@brasil.moneng.mei.com> from "Joe Greco" at Nov 15, 96 02:14:47 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-hackers@FreeBSD.org X-Loop: FreeBSD.org Precedence: bulk > If I want to do a > > write(fd, buf, 1048576) > > on a socket connected via a 9600 baud SLIP link, I might expect the system > call to take around 1092 seconds. If I have a server process dealing with > two such sockets, response time will be butt slow if the server is > currently writing to the other socket... it has to wait for the write to > complete because write(2) has to finish sending the entire 1048576 bytes. Actually, write will return when the data has been copied into the local transmit buffers, not when it has actually been sent. It's only when you run out of local transmit buffers that the write blocks. And well it should: something needs to tell the server process to quit making calls which the kernel is unable to satisfy. Halting the server process based on resource unavailability does this. > So a clever software author does not do this. He has 1048576 bytes of > (different, even) data that he wants to write "simultaneously" to two > sockets. He wants to do the equivalent of Sun's > > aiowrite(fd1, buf1, 1048576, SEEK_CUR, 0, NULL); > aiowrite(fd2, buf2, 1048576, SEEK_CUR, 0, NULL); Yes. This is *exactly* what he wants to do. > Well how the hell do you do THAT if you are busy blocked in a write call? He uses a native aiowrite(). Or he wants to call a write from a thread dedicated to that client, which may block the thread, but not the process, and therefore not other writes. The underlying implementation may use non-blocking I/O, or it may use an OS implementation of aiowrote (like Sun's SunOS 4.3 LWP user space threads library provided). It doesn't matter. That's the point of using threads. > Well, you use non-blocking I/O... and you take advantage of the fact that > the OS is capable of buffering some data on your behalf. > > Let's say you have "buf1" and "buf2" to write to "fd1" and "fd2", and "len1" > and "len2" for the size of the corresponding buf's. > > You write code to do the following: > > rval = write(fd1, buf1, len1) # Wrote 2K of data > len1 -= rval; # 1046528 bytes remain > buf1 += rval; # Move forward 2K in buffer [ ... ] > You can trivially do this with a moderately complex select() mechanism, > so that the outbound buffers for both sockets are kept filled. This is exactly the finite state automaton I was talking about having to move into user space code in order to use the interface. It makes things more complex for the user space programmer. > A little hard to do without nonblocking sockets. Very useful. I don't > think that this is a "stupid idea" at all. Maybe not compared to being unable to do it at all... but BSD is not limited this way. We have threads. > > What is the point of a non-blocking write if this is what happens? > > I will leave that as your homework for tonite. Answer: for writes in a multiple client server. Extra credit: the failure case that originated this discussion was concerned with a client using read. > Please tell that to FreeBSD's FTP server, which uses a single (blocking) > write to perform delivery of data. > > Why should an application developer have to know or care what the available > buffer space is? Please tell me where in write(2) and read(2) it says I > must worry about this. > > It doesn't. Exactly my point on a socket read not returning until it completes. > > Indeterminate sockets are evil. They are on the order of not knowing > > your lock state when entering into a function that's going to need > > the lock held. > > I suppose you have never written a library function. > > I suppose you do not subscribe to the philosophy that you should be > liberal in what you accept (in this case, assume that you may need to > deal with either type of socket). If I wrote a library function which operated on a nonu user-opaque object like a socket set up by the user, then it would function for all potential valid states in which that object could be at the time of the call. For potential invalid states, I would trap the ones which I could identify from subfunction returns, and state that the behaviour for other invalid states was "undefined" in the documentation which I published with the library (ie: optimise for the success case). More likely, I would encapsulate the object using an opaque data type, and I would expect the users who wish to consume my interface to obtain an object of that type, operate on the object with my functions, and release the object when done. In other words, I would employ standard data encapsulation techniques. > I wonder if anyone has ever rewritten one of your programs, and made > a fundamental change that silently broke one of your programs because > an underlying concept was changed. Unlikely. I document my assumptions. > Any software author who writes code and does not perform reasonable > sanity checks on the return value, particularly for something as important > as the read and write system calls, is hanging a big sign around their > neck saying "Kick Me I Code Worth Shit". On the other hand, "do not test for an error condition which you can not handle". If as part of my rundown in a program, I go to close a file, and the close fails, what should I do about it? Not exit? Give me a break... > > It bothers me too... I am used to formatting my IPC data streams. I > > either use fixed length data units so that the receiver can post a > > fixed size read, or I use a fix length data unit, and guarantee write > > ordering by maintaining state. I do this in order to send a fixed > > length header to indicate that I'm writing a variable length packet, > > so the receiver can then issue a blocking read for the right size. > > I have never seen that work as expected with a large data size. I have never seen *any* IPC transport work (reliably) with large data sizes... depending on your definition of large. To deal with this, you can only encapsulate the transport and handle them, or don't use large data sizes in the first place. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.