Date: Mon, 26 May 2003 17:19:36 -0700 From: Terry Lambert <tlambert2@mindspring.com> To: Igor Sysoev <is@rambler-co.ru> Cc: arch@freebsd.org Subject: Re: sendfile(2) SF_NOPUSH flag proposal Message-ID: <3ED2AF18.F5EB4FA5@mindspring.com> References: <Pine.BSF.4.21.0305262140570.45552-100000@is>
next in thread | previous in thread | raw e-mail | index | archive | help
Igor Sysoev wrote: > sendfile(2) now has two drawbacks: Only two? ;^). > 1) it always sends the header, the file and the trailer in the separate > packets even their sizes allow to place all them in one packet. > For example the typical HTTP response header is less then an ethernet > packet and sendfile() sends it in first small packet. > > 2) often enough it sends 4K page in three packets: 1460, 1460 and 1176 bytes. > > When I turn TCP_NOPUSH on just before sendfile() then it sends the header > and the first part of the file in one 1460 bytes packet. > Besides it sends file pages in the full ethernet 1460 bytes packets. > When sendfile() completed or returned EAGAIN (I use non-blocking sockets) > I turn TCP_NOPUSH off and the remaining file part is flushed to client. > Without turing off the remaining file part is delayed for 5 seconds. OK, basically what is happening is that the data is being pushed out as it's made available, and it's being made available in seperate chunks. The small file case is not really the optimum case for using the sendfile interface at all. The problem here is that you have a send queue depth limit on the sockets, and it's expected that the file will end up exceeding this, so it's going to get buffered anyway, due to a buffer size limit stall on the send side of the socket. > So here is a proposal. We can introduce a sendfile(2) flag, i.e. SF_NOPUSH > that will turn TF_NOPUSH on before the sending and turn it off just > before return. It allows to save two syscalls on each sendfile() call > and it's especially useful with non-blocking sockets - they can cause many > sendfile() calls. I don't see this as being terrifically useful; small files should probably just be mapped and written; the copy expense is still there for the headers and trailers, no matter what, and the file size itself is very small overhead, relatively speaking, for files small enough for this to be an issue. I also think your headers and trailers are very small, if they are fitting with the file contents in a single packet. I think this is atypical. On the other hand, if you want to add a flag for this, I say "knock yourself out" -- go ahead and add the flag; it's not really going to benefit you that much, but it's not going to really hurt any of the rest of us either, so there's really no reason to make you not do it. 8-). BTW: if you go ahead with this, you should verify that it also works for the trailers, etc., and you should probably skip it if you headers > transmit queue depth, or file size > transmit queue depth, or trailers > transmit queue depth. -- Terry
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3ED2AF18.F5EB4FA5>