Date: Tue, 27 May 2003 21:46:54 +0400 (MSD) From: Igor Sysoev <is@rambler-co.ru> To: Terry Lambert <tlambert2@mindspring.com> Cc: arch@freebsd.org Subject: Re: sendfile(2) SF_NOPUSH flag proposal Message-ID: <Pine.BSF.4.21.0305272137250.49494-100000@is> In-Reply-To: <3ED38A13.524529B2@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 27 May 2003, Terry Lambert wrote: > Igor Sysoev wrote: > > I mean that if you have 230 bytes header then sendfile() will send it > > in separate packet nevertheless the size of header and of the file. > > Something like this - 230, 1460, 1460, ... > > Again, see other post: this is arguably a sendfile(2) bug, > though a reall minor one; one which should be addressed in > the sendfile(2) implementation, and doesn't need options > added to the API in order to address it. How do suppose to coelesce the file pages ? Wire two or more pages to mbuf's at once ? BTW I did not see how sendfile() work over jumbo ethernet. I suspect that without TCP_NOPUSH it sometimes sends 4096 or 8192 bytes packets instead of 9000. > > > > it will return me 230 bytes: > > > > > > The "HEAD" is atypical, compared to the "GET"; the full Google > > > front page is larger than that, and consists of multiple files; > > > assuming you support HTTP/1.1 and pipelining, it's going to be > > > a back-to-back transfer involving multiple sendfile() calls. > > > > I use HEAD to show you the size of the HTTP header. > > The HEAD is atypical but such small HTTP header is typical. > > Here is my problem: you are arguing both amortized cost and > total cost, depending on which is more supportive of your > main thesis. These arguments are seperate and orthogonal to > each other: they don't support each other. You can argue > tiny files, and a relatively high total cost, or you can argue > large files and pipelining, and a relatively high amortized > cost, but you can't argue both time and large files and > many connections and one connection at the same time. Terry, I do not understand you. My argument is simple - I want to avoid the partial packets because it decreases the number of packets. That's all. There's nothing about amortized cost or total cost. I do not even know what they are. > Personally, I'd step back and get the arguments straight, > and get an implementation that demonstrates statistically > significant performance differences, and then come back, if > I wanted to press the case for additional option flags. I > have done this several times in the past, e.g. with my soft > interrupt coelescing implementation that's now part of most > of the ethernet drivers people care about. > > Actually, in this case, I'd just try to fix sendfile(2) to > do the packet coelescing I'd expect, given the relative > state of the TCP_NODELAY and TCP_NOPUSH options flags. Actually, sendfile() already works according to TCP_NOPUSH flag. I do not know about TCP_NODELAY - I do not work with it. But if you turn TCP_NOPUSH on then sendfile() will send the full packets. If you turn TCP_NOPUSH off then sendfile() will send some packets partially filled. It's correct. > BTW: I'm still wary of the initial fault on the file data, if > it's not already in cache: arguably, it's better to start > sending the headers, and avoid the startup latency of delaying > sending the headers until the fault is satisfied: part of the > thing that's going to be eating your PCI bandwidth is the > disk I/O, and your disks are going to be the slowest data > sources/sinks in the whole equation. I agree but after all it's 20ms or so delay. > In any case, I expect that this should be handled in the > context of TCP_NODELAY and TCP_NOPUSH, rather than by adding > options to work around an arguably broken sendfile(2). sendfile() already works nice with TCP_NOPUSH. I propose only the flags that allow to turn TCP_NOPUSH (actually TF_NOPUSH) on/off inside sendfile(). Then in one syscall you can turn TCP_NOPUSH on, send the HTTP header, the file pages and turn TCP_NOPUSH off if all file pages are wired to mbuf's. And this TCP_NOPUSH state is not bound by sendfile() internals, you can control it via setsockopt/getsockopt(TCP_NOPUSH). Igor Sysoev http://sysoev.ru/en/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0305272137250.49494-100000>