From owner-freebsd-arch@FreeBSD.ORG Tue May 27 11:36:27 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 91A3C37B401 for ; Tue, 27 May 2003 11:36:27 -0700 (PDT) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8406E43FA3 for ; Tue, 27 May 2003 11:36:26 -0700 (PDT) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h4RIaPmF023358; Tue, 27 May 2003 22:36:25 +0400 (MSD) Date: Tue, 27 May 2003 22:36:25 +0400 (MSD) From: Igor Sysoev X-Sender: is@is To: Terry Lambert In-Reply-To: <3ED38C2B.DEA23AB8@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: sendfile(2) SF_NOPUSH flag proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 May 2003 18:36:28 -0000 On Tue, 27 May 2003, Terry Lambert wrote: > But this call should not be necessary. Internally, the > sendfile(2) implementation should treat the headers + > file contents + trailers as a single stream. Your problem > is that the implementation of sendfile(2) sucks and is not > doing this, not that you need to set TCP_NOPUSH to avoid > seperation of three back-to-back transmits: you don't *have* > three back-to-back transmits here, you have only *one* > transmit. > > Would you expect a writev(2) operation to break up each of > the chunks described by the vector into seperate back-to-back > transmits? If not, why do you expect sendfile(2) to do it? Yes, I agree that sendfile() should work as writev(). > > The turing TF_NOPUSH off has almost the same overhead as > > setsockopt(TCP_NOPUSH, 0) if you need to call tcp_output(tp) inside > > sendfile(2) and has no overhead at all if you do not need to call it. > > The problem is that you need to break tcp_output() into a > couple of routines, OR you need to not call it on the > headers, file data, and trailers seperately. No, tcp_output() is called only once and only if the data in the send buffer is less than MSS: sendfile() { if (flags & SF_NOPUSH) { tp->t_flags |= TF_NOPUSH; } writev(header); send file pages; writev(trailer); if (error == 0 && flags & SF_PUSH) { tp->t_flags &= ~TF_NOPUSH; if (so->so_snd.sb_cc < tp->t_maxseg) { error = tcp_output(tp); } } } > I think we can all make up our own stories, where the overhead > could become important enough for a specific application that > we wouldn't complain about you eliminating it so you could do > your application, as long as it doesn't negatively impact the > rest of us (say, by adding non-standard sendfile(2) flags that > no one else supports, if that isn't the only possible way to > solve the problem). sendfile(2) is completly non-standard thing. Among FreeBSD, Linux, Solaris, HP/UX and AIX no one has even similar prototypes. And all of them have different functionality. > I don't think overhead is the issue, at this point: say we agree > with you on overhead, for your particular application, and we are > not against you solving your overhead problem: why exactly does > the API have to change to fix the root cause of the problem? I do not propose the change of the API, I propose the source and binary compatible addition. Igor Sysoev http://sysoev.ru/en/