From owner-freebsd-current@FreeBSD.ORG Thu Sep 21 11:44:44 2006 Return-Path: X-Original-To: freebsd-current@FreeBSD.org Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CD41C16A40F; Thu, 21 Sep 2006 11:44:44 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (cell.sick.ru [217.72.144.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id E7D4243D69; Thu, 21 Sep 2006 11:44:43 +0000 (GMT) (envelope-from glebius@FreeBSD.org) Received: from cell.sick.ru (glebius@localhost [127.0.0.1]) by cell.sick.ru (8.13.4/8.13.3) with ESMTP id k8LBiVpc024095 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 21 Sep 2006 15:44:32 +0400 (MSD) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.sick.ru (8.13.4/8.13.1/Submit) id k8LBiV5U024094; Thu, 21 Sep 2006 15:44:31 +0400 (MSD) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.sick.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Thu, 21 Sep 2006 15:44:31 +0400 From: Gleb Smirnoff To: Andre Oppermann Message-ID: <20060921114431.GF27667@FreeBSD.org> Mail-Followup-To: Gleb Smirnoff , Andre Oppermann , freebsd-current@freebsd.org, freebsd-net@freebsd.org, alc@freebsd.org, tegge@freebsd.org, gallatin@cs.duke.edu References: <4511B9B1.2000903@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: <4511B9B1.2000903@freebsd.org> User-Agent: Mutt/1.5.6i Cc: alc@FreeBSD.org, freebsd-net@FreeBSD.org, freebsd-current@FreeBSD.org, tegge@FreeBSD.org, gallatin@cs.duke.edu Subject: Re: Much improved sendfile(2) kernel implementation X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Sep 2006 11:44:44 -0000 Andre, On Wed, Sep 20, 2006 at 11:59:13PM +0200, Andre Oppermann wrote: A> I have rewritten kern_sendfile() to work in two loops, the inner which turns A> as many pages into mbufs as it can up to the free send socket buffer space. A> The outer loop then drops the whole mbuf chain into the send socket buffer, A> calls tcp_output() on it and then waits until 50% of the socket buffer are A> free again to repeat the cycle. This way tcp_output() gets the full amount A> of data to work with and can issue up to 64K sends for TSO to chop up in the A> network adapter without using any CPU cycles. Thus it gets very efficient A> especially with the readahead the VM and I/O system do. A> A> The patch is available here: A> http://people.freebsd.org/~andre/sendfile-20060920.diff I see that mbuf allocations are done in different way depending on the SS_NBIO flag on socket. This isn't true for the current code, so your patch isn't only opmization, but is also changing the semantics of the syscall. This should be avoided in single change/commit. The non-blocking IO definition is related to the blocking on the socket buffer. It isn't related to blocking on memory, so this change is incorrect: - m_header = m_uiotombuf(hdr_uio, M_DONTWAIT, 0, 0); - if (m_header == NULL) + m = m_uiotombuf(hdr_uio, (nbio ? M_NOWAIT : M_WAITOK), + 0, 0); + if (m == NULL) { + error = nbio ? EAGAIN : ENOBUFS; There should be unconditional M_NOWAIT. Oops, the M_DONTWAIT in the current code is incorrect. It is present since rev. 1.171. If the m_uiotombuf() fails the current code returns from syscall without error! Before rev. 1.171, there wasn't m_uiotombuf(), the mbuf header was allocated below, with correct wait argument. The wait argument for m_uiotombuf() should be changed to M_WAITOK, but in a separate commit. And this one: + m0 = m_get((nbio ? M_NOWAIT : M_WAITOK), MT_DATA); + if (m0 == NULL) { + error = (nbio ? EAGAIN : ENOBUFS); + sf_buf_mext((void *)sf_buf_kva(sf), sf); + break; + } This one should be M_WAITOK always. It is M_TRYWAIT (equal to M_WAITOK) in the current code. Is there RELENG_6 version of your patch. If there is one, we can test it quite extensively. -- Totus tuus, Glebius. GLEBIUS-RIPN GLEB-RIPE