From owner-freebsd-net@FreeBSD.ORG Thu Jul 8 11:58:45 2010 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BEAB1065670 for ; Thu, 8 Jul 2010 11:58:45 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 7A9988FC1C for ; Thu, 8 Jul 2010 11:58:43 +0000 (UTC) Received: (qmail 37813 invoked from network); 8 Jul 2010 10:34:57 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 8 Jul 2010 10:34:57 -0000 Message-ID: <4C35BD7A.8020408@freebsd.org> Date: Thu, 08 Jul 2010 13:58:50 +0200 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.10) Gecko/20100512 Thunderbird/3.0.5 MIME-Version: 1.0 To: Kostik Belousov References: <7C3D15DD6E8F464998CA1470D8A322F302BB9F72@ES02CO.wgti.net> <20100707205041.GO13238@deviant.kiev.zoral.com.ua> <4C358843.5000001@freebsd.org> <4C358A01.8080206@icyb.net.ua> <20100708082943.GB2439@deviant.kiev.zoral.com.ua> <4C358EE5.8070003@icyb.net.ua> <20100708094255.GA2408@deviant.kiev.zoral.com.ua> <4C35B7C4.40507@freebsd.org> <20100708114705.GC2408@deviant.kiev.zoral.com.ua> In-Reply-To: <20100708114705.GC2408@deviant.kiev.zoral.com.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: lstewart@freebsd.org, bz@freebsd.org, Ming Fu , Andriy Gapon , freebsd-net@freebsd.org Subject: Re: kern/123095 kern/131602 sendfile X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 Jul 2010 11:58:45 -0000 On 08.07.2010 13:47, Kostik Belousov wrote: > On Thu, Jul 08, 2010 at 01:34:28PM +0200, Andre Oppermann wrote: >> On 08.07.2010 11:42, Kostik Belousov wrote: >>> On Thu, Jul 08, 2010 at 11:40:05AM +0300, Andriy Gapon wrote: >>>> on 08/07/2010 11:29 Kostik Belousov said the following: >>>>> Right, the patch maps the page in sf buffer read-only (on i386 only). >>>>> But note the parallel posting with m_cat() change. It is still not >>>>> enough, >>>>> and I am not set up for the real network testing ATM. >>>> >>>> Could you also try to experiment with mb_dupcl? >>>> Namely transfer M_RDONLY from source mbuf. >>> >>> Right, it is it. >>> >>> Below is my current patch including debugging facilities that seems to >>> work. >>> Real changes that needed are in m_cat and mb_dupcl. >>> >> ... >>> diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c >>> index f41eb03..1701ef2 100644 >>> --- a/sys/kern/uipc_mbuf.c >>> +++ b/sys/kern/uipc_mbuf.c >>> @@ -301,7 +301,7 @@ mb_dupcl(struct mbuf *n, struct mbuf *m) >>> n->m_ext.ext_size = m->m_ext.ext_size; >>> n->m_ext.ref_cnt = m->m_ext.ref_cnt; >>> n->m_ext.ext_type = m->m_ext.ext_type; >>> - n->m_flags |= M_EXT; >>> + n->m_flags |= M_EXT | (M_RDONLY& m->m_flags); >>> } >> >> Having the M_EXT flag always implies readonly and M_WRITABLE gets this >> right. Not inheriting all the flags from the source seems questionable. >> So IMHO this should be done here: >> >> n->m_flags |= (M_EXT | m->m_flags) >> >>> /* >>> @@ -911,7 +911,8 @@ m_cat(struct mbuf *m, struct mbuf *n) >>> m = m->m_next; >>> while (n) { >>> if (m->m_flags& M_EXT || >>> - m->m_data + m->m_len + n->m_len>=&m->m_dat[MLEN]) { >>> + m->m_data + m->m_len + n->m_len>=&m->m_dat[MLEN] || >>> + !M_WRITABLE(m)) { >> >> Here you can fully replace the (m->m_flags& M_EXT) test with >> M_WRITABLE(). The M_EXT test is included in it. >> >>> /* just join the two chains */ >>> m->m_next = n; >>> return; > > The patch is below. Works as well for me, thank you for the feedback. You may want to run the change to m_dupcl() by rwatson just to be sure. It should be the right thing to do but better have another mbuf expert look at it. Does this fix the sendfile corruption problem for all cases or just the panic for the loopback case? > diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c > index f41eb03..58567a4 100644 > --- a/sys/kern/uipc_mbuf.c > +++ b/sys/kern/uipc_mbuf.c > @@ -301,7 +301,7 @@ mb_dupcl(struct mbuf *n, struct mbuf *m) > n->m_ext.ext_size = m->m_ext.ext_size; > n->m_ext.ref_cnt = m->m_ext.ref_cnt; > n->m_ext.ext_type = m->m_ext.ext_type; > - n->m_flags |= M_EXT; > + n->m_flags |= M_EXT | m->m_flags; > } > > /* > @@ -910,7 +910,7 @@ m_cat(struct mbuf *m, struct mbuf *n) > while (m->m_next) > m = m->m_next; > while (n) { > - if (m->m_flags& M_EXT || > + if (!M_WRITABLE(m) || > m->m_data + m->m_len + n->m_len>=&m->m_dat[MLEN]) { > /* just join the two chains */ > m->m_next = n; -- Andre