From owner-freebsd-net@FreeBSD.ORG Tue Mar 5 14:03:57 2013 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0075E801 for ; Tue, 5 Mar 2013 14:03:56 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 77294120 for ; Tue, 5 Mar 2013 14:03:56 +0000 (UTC) Received: (qmail 41221 invoked from network); 5 Mar 2013 15:17:45 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 5 Mar 2013 15:17:45 -0000 Message-ID: <5135FB48.1000809@freebsd.org> Date: Tue, 05 Mar 2013 15:03:52 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Lawrence Stewart Subject: Re: Bug in sbsndptr() References: <512CBADB.3050004@freebsd.org> <5134CD5D.6090107@freebsd.org> <513564AD.7000006@freebsd.org> In-Reply-To: <513564AD.7000006@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-net@freebsd.org" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Mar 2013 14:03:57 -0000 On 05.03.2013 04:21, Lawrence Stewart wrote: > On 03/05/13 03:35, Andre Oppermann wrote: >> On 26.02.2013 14:38, Lawrence Stewart wrote: >>> Hi Andre, >> >> Hi Lawrence, :-) >> >>> A colleague and I spent a very frustrating day tracing an accounting bug >>> in the multipath TCP patch we're working on at CAIA to a bug in >>> sbsndptr(). I haven't tested it with regular TCP yet, but I believe the >>> following patch fixes the bug (proposed commit log message is at the top >>> of the patch): >>> >>> http://people.freebsd.org/~lstewart/patches/misctcp/sbsndptr_mnext_10.x.r247314.diff >>> >>> >>> The patch should have no tangible effect to operation other than to >>> ensure the function delivers on the promise to return the closest mbuf >>> in the chain for the given offset. >> >> I agree that the description of sbsndptr() can be misleading as it refers >> to the point in time when the pointer was updated last. Relative to now >> the real offset may be at the beginning of the next mbuf. > > Right, and we ran into the issue because we made an assumption based on > the use of the present tense in the comment: > > "Return closest mbuf in chain for current offset." I apologize for the incorrect and misleading description. :-) >> As you note in the proposed commit message by the time the send pointer >> is calculated we may have reached the end of the chain and must avoid >> storing a NULL pointer. The mbuf copy routines simply skips over the >> additional mbuf in the chain using the returned offset. >> >> I wonder how this has caused trouble with your multipath patch. You'd >> have to copy the sockbuf contents as well and unless you're using custom >> sockbuf and mbuf chain functions this shouldn't be a problem. Using >> custom functions on a socket buffer is a delicate approach. For a sockbuf >> consumer being able to handle valid offsets into an mbuf chain is a core >> feature and must-have part of the functionality. > > No custom sockbuf or mbuf routines are in use. We've implemented a > mapping shim between subflows and the socket buffer. When a subflow asks > the multipath layer for some data to send, the multipath layer returns a > mapping onto the socket buffer, which will remain valid until such time > as the subflow has marked the mapped data as acknowledged. > > Part of the map accounting is tracking the pointer of the first mbuf in > the sockbuf where the map's data begins. Our accounting assumed the mbuf > + the offset returned by sbsndptr had data available, which is how we > triggered the problem. We could have accounted for the issue in our new > map accounting code, but that would add additional complexity to some > already complex code and the better solution is to make sbsndptr DTRT. So effectively you run a separate sbsndptr for each subflow using the real sbsndptr to track the head of the queue? /me fears the day a mptcp import comes up. tcp-complexity^^3. :-o >>> I would appreciate a review and any thoughts. >> >> I think you have found a valid (micro-)optimization. However you're >> still making a dangerous assumption in that the next mbuf is indeed >> the one you want. This may not be true in subtle ways when the chain >> contains m_len=0 mbufs in it. I'm not aware of it actually happening >> but it can't be ruled out either if custom sockbuf manipulation functions >> are in use. > > True, though I'm struggling to think why there would be m_len=0 mbufs > interspersed with m_len > 0 mbufs in a socket send buffer mbuf chain. sbcompress() doesn't allow for m_len=0 mbufs. This holds true as long as the sbappend functions are used. If not, we may get anything there. As long as nobody is using custom sockbuf appends we're safe. Because I first assumed from your description some custom sockbuf munging the guarantee wouldn't haven been there anymore. >> I'd recommend the following: >> have you custom sockbuf function handle forward seeking like the other >> m_copy() functions; and/or apply a patch along the (untested) example >> below. > > If you believe it is both correct and possible for m_len=0 mbufs to > exist in a socket buffer chain, then I agree that we should amend my > proposed patch to loop and skip over m_len=0 mbufs as you've suggested. No. So far it is neither possible or correct. > However, I'm more inclined to suspect it is undesirable and potentially > buggy behaviour to end up with m_len=0 mbufs in a socket buffer chain on > which sbsndptr is being used, and would instead suggest a > "KASSERT(ret->m_len > 0, (...));" be added to the end of my proposed if > block. Agreed. -- Andre