From owner-freebsd-net@freebsd.org  Fri Oct 13 07:59:50 2017
Return-Path: <owner-freebsd-net@freebsd.org>
Delivered-To: freebsd-net@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 84986E4003D
 for <freebsd-net@mailman.ysv.freebsd.org>;
 Fri, 13 Oct 2017 07:59:50 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-wm0-x22e.google.com (mail-wm0-x22e.google.com
 [IPv6:2a00:1450:400c:c09::22e])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (Client CN "smtp.gmail.com",
 Issuer "Google Internet Authority G2" (verified OK))
 by mx1.freebsd.org (Postfix) with ESMTPS id 18D2C7C8E9;
 Fri, 13 Oct 2017 07:59:50 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: by mail-wm0-x22e.google.com with SMTP id t69so19111239wmt.2;
 Fri, 13 Oct 2017 00:59:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=cL+eA1qT8K0iO9RoNDdgOkSxhEJXF/1u1gl+BMTXfpw=;
 b=UJM+eiJNVyTcZci8NdOHW1E1EabEC8J54lKI5uisWPd5bE9ww+u3ugvWflOasB2W29
 SPQLCZeIwLMu9mDP8pgVjFEkg//JmRkFsx3W3YWMy1Dsl2RQmXxY3bJsWyVvtnkjJFP1
 /lVYp23eWuB3fCtzYrju0n2GVCyy6tdsxzCM8pzuVXWnV0GzhhKwgQaGUq9SShBV9PAa
 H4uXaL56m0yP4X42MCJ6oXULgkWoDYAuwEgDEDQItB1g3o9YT7kHizRl6Sry9u1oNqMI
 nxqdQBRfm0fQ9B1TDwoSk15dhO6atj5aQSomS8IvYDqC//+BdTfL0pbh91nkeQBG7son
 A7Lg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=cL+eA1qT8K0iO9RoNDdgOkSxhEJXF/1u1gl+BMTXfpw=;
 b=duNAdceeOSp9LEm8BUcb+xA5ClUhSJ5jNvbv+5ubpcweEmKmBHd0A/iAFVZ+6uTqp3
 niUeJ/ItS/w5siVnGiJXCpngoqb+kgLWZ2Grw5uNSYeeBjmbpcgF3fhYM1j3Tt5ohCci
 e4lWmM2tjOjm1esYaY+/jkNZARmTzuI7BApudg6XG6d9blxKnzOp00LkKJkjbTNYMLID
 K59VxIY9BKHm1igtrjnhBsC7wezRbIL2CqX0MaXyuCyNxPXemBR1KqlZIW8+KGkQOef7
 36LY+5QgdiOmQR2cQOdsCzPzm3kYH/xloNlDK1Vple3K2hPBGMcWGjtBRL52dNhnBRpN
 y4Nw==
X-Gm-Message-State: AMCzsaWNHpxRrHjH0wscW+bOimM+3z6icz9/ILZiO+LQqAWiE/uiICjq
 Tzc3SccU44M+IhifedAIGNvUwDaASBxjAGEGVBJCGA==
X-Google-Smtp-Source: AOwi7QD7N7D0E9gMFsHrIHC0/1R/Ugy4I+iPhl5wWDJThzLHqd0RLCpw7URBg+nEwCBUM9T3wvAPmtTC7A+zcWU6a6Y=
X-Received: by 10.223.151.151 with SMTP id s23mr579395wrb.44.1507881588299;
 Fri, 13 Oct 2017 00:59:48 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.28.86.70 with HTTP; Fri, 13 Oct 2017 00:59:47 -0700 (PDT)
In-Reply-To: <59DFD3CC.2000401@xiplink.com>
References: <59567148.1020902@xiplink.com>
 <CAJ-VmomhJVbZO-G1Ki2sg5Wxrn6xL-zYU1ggoEKS-qPGuocG2g@mail.gmail.com>
 <31535133-f95a-5db6-a04c-acc0175fa287@yandex.ru>
 <59DFD3CC.2000401@xiplink.com>
From: Adrian Chadd <adrian.chadd@gmail.com>
Date: Fri, 13 Oct 2017 00:59:47 -0700
Message-ID: <CAJ-Vmo=JhFwo+7FgsZUgQMwOSimcoS8zHL+AJFONKS-+tv7Eww@mail.gmail.com>
Subject: Re: m_move_pkthdr leaves m_nextpkt 'dangling'
To: Karim Fodil-Lemelin <kfodil-lemelin@xiplink.com>,
 Gleb Smirnoff <glebius@freebsd.org>
Cc: "Andrey V. Elsukov" <bu7cher@yandex.ru>,
 FreeBSD Net <freebsd-net@freebsd.org>
Content-Type: text/plain; charset="UTF-8"
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net/>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-net>,
 <mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Oct 2017 07:59:50 -0000

Gleb, what do you think?


-a


On 12 October 2017 at 13:42, Karim Fodil-Lemelin
<kfodil-lemelin@xiplink.com> wrote:
> On 2017-07-07 10:46 AM, Andrey V. Elsukov wrote:
>>
>> On 05.07.2017 19:23, Adrian Chadd wrote:
>>>>
>>>> As many of you know, when dealing with IP fragments the kernel will
>>>> build a
>>>> list of packets (fragments) chained together through the m_nextpkt
>>>> pointer.
>>>> This is all good until someone tries to do a M_PREPEND on one of the
>>>> packet
>>>> in the chain and the M_PREPEND has to create an extra mbuf to prepend at
>>>> the
>>>> beginning of the chain.
>>>>
>>>> When doing so m_move_pkthdr is called to copy the current PKTHDR fields
>>>> (tags and flags) to the mbuf that was prepended. The function also does:
>>>>
>>>> to->m_pkthdr = from->m_pkthdr;
>>>>
>>>> This, for the case I am interested in, essentially leaves the 'from'
>>>> mbuf
>>>> with a dangling pointer m_nextpkt pointing to the next fragment. While
>>>> this
>>>> is mostly harmless because only mbufs of pkthdr types are supposed to
>>>> have
>>>> m_nextpkt it triggers some panics when running with INVARIANTS in
>>>> NetGraph
>>>> (see ng_base.c :: CHECK_DATA_MBUF(m)):
>>>>
>>>> ...
>>>>                          if (n->m_nextpkt != NULL)
>>>> \
>>>>                                  panic("%s: m_nextpkt", __func__);
>>>> \
>>>>                  }
>>>> ...
>>>>
>>>> So I would like to propose the following patch:
>>>>
>>>> @@ -442,10 +442,11 @@ m_move_pkthdr(struct mbuf *to, struct mbuf *from)
>>>>          if ((to->m_flags & M_EXT) == 0)
>>>>                  to->m_data = to->m_pktdat;
>>>>          to->m_pkthdr = from->m_pkthdr;          /* especially tags */
>>>>          SLIST_INIT(&from->m_pkthdr.tags);       /* purge tags from src
>>>> */
>>>>          from->m_flags &= ~M_PKTHDR;
>>>> +       from->m_nextpkt = NULL;
>>>>   }
>>>>
>>>> It will reset the m_nextpkt so we don't have two mbufs pointing to the
>>>> same
>>>> next packet. This is fairly harmless and solves a problem for us here at
>>>> XipLink.
>>>
>>> This seems like a no-brainer. :-) Any objections?
>>
>> I think the change is reasonable.
>> But from other side m_demote_pkthdr() may also need this change.
>> Maybe we can wait when Gleb will be back and review this? Also he is the
>> author of the mentioned assertion in netgraph code.
>>
> Hi,
>
> Any updates on this one?
>
> There is another interesting patch I would like to share. This is regarding
> the m_tag_free function pointer in the m_tag structure.
>
> As it turns out, we use this field (m_tag_free) to track some mbuf tag at
> work and, in order to properly do reference counting on it, we had to modify
> m_tag_copy() the following way in order to keep the m_tag_free function
> pointer to point to the same function the original tag was pointing to (the
> code is a lot easier to understand than the text ...).
>
>
> @@ -437,6 +437,7 @@ m_tag_copy(struct m_tag *t, int how)
>         } else
>  #endif
>                 bcopy(t + 1, p + 1, t->m_tag_len); /* Copy the data */
> + p->m_tag_free = t->m_tag_free;      /* copy the 'free' function pointer */
>         return p;
>  }
>
> This is because m_tag_copy uses m_tag_alloc() that resets the m_tag_free
> pointer to m_tag_free_default. It would be nice if this could make its way
> into the mbuf tag base code.
>
> Best regards,
>
> Karim.
>
>