From owner-freebsd-net@FreeBSD.ORG Mon Mar 23 13:40:35 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C5AFA2A8 for ; Mon, 23 Mar 2015 13:40:35 +0000 (UTC) Received: from mail-ig0-x230.google.com (mail-ig0-x230.google.com [IPv6:2607:f8b0:4001:c05::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 8694DCE8 for ; Mon, 23 Mar 2015 13:40:35 +0000 (UTC) Received: by igbud6 with SMTP id ud6so42196311igb.1 for ; Mon, 23 Mar 2015 06:40:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=FukouFyuqlfFHkT64sZcUQRAoNi7xsPFx/Rqfgm8LoQ=; b=CEtWh54ooP0YrSIM5dkOE4HmJoqefzFqU0ACLm/tMn8mJlC5/2y8yyW8c+X13VXUZT BVz2lLmLMrqH8/CtfEz0l5jbLvloQK9serrkvAHihj1plu/Ld7s/SvLFQ0uKEK/S5qFa wM2vZ9DhQu2FAG5f0yhAD2ShvKXmZ+pU0mmq3vIcDZejh1uPR7NdAODqznG+RoWK5p44 GEPUzRBTvQOosR5R1jyKsnlpQStx9a21E16KilMqnvVz9Shdyam0HYCNm1SqDqnT15If BSFfxrDhJTC5VNxrwxtQWmtNOZHyjrbuAIayU9OwjO5dJPi4iiX5pKlQ5by5p8yYtqWj Dtgw== X-Received: by 10.50.39.65 with SMTP id n1mr14413102igk.37.1427118034886; Mon, 23 Mar 2015 06:40:34 -0700 (PDT) Received: from [10.10.1.5] ([192.252.130.194]) by mx.google.com with ESMTPSA id k4sm2986831igf.0.2015.03.23.06.40.34 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Mar 2015 06:40:34 -0700 (PDT) Message-ID: <551017D1.60008@gmail.com> Date: Mon, 23 Mar 2015 09:40:33 -0400 From: Karim Fodil-Lemelin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: freebsd-net@freebsd.org Subject: Re: Another fragment question / patch References: <550C3A62.3080403@gmail.com> <550C5F6C.3080302@selasky.org> <550C7D0C.3090603@gmail.com> <550D3675.5030900@selasky.org> In-Reply-To: <550D3675.5030900@selasky.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2015 13:40:35 -0000 On 2015-03-21 5:14 AM, Hans Petter Selasky wrote: > On 03/20/15 21:03, Karim Fodil-Lemelin wrote: >> On 2015-03-20 1:57 PM, Hans Petter Selasky wrote: >>> On 03/20/15 16:18, Karim Fodil-Lemelin wrote: >>>> Hi, >>>> >>>> While reading through a previous comment on this list about fragments >>>> I've noticed that mbuf tags aren't being copied from the leading >>>> fragment (header) to the subsequent fragment packets. In other words, >>>> one would expect that all fragments of a packet are carrying the same >>>> tags that were set on the original packet. I have built a simple test >>>> were I use ipfw with ALTQ and sent large packet (bigger then MTU) off >>>> that BSD machine. I have observed that the leading fragment (m0) >>>> packet >>>> is going through the right class although the next fragments are >>>> hitting >>>> the default class for unclassified packets. >>>> >>>> Here is a patch that makes things works as expected (all fragments >>>> carry >>>> the ALTQ tag): >>>> >>>> diff --git a/freebsd/sys/netinet/ip_output.c >>>> b/freebsd/sys/netinet/ip_output.c >>>> index d650949..7d8f041 100644 >>>> --- a/freebsd/sys/netinet/ip_output.c >>>> +++ b/freebsd/sys/netinet/ip_output.c >>>> @@ -1184,7 +1184,10 @@ smart_frag_failure: >>>> ipstat.ips_odropped++; >>>> goto done; >>>> } >>>> - m->m_flags |= (m0->m_flags & M_MCAST) | M_FRAG; >>>> + >>>> + m->m_flags |= (m0->m_flags & M_COPYFLAGS) | M_FRAG; >>>> + m_tag_copy_chain(m, m0, M_NOWAIT); >>>> + >>>> /* >>>> * In the first mbuf, leave room for the link >>>> header, then >>>> * copy the original IP header including options. The >>>> payload >>>> diff --git a/freebsd/sys/sys/mbuf.h b/freebsd/sys/sys/mbuf.h >>>> index 2efff38..6ad8439 100644 >>>> --- a/freebsd/sys/sys/mbuf.h >>>> >>> >>> Hi, >>> >>> I see your point about copying the tags. I'm not sure however that >>> M_COPYFLAGS is correct, because it also copies M_RDONLY, which is not >>> relevant for this case. Can you explain what flags need copying in >>> addition to M_MCAST ? Maybe we need to define these flags separately. >>> >>> Thank you! >>> >>> --HPS >> Hi, >> >> Arguably the M_RDONLY is added when m_copy() is called a bit later in >> that function. m_copym() does a shallow copy (through a call to >> mb_dupcl) and will set the RDONLY flag when doing so. So the fact it was >> copied over from M_COPYFLAGS shouldn't be a problem in terms of >> functionality. A similar logic applies to the M_VLANTAG since it should >> never be set in ip_output() (severe layer violation). I guess >> M_COPYFLAGS is kinda safe but not necessarily correct. >> >> In terms of appropriate behavior (whats the real purpose of >> M_COPYFLAGS?) my initial patch is debatable and to answer your question >> I would consider to copy the remaining flags: >> >> M_PKTHDR => already in there through the m_gethdr() call >> M_BCAST => no need to copy >> M_MCAST => already in there in ip_fragment() >> M_PROTOFLAGS >> M_SKIP_FIREWALL => for layer 2 fire-walling? >> >> So something like? >> >> - m->m_flags |= (m0->m_flags & M_MCAST); >> + m->m_flags |= (m0->m_flags & (M_MCAST | M_PROTOFLAGS)); >> + m_tag_copy_chain(m, m0, M_NOWAIT); >> > > Hi, > > Could you have a look at the attached patch, test and give some > comments? I believe the right thing to do in this case is to use the > copy packet header function. > > --HPS > Hi, Your patch seems to work as well as mine, albeit completely swinging the other way now since your copying a lot more from the packet header with the call to m_dup_pkthdr() (including the M_COPYFLAGS, vlan tags, etc..). I think this is fine, one would expect IP fragments to be somewhat very close to the original packet. Thanks, Karim.