From owner-freebsd-net@FreeBSD.ORG Sat Mar 21 09:13:52 2015 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5350B246 for ; Sat, 21 Mar 2015 09:13:52 +0000 (UTC) Received: from mail.turbocat.net (heidi.turbocat.net [88.198.202.214]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id EA316CBF for ; Sat, 21 Mar 2015 09:13:51 +0000 (UTC) Received: from laptop015.home.selasky.org (cm-176.74.213.204.customer.telag.net [176.74.213.204]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.turbocat.net (Postfix) with ESMTPSA id A19F11FE022; Sat, 21 Mar 2015 10:13:42 +0100 (CET) Message-ID: <550D3675.5030900@selasky.org> Date: Sat, 21 Mar 2015 10:14:29 +0100 From: Hans Petter Selasky User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Karim Fodil-Lemelin , freebsd-net@freebsd.org Subject: Re: Another fragment question / patch References: <550C3A62.3080403@gmail.com> <550C5F6C.3080302@selasky.org> <550C7D0C.3090603@gmail.com> In-Reply-To: <550C7D0C.3090603@gmail.com> Content-Type: multipart/mixed; boundary="------------030400090601090601060304" X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Mar 2015 09:13:52 -0000 This is a multi-part message in MIME format. --------------030400090601090601060304 Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit On 03/20/15 21:03, Karim Fodil-Lemelin wrote: > On 2015-03-20 1:57 PM, Hans Petter Selasky wrote: >> On 03/20/15 16:18, Karim Fodil-Lemelin wrote: >>> Hi, >>> >>> While reading through a previous comment on this list about fragments >>> I've noticed that mbuf tags aren't being copied from the leading >>> fragment (header) to the subsequent fragment packets. In other words, >>> one would expect that all fragments of a packet are carrying the same >>> tags that were set on the original packet. I have built a simple test >>> were I use ipfw with ALTQ and sent large packet (bigger then MTU) off >>> that BSD machine. I have observed that the leading fragment (m0) packet >>> is going through the right class although the next fragments are hitting >>> the default class for unclassified packets. >>> >>> Here is a patch that makes things works as expected (all fragments carry >>> the ALTQ tag): >>> >>> diff --git a/freebsd/sys/netinet/ip_output.c >>> b/freebsd/sys/netinet/ip_output.c >>> index d650949..7d8f041 100644 >>> --- a/freebsd/sys/netinet/ip_output.c >>> +++ b/freebsd/sys/netinet/ip_output.c >>> @@ -1184,7 +1184,10 @@ smart_frag_failure: >>> ipstat.ips_odropped++; >>> goto done; >>> } >>> - m->m_flags |= (m0->m_flags & M_MCAST) | M_FRAG; >>> + >>> + m->m_flags |= (m0->m_flags & M_COPYFLAGS) | M_FRAG; >>> + m_tag_copy_chain(m, m0, M_NOWAIT); >>> + >>> /* >>> * In the first mbuf, leave room for the link >>> header, then >>> * copy the original IP header including options. The >>> payload >>> diff --git a/freebsd/sys/sys/mbuf.h b/freebsd/sys/sys/mbuf.h >>> index 2efff38..6ad8439 100644 >>> --- a/freebsd/sys/sys/mbuf.h >>> >> >> Hi, >> >> I see your point about copying the tags. I'm not sure however that >> M_COPYFLAGS is correct, because it also copies M_RDONLY, which is not >> relevant for this case. Can you explain what flags need copying in >> addition to M_MCAST ? Maybe we need to define these flags separately. >> >> Thank you! >> >> --HPS > Hi, > > Arguably the M_RDONLY is added when m_copy() is called a bit later in > that function. m_copym() does a shallow copy (through a call to > mb_dupcl) and will set the RDONLY flag when doing so. So the fact it was > copied over from M_COPYFLAGS shouldn't be a problem in terms of > functionality. A similar logic applies to the M_VLANTAG since it should > never be set in ip_output() (severe layer violation). I guess > M_COPYFLAGS is kinda safe but not necessarily correct. > > In terms of appropriate behavior (whats the real purpose of > M_COPYFLAGS?) my initial patch is debatable and to answer your question > I would consider to copy the remaining flags: > > M_PKTHDR => already in there through the m_gethdr() call > M_BCAST => no need to copy > M_MCAST => already in there in ip_fragment() > M_PROTOFLAGS > M_SKIP_FIREWALL => for layer 2 fire-walling? > > So something like? > > - m->m_flags |= (m0->m_flags & M_MCAST); > + m->m_flags |= (m0->m_flags & (M_MCAST | M_PROTOFLAGS)); > + m_tag_copy_chain(m, m0, M_NOWAIT); > Hi, Could you have a look at the attached patch, test and give some comments? I believe the right thing to do in this case is to use the copy packet header function. --HPS --------------030400090601090601060304 Content-Type: text/x-diff; name="ip_output.c.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ip_output.c.diff" Index: ip_output.c =================================================================== --- ip_output.c (revision 279890) +++ ip_output.c (working copy) @@ -787,11 +787,13 @@ IPSTAT_INC(ips_odropped); goto done; } - /* make sure the flowid is the same for the fragmented mbufs */ - M_HASHTYPE_SET(m, M_HASHTYPE_GET(m0)); - m->m_pkthdr.flowid = m0->m_pkthdr.flowid; - /* copy multicast flag, if any */ - m->m_flags |= (m0->m_flags & M_MCAST); + /* make sure the packet header gets copied */ + if (m_dup_pkthdr(m, m0, M_NOWAIT) == 0) { + m_free(m); + error = ENOBUFS; + IPSTAT_INC(ips_odropped); + goto done; + } /* * In the first mbuf, leave room for the link header, then * copy the original IP header including options. The payload @@ -821,11 +823,9 @@ goto done; } m->m_pkthdr.len = mhlen + len; - m->m_pkthdr.rcvif = NULL; #ifdef MAC mac_netinet_fragment(m0, m); #endif - m->m_pkthdr.csum_flags = m0->m_pkthdr.csum_flags; mhip->ip_off = htons(mhip->ip_off); mhip->ip_sum = 0; if (m->m_pkthdr.csum_flags & CSUM_IP & ~if_hwassist_flags) { --------------030400090601090601060304--