Date: Wed, 13 Jan 2016 12:23:03 -0800 From: Adrian Chadd <adrian.chadd@gmail.com> To: Karim Fodil-Lemelin <fodillemlinkarim@gmail.com> Cc: freebsd-ipfw@freebsd.org, freebsd-net <freebsd-net@freebsd.org> Subject: Re: ipfw NAT, igb and hardware checksums Message-ID: <CAJ-Vmo=ZGWYCa_v%2BkPs9vUpoYNQG4yQCxQza7Oh=q2c2K0tKNw@mail.gmail.com> In-Reply-To: <5696ABBE.4050709@gmail.com> References: <5696ABBE.4050709@gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This looks mostly sensible. hm! -a On 13 January 2016 at 11:55, Karim Fodil-Lemelin <fodillemlinkarim@gmail.com> wrote: > Hi, > > I've hit a very interesting problem with ipfw-nat and local TCP traffic that > has enough TCP options to hit a special case in m_megapullup(). Here is the > story: > > I am using the following NIC: > > igb0@pci0:4:0:0: class=0x020000 card=0x00008086 chip=0x150e8086 > rev=0x01 hdr=0x00 > > And when I do ipfw nat to locally emitted packets I see packets not being > processed in the igb driver for HW checksum. Now a quick search for m_pullup > in the igb driver code will show that our igb driver expects a contiguous > ethernet + ip header in igb_tx_ctx_setup(). Now the friendly m_megapullup() > in alias.c doesn't reserve any space before the ip header for the ethernet > header after its call to m_getcl like tcp_output.c (see m->m_data += > max_linkhdr in tcp_output.c). > > So the call to M_PREPEND() in ether_output() is forced to prepend a new mbuf > for the ethernet header, leading to a non contiguous ether + ip. This in > turn leads to a failure to properly read the IP protocol in the igb driver > and apply the proper HW checksum function. Particularly this call in > igb_tcp_ctx_setup(): ip = (struct ip *)(mp->m_data + ehdrlen); > > To reproduce the issue I simply create a NAT rule for an igb interface and > initiate a TCP connection locally going out through that interface (it > should go through NAT obviously) something like: > > ipfw nat 1 config igb0 reset > ipfw add 10 nat 1 via igb0 > > Although you need to make sure you fill enough of the SYN packet to trigger > the allocation of new memory in m_megapullup. You can do this by using > enough TCP options so its filling up almost all of the 256 mbuf or make > RESERVE something like 300 bytes in alias.c. > > The fix I propose is very simple and faster for all drivers, including the > ones that do perform a check for ether + ip to be contiguous upon accessing > the IP header. If the leading space is available it doesn't allocate any > extra space (as it should for most cases) but if for some reason the mbuf > used doesn't have 100 bytes (RESERVE in megapullup) of free space it will > reserve some at the front too. If the leading space isn't necessary then it > won't cause any harm. > > > -Subproject commit cfe39807fe9b1a23c13f73aabde302046736fa1c > +Subproject commit cfe39807fe9b1a23c13f73aabde302046736fa1c-dirty > diff --git a/freebsd/sys/netinet/libalias/alias.c > b/freebsd/sys/netinet/libalias/alias.c > index 876e958..dc424a6 100644 > --- a/freebsd/sys/netinet/libalias/alias.c > +++ b/freebsd/sys/netinet/libalias/alias.c > @@ -1757,7 +1757,8 @@ m_megapullup(struct mbuf *m, int len) { > * writable and has some extra space for expansion. > * XXX: Constant 100bytes is completely empirical. */ > #define RESERVE 100 > - if (m->m_next == NULL && M_WRITABLE(m) && M_TRAILINGSPACE(m) >= RESERVE) > + if (m->m_next == NULL && M_WRITABLE(m) && > + M_TRAILINGSPACE(m) >= RESERVE && M_LEADINGSPACE(m) >= > max_linkhdr) > return (m); > > if (len <= MCLBYTES - RESERVE) { > @@ -1779,6 +1780,7 @@ m_megapullup(struct mbuf *m, int len) { > goto bad; > > m_move_pkthdr(mcl, m); > + mcl->m_data += max_linkhdr; > m_copydata(m, 0, len, mtod(mcl, caddr_t)); > mcl->m_len = mcl->m_pkthdr.len = len; > m_freem(m); > > It would be nice if some FBSD comitter could review and hopefully add this > patch to FBSD. > > Thank you, > > Karim. > _______________________________________________ > freebsd-net@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-Vmo=ZGWYCa_v%2BkPs9vUpoYNQG4yQCxQza7Oh=q2c2K0tKNw>