Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 18 Nov 2007 14:44:09 +0900
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        Mike Andrews <mandrews@bit0.com>
Cc:        Denis Shaposhnikov <dsh@vlink.ru>, Kip Macy <kip.macy@gmail.com>, Mike Silbersack <silby@freebsd.org>, Andre Oppermann <andre@freebsd.org>, freebsd-current@freebsd.org
Subject:   Re: bizarre em + TSO + MSS issue in RELENG_7
Message-ID:  <20071118054409.GA1044@cdnetworks.co.kr>
In-Reply-To: <473FBD1A.8010207@bit0.com>
References:  <20071117003504.R31357@mindcrime.int.bit0.com> <20071117213316.499be43b@vlink.ru> <b1fa29170711171308x62a6371dnbb939748c5c59ae2@mail.gmail.com> <20071117170537.F59492@mindcrime.int.bit0.com> <b1fa29170711171519r65473426s1b9f3d9666ff6a92@mail.gmail.com> <20071117182232.T59492@mindcrime.int.bit0.com> <b1fa29170711171619x24233a3cw4361e0f3ca395e4c@mail.gmail.com> <473F9552.50402@bit0.com> <b1fa29170711171804x36e4ae51ie03d01e4bc0220ac@mail.gmail.com> <473FBD1A.8010207@bit0.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--BOKacYhQ+x31HxR3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sat, Nov 17, 2007 at 11:18:34PM -0500, Mike Andrews wrote:
 > Kip Macy wrote:
 > >On Nov 17, 2007 5:28 PM, Mike Andrews <mandrews@bit0.com> wrote:
 > >>Kip Macy wrote:
 > >>>On Nov 17, 2007 3:23 PM, Mike Andrews <mandrews@bit0.com> wrote:
 > >>>>On Sat, 17 Nov 2007, Kip Macy wrote:
 > >>>>
 > >>>>>On Nov 17, 2007 2:33 PM, Mike Andrews <mandrews@bit0.com> wrote:
 > >>>>>>On Sat, 17 Nov 2007, Kip Macy wrote:
 > >>>>>>
 > >>>>>>>On Nov 17, 2007 10:33 AM, Denis Shaposhnikov <dsh@vlink.ru> wrote:
 > >>>>>>>>On Sat, 17 Nov 2007 00:42:54 -0500 (EST)
 > >>>>>>>>Mike Andrews <mandrews@bit0.com> wrote:
 > >>>>>>>>
 > >>>>>>>>>Has anyone run into problems with MSS not being respected when 
 > >>>>>>>>>using
 > >>>>>>>>>TSO, specifically on em cards?
 > >>>>>>>>Yes, I wrote about this problem on the beginning of 2007, see
 > >>>>>>>>
 > >>>>>>>>    http://tinyurl.com/3e5ak5
 > >>>>>>>>
 > >>>>>>>if_em.c:3502
 > >>>>>>>       /*
 > >>>>>>>        * Payload size per packet w/o any headers.
 > >>>>>>>        * Length of all headers up to payload.
 > >>>>>>>        */
 > >>>>>>>       TXD->tcp_seg_setup.fields.mss = 
 > >>>>>>>       htole16(mp->m_pkthdr.tso_segsz);
 > >>>>>>>       TXD->tcp_seg_setup.fields.hdr_len = hdr_len;
 > >>>>>>>
 > >>>>>>>
 > >>>>>>>Please print out the value of tso_segsz here. It appears to be being
 > >>>>>>>set correctly. The only thing I can think of is that t_maxopd is not
 > >>>>>>>correct. As tso_segsz is correct here:
 > >>>>>>It repeatedly prints 1368 during a 1 meg file transfer over a 
 > >>>>>>connection
 > >>>>>>with a 1380 MSS.  Any other printf's I can add?  I'm working on a web 
 > >>>>>>page
 > >>>>>>with tcpdump / firewall log output illustrating the issue...
 > >>>>>Mike -
 > >>>>>Denis' tcpdump output doesn't show oversized segments, something else
 > >>>>>appears to be happening there. Can you post your tcpdump output
 > >>>>>somewhere?
 > >>>>URL sent off-list.
 > >>>       if (tso) {
 > >>>               m->m_pkthdr.csum_flags = CSUM_TSO;
 > >>>               m->m_pkthdr.tso_segsz = tp->t_maxopd - optlen;
 > >>>       }
 > >>>
 > >>>
 > >>>Please print the value of maxopd and optlen under "if (tso)" in
 > >>>tcp_output. I think the calculated optlen may be too small.
 > >>
 > >>maxopt=1380 - optlen=12 = tso_segsz=1368
 > >>
 > >>Weird though, after this reboot, I had to re-copy a 4 meg file 5 times
 > >>to start getting the firewall to log any drops.  Transfer rate was
 > >>around 240KB/sec before the firewall started to drop, then it went down
 > >>to about 64KB/sec during the 5th copy, and stayed there for subsequent
 > >>copies.  The actual packet size the firewall said it was dropping was
 > >>varying all over the place still, yet the maxopt/optlen/tso_segsz values
 > >>stayed constant.  But it's interesting that it didn't start dropping
 > >>immediately after the reboot -- though the transfer rate was still
 > >>sub-optimal.
 > >
 > >Ok, next theory :D. You shouldn't be seeing "bad len" packets from
 > >tcpdump. I'm wondering if that means you're sending down more than
 > >64k. Can you please print out the value of mp->m_pkthdr.len around the
 > >same place that you printed out tso_segsz? 64k is the generally
 > >accepted limit for TSO, I'm wondering if the card firmware does
 > >something weird if you give it more.
 > 
 > OK.  In that last message, where I said it took 5 times to start 
 > reproducing the problem... this time it took until I actually toggled 
 > TSO back off and back on again, and then it started acting up again.  I 
 > don't know what the actual trigger is... it's very weird.
 > 
 > Initially, w/ TSO on and it wasn't dropping yet (but was still 
 > transferring slow)...
 > 
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=8306
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=8306
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=8306
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=8306
 > (etc, always 8306)
 > 
 > After toggling off/on which caused the drops to start (and the speed to 
 > drop even further):
 > 
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=7507
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=3053
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1677
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=3037
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=2264
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1656
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1902
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1888
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1640
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1871
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=2461
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=1849
 > BIT0 DEBUG: tso_segsz=1368  hdr_len=66  mp->m_pkthdr.len=2092
 > 
 > and so on, with more seemingly random lengths... but none of them ever 
 > over 8306, much less 64K.

It seems that em_tso_setup() doesn't clear txd_upper/txd_lower in
failure path so that unintialized value could be used in subsequent
Tx descriptor setup.  
How about clearing those variable?(Patch attached)

It seems that em(4) uses EM_TSO_SIZE(64K) to create DMA tag. A packet
can have 64K payload under TSO so its the mximum size of the mbuf
chain would be 64K + sizeof(link layer). So I guess the EM_TSO_SIZE
should be increased to hold sizeof(link layer).
It had been a long time since I looked into em(4) so I'm not sure.

-- 
Regards,
Pyun YongHyeon

--BOKacYhQ+x31HxR3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="em.tso.patch"

Index: if_em.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/em/if_em.c,v
retrieving revision 1.184
diff -u -r1.184 if_em.c
--- if_em.c	10 Sep 2007 21:50:40 -0000	1.184
+++ if_em.c	18 Nov 2007 05:42:35 -0000
@@ -1791,6 +1791,7 @@
 	m_head = *m_headp;
 
 	/* Do hardware assists */
+	txd_upper = txd_lower = 0;
 	if (em_tso_setup(adapter, m_head, &txd_upper, &txd_lower))
 		/* we need to make a final sentinel transmit desc */
 		tso_desc = TRUE;

--BOKacYhQ+x31HxR3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20071118054409.GA1044>