Date: Sat, 17 Nov 2007 23:18:34 -0500 From: Mike Andrews <mandrews@bit0.com> To: Kip Macy <kip.macy@gmail.com> Cc: Denis Shaposhnikov <dsh@vlink.ru>, Mike Silbersack <silby@freebsd.org>, Andre Oppermann <andre@freebsd.org>, freebsd-current@freebsd.org Subject: Re: bizarre em + TSO + MSS issue in RELENG_7 Message-ID: <473FBD1A.8010207@bit0.com> In-Reply-To: <b1fa29170711171804x36e4ae51ie03d01e4bc0220ac@mail.gmail.com> References: <20071117003504.R31357@mindcrime.int.bit0.com> <20071117213316.499be43b@vlink.ru> <b1fa29170711171308x62a6371dnbb939748c5c59ae2@mail.gmail.com> <20071117170537.F59492@mindcrime.int.bit0.com> <b1fa29170711171519r65473426s1b9f3d9666ff6a92@mail.gmail.com> <20071117182232.T59492@mindcrime.int.bit0.com> <b1fa29170711171619x24233a3cw4361e0f3ca395e4c@mail.gmail.com> <473F9552.50402@bit0.com> <b1fa29170711171804x36e4ae51ie03d01e4bc0220ac@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Kip Macy wrote: > On Nov 17, 2007 5:28 PM, Mike Andrews <mandrews@bit0.com> wrote: >> Kip Macy wrote: >>> On Nov 17, 2007 3:23 PM, Mike Andrews <mandrews@bit0.com> wrote: >>>> On Sat, 17 Nov 2007, Kip Macy wrote: >>>> >>>>> On Nov 17, 2007 2:33 PM, Mike Andrews <mandrews@bit0.com> wrote: >>>>>> On Sat, 17 Nov 2007, Kip Macy wrote: >>>>>> >>>>>>> On Nov 17, 2007 10:33 AM, Denis Shaposhnikov <dsh@vlink.ru> wrote: >>>>>>>> On Sat, 17 Nov 2007 00:42:54 -0500 (EST) >>>>>>>> Mike Andrews <mandrews@bit0.com> wrote: >>>>>>>> >>>>>>>>> Has anyone run into problems with MSS not being respected when using >>>>>>>>> TSO, specifically on em cards? >>>>>>>> Yes, I wrote about this problem on the beginning of 2007, see >>>>>>>> >>>>>>>> http://tinyurl.com/3e5ak5 >>>>>>>> >>>>>>> if_em.c:3502 >>>>>>> /* >>>>>>> * Payload size per packet w/o any headers. >>>>>>> * Length of all headers up to payload. >>>>>>> */ >>>>>>> TXD->tcp_seg_setup.fields.mss = htole16(mp->m_pkthdr.tso_segsz); >>>>>>> TXD->tcp_seg_setup.fields.hdr_len = hdr_len; >>>>>>> >>>>>>> >>>>>>> Please print out the value of tso_segsz here. It appears to be being >>>>>>> set correctly. The only thing I can think of is that t_maxopd is not >>>>>>> correct. As tso_segsz is correct here: >>>>>> It repeatedly prints 1368 during a 1 meg file transfer over a connection >>>>>> with a 1380 MSS. Any other printf's I can add? I'm working on a web page >>>>>> with tcpdump / firewall log output illustrating the issue... >>>>> Mike - >>>>> Denis' tcpdump output doesn't show oversized segments, something else >>>>> appears to be happening there. Can you post your tcpdump output >>>>> somewhere? >>>> URL sent off-list. >>> if (tso) { >>> m->m_pkthdr.csum_flags = CSUM_TSO; >>> m->m_pkthdr.tso_segsz = tp->t_maxopd - optlen; >>> } >>> >>> >>> Please print the value of maxopd and optlen under "if (tso)" in >>> tcp_output. I think the calculated optlen may be too small. >> >> maxopt=1380 - optlen=12 = tso_segsz=1368 >> >> Weird though, after this reboot, I had to re-copy a 4 meg file 5 times >> to start getting the firewall to log any drops. Transfer rate was >> around 240KB/sec before the firewall started to drop, then it went down >> to about 64KB/sec during the 5th copy, and stayed there for subsequent >> copies. The actual packet size the firewall said it was dropping was >> varying all over the place still, yet the maxopt/optlen/tso_segsz values >> stayed constant. But it's interesting that it didn't start dropping >> immediately after the reboot -- though the transfer rate was still >> sub-optimal. > > Ok, next theory :D. You shouldn't be seeing "bad len" packets from > tcpdump. I'm wondering if that means you're sending down more than > 64k. Can you please print out the value of mp->m_pkthdr.len around the > same place that you printed out tso_segsz? 64k is the generally > accepted limit for TSO, I'm wondering if the card firmware does > something weird if you give it more. OK. In that last message, where I said it took 5 times to start reproducing the problem... this time it took until I actually toggled TSO back off and back on again, and then it started acting up again. I don't know what the actual trigger is... it's very weird. Initially, w/ TSO on and it wasn't dropping yet (but was still transferring slow)... BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=8306 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=8306 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=8306 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=8306 (etc, always 8306) After toggling off/on which caused the drops to start (and the speed to drop even further): BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=7507 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=3053 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1677 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=3037 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=2264 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1656 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1902 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1888 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1640 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1871 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=2461 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=1849 BIT0 DEBUG: tso_segsz=1368 hdr_len=66 mp->m_pkthdr.len=2092 and so on, with more seemingly random lengths... but none of them ever over 8306, much less 64K.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?473FBD1A.8010207>