Date: Sat, 23 Oct 2010 13:10:58 +0000 (UTC) From: "Bjoern A. Zeeb" <bz@FreeBSD.org> To: Andre Oppermann <andre@FreeBSD.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r212803 - head/sys/netinet Message-ID: <20101023131021.C66242@maildrop.int.zabbadoz.net> In-Reply-To: <201009172205.o8HM5RPG043265@svn.freebsd.org> References: <201009172205.o8HM5RPG043265@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 17 Sep 2010, Andre Oppermann wrote: > Author: andre > Date: Fri Sep 17 22:05:27 2010 > New Revision: 212803 > URL: http://svn.freebsd.org/changeset/base/212803 > > Log: > Rearrange the TSO code to make it more readable and to clearly > separate the decision logic, of whether we can do TSO, and the > calculation of the burst length into two distinct parts. > > Change the way the TSO burst length calculation is done. While > TSO could do bursts of 65535 bytes that can't be represented in > ip_len together with the IP and TCP header. Account for that and > use IP_MAXPACKET instead of TCP_MAXWIN as base constant (both > have the same value of 64K). When more data is available prevent > less than MSS sized segments from being sent during the current > TSO burst. > > Add two more KASSERTs to ensure the integrity of the packets. > > Tested by: Ben Wilber <ben-at-desync com> > MFC after: 10 days As this hasn't happned yet, please do not do. It breaks things. I'll follow-up later as soon as I have more details. > Modified: > head/sys/netinet/tcp_output.c > > Modified: head/sys/netinet/tcp_output.c > ============================================================================== > --- head/sys/netinet/tcp_output.c Fri Sep 17 21:53:56 2010 (r212802) > +++ head/sys/netinet/tcp_output.c Fri Sep 17 22:05:27 2010 (r212803) > @@ -465,9 +465,8 @@ after_sack_rexmit: > } > > /* > - * Truncate to the maximum segment length or enable TCP Segmentation > - * Offloading (if supported by hardware) and ensure that FIN is removed > - * if the length no longer contains the last data byte. > + * Decide if we can use TCP Segmentation Offloading (if supported by > + * hardware). > * > * TSO may only be used if we are in a pure bulk sending state. The > * presence of TCP-MD5, SACK retransmits, SACK advertizements and > @@ -475,10 +474,6 @@ after_sack_rexmit: > * (except for the sequence number) for all generated packets. This > * makes it impossible to transmit any options which vary per generated > * segment or packet. > - * > - * The length of TSO bursts is limited to TCP_MAXWIN. That limit and > - * removal of FIN (if not already catched here) are handled later after > - * the exact length of the TCP options are known. > */ > #ifdef IPSEC > /* > @@ -487,22 +482,15 @@ after_sack_rexmit: > */ > ipsec_optlen = ipsec_hdrsiz_tcp(tp); > #endif > - if (len > tp->t_maxseg) { > - if ((tp->t_flags & TF_TSO) && V_tcp_do_tso && > - ((tp->t_flags & TF_SIGNATURE) == 0) && > - tp->rcv_numsacks == 0 && sack_rxmit == 0 && > - tp->t_inpcb->inp_options == NULL && > - tp->t_inpcb->in6p_options == NULL > + if ((tp->t_flags & TF_TSO) && V_tcp_do_tso && len > tp->t_maxseg && > + ((tp->t_flags & TF_SIGNATURE) == 0) && > + tp->rcv_numsacks == 0 && sack_rxmit == 0 && > #ifdef IPSEC > - && ipsec_optlen == 0 > + ipsec_optlen == 0 && > #endif > - ) { > - tso = 1; > - } else { > - len = tp->t_maxseg; > - sendalot = 1; > - } > - } > + tp->t_inpcb->inp_options == NULL && > + tp->t_inpcb->in6p_options == NULL) > + tso = 1; > > if (sack_rxmit) { > if (SEQ_LT(p->rxmit + len, tp->snd_una + so->so_snd.sb_cc)) > @@ -732,28 +720,53 @@ send: > * bump the packet length beyond the t_maxopd length. > * Clear the FIN bit because we cut off the tail of > * the segment. > - * > - * When doing TSO limit a burst to TCP_MAXWIN minus the > - * IP, TCP and Options length to keep ip->ip_len from > - * overflowing. Prevent the last segment from being > - * fractional thus making them all equal sized and set > - * the flag to continue sending. TSO is disabled when > - * IP options or IPSEC are present. > */ > if (len + optlen + ipoptlen > tp->t_maxopd) { > flags &= ~TH_FIN; > + > if (tso) { > - if (len > TCP_MAXWIN - hdrlen - optlen) { > - len = TCP_MAXWIN - hdrlen - optlen; > - len = len - (len % (tp->t_maxopd - optlen)); > + KASSERT(ipoptlen == 0, > + ("%s: TSO can't do IP options", __func__)); > + > + /* > + * Limit a burst to IP_MAXPACKET minus IP, > + * TCP and options length to keep ip->ip_len > + * from overflowing. > + */ > + if (len > IP_MAXPACKET - hdrlen) { > + len = IP_MAXPACKET - hdrlen; > + sendalot = 1; > + } > + > + /* > + * Prevent the last segment from being > + * fractional unless the send sockbuf can > + * be emptied. > + */ > + if (sendalot && off + len < so->so_snd.sb_cc) { > + len -= len % (tp->t_maxopd - optlen); > sendalot = 1; > - } else if (tp->t_flags & TF_NEEDFIN) > + } > + > + /* > + * Send the FIN in a separate segment > + * after the bulk sending is done. > + * We don't trust the TSO implementations > + * to clear the FIN flag on all but the > + * last segment. > + */ > + if (tp->t_flags & TF_NEEDFIN) > sendalot = 1; > + > } else { > len = tp->t_maxopd - optlen - ipoptlen; > sendalot = 1; > } > - } > + } else > + tso = 0; > + > + KASSERT(len + hdrlen + ipoptlen <= IP_MAXPACKET, > + ("%s: len > IP_MAXPACKET", __func__)); > > /*#ifdef DIAGNOSTIC*/ > #ifdef INET6 > @@ -1068,6 +1081,9 @@ send: > m->m_pkthdr.tso_segsz = tp->t_maxopd - optlen; > } > > + KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL), > + ("%s: mbuf chain shorter than expected", __func__)); > + > /* > * In transmit state, time the transmission and arrange for > * the retransmit. In persist state, just set snd_max. > -- Bjoern A. Zeeb Welcome a new stage of life. <ks> Going to jail sucks -- <bz> All my daemons like it! http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101023131021.C66242>