Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 24 Oct 2010 19:35:34 +0000 (UTC)
From:      "Bjoern A. Zeeb" <bz@FreeBSD.org>
To:        Andre Oppermann <andre@FreeBSD.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r212803 - head/sys/netinet
Message-ID:  <20101024191523.R66242@maildrop.int.zabbadoz.net>
In-Reply-To: <20101023131021.C66242@maildrop.int.zabbadoz.net>
References:  <201009172205.o8HM5RPG043265@svn.freebsd.org> <20101023131021.C66242@maildrop.int.zabbadoz.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 23 Oct 2010, Bjoern A. Zeeb wrote:

> On Fri, 17 Sep 2010, Andre Oppermann wrote:
>
>> Author: andre
>> Date: Fri Sep 17 22:05:27 2010
>> New Revision: 212803
>> URL: http://svn.freebsd.org/changeset/base/212803
>> 
>> Log:
>>  Rearrange the TSO code to make it more readable and to clearly
>>  separate the decision logic, of whether we can do TSO, and the
>>  calculation of the burst length into two distinct parts.
>>
>>  Change the way the TSO burst length calculation is done. While
>>  TSO could do bursts of 65535 bytes that can't be represented in
>>  ip_len together with the IP and TCP header. Account for that and
>>  use IP_MAXPACKET instead of TCP_MAXWIN as base constant (both
>>  have the same value of 64K). When more data is available prevent
>>  less than MSS sized segments from being sent during the current
>>  TSO burst.
>>
>>  Add two more KASSERTs to ensure the integrity of the packets.


I changed the order for simplicity of explanations.

>> @@ -1068,6 +1081,9 @@ send:
>> 		m->m_pkthdr.tso_segsz = tp->t_maxopd - optlen;
>> 	}
>> 
>> +	KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
>> +	    ("%s: mbuf chain shorter than expected", __func__));
>> +

I have hit this KASSERT() and it was unhelpful to figure out what
was wrong in first place.  I had hit it with a TCP6 connect initially.

Adter I changed it I realized that it was due to IPSEC and not due to IPv6.
So I have two local changes now, neither of which I particularly like
but I didn't want to revert ipoptlen in the IPSEC case though it might
be the better option if INVARIANTS is on, as it's not used anymore
further on but it would obviously confusing as well.

+#ifdef IPSEC
+       KASSERT(len + hdrlen + ipoptlen - ipsec_optlen == m_length(m, NULL),
+           ("%s: mbuf chain shorter than expected: %ld + %u + %u - %u != %u",
+           __func__, len, hdrlen, ipoptlen, ipsec_optlen, m_length(m, NULL)));
+#else
         KASSERT(len + hdrlen + ipoptlen == m_length(m, NULL),
-           ("%s: mbuf chain shorter than expected", __func__));
+           ("%s: mbuf chain shorter than expected: %ld + %u + %u != %u",
+           __func__, len, hdrlen, ipoptlen, m_length(m, NULL)));
+#endif

>> 	/*
>> 	 * In transmit state, time the transmission and arrange for
>> 	 * the retransmit.  In persist state, just set snd_max.



The second worry I have is ...

>
>
>> Modified:
>>  head/sys/netinet/tcp_output.c
>> 
>> Modified: head/sys/netinet/tcp_output.c
>> ==============================================================================
>> --- head/sys/netinet/tcp_output.c	Fri Sep 17 21:53:56 2010 
>> (r212802)
>> +++ head/sys/netinet/tcp_output.c	Fri Sep 17 22:05:27 2010 
>> (r212803)
>> @@ -465,9 +465,8 @@ after_sack_rexmit:
>> 	}
>>
>> 	/*
>> -	 * Truncate to the maximum segment length or enable TCP Segmentation
>> -	 * Offloading (if supported by hardware) and ensure that FIN is 
>> removed
>> -	 * if the length no longer contains the last data byte.
>> +	 * Decide if we can use TCP Segmentation Offloading (if supported by
>> +	 * hardware).
>> 	 *
>> 	 * TSO may only be used if we are in a pure bulk sending state.  The
>> 	 * presence of TCP-MD5, SACK retransmits, SACK advertizements and
>> @@ -475,10 +474,6 @@ after_sack_rexmit:
>> 	 * (except for the sequence number) for all generated packets.  This
>> 	 * makes it impossible to transmit any options which vary per 
>> generated
>> 	 * segment or packet.
>> -	 *
>> -	 * The length of TSO bursts is limited to TCP_MAXWIN.  That limit and
>> -	 * removal of FIN (if not already catched here) are handled later 
>> after
>> -	 * the exact length of the TCP options are known.
>> 	 */
>> #ifdef IPSEC
>> 	/*
>> @@ -487,22 +482,15 @@ after_sack_rexmit:
>> 	 */
>> 	ipsec_optlen = ipsec_hdrsiz_tcp(tp);
>> #endif
>> -	if (len > tp->t_maxseg) {
>> -		if ((tp->t_flags & TF_TSO) && V_tcp_do_tso &&
>> -		    ((tp->t_flags & TF_SIGNATURE) == 0) &&
>> -		    tp->rcv_numsacks == 0 && sack_rxmit == 0 &&
>> -		    tp->t_inpcb->inp_options == NULL &&
>> -		    tp->t_inpcb->in6p_options == NULL
>> +	if ((tp->t_flags & TF_TSO) && V_tcp_do_tso && len > tp->t_maxseg &&
>> +	    ((tp->t_flags & TF_SIGNATURE) == 0) &&
>> +	    tp->rcv_numsacks == 0 && sack_rxmit == 0 &&
>> #ifdef IPSEC
>> -		    && ipsec_optlen == 0
>> +	    ipsec_optlen == 0 &&
>> #endif
>> -		    ) {
>> -			tso = 1;
>> -		} else {
>> -			len = tp->t_maxseg;
>> -			sendalot = 1;
>> -		}
>> -	}
>> +	    tp->t_inpcb->inp_options == NULL &&
>> +	    tp->t_inpcb->in6p_options == NULL)
>> +		tso = 1;

Given the fact that in the TSO case we still check for
 	len > tp->t_maxseg
there is the assumption that it can happen.

Unfortunately in the non-TSO we no longer adjust len in that case
anymore, even though len is just used the second next line and all
further on for various calculations and comparisons until the new
code comes to re-adjust it, in case of only TSO.  Now that changes
the meanings of quite a bit of code further down in the former
 	if (!TSO && len > tp->t_maxseg)
 		len = tp->t_maxseg;
case.  So why was the old code wrong and why is the new code right,
assuming it would not just be an oversight.

It seems it's just a subtle change but I fear it has non-understood
consequences given there was no Reviewed by: on the commit and no
explanation for this.  Can you explain?


>> 	if (sack_rxmit) {
>> 		if (SEQ_LT(p->rxmit + len, tp->snd_una + so->so_snd.sb_cc))

-- 
Bjoern A. Zeeb                              Welcome a new stage of life.
         <ks> Going to jail sucks -- <bz> All my daemons like it!
   http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20101024191523.R66242>