Date: Sat, 17 Aug 2002 10:49:18 -0700 (PDT) From: Nate Lawson <nate@root.org> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: freebsd-hackers@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Commit schedule for bandwidth delay product pipeline limiting for TCP Message-ID: <Pine.BSF.4.21.0208171027280.47412-100000@root.org> In-Reply-To: <200208170233.g7H2XgqG047569@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Comments below. Consider them only semi-informed. :)
On Fri, 16 Aug 2002, Matthew Dillon wrote:
> +void
> +tcp_xmit_bandwidth_limit(struct tcpcb *tp, tcp_seq ack_seq)
> +{
> + u_long bw;
> + u_long bwnd;
> + int save_ticks;
> +
> + /*
> + * If inflight_enable is disabled in the middle of a tcp connection,
> + * make sure snd_bwnd is effectively disabled.
> + */
> + if (tcp_inflight_enable == 0) {
> + tp->snd_bwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT;
> + tp->snd_bandwidth = 0;
> + return;
> + }
> +
> + /*
> + * Figure out the bandwidth. Due to the tick granularity this
> + * is a very rough number and it MUST be averaged over a fairly
> + * long period of time.
> + */
> + save_ticks = ticks;
> + if ((u_int)(save_ticks - tp->t_bw_rtttime) < 1)
> + return;
If tp->t_bw_rtttime is greater than save_ticks, the unsigned compare will
fail. Can this ever happen (say with the "tick count goes
backwards" issue)? Why not do this as a signed compare and check for
<= 0?
> + bw = (int64_t)(ack_seq - tp->t_bw_rtseq) * hz /
> + (save_ticks - tp->t_bw_rtttime);
> + tp->t_bw_rtttime = save_ticks;
> + tp->t_bw_rtseq = ack_seq;
> + if (tp->t_bw_rtttime == 0)
> + return;
> + bw = ((int64_t)tp->snd_bandwidth * 15 + bw) >> 4;
I'm not familiar with the paper -- where does 15 come from? I'm guessing
this is a cheap way to perturb the lower bits.
> + tp->snd_bandwidth = bw;
> +
> + /*
> + * Calculate the semi-static bandwidth delay product, plus two maximal
> + * segments. The additional slop puts us squarely in the sweet
> + * spot and also handles the bandwidth run-up case. Without the
> + * slop we could be locking ourselves into a lower bandwidth.
> + *
> + * Situations Handled:
> + * (1) prevents over-queueing of packets on LANs, especially
> + * high speed LANs, allowing larger TCP buffers to be
> + * specified.
> + *
> + * (2) able to handle increased network loads (bandwidth drops
> + * so bwnd drops).
> + *
> + * (3) Randomly changes the window size in order to force
> + * bandwidth balancing between connections.
> + */
> +#define USERTT ((tp->t_srtt + tp->t_rttbest) / 2)
> + bwnd = (int64_t)bw * USERTT / (hz << TCP_RTT_SHIFT) + 2 * tp->t_maxseg;
Would prefer an #undef USERTT after you're done with it.
> + if (tcp_inflight_debug > 0) {
> + static int ltime;
What happens when multiple packets arrive at once. Can they make the
calculation below fail? Do you want splimp or a lock on ltime?
> + if ((u_int)(ticks - ltime) >= hz / tcp_inflight_debug) {
Why the division by a debug enable/disable int? Is it possible for this
to ever be 0 at this point?
> + ltime = ticks;
> + printf("%p bw %ld rttbest %d srtt %d bwnd %ld\n",
> + tp,
> + bw,
> + tp->t_rttbest,
> + tp->t_srtt,
> + bwnd
> + );
> + }
> + }
> + if ((long)bwnd < tcp_inflight_min)
> + bwnd = tcp_inflight_min;
> + if (bwnd > tcp_inflight_max)
> + bwnd = tcp_inflight_max;
> + if ((long)bwnd < tp->t_maxseg * 2)
> + bwnd = tp->t_maxseg * 2;
> + tp->snd_bwnd = bwnd;
> +}
> +
> Index: netinet/tcp_var.h
> ===================================================================
> RCS file: /home/ncvs/src/sys/netinet/tcp_var.h,v
> retrieving revision 1.82
> diff -u -r1.82 tcp_var.h
> --- netinet/tcp_var.h 19 Jul 2002 18:27:39 -0000 1.82
> +++ netinet/tcp_var.h 21 Jul 2002 02:26:36 -0000
>
> @@ -137,6 +139,9 @@
> int t_rtttime; /* round trip time */
> tcp_seq t_rtseq; /* sequence number being timed */
>
> + int t_bw_rtttime; /* used for bandwidth calculation */
Does this need to be signed?
> + tcp_seq t_bw_rtseq; /* used for bandwidth calculation */
> +
> int t_rxtcur; /* current retransmit value (ticks) */
> u_int t_maxseg; /* maximum segment size */
> int t_srtt; /* smoothed round-trip time */
> @@ -144,6 +149,7 @@
>
> int t_rxtshift; /* log(2) of rexmt exp. backoff */
> u_int t_rttmin; /* minimum rtt allowed */
> + u_int t_rttbest; /* best rtt we've seen */
> u_long t_rttupdated; /* number of times rtt sampled */
> u_long max_sndwnd; /* largest window peer has offered */
-Nate
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0208171027280.47412-100000>
