Date: Sat, 17 Aug 2002 10:49:18 -0700 (PDT) From: Nate Lawson <nate@root.org> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: freebsd-hackers@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Commit schedule for bandwidth delay product pipeline limiting for TCP Message-ID: <Pine.BSF.4.21.0208171027280.47412-100000@root.org> In-Reply-To: <200208170233.g7H2XgqG047569@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Comments below. Consider them only semi-informed. :) On Fri, 16 Aug 2002, Matthew Dillon wrote: > +void > +tcp_xmit_bandwidth_limit(struct tcpcb *tp, tcp_seq ack_seq) > +{ > + u_long bw; > + u_long bwnd; > + int save_ticks; > + > + /* > + * If inflight_enable is disabled in the middle of a tcp connection, > + * make sure snd_bwnd is effectively disabled. > + */ > + if (tcp_inflight_enable == 0) { > + tp->snd_bwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT; > + tp->snd_bandwidth = 0; > + return; > + } > + > + /* > + * Figure out the bandwidth. Due to the tick granularity this > + * is a very rough number and it MUST be averaged over a fairly > + * long period of time. > + */ > + save_ticks = ticks; > + if ((u_int)(save_ticks - tp->t_bw_rtttime) < 1) > + return; If tp->t_bw_rtttime is greater than save_ticks, the unsigned compare will fail. Can this ever happen (say with the "tick count goes backwards" issue)? Why not do this as a signed compare and check for <= 0? > + bw = (int64_t)(ack_seq - tp->t_bw_rtseq) * hz / > + (save_ticks - tp->t_bw_rtttime); > + tp->t_bw_rtttime = save_ticks; > + tp->t_bw_rtseq = ack_seq; > + if (tp->t_bw_rtttime == 0) > + return; > + bw = ((int64_t)tp->snd_bandwidth * 15 + bw) >> 4; I'm not familiar with the paper -- where does 15 come from? I'm guessing this is a cheap way to perturb the lower bits. > + tp->snd_bandwidth = bw; > + > + /* > + * Calculate the semi-static bandwidth delay product, plus two maximal > + * segments. The additional slop puts us squarely in the sweet > + * spot and also handles the bandwidth run-up case. Without the > + * slop we could be locking ourselves into a lower bandwidth. > + * > + * Situations Handled: > + * (1) prevents over-queueing of packets on LANs, especially > + * high speed LANs, allowing larger TCP buffers to be > + * specified. > + * > + * (2) able to handle increased network loads (bandwidth drops > + * so bwnd drops). > + * > + * (3) Randomly changes the window size in order to force > + * bandwidth balancing between connections. > + */ > +#define USERTT ((tp->t_srtt + tp->t_rttbest) / 2) > + bwnd = (int64_t)bw * USERTT / (hz << TCP_RTT_SHIFT) + 2 * tp->t_maxseg; Would prefer an #undef USERTT after you're done with it. > + if (tcp_inflight_debug > 0) { > + static int ltime; What happens when multiple packets arrive at once. Can they make the calculation below fail? Do you want splimp or a lock on ltime? > + if ((u_int)(ticks - ltime) >= hz / tcp_inflight_debug) { Why the division by a debug enable/disable int? Is it possible for this to ever be 0 at this point? > + ltime = ticks; > + printf("%p bw %ld rttbest %d srtt %d bwnd %ld\n", > + tp, > + bw, > + tp->t_rttbest, > + tp->t_srtt, > + bwnd > + ); > + } > + } > + if ((long)bwnd < tcp_inflight_min) > + bwnd = tcp_inflight_min; > + if (bwnd > tcp_inflight_max) > + bwnd = tcp_inflight_max; > + if ((long)bwnd < tp->t_maxseg * 2) > + bwnd = tp->t_maxseg * 2; > + tp->snd_bwnd = bwnd; > +} > + > Index: netinet/tcp_var.h > =================================================================== > RCS file: /home/ncvs/src/sys/netinet/tcp_var.h,v > retrieving revision 1.82 > diff -u -r1.82 tcp_var.h > --- netinet/tcp_var.h 19 Jul 2002 18:27:39 -0000 1.82 > +++ netinet/tcp_var.h 21 Jul 2002 02:26:36 -0000 > > @@ -137,6 +139,9 @@ > int t_rtttime; /* round trip time */ > tcp_seq t_rtseq; /* sequence number being timed */ > > + int t_bw_rtttime; /* used for bandwidth calculation */ Does this need to be signed? > + tcp_seq t_bw_rtseq; /* used for bandwidth calculation */ > + > int t_rxtcur; /* current retransmit value (ticks) */ > u_int t_maxseg; /* maximum segment size */ > int t_srtt; /* smoothed round-trip time */ > @@ -144,6 +149,7 @@ > > int t_rxtshift; /* log(2) of rexmt exp. backoff */ > u_int t_rttmin; /* minimum rtt allowed */ > + u_int t_rttbest; /* best rtt we've seen */ > u_long t_rttupdated; /* number of times rtt sampled */ > u_long max_sndwnd; /* largest window peer has offered */ -Nate To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0208171027280.47412-100000>