Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 24 Sep 2019 12:52:08 -0700
From:      Randall Stewart <rrs@netflix.com>
To:        "O. Hartmann" <ohartmann@walstatt.org>
Cc:        Randall Ray Stewart <rrs@FreeBSD.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r352657 - in head/sys: conf kern modules/tcp modules/tcp/bbr netinet netinet/tcp_stacks sys
Message-ID:  <2837AF7D-8040-491C-B9AE-9749412F9AE6@netflix.com>
In-Reply-To: <20190924212918.01e52920@thor.intern.walstatt.dynvpn.de>
References:  <201909241818.x8OIIBNr039667@repo.freebsd.org> <20190924212918.01e52920@thor.intern.walstatt.dynvpn.de>

next in thread | previous in thread | raw e-mail | index | archive | help
This is strange I built this and have it running on my machine
with the standard

make buildkern KERNCONF=3Dmyconf
and
make installkern KERNCONF=3Dmyconf

Why can I build and it blow up.. last time I saw this I was building in =
amd64/compile and
was getting a warning that somehow is an error.. but this time it =
*should* have built fine :(

R

> On Sep 24, 2019, at 12:28 PM, O. Hartmann <ohartmann@walstatt.org> =
wrote:
>=20
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>=20
> Am Tue, 24 Sep 2019 18:18:11 +0000 (UTC)
> Randall Stewart <rrs@FreeBSD.org> schrieb:
>=20
>> Author: rrs
>> Date: Tue Sep 24 18:18:11 2019
>> New Revision: 352657
>> URL: https://svnweb.freebsd.org/changeset/base/352657
>>=20
>> Log:
>>  This commit adds BBR (Bottleneck Bandwidth and RTT) congestion =
control. This
>>  is a completely separate TCP stack (tcp_bbr.ko) that will be built =
only if
>>  you add the make options WITH_EXTRA_TCP_STACKS=3D1 and also include =
the option
>>  TCPHPTS. You can also include the RATELIMIT option if you have a NIC =
interface that
>>  supports hardware pacing, BBR understands how to use such a feature.
>>=20
>>  Note that this commit also adds in a general purpose time-filter =
which
>>  allows you to have a min-filter or max-filter. A filter allows you =
to
>>  have a low (or high) value for some period of time and degrade =
slowly
>>  to another value has time passes. You can find out the details of
>>  BBR by looking at the original paper at:
>>=20
>>  https://queue.acm.org/detail.cfm?id=3D3022184
>>=20
>>  or consult many other web resources you can find on the web
>>  referenced by "BBR congestion control". It should be noted that
>>  BBRv1 (which this is) does tend to unfairness in cases of small
>>  buffered paths, and it will usually get less bandwidth in the case
>>  of large BDP paths(when competing with new-reno or cubic flows). BBR
>>  is still an active research area and we do plan on  implementing V2
>>  of BBR to see if it is an improvement over V1.
>>=20
>>  Sponsored by:	Netflix Inc.
>>  Differential Revision:	https://reviews.freebsd.org/D21582
>>=20
>> Added:
>>  head/sys/kern/subr_filter.c   (contents, props changed)
>>  head/sys/modules/tcp/bbr/
>>  head/sys/modules/tcp/bbr/Makefile   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/bbr.c   (contents, props changed)
>>  head/sys/netinet/tcp_stacks/tcp_bbr.h   (contents, props changed)
>>  head/sys/sys/tim_filter.h   (contents, props changed)
>> Modified:
>>  head/sys/conf/files
>>  head/sys/modules/tcp/Makefile
>>  head/sys/netinet/ip_output.c
>>  head/sys/netinet/ip_var.h
>>  head/sys/netinet/tcp.h
>>  head/sys/netinet/tcp_stacks/rack.c
>>  head/sys/netinet/tcp_stacks/rack_bbr_common.c
>>  head/sys/netinet/tcp_stacks/rack_bbr_common.h
>>  head/sys/netinet/tcp_stacks/sack_filter.c
>>  head/sys/netinet/tcp_stacks/sack_filter.h
>>  head/sys/netinet/tcp_stacks/tcp_rack.h
>>  head/sys/sys/mbuf.h
>>=20
>> Modified: head/sys/conf/files
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/sys/conf/files	Tue Sep 24 17:06:32 2019	=
(r352656)
>> +++ head/sys/conf/files	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -3808,6 +3808,7 @@ kern/subr_epoch.c		standard
>> kern/subr_eventhandler.c	standard
>> kern/subr_fattime.c		standard
>> kern/subr_firmware.c		optional firmware
>> +kern/subr_filter.c              standard
>> kern/subr_gtaskqueue.c		standard
>> kern/subr_hash.c		standard
>> kern/subr_hints.c		standard
>>=20
>> Added: head/sys/kern/subr_filter.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- /dev/null	00:00:00 1970	(empty, because file is newly =
added)
>> +++ head/sys/kern/subr_filter.c	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -0,0 +1,482 @@
>> +/*-
>> + * Copyright (c) 2016-2019 Netflix, Inc.
>> + * All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or =
without
>> + * modification, are permitted provided that the following =
conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + *    notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above =
copyright
>> + *    notice, this list of conditions and the following disclaimer =
in the
>> + *    documentation and/or other materials provided with the =
distribution.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS =
IS'' AND
>> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, =
THE
>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A =
PARTICULAR PURPOSE
>> + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE =
LIABLE
>> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR =
CONSEQUENTIAL
>> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE =
GOODS
>> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS =
INTERRUPTION)
>> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN =
CONTRACT, STRICT
>> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN =
ANY WAY
>> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE =
POSSIBILITY OF
>> + * SUCH DAMAGE.
>> + */
>> +
>> +/*
>> + * Author: Randall Stewart <rrs@netflix.com>
>> + */
>> +#include <sys/cdefs.h>
>> +__FBSDID("$FreeBSD$");
>> +#include <sys/types.h>
>> +#include <sys/time.h>
>> +#include <sys/errno.h>
>> +#include <sys/tim_filter.h>
>> +
>> +void
>> +reset_time(struct time_filter *tf, uint32_t time_len)
>> +{
>> +	tf->cur_time_limit =3D time_len;
>> +}
>> +
>> +void
>> +reset_time_small(struct time_filter_small *tf, uint32_t time_len)
>> +{
>> +	tf->cur_time_limit =3D time_len;
>> +}
>> +
>> +/*
>> + * A time filter can be a filter for MIN or MAX.=20
>> + * You call setup_time_filter() with the pointer to
>> + * the filter structure, the type (FILTER_TYPE_MIN/MAX) and
>> + * the time length. You can optionally reset the time length
>> + * later with reset_time().
>> + *
>> + * You generally call apply_filter_xxx() to apply the new value
>> + * to the filter. You also provide a time (now). The filter will
>> + * age out entries based on the time now and your time limit
>> + * so that you are always maintaining the min or max in that
>> + * window of time. Time is a relative thing, it might be ticks
>> + * in milliseconds, it might be round trip times, its really
>> + * up to you to decide what it is.
>> + *
>> + * To access the current flitered value you can use the macro
>> + * get_filter_value() which returns the correct entry that
>> + * has the "current" value in the filter.
>> + *
>> + * One thing that used to be here is a single apply_filter(). But
>> + * this meant that we then had to store the type of filter in
>> + * the time_filter structure. In order to keep it at a cache
>> + * line size I split it to two functions.=20
>> + *
>> + */
>> +int
>> +setup_time_filter(struct time_filter *tf, int fil_type, uint32_t =
time_len)
>> +{
>> +	uint64_t set_val;
>> +	int i;
>> +=09
>> +	/*=20
>> +	 * You must specify either a MIN or MAX filter,
>> +	 * though its up to the user to use the correct
>> +	 * apply.
>> +	 */
>> +	if ((fil_type !=3D FILTER_TYPE_MIN) &&
>> +	    (fil_type !=3D FILTER_TYPE_MAX))
>> +		return(EINVAL);
>> +
>> +	if (time_len < NUM_FILTER_ENTRIES)
>> +		return(EINVAL);
>> +		      =20
>> +	if (fil_type =3D=3D FILTER_TYPE_MIN)
>> +		set_val =3D 0xffffffffffffffff;
>> +	else
>> +		set_val =3D 0;
>> +
>> +	for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D set_val;
>> +		tf->entries[i].time_up =3D 0;
>> +	}
>> +	tf->cur_time_limit =3D time_len;
>> +	return(0);
>> +}
>> +
>> +int
>> +setup_time_filter_small(struct time_filter_small *tf, int fil_type, =
uint32_t time_len)
>> +{
>> +	uint32_t set_val;
>> +	int i;
>> +=09
>> +	/*=20
>> +	 * You must specify either a MIN or MAX filter,
>> +	 * though its up to the user to use the correct
>> +	 * apply.
>> +	 */
>> +	if ((fil_type !=3D FILTER_TYPE_MIN) &&
>> +	    (fil_type !=3D FILTER_TYPE_MAX))
>> +		return(EINVAL);
>> +
>> +	if (time_len < NUM_FILTER_ENTRIES)
>> +		return(EINVAL);
>> +		      =20
>> +	if (fil_type =3D=3D FILTER_TYPE_MIN)
>> +		set_val =3D 0xffffffff;
>> +	else
>> +		set_val =3D 0;
>> +
>> +	for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D set_val;
>> +		tf->entries[i].time_up =3D 0;
>> +	}
>> +	tf->cur_time_limit =3D time_len;
>> +	return(0);
>> +}
>> +
>> +
>> +static void
>> +check_update_times(struct time_filter *tf, uint64_t value, uint32_t =
now)
>> +{
>> +	int i, j, fnd;
>> +	uint32_t tim;
>> +	uint32_t time_limit;
>> +	for(i=3D0; i<(NUM_FILTER_ENTRIES-1); i++) {
>> +		tim =3D now - tf->entries[i].time_up;
>> +		time_limit =3D (tf->cur_time_limit *
>> (NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +		if (tim >=3D time_limit) {
>> +			fnd =3D 0;
>> +			for(j=3D(i+1); j<NUM_FILTER_ENTRIES; j++) {
>> +				if (tf->entries[i].time_up < =
tf->entries[j].time_up) {
>> +					tf->entries[i].value =3D =
tf->entries[j].value;
>> +					tf->entries[i].time_up =3D =
tf->entries[j].time_up;
>> +					fnd =3D 1;
>> +					break;
>> +				}
>> +			}
>> +			if (fnd =3D=3D 0) {
>> +				/* Nothing but the same old entry */
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +		}
>> +	}
>> +	i =3D NUM_FILTER_ENTRIES-1;
>> +	tim =3D now - tf->entries[i].time_up;
>> +	time_limit =3D (tf->cur_time_limit * =
(NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +	if (tim >=3D time_limit) {
>> +		tf->entries[i].value =3D value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +static void
>> +check_update_times_small(struct time_filter_small *tf, uint32_t =
value, uint32_t now)
>> +{
>> +	int i, j, fnd;
>> +	uint32_t tim;
>> +	uint32_t time_limit;
>> +	for(i=3D0; i<(NUM_FILTER_ENTRIES-1); i++) {
>> +		tim =3D now - tf->entries[i].time_up;
>> +		time_limit =3D (tf->cur_time_limit *
>> (NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +		if (tim >=3D time_limit) {
>> +			fnd =3D 0;
>> +			for(j=3D(i+1); j<NUM_FILTER_ENTRIES; j++) {
>> +				if (tf->entries[i].time_up < =
tf->entries[j].time_up) {
>> +					tf->entries[i].value =3D =
tf->entries[j].value;
>> +					tf->entries[i].time_up =3D =
tf->entries[j].time_up;
>> +					fnd =3D 1;
>> +					break;
>> +				}
>> +			}
>> +			if (fnd =3D=3D 0) {
>> +				/* Nothing but the same old entry */
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +		}
>> +	}
>> +	i =3D NUM_FILTER_ENTRIES-1;
>> +	tim =3D now - tf->entries[i].time_up;
>> +	time_limit =3D (tf->cur_time_limit * =
(NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +	if (tim >=3D time_limit) {
>> +		tf->entries[i].value =3D value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +
>> +
>> +void
>> +filter_reduce_by(struct time_filter *tf, uint64_t reduce_by, =
uint32_t now)
>> +{
>> +	int i;
>> +	/*=20
>> +	 * Reduce our filter main by reduce_by and
>> +	 * update its time. Then walk other's and
>> +	 * make them the new value too.
>> +	 */
>> +	if (reduce_by < tf->entries[0].value)
>> +		tf->entries[0].value -=3D reduce_by;
>> +	else
>> +		tf->entries[0].value =3D 0;
>> +	tf->entries[0].time_up =3D now;
>> +	for(i=3D1; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D tf->entries[0].value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +void
>> +filter_reduce_by_small(struct time_filter_small *tf, uint32_t =
reduce_by, uint32_t now)
>> +{
>> +	int i;
>> +	/*=20
>> +	 * Reduce our filter main by reduce_by and
>> +	 * update its time. Then walk other's and
>> +	 * make them the new value too.
>> +	 */
>> +	if (reduce_by < tf->entries[0].value)
>> +		tf->entries[0].value -=3D reduce_by;
>> +	else
>> +		tf->entries[0].value =3D 0;
>> +	tf->entries[0].time_up =3D now;
>> +	for(i=3D1; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D tf->entries[0].value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +void
>> +filter_increase_by(struct time_filter *tf, uint64_t incr_by, =
uint32_t now)
>> +{
>> +	int i;
>> +	/*=20
>> +	 * Increase our filter main by incr_by and
>> +	 * update its time. Then walk other's and
>> +	 * make them the new value too.
>> +	 */
>> +	tf->entries[0].value +=3D incr_by;
>> +	tf->entries[0].time_up =3D now;
>> +	for(i=3D1; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D tf->entries[0].value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +void
>> +filter_increase_by_small(struct time_filter_small *tf, uint32_t =
incr_by, uint32_t now)
>> +{
>> +	int i;
>> +	/*=20
>> +	 * Increase our filter main by incr_by and
>> +	 * update its time. Then walk other's and
>> +	 * make them the new value too.
>> +	 */
>> +	tf->entries[0].value +=3D incr_by;
>> +	tf->entries[0].time_up =3D now;
>> +	for(i=3D1; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].value =3D tf->entries[0].value;
>> +		tf->entries[i].time_up =3D now;
>> +	}
>> +}
>> +
>> +void
>> +forward_filter_clock(struct time_filter *tf, uint32_t ticks_forward)
>> +{
>> +	/*
>> +	 * Bring forward all time values by N ticks. This
>> +	 * postpones expiring slots by that amount.
>> +	 */
>> +	int i;
>> +
>> +	for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].time_up +=3D ticks_forward;
>> +	}
>> +}
>> +
>> +
>> +void
>> +forward_filter_clock_small(struct time_filter_small *tf, uint32_t =
ticks_forward)
>> +{
>> +	/*
>> +	 * Bring forward all time values by N ticks. This
>> +	 * postpones expiring slots by that amount.
>> +	 */
>> +	int i;
>> +
>> +	for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +		tf->entries[i].time_up +=3D ticks_forward;
>> +	}
>> +}
>> +
>> +
>> +void
>> +tick_filter_clock(struct time_filter *tf, uint32_t now)
>> +{
>> +	int i;
>> +	uint32_t tim, time_limit;
>> +
>> +	/*
>> +	 * We start at two positions back. This
>> +	 * is because the oldest worst value is
>> +	 * preserved always, i.e. it can't expire
>> +	 * due to clock ticking with no updated value.
>> +	 *
>> +	 * The other choice would be to fill it in with
>> +	 * zero, but I don't like that option since
>> +	 * some measurement is better than none (even
>> +	 * if its your oldest measurment).
>> +	 */
>> +	for(i=3D(NUM_FILTER_ENTRIES-2); i>=3D0 ; i--) {
>> +		tim =3D now - tf->entries[i].time_up;
>> +		time_limit =3D (tf->cur_time_limit *
>> (NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +		if (tim >=3D time_limit) {
>> +			/*=20
>> +			 * This entry is expired, pull down
>> +			 * the next one up.
>> +			 */
>> +			tf->entries[i].value =3D =
tf->entries[(i+1)].value;
>> +			tf->entries[i].time_up =3D =
tf->entries[(i+1)].time_up;
>> +		}
>> +
>> +	}
>> +}
>> +
>> +void
>> +tick_filter_clock_small(struct time_filter_small *tf, uint32_t now)
>> +{
>> +	int i;
>> +	uint32_t tim, time_limit;
>> +
>> +	/*
>> +	 * We start at two positions back. This
>> +	 * is because the oldest worst value is
>> +	 * preserved always, i.e. it can't expire
>> +	 * due to clock ticking with no updated value.
>> +	 *
>> +	 * The other choice would be to fill it in with
>> +	 * zero, but I don't like that option since
>> +	 * some measurement is better than none (even
>> +	 * if its your oldest measurment).
>> +	 */
>> +	for(i=3D(NUM_FILTER_ENTRIES-2); i>=3D0 ; i--) {
>> +		tim =3D now - tf->entries[i].time_up;
>> +		time_limit =3D (tf->cur_time_limit *
>> (NUM_FILTER_ENTRIES-i))/NUM_FILTER_ENTRIES;
>> +		if (tim >=3D time_limit) {
>> +			/*=20
>> +			 * This entry is expired, pull down
>> +			 * the next one up.
>> +			 */
>> +			tf->entries[i].value =3D =
tf->entries[(i+1)].value;
>> +			tf->entries[i].time_up =3D =
tf->entries[(i+1)].time_up;
>> +		}
>> +
>> +	}
>> +}
>> +
>> +uint32_t
>> +apply_filter_min(struct time_filter *tf, uint64_t value, uint32_t =
now)
>> +{
>> +	int i, j;
>> +=09
>> +	if (value <=3D tf->entries[0].value) {
>> +		/* Zap them all */
>> +		for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +			tf->entries[i].value =3D value;
>> +			tf->entries[i].time_up =3D now;
>> +		}
>> +		return (tf->entries[0].value);
>> +	}
>> +	for (j=3D1; j<NUM_FILTER_ENTRIES; j++) {
>> +		if (value <=3D tf->entries[j].value) {
>> +			for(i=3Dj; i<NUM_FILTER_ENTRIES; i++) {
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +			break;
>> +		}
>> +	}
>> +	check_update_times(tf, value, now);
>> +	return (tf->entries[0].value);
>> +}
>> +
>> +uint32_t
>> +apply_filter_min_small(struct time_filter_small *tf,
>> +		       uint32_t value, uint32_t now)
>> +{
>> +	int i, j;
>> +=09
>> +	if (value <=3D tf->entries[0].value) {
>> +		/* Zap them all */
>> +		for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +			tf->entries[i].value =3D value;
>> +			tf->entries[i].time_up =3D now;
>> +		}
>> +		return (tf->entries[0].value);
>> +	}
>> +	for (j=3D1; j<NUM_FILTER_ENTRIES; j++) {
>> +		if (value <=3D tf->entries[j].value) {
>> +			for(i=3Dj; i<NUM_FILTER_ENTRIES; i++) {
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +			break;
>> +		}
>> +	}
>> +	check_update_times_small(tf, value, now);
>> +	return (tf->entries[0].value);
>> +}
>> +
>> +uint32_t
>> +apply_filter_max(struct time_filter *tf, uint64_t value, uint32_t =
now)
>> +{
>> +	int i, j;
>> +=09
>> +	if (value >=3D tf->entries[0].value) {
>> +		/* Zap them all */
>> +		for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +			tf->entries[i].value =3D value;
>> +			tf->entries[i].time_up =3D now;
>> +		}
>> +		return (tf->entries[0].value);
>> +	}
>> +	for (j=3D1; j<NUM_FILTER_ENTRIES; j++) {
>> +		if (value >=3D tf->entries[j].value) {
>> +			for(i=3Dj; i<NUM_FILTER_ENTRIES; i++) {
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +			break;
>> +		}
>> +	}
>> +	check_update_times(tf, value, now);
>> +	return (tf->entries[0].value);
>> +}
>> +
>> +
>> +uint32_t
>> +apply_filter_max_small(struct time_filter_small *tf,
>> +		       uint32_t value, uint32_t now)
>> +{
>> +	int i, j;
>> +=09
>> +	if (value >=3D tf->entries[0].value) {
>> +		/* Zap them all */
>> +		for(i=3D0; i<NUM_FILTER_ENTRIES; i++) {
>> +			tf->entries[i].value =3D value;
>> +			tf->entries[i].time_up =3D now;
>> +		}
>> +		return (tf->entries[0].value);
>> +	}
>> +	for (j=3D1; j<NUM_FILTER_ENTRIES; j++) {
>> +		if (value >=3D tf->entries[j].value) {
>> +			for(i=3Dj; i<NUM_FILTER_ENTRIES; i++) {
>> +				tf->entries[i].value =3D value;
>> +				tf->entries[i].time_up =3D now;
>> +			}
>> +			break;
>> +		}
>> +	}
>> +	check_update_times_small(tf, value, now);
>> +	return (tf->entries[0].value);
>> +}
>>=20
>> Modified: head/sys/modules/tcp/Makefile
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/sys/modules/tcp/Makefile	Tue Sep 24 17:06:32 2019	=
(r352656)
>> +++ head/sys/modules/tcp/Makefile	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -6,10 +6,12 @@ SYSDIR?=3D${SRCTOP}/sys
>> .include "${SYSDIR}/conf/kern.opts.mk"
>>=20
>> SUBDIR=3D	\
>> +        ${_tcp_bbr} \
>>         ${_tcp_rack} \
>> 	${_tcpmd5} \
>>=20
>> .if ${MK_EXTRA_TCP_STACKS} !=3D "no" || defined(ALL_MODULES)
>> +_tcp_bbr=3D 	bbr
>> _tcp_rack=3D 	rack
>> .endif
>>=20
>>=20
>> Added: head/sys/modules/tcp/bbr/Makefile
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- /dev/null	00:00:00 1970	(empty, because file is newly =
added)
>> +++ head/sys/modules/tcp/bbr/Makefile	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -0,0 +1,23 @@
>> +#
>> +# $FreeBSD$
>> +#
>> +
>> +.PATH: ${.CURDIR}/../../../netinet/tcp_stacks
>> +
>> +STACKNAME=3D	bbr
>> +KMOD=3D	tcp_${STACKNAME}
>> +SRCS=3D	bbr.c sack_filter.c rack_bbr_common.c
>> +
>> +SRCS+=3D	opt_inet.h opt_inet6.h opt_ipsec.h
>> +SRCS+=3D	opt_tcpdebug.h
>> +SRCS+=3D	opt_kern_tls.h
>> +
>> +#
>> +# Enable full debugging
>> +#
>> +#CFLAGS +=3D -g
>> +
>> +CFLAGS+=3D	-DMODNAME=3D${KMOD}
>> +CFLAGS+=3D	-DSTACKNAME=3D${STACKNAME}
>> +
>> +.include <bsd.kmod.mk>
>>=20
>> Modified: head/sys/netinet/ip_output.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/sys/netinet/ip_output.c	Tue Sep 24 17:06:32 2019	=
(r352656)
>> +++ head/sys/netinet/ip_output.c	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -212,7 +212,7 @@ ip_output_pfil(struct mbuf **mp, struct ifnet =
*ifp, in
>>=20
>> static int
>> ip_output_send(struct inpcb *inp, struct ifnet *ifp, struct mbuf *m,
>> -    const struct sockaddr_in *gw, struct route *ro)
>> +    const struct sockaddr_in *gw, struct route *ro, bool stamp_tag)
>> {
>> #ifdef KERN_TLS
>> 	struct ktls_session *tls =3D NULL;
>> @@ -256,7 +256,7 @@ ip_output_send(struct inpcb *inp, struct ifnet =
*ifp, s
>> 			mst =3D inp->inp_snd_tag;
>> 	}
>> #endif
>> -	if (mst !=3D NULL) {
>> +	if (stamp_tag && mst !=3D NULL) {
>> 		KASSERT(m->m_pkthdr.rcvif =3D=3D NULL,
>> 		    ("trying to add a send tag to a forwarded packet"));
>> 		if (mst->ifp !=3D ifp) {
>> @@ -791,7 +791,8 @@ sendit:
>> 		 */
>> 		m_clrprotoflags(m);
>> 		IP_PROBE(send, NULL, NULL, ip, ifp, ip, NULL);
>> -		error =3D ip_output_send(inp, ifp, m, gw, ro);
>> +		error =3D ip_output_send(inp, ifp, m, gw, ro,
>> +		    (flags & IP_NO_SND_TAG_RL) ? false : true);
>> 		goto done;
>> 	}
>>=20
>> @@ -827,7 +828,7 @@ sendit:
>>=20
>> 			IP_PROBE(send, NULL, NULL, mtod(m, struct ip *), =
ifp,
>> 			    mtod(m, struct ip *), NULL);
>> -			error =3D ip_output_send(inp, ifp, m, gw, ro);
>> +			error =3D ip_output_send(inp, ifp, m, gw, ro, =
true);
>> 		} else
>> 			m_freem(m);
>> 	}
>>=20
>> Modified: head/sys/netinet/ip_var.h
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/sys/netinet/ip_var.h	Tue Sep 24 17:06:32 2019	=
(r352656)
>> +++ head/sys/netinet/ip_var.h	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -166,6 +166,7 @@ void	kmod_ipstat_dec(int statnum);
>> #define IP_ROUTETOIF		SO_DONTROUTE	/* 0x10 bypass routing =
tables */
>> #define IP_ALLOWBROADCAST	SO_BROADCAST	/* 0x20 can send =
broadcast packets */
>> #define	IP_NODEFAULTFLOWID	0x40		/* Don't set the =
flowid from
>> inp */ +#define IP_NO_SND_TAG_RL	0x80		/* Don't send =
down the ratelimit
>> tag */=20
>> #ifdef __NO_STRICT_ALIGNMENT
>> #define IP_HDR_ALIGNED_P(ip)	1
>>=20
>> Modified: head/sys/netinet/tcp.h
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- head/sys/netinet/tcp.h	Tue Sep 24 17:06:32 2019	=
(r352656)
>> +++ head/sys/netinet/tcp.h	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -239,6 +239,7 @@ struct tcphdr {
>> #define TCP_BBR_ACK_COMP_ALG   1096 	/* Not used */
>> #define TCP_BBR_TMR_PACE_OH    1096	/* Recycled in 4.2 */
>> #define TCP_BBR_EXTRA_GAIN     1097
>> +#define TCP_RACK_DO_DETECTION  1097	/* Recycle of extra gain for =
rack, attack
>> detection */ #define TCP_BBR_RACK_RTT_USE   1098	/* what RTT =
should we use 0, 1, or
>> 2? */ #define TCP_BBR_RETRAN_WTSO    1099
>> #define TCP_DATA_AFTER_CLOSE   1100
>>=20
>> Added: head/sys/netinet/tcp_stacks/bbr.c
>> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>> --- /dev/null	00:00:00 1970	(empty, because file is newly =
added)
>> +++ head/sys/netinet/tcp_stacks/bbr.c	Tue Sep 24 18:18:11 2019	=
(r352657)
>> @@ -0,0 +1,15189 @@
>> +/*-
>> + * Copyright (c) 2016-2019
>> + *	Netflix Inc.
>> + *      All rights reserved.
>> + *
>> + * Redistribution and use in source and binary forms, with or =
without
>> + * modification, are permitted provided that the following =
conditions
>> + * are met:
>> + * 1. Redistributions of source code must retain the above copyright
>> + *    notice, this list of conditions and the following disclaimer.
>> + * 2. Redistributions in binary form must reproduce the above =
copyright
>> + *    notice, this list of conditions and the following disclaimer =
in the
>> + *    documentation and/or other materials provided with the =
distribution.
>> + *
>> + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS =
IS'' AND
>> + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, =
THE
>> + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A =
PARTICULAR PURPOSE
>> + * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE =
LIABLE
>> + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR =
CONSEQUENTIAL
>> + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE =
GOODS
>> + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS =
INTERRUPTION)
>> + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN =
CONTRACT, STRICT
>> + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN =
ANY WAY
>> + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE =
POSSIBILITY OF
>> + * SUCH DAMAGE.
>> + *
>> + */
>> +/**
>> + * Author: Randall Stewart <rrs@netflix.com>
>> + * This work is based on the ACM Queue paper
>> + * BBR - Congestion Based Congestion Control
>> + * and also numerous discussions with Neal, Yuchung and Van.
>> + */
>> +
>> +#include <sys/cdefs.h>
>> +__FBSDID("$FreeBSD$");
>> +
>> +#include "opt_inet.h"
>> +#include "opt_inet6.h"
>> +#include "opt_ipsec.h"
>> +#include "opt_tcpdebug.h"
>> +#include "opt_ratelimit.h"
>> +#include "opt_kern_tls.h"
>> +#include <sys/param.h>
>> +#include <sys/module.h>
>> +#include <sys/kernel.h>
>> +#ifdef TCP_HHOOK
>> +#include <sys/hhook.h>
>> +#endif
>> +#include <sys/malloc.h>
>> +#include <sys/mbuf.h>
>> +#include <sys/proc.h>
>> +#include <sys/socket.h>
>> +#include <sys/socketvar.h>
>> +#ifdef KERN_TLS
>> +#include <sys/ktls.h>
>> +#endif
>> +#include <sys/sysctl.h>
>> +#include <sys/systm.h>
>> +#include <sys/qmath.h>
>> +#include <sys/tree.h>
>> +#ifdef NETFLIX_STATS
>> +#include <sys/stats.h> /* Must come after qmath.h and tree.h */
>> +#endif
>> +#include <sys/refcount.h>
>> +#include <sys/queue.h>
>> +#include <sys/eventhandler.h>
>> +#include <sys/smp.h>
>> +#include <sys/kthread.h>
>> +#include <sys/lock.h>
>> +#include <sys/mutex.h>
>> +#include <sys/tim_filter.h>
>> +#include <sys/time.h>
>> +#include <vm/uma.h>
>> +#include <sys/kern_prefetch.h>
>> +
>> +#include <net/route.h>
>> +#include <net/vnet.h>
>> +
>> +#define TCPSTATES		/* for logging */
>> +
>> +#include <netinet/in.h>
>> +#include <netinet/in_kdtrace.h>
>> +#include <netinet/in_pcb.h>
>> +#include <netinet/ip.h>
>> +#include <netinet/ip_icmp.h>	/* required for icmp_var.h */
>> +#include <netinet/icmp_var.h>	/* for ICMP_BANDLIM */
>> +#include <netinet/ip_var.h>
>> +#include <netinet/ip6.h>
>> +#include <netinet6/in6_pcb.h>
>> +#include <netinet6/ip6_var.h>
>> +#define	TCPOUTFLAGS
>> +#include <netinet/tcp.h>
>> +#include <netinet/tcp_fsm.h>
>> +#include <netinet/tcp_seq.h>
>> +#include <netinet/tcp_timer.h>
>> +#include <netinet/tcp_var.h>
>> +#include <netinet/tcpip.h>
>> +#include <netinet/tcp_hpts.h>
>> +#include <netinet/cc/cc.h>
>> +#include <netinet/tcp_log_buf.h>
>> +#include <netinet/tcp_ratelimit.h>
>> +#include <netinet/tcp_lro.h>
>> +#ifdef TCPDEBUG
>> +#include <netinet/tcp_debug.h>
>> +#endif				/* TCPDEBUG */
>> +#ifdef TCP_OFFLOAD
>> +#include <netinet/tcp_offload.h>
>> +#endif
>> +#ifdef INET6
>> +#include <netinet6/tcp6_var.h>
>> +#endif
>> +#include <netinet/tcp_fastopen.h>
>> +
>> +#include <netipsec/ipsec_support.h>
>> +#include <net/if.h>
>> +#include <net/if_var.h>
>> +#include <net/ethernet.h>
>> +
>> +#if defined(IPSEC) || defined(IPSEC_SUPPORT)
>> +#include <netipsec/ipsec.h>
>> +#include <netipsec/ipsec6.h>
>> +#endif				/* IPSEC */
>> +
>> +#include <netinet/udp.h>
>> +#include <netinet/udp_var.h>
>> +#include <machine/in_cksum.h>
>> +
>> +#ifdef MAC
>> +#include <security/mac/mac_framework.h>
>> +#endif
>> +
>> +#include "sack_filter.h"
>> +#include "tcp_bbr.h"
>> +#include "rack_bbr_common.h"
>> +uma_zone_t bbr_zone;
>> +uma_zone_t bbr_pcb_zone;
>> +
>> +struct sysctl_ctx_list bbr_sysctl_ctx;
>> +struct sysctl_oid *bbr_sysctl_root;
>> +
>> +#define	TCPT_RANGESET_NOSLOP(tv, value, tvmin, tvmax) do { \
>> +	(tv) =3D (value); \
>> +	if ((u_long)(tv) < (u_long)(tvmin)) \
>> +		(tv) =3D (tvmin); \
>> +	if ((u_long)(tv) > (u_long)(tvmax)) \
>> +		(tv) =3D (tvmax); \
>> +} while(0)
>> +
>> +/*#define BBR_INVARIANT 1*/
>> +
>> +/*
>> + * initial window
>> + */
>> +static uint32_t bbr_def_init_win =3D 10;
>> +static int32_t bbr_persist_min =3D 250000;	/* 250ms */
>> +static int32_t bbr_persist_max =3D 1000000;	/* 1 Second */
>> +static int32_t bbr_cwnd_may_shrink =3D 0;
>> +static int32_t bbr_cwndtarget_rtt_touse =3D BBR_RTT_PROP;
>> +static int32_t bbr_num_pktepo_for_del_limit =3D =
BBR_NUM_RTTS_FOR_DEL_LIMIT;
>> +static int32_t bbr_hardware_pacing_limit =3D 8000;
>> +static int32_t bbr_quanta =3D 3;	/* How much extra quanta do we =
get? */
>> +static int32_t bbr_no_retran =3D 0;
>> +static int32_t bbr_tcp_map_entries_limit =3D 1500;
>> +static int32_t bbr_tcp_map_split_limit =3D 256;
>> +
>> +static int32_t bbr_error_base_paceout =3D 10000; /* usec to pace */
>> +static int32_t bbr_max_net_error_cnt =3D 10;
>> +/* Should the following be dynamic too -- loss wise */
>> +static int32_t bbr_rtt_gain_thresh =3D 0;
>> +/* Measurement controls */
>> +static int32_t bbr_use_google_algo =3D 1;
>> +static int32_t bbr_ts_limiting =3D 1;
>> +static int32_t bbr_ts_can_raise =3D 0;
>> +static int32_t bbr_do_red =3D 600;
>> +static int32_t bbr_red_scale =3D 20000;
>> +static int32_t bbr_red_mul =3D 1;
>> +static int32_t bbr_red_div =3D 2;
>> +static int32_t bbr_red_growth_restrict =3D 1;
>> +static int32_t  bbr_target_is_bbunit =3D 0;
>> +static int32_t bbr_drop_limit =3D 0;
>> +/*
>> + * How much gain do we need to see to
>> + * stay in startup?
>> + */
>> +static int32_t bbr_marks_rxt_sack_passed =3D 0;
>> +static int32_t bbr_start_exit =3D 25;
>> +static int32_t bbr_low_start_exit =3D 25;	/* When we are in =
reduced gain */
>> +static int32_t bbr_startup_loss_thresh =3D 2000;	/* 20.00% loss =
*/
>> +static int32_t bbr_hptsi_max_mul =3D 1;	/* These two mul/div =
assure a min pacing */
>> +static int32_t bbr_hptsi_max_div =3D 2;	/* time, 0 means turned =
off. We need this
>> +					 * if we go back ever to where =
the pacer
>> +					 * has priority over timers.
>> +					 */
>> +static int32_t bbr_policer_call_from_rack_to =3D 0;
>> +static int32_t bbr_policer_detection_enabled =3D 1;
>> +static int32_t bbr_min_measurements_req =3D 1;	/* We need at =
least 2
>> +						 * measurments before we =
are
>> +						 * "good" note that 2 =3D=3D=
 1.
>> +						 * This is because we =
use a >
>> +						 * comparison. This =
means if
>> +						 * min_measure was 0, it =
takes
>> +						 * num-measures > min(0) =
and
>> +						 * you get 1 measurement =
and
>> +						 * you are good. Set to =
1, you
>> +						 * have to have two
>> +						 * measurements (this is =
done
>> +						 * to prevent it from =
being ok
>> +						 * to have no =
measurements). */
>> +static int32_t bbr_no_pacing_until =3D 4;
>> +						=20
>> +static int32_t bbr_min_usec_delta =3D 20000;	/* 20,000 usecs =
*/
>> +static int32_t bbr_min_peer_delta =3D 20;		/* 20 units */
>> +static int32_t bbr_delta_percent =3D 150;		/* 15.0 % */
>> +
>> +static int32_t bbr_target_cwnd_mult_limit =3D 8;
>> +/*
>> + * bbr_cwnd_min_val is the number of
>> + * segments we hold to in the RTT probe
>> + * state typically 4.
>> + */
>> +static int32_t bbr_cwnd_min_val =3D BBR_PROBERTT_NUM_MSS;
>> +
>> +
>> +static int32_t bbr_cwnd_min_val_hs =3D BBR_HIGHSPEED_NUM_MSS;
>> +
>> +static int32_t bbr_gain_to_target =3D 1;
>> +static int32_t bbr_gain_gets_extra_too =3D 1;
>> +/*
>> + * bbr_high_gain is the 2/ln(2) value we need
>> + * to double the sending rate in startup. This
>> + * is used for both cwnd and hptsi gain's.
>> + */
>> +static int32_t bbr_high_gain =3D BBR_UNIT * 2885 / 1000 + 1;
>> +static int32_t bbr_startup_lower =3D BBR_UNIT * 1500 / 1000 + 1;
>> +static int32_t bbr_use_lower_gain_in_startup =3D 1;
>> +
>> +/* thresholds for reduction on drain in sub-states/drain */
>> +static int32_t bbr_drain_rtt =3D BBR_SRTT;
>> +static int32_t bbr_drain_floor =3D 88;
>> +static int32_t google_allow_early_out =3D 1;
>> +static int32_t google_consider_lost =3D 1;
>> +static int32_t bbr_drain_drop_mul =3D 4;
>> +static int32_t bbr_drain_drop_div =3D 5;
>> +static int32_t bbr_rand_ot =3D 50;
>> +static int32_t bbr_can_force_probertt =3D 0;
>> +static int32_t bbr_can_adjust_probertt =3D 1;
>> +static int32_t bbr_probertt_sets_rtt =3D 0;
>> +static int32_t bbr_can_use_ts_for_rtt =3D 1;
>> +static int32_t bbr_is_ratio =3D 0;
>> +static int32_t bbr_sub_drain_app_limit =3D 1;
>> +static int32_t bbr_prtt_slam_cwnd =3D 1;
>> +static int32_t bbr_sub_drain_slam_cwnd =3D 1;
>> +static int32_t bbr_slam_cwnd_in_main_drain =3D 1;
>> +static int32_t bbr_filter_len_sec =3D 6;	/* How long does the =
rttProp filter
>> +					 * hold */
>> +static uint32_t bbr_rtt_probe_limit =3D (USECS_IN_SECOND * 4);
>> +/*
>> + * bbr_drain_gain is the reverse of the high_gain
>> + * designed to drain back out the standing queue
>> + * that is formed in startup by causing a larger
>> + * hptsi gain and thus drainging the packets
>> + * in flight.
>> + */
>> +static int32_t bbr_drain_gain =3D BBR_UNIT * 1000 / 2885;
>> +static int32_t bbr_rttprobe_gain =3D 192;
>> +
>> +/*
>> + * The cwnd_gain is the default cwnd gain applied when
>> + * calculating a target cwnd. Note that the cwnd is
>> + * a secondary factor in the way BBR works (see the
>> + * paper and think about it, it will take some time).
>> + * Basically the hptsi_gain spreads the packets out
>> + * so you never get more than BDP to the peer even
>> + * if the cwnd is high. In our implemenation that
>> + * means in non-recovery/retransmission scenarios
>> + * cwnd will never be reached by the flight-size.
>> + */
>> +static int32_t bbr_cwnd_gain =3D BBR_UNIT * 2;
>> +static int32_t bbr_tlp_type_to_use =3D BBR_SRTT;
>> +static int32_t bbr_delack_time =3D 100000;	/* 100ms in useconds */
>> +static int32_t bbr_sack_not_required =3D 0;	/* set to one to allow =
non-sack to use bbr
>> */ +static int32_t bbr_initial_bw_bps =3D 62500;	/* 500kbps in =
bytes ps */
>> +static int32_t bbr_ignore_data_after_close =3D 1;
>> +static int16_t bbr_hptsi_gain[] =3D {
>> +	(BBR_UNIT *5 / 4),
>> +	(BBR_UNIT * 3 / 4),
>> +	BBR_UNIT,
>> +	BBR_UNIT,
>> +	BBR_UNIT,
>> +	BBR_UNIT,
>> +	BBR_UNIT,
>> +	BBR_UNIT
>> +};
>> +int32_t bbr_use_rack_resend_cheat =3D 1;
>> +int32_t bbr_sends_full_iwnd =3D 1;
>> +
>> +#define BBR_HPTSI_GAIN_MAX 8
>> +/*
>> + * The BBR module incorporates a number of
>> + * TCP ideas that have been put out into the IETF
>> + * over the last few years:
>> + * - Yuchung Cheng's RACK TCP (for which its named) that
>> + *    will stop us using the number of dup acks and instead
>> + *    use time as the gage of when we retransmit.
>> + * - Reorder Detection of RFC4737 and the Tail-Loss probe draft
>> + *    of Dukkipati et.al.
>> + * - Van Jacobson's et.al BBR.
>> + *
>> + * RACK depends on SACK, so if an endpoint arrives that
>> + * cannot do SACK the state machine below will shuttle the
>> + * connection back to using the "default" TCP stack that is
>> + * in FreeBSD.
>> + *
>> + * To implement BBR and RACK the original TCP stack was first =
decomposed
>> + * into a functional state machine with individual states
>> + * for each of the possible TCP connection states. The do_segement
>> + * functions role in life is to mandate the connection supports SACK
>> + * initially and then assure that the RACK state matches the =
conenction
>> + * state before calling the states do_segment function. Data =
processing
>> + * of inbound segments also now happens in the hpts_do_segment in =
general
>> + * with only one exception. This is so we can keep the connection on
>> + * a single CPU.
>> + *
>> + * Each state is simplified due to the fact that the original =
do_segment
>> + * has been decomposed and we *know* what state we are in (no
>> + * switches on the state) and all tests for SACK are gone. This
>> + * greatly simplifies what each state does.
>> + *
>> + * TCP output is also over-written with a new version since it
>> + * must maintain the new rack scoreboard and has had hptsi
>> + * integrated as a requirment. Still todo is to eliminate the
>> + * use of the callout_() system and use the hpts for all
>> + * timers as well.
>> + */
>> +static uint32_t bbr_rtt_probe_time =3D 200000;	/* 200ms in =
micro seconds */
>> +static uint32_t bbr_rtt_probe_cwndtarg =3D 4;	/* How many =
mss's outstanding */
>> +static const int32_t bbr_min_req_free =3D 2;	/* The min we =
must have on the
>> +						 * free list */
>> +static int32_t bbr_tlp_thresh =3D 1;
>> +static int32_t bbr_reorder_thresh =3D 2;
>> +static int32_t bbr_reorder_fade =3D 60000000;	/* 0 - never =
fade, def
>> +						 * 60,000,000 - 60 =
seconds */
>> +static int32_t bbr_pkt_delay =3D 1000;
>> +static int32_t bbr_min_to =3D 1000;	/* Number of usec's minimum =
timeout */
>> +static int32_t bbr_incr_timers =3D 1;
>> +
>> +static int32_t bbr_tlp_min =3D 10000;	/* 10ms in usecs */
>> +static int32_t bbr_delayed_ack_time =3D 200000;	/* 200ms in =
usecs */
>> +static int32_t bbr_exit_startup_at_loss =3D 1;
>> +
>> +/*
>> + * bbr_lt_bw_ratio is 1/8th
>> + * bbr_lt_bw_diff is  < 4 Kbit/sec
>> + */
>> +static uint64_t bbr_lt_bw_diff =3D 4000 / 8;	/* In bytes per =
second */
>> +static uint64_t bbr_lt_bw_ratio =3D 8;	/* For 1/8th */
>> +static uint32_t bbr_lt_bw_max_rtts =3D 48;	/* How many rtt's do we =
use
>> +						 * the lt_bw for */
>> +static uint32_t bbr_lt_intvl_min_rtts =3D 4;	/* Min num of =
RTT's to measure
>> +						 * lt_bw */
>> +static int32_t bbr_lt_intvl_fp =3D 0;		/* False =
positive epoch diff */
>> +static int32_t bbr_lt_loss_thresh =3D 196;	/* Lost vs delivered % =
*/
>> +static int32_t bbr_lt_fd_thresh =3D 100;		/* false =
detection % */
>> +
>> +static int32_t bbr_verbose_logging =3D 0;
>> +/*
>> + * Currently regular tcp has a rto_min of 30ms
>> + * the backoff goes 12 times so that ends up
>> + * being a total of 122.850 seconds before a
>> + * connection is killed.
>> + */
>> +static int32_t bbr_rto_min_ms =3D 30;	/* 30ms same as main =
freebsd */
>> +static int32_t bbr_rto_max_sec =3D 4;	/* 4 seconds */
>> +
>> +/****************************************************/
>> +/* DEFAULT TSO SIZING  (cpu performance impacting)  */
>> +/****************************************************/
>> +/* What amount is our formula using to get TSO size */
>> +static int32_t bbr_hptsi_per_second =3D 1000;
>> +
>> +/*
>> + * For hptsi under bbr_cross_over connections what is delay=20
>> + * target 7ms (in usec) combined with a seg_max of 2
>> + * gets us close to identical google behavior in=20
>> + * TSO size selection (possibly more 1MSS sends).
>> + */
>> +static int32_t bbr_hptsi_segments_delay_tar =3D 7000;
>> +
>> +/* Does pacing delay include overhead's in its time calculations? */
>> +static int32_t bbr_include_enet_oh =3D 0;
>> +static int32_t bbr_include_ip_oh =3D 1;
>> +static int32_t bbr_include_tcp_oh =3D 1;
>> +static int32_t bbr_google_discount =3D 10;
>> +
>> +/* Do we use (nf mode) pkt-epoch to drive us or rttProp? */
>> +static int32_t bbr_state_is_pkt_epoch =3D 0;
>> +static int32_t bbr_state_drain_2_tar =3D 1;
>> +/* What is the max the 0 - bbr_cross_over MBPS TSO target
>> + * can reach using our delay target. Note that this
>> + * value becomes the floor for the cross over
>> + * algorithm.
>>=20
>> *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
>> _______________________________________________
>> svn-src-head@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/svn-src-head
>=20
>> To unsubscribe, send any mail to =
"svn-src-head-unsubscribe@freebsd.org"
>=20
> This break kernel builds:
>=20
> [...]
> /usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:5613:9: =
error: implicit
> declaration of function 'tcp_chg_pacing_rate' is invalid in C99
> [-Werror,-Wimplicit-function-declaration] nrte =3D =
tcp_chg_pacing_rate(bbr->r_ctl.crte, ^
> /usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:5613:9: =
error: this function
> declaration is not a prototype [-Werror,-Wstrict-prototypes]
> /usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:5613:7: =
error: incompatible
> integer to pointer conversion assigning to 'const struct =
tcp_hwrate_limit_table *' from 'int'
> [-Werror,-Wint-conversion] nrte =3D =
tcp_chg_pacing_rate(bbr->r_ctl.crte, ^
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ --- all_subdir_toecore --- =
Building
> =
/usr/obj/usr/src/amd64.amd64/sys/THOR/modules/usr/src/sys/modules/toecore/=
toecore.ko ---
> all_subdir_tcp --- =
/usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:10443:4:
> error: implicit declaration of function 'tcp_rel_pacing_rate' is =
invalid in C99
> [-Werror,-Wimplicit-function-declaration] =
tcp_rel_pacing_rate(bbr->r_ctl.crte, bbr->rc_tp); ^
> - --- all_subdir_tpm ---
> =3D=3D=3D> tpm (all)
> - --- all_subdir_tcp ---
> =
/usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:10443:4: =
error: this function
> declaration is not a prototype [-Werror,-Wstrict-prototypes] --- =
all_subdir_trm ---
> =3D=3D=3D> trm (all)
> - --- all_subdir_tcp ---
> =
/usr/src/sys/modules/tcp/bbr/../../../netinet/tcp_stacks/bbr.c:14307:21: =
error: implicit
> declaration of function 'tcp_set_pacing_rate' is invalid in C99
> [-Werror,-Wimplicit-function-declaration] bbr->r_ctl.crte =3D =
tcp_set_pacing_rate(bbr->rc_tp,=20
>=20
> - --=20
> O. Hartmann
>=20
> Ich widerspreche der Nutzung oder =C3=9Cbermittlung meiner Daten f=C3=BC=
r
> Werbezwecke oder f=C3=BCr die Markt- oder Meinungsforschung (=C2=A7 28 =
Abs. 4 BDSG).
> -----BEGIN PGP SIGNATURE-----
>=20
> iHUEARYIAB0WIQSy8IBxAPDkqVBaTJ44N1ZZPba5RwUCXYpujgAKCRA4N1ZZPba5
> R7bwAQD9cgJgJyb5PqX8A8x9R+H9Tun9b+YSg4TNK3vP/VffHwEA8MN2B/QTvhJc
> WISysiLHeOrQKGhCJbtjW2RUbprLfAY=3D
> =3D0hIu
> -----END PGP SIGNATURE-----

------
Randall Stewart
rrs@netflix.com






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2837AF7D-8040-491C-B9AE-9749412F9AE6>