From owner-freebsd-net@FreeBSD.ORG Mon Aug 11 17:06:10 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CF617B4F for ; Mon, 11 Aug 2014 17:06:10 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EB7382584 for ; Mon, 11 Aug 2014 17:06:09 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s7BH66rN034590 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 11 Aug 2014 10:06:07 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s7BH66HH034589; Mon, 11 Aug 2014 10:06:06 -0700 (PDT) (envelope-from jmg) Date: Mon, 11 Aug 2014 10:06:06 -0700 From: John-Mark Gurney To: Vlad Zolotarov Subject: Re: TCP Rx window auto sizing relies on TCP timestamp option? Message-ID: <20140811170606.GV83475@funkthat.com> Mail-Followup-To: Vlad Zolotarov , freebsd-net@freebsd.org References: <53E8B424.2000904@cloudius-systems.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <53E8B424.2000904@cloudius-systems.com> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Mon, 11 Aug 2014 10:06:07 -0700 (PDT) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 11 Aug 2014 17:06:11 -0000 Vlad Zolotarov wrote this message on Mon, Aug 11, 2014 at 15:16 +0300: > Hi, I have the most strange question about the TCP Rx window auto sizing > implementation in a FreeBSD networking stack. > When I looked at the FreeBSD code (hash > 9abce0e567c9a5a0520cdd94d5c633c7baf9a184) I noticed that > the mentioned above feature will not be "enabled" if there isn't a TCP > timestamp option present in the current TCP session: > > See sys/netinet/tcp_input.c: line 1813 in tcp_do_segment() function: > > if (V_tcp_do_autorcvbuf && > *to.to_tsecr* && <-------- this is what I'm > talking about > (so->so_rcv.sb_flags & SB_AUTOSIZE)) > > So, if i read the code correctly, if there isn't a TS option (negotiated > and thus present in every received packet) the receive socket buffer > won't grow thus preventing the growth of the Rx window. > If that's the case this is very strange since TS option is not promised > and even more - in many cases it won't be present. > For example in Linux this feature is disabled by default (controlled by > /proc/sys/net/ipv4/tcp_timestamps). > This is how I actually noticed the problem the first place: I ran iperf > test where Linux was an initiator and a transmitter (iperf -c) FreeBSD > box was a receiver (iperf -s) and I noticed that the Rx window wasn't > opening up because Linux box hasn't negotiated the TS option in the SYN. > As a result, the throughput numbers were significantly lower compared to > Linux-to-Linux setup (Linux uses a Dynamic Right-Sizing (DRS) algorithm > http://public.lanl.gov/radiant/pubs.html#DRS, which doesn't rely on TS). > > Could anybody comment on this, pls.? > Did I miss anything? > Is it true that FreeBSD assumes that TS option is always present and if > not how can I cause an Rx Window to open up when TS option hasn't been > negotiated? This means the receive buffer won't grow beyond the default of 64k... But, as the comment says: * On the receive side the socket buffer memory is only rarely * used to any significant extent. This allows us to be much The receive buffer will only get used if the application takes too long to read it's buffer, or it isn't currently waiting... If that's the case, then the application should be fixed to be able to process the data as quickly as it comes in... So, I don't see much of an issue w/ the code you pointed out, yes, the receive buffer won't grow, but there are options that you can set (sysctl net.inet.tcp.recvspace) and SO_RCVBUF in the application that will address it otherwise... Obviously setting the default too large will just waste memory... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."