From owner-freebsd-net@freebsd.org Wed Aug 26 21:30:09 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 751D19C2C6A for ; Wed, 26 Aug 2015 21:30:09 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from mango.stankevitz.com (mango.stankevitz.com [208.79.93.194]) by mx1.freebsd.org (Postfix) with ESMTP id 51B2F155A for ; Wed, 26 Aug 2015 21:30:09 +0000 (UTC) (envelope-from chris@stankevitz.com) Received: from Chriss-MacBook-Pro.local (209-203-101-124.static.twtelecom.net [209.203.101.124]) (using TLSv1.2 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mango.stankevitz.com (Postfix) with ESMTPSA id D8123145E; Wed, 26 Aug 2015 14:30:07 -0700 (PDT) Message-ID: <55DE2FDF.5030707@stankevitz.com> Date: Wed, 26 Aug 2015 14:30:07 -0700 From: Chris Stankevitz User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: John-Mark Gurney CC: freebsd-net@freebsd.org Subject: Re: ssh over WAN: TCP window too small References: <55DCF080.7080208@stankevitz.com> <20150826010323.GN33167@funkthat.com> <55DD2A98.2010605@stankevitz.com> <20150826082457.GQ33167@funkthat.com> In-Reply-To: <20150826082457.GQ33167@funkthat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Aug 2015 21:30:09 -0000 John-Mark, Thanks again. I appreciate you teaching me "how to fish". I basically spent all morning reading kdump output. On 8/26/15 1:24 AM, John-Mark Gurney wrote: > It really looks like we should set TCPRcvBufPoll by default on > FreeBSD... According to /etc/ssh/sshd_config, TCPRcvBufPoll defaults to true. ktrace (thank you for showing me this) shows the ssh process continuously polling SO_RCVBUF. In my case with TCPRcvBufPoll on and HPNBufferSize and TCPRcvBuf unset, ssh never sets SO_RCVBUF or SO_SNDBUF so presumably FreeBSD has complete control over those values (you saw the same thing). But perhaps all this is moot in my case because netstat shows that the sender and receiver have sufficiently large SND/RCV buffers: (can you confirm my interpretation is correct?) Sender S-BMAX 662816 Receiver R-BMAX 528416 >> I will experiment with TCPRcvBuf. > > It does look like the values are in KB Yes, README.hpn says they are in KB. However, even though it is described in README.hpn, I cannot set TCPRcvBuf in sshd_config: /etc/ssh/sshd_config: line 141: Bad configuration option: TCPRcvBuf /etc/ssh/sshd_config: line 141: Bad configuration option: TcpRcvBuf However, as I described above, I believe the buffer size is a red herring. ktrace shows pretty clearly what is happening. ssh isn't even bothering to read from /dev/zero until select(2) gives the green light. And in my case select(2) blocks ssh for 0.1 second for every ~30KB of data: 94146 ssh 6.592042 CALL getsockopt(0x3,SOL_SOCKET,SO_RCVBUF 94146 ssh 6.592054 RET getsockopt 0 94146 ssh 6.592065 CALL select(0x7,0x803817168,0x803817178,0,0) 94146 ssh 6.685873 RET select 1 94146 ssh 6.685907 CALL read(0x3,0x7fffffffae40,0x2000) 94146 ssh 6.685921 GIO fd 3 read 36 bytes [ read of fd 3 snipped] 94146 ssh 6.685931 RET read 36/0x24 94146 ssh 6.685950 CALL getpid 94146 ssh 6.685962 RET getpid 94146/0x16fc2 94146 ssh 6.685974 CALL getpid 94146 ssh 6.685984 RET getpid 94146/0x16fc2 94146 ssh 6.685996 CALL getpid 94146 ssh 6.686006 RET getpid 94146/0x16fc2 94146 ssh 6.686017 CALL getpid 94146 ssh 6.686027 RET getpid 94146/0x16fc2 94146 ssh 6.686091 CALL getsockopt(0x3,SOL_SOCKET,SO_RCVBUF 94146 ssh 6.686103 RET getsockopt 0 94146 ssh 6.686116 CALL select(0x7,0x803817168,0x803817178,0,0) 94146 ssh 6.686128 RET select 2 94146 ssh 6.686140 CALL read(0x4,0x7fffffff6c70,0x4000) 94146 ssh 6.686154 GIO fd 4 read 4096 bytes [ read of stdin (/dev/zero) snipped) Note in the above the blocking call to select at time 6.592065 that takes ~0.1 second. This is the reason my connection is slow. The content appears to be encrypted... I presume it's an application-level ssh ack. BTW when I disable HPN (HPNdisabled=yes) that same select happens, but it blocks for less time (~0.05 second) and roughly doubles my throughput. > Also, don't forget that if you set this in .ssh/config, you only set > the client size recive buffer, not the server side, so you'd probably > need to add this to the server's sshd_config to enable it for server > receive side... I understand. > ktrace -i ssh ... Thank you, this is awesome. Is there a way for me to ktrace 'b' in this example: `a | b | c`? I tried `ktrace a | b | c` and `ktrace test.sh` (which included a|b|c) but neither seemed to work. I'm using stream redirection now but it doesn't afford me the bs control of dd. Perhaps named pipes is the solution. > Oh, I forgot to ask to make sure that net.inet.tcp.{send,recv}buf_auto > is enabled: Unfortunately it is enabled :-/ > Maybe a dump of your net.inet.tcp might also be helpful... This should all be standard out-of-the-box: net.inet.tcp.rfc1323: 1 net.inet.tcp.mssdflt: 536 net.inet.tcp.keepidle: 7200000 net.inet.tcp.keepintvl: 75000 net.inet.tcp.sendspace: 32768 net.inet.tcp.recvspace: 65536 net.inet.tcp.keepinit: 75000 net.inet.tcp.delacktime: 100 net.inet.tcp.v6mssdflt: 1220 net.inet.tcp.cc.algorithm: newreno net.inet.tcp.cc.available: newreno net.inet.tcp.hostcache.cachelimit: 15360 net.inet.tcp.hostcache.hashsize: 512 net.inet.tcp.hostcache.bucketlimit: 30 net.inet.tcp.hostcache.count: 6 net.inet.tcp.hostcache.expire: 3600 net.inet.tcp.hostcache.prune: 300 net.inet.tcp.hostcache.purge: 0 net.inet.tcp.log_in_vain: 0 net.inet.tcp.blackhole: 0 net.inet.tcp.delayed_ack: 1 net.inet.tcp.drop_synfin: 0 net.inet.tcp.rfc3042: 1 net.inet.tcp.rfc3390: 1 net.inet.tcp.experimental.initcwnd10: 1 net.inet.tcp.rfc3465: 1 net.inet.tcp.abc_l_var: 2 net.inet.tcp.ecn.enable: 0 net.inet.tcp.ecn.maxretries: 1 net.inet.tcp.insecure_rst: 0 net.inet.tcp.recvbuf_auto: 1 net.inet.tcp.recvbuf_inc: 16384 net.inet.tcp.recvbuf_max: 2097152 net.inet.tcp.path_mtu_discovery: 1 net.inet.tcp.tso: 1 net.inet.tcp.sendbuf_auto: 1 net.inet.tcp.sendbuf_inc: 8192 net.inet.tcp.sendbuf_max: 2097152 net.inet.tcp.reass.maxsegments: 255100 net.inet.tcp.reass.cursegments: 0 net.inet.tcp.reass.overflows: 471 net.inet.tcp.sack.enable: 1 net.inet.tcp.sack.maxholes: 128 net.inet.tcp.sack.globalmaxholes: 65536 net.inet.tcp.sack.globalholes: 0 net.inet.tcp.minmss: 216 net.inet.tcp.log_debug: 0 net.inet.tcp.tcbhashsize: 524288 net.inet.tcp.do_tcpdrain: 1 net.inet.tcp.pcbcount: 43 net.inet.tcp.icmp_may_rst: 1 net.inet.tcp.isn_reseed_interval: 0 net.inet.tcp.soreceive_stream: 0 net.inet.tcp.syncookies: 1 net.inet.tcp.syncookies_only: 0 net.inet.tcp.syncache.bucketlimit: 30 net.inet.tcp.syncache.cachelimit: 15375 net.inet.tcp.syncache.count: 0 net.inet.tcp.syncache.hashsize: 512 net.inet.tcp.syncache.rexmtlimit: 3 net.inet.tcp.syncache.rst_on_sock_fail: 1 net.inet.tcp.msl: 30000 net.inet.tcp.rexmit_min: 30 net.inet.tcp.rexmit_slop: 200 net.inet.tcp.always_keepalive: 1 net.inet.tcp.fast_finwait2_recycle: 0 net.inet.tcp.finwait2_timeout: 60000 net.inet.tcp.keepcnt: 8 net.inet.tcp.rexmit_drop_options: 0 net.inet.tcp.per_cpu_timers: 0 net.inet.tcp.timer_race: 0 net.inet.tcp.maxtcptw: 27767 net.inet.tcp.nolocaltimewait: 0 >> S-BCNT 57344 > > You were probably unlucky when you sampled this value, and caught it at > a bad time... Also, look at how much CPU time ssh uses... ssh can > introduce additional latency that isn't apparent from the network... S-BCNT is always ~60KB when HPN is enabled. This jives with my "ssh is spending all of its time waiting on select(2)" theory. > It's very possible that we don't set any of these values, so what > happens is that ssh reads the value of the receive buffer at startup, > which is 64k or so, and only does buffering in that size.. Then you > end up w/ a latency not of your network, but of the speed at which > your computer can encrypt at... Just a thought, but you could also > measure latency between writes using ktrace to help figure this > out... Yes, I believe something like this is happening now. Thank you again for your help... this thread is proving to me one of those quantum leap moments for me in terms of FreeBSD knowledge. Chris