From owner-freebsd-net@freebsd.org Wed Aug 26 22:32:16 2015 Return-Path: Delivered-To: freebsd-net@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id A97C99C32B8 for ; Wed, 26 Aug 2015 22:32:16 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "gold.funkthat.com", Issuer "gold.funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 88194682 for ; Wed, 26 Aug 2015 22:32:16 +0000 (UTC) (envelope-from jmg@gold.funkthat.com) Received: from gold.funkthat.com (localhost [127.0.0.1]) by gold.funkthat.com (8.14.5/8.14.5) with ESMTP id t7QMWFdc036248 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 26 Aug 2015 15:32:15 -0700 (PDT) (envelope-from jmg@gold.funkthat.com) Received: (from jmg@localhost) by gold.funkthat.com (8.14.5/8.14.5/Submit) id t7QMWFCa036247; Wed, 26 Aug 2015 15:32:15 -0700 (PDT) (envelope-from jmg) Date: Wed, 26 Aug 2015 15:32:15 -0700 From: John-Mark Gurney To: Chris Stankevitz Cc: freebsd-net@freebsd.org Subject: Re: ssh over WAN: TCP window too small Message-ID: <20150826223215.GS33167@funkthat.com> References: <55DCF080.7080208@stankevitz.com> <20150826010323.GN33167@funkthat.com> <55DD2A98.2010605@stankevitz.com> <20150826082457.GQ33167@funkthat.com> <55DE2FDF.5030707@stankevitz.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <55DE2FDF.5030707@stankevitz.com> X-Operating-System: FreeBSD 9.1-PRERELEASE amd64 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.7 (gold.funkthat.com [127.0.0.1]); Wed, 26 Aug 2015 15:32:15 -0700 (PDT) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Aug 2015 22:32:16 -0000 Chris Stankevitz wrote this message on Wed, Aug 26, 2015 at 14:30 -0700: > Thanks again. I appreciate you teaching me "how to fish". I basically > spent all morning reading kdump output. No worries, glad you're learning... > On 8/26/15 1:24 AM, John-Mark Gurney wrote: > > It really looks like we should set TCPRcvBufPoll by default on > > FreeBSD... > > According to /etc/ssh/sshd_config, TCPRcvBufPoll defaults to true. > ktrace (thank you for showing me this) shows the ssh process > continuously polling SO_RCVBUF. > > In my case with TCPRcvBufPoll on and HPNBufferSize and TCPRcvBuf unset, > ssh never sets SO_RCVBUF or SO_SNDBUF so presumably FreeBSD has complete > control over those values (you saw the same thing). > > But perhaps all this is moot in my case because netstat shows that the > sender and receiver have sufficiently large SND/RCV buffers: (can you > confirm my interpretation is correct?) > > Sender S-BMAX 662816 > Receiver R-BMAX 528416 Yes, this is correct... So definately not an issue w/ window size of socket buffer size... > >> I will experiment with TCPRcvBuf. > > > > It does look like the values are in KB > > Yes, README.hpn says they are in KB. However, even though it is > described in README.hpn, I cannot set TCPRcvBuf in sshd_config: > > /etc/ssh/sshd_config: line 141: Bad configuration option: TCPRcvBuf > /etc/ssh/sshd_config: line 141: Bad configuration option: TcpRcvBuf Looks like: https://www.psc.edu/index.php/networking/695 Says that TCPRcvBuf is client side only.. Which is odd, but not an issue w/ FreeBSD... > However, as I described above, I believe the buffer size is a red herring. Agreed.... > ktrace shows pretty clearly what is happening. ssh isn't even bothering > to read from /dev/zero until select(2) gives the green light. And in my > case select(2) blocks ssh for 0.1 second for every ~30KB of data: > > > 94146 ssh 6.592042 CALL getsockopt(0x3,SOL_SOCKET,SO_RCVBUF > 94146 ssh 6.592054 RET getsockopt 0 > 94146 ssh 6.592065 CALL select(0x7,0x803817168,0x803817178,0,0) > 94146 ssh 6.685873 RET select 1 > 94146 ssh 6.685907 CALL read(0x3,0x7fffffffae40,0x2000) > 94146 ssh 6.685921 GIO fd 3 read 36 bytes > [ read of fd 3 snipped] > 94146 ssh 6.685931 RET read 36/0x24 > 94146 ssh 6.685950 CALL getpid > 94146 ssh 6.685962 RET getpid 94146/0x16fc2 > 94146 ssh 6.685974 CALL getpid > 94146 ssh 6.685984 RET getpid 94146/0x16fc2 > 94146 ssh 6.685996 CALL getpid > 94146 ssh 6.686006 RET getpid 94146/0x16fc2 > 94146 ssh 6.686017 CALL getpid > 94146 ssh 6.686027 RET getpid 94146/0x16fc2 > 94146 ssh 6.686091 CALL getsockopt(0x3,SOL_SOCKET,SO_RCVBUF > 94146 ssh 6.686103 RET getsockopt 0 > 94146 ssh 6.686116 CALL select(0x7,0x803817168,0x803817178,0,0) > 94146 ssh 6.686128 RET select 2 > 94146 ssh 6.686140 CALL read(0x4,0x7fffffff6c70,0x4000) > 94146 ssh 6.686154 GIO fd 4 read 4096 bytes > [ read of stdin (/dev/zero) snipped) It would be interesting to know how long from the read of stdin (and is it really reading stdin in 4k blocks? If so, that should be fixed) to the write out the socket... Basicly, how long does it take to encrypt the data... > Note in the above the blocking call to select at time 6.592065 that > takes ~0.1 second. This is the reason my connection is slow. The > content appears to be encrypted... I presume it's an application-level > ssh ack. > > BTW when I disable HPN (HPNdisabled=yes) that same select happens, but > it blocks for less time (~0.05 second) and roughly doubles my throughput. > > > Also, don't forget that if you set this in .ssh/config, you only set > > the client size recive buffer, not the server side, so you'd probably > > need to add this to the server's sshd_config to enable it for server > > receive side... > > I understand. > > > ktrace -i ssh ... > > Thank you, this is awesome. Is there a way for me to ktrace 'b' in this > example: `a | b | c`? I tried `ktrace a | b | c` and `ktrace test.sh` When you do this, make sure you ktrace -i to inherit the ktrace flag to the children... In the case of a | b | c, just doing: a | ktrace b | c Should work too... > (which included a|b|c) but neither seemed to work. I'm using stream > redirection now but it doesn't afford me the bs control of dd. Perhaps > named pipes is the solution. > > > Oh, I forgot to ask to make sure that net.inet.tcp.{send,recv}buf_auto > > is enabled: > > Unfortunately it is enabled :-/ > > > Maybe a dump of your net.inet.tcp might also be helpful... > > This should all be standard out-of-the-box: [...] > net.inet.tcp.sendspace: 32768 > net.inet.tcp.recvspace: 65536 Try increasing these and reporting back... the buf_auto should override those, but this may be limiting it... > net.inet.tcp.cc.algorithm: newreno > net.inet.tcp.cc.available: newreno Another option could be to look at other cc algorithms... See mod_cc(4) for more info... [...] > >> S-BCNT 57344 > > > > You were probably unlucky when you sampled this value, and caught it at > > a bad time... Also, look at how much CPU time ssh uses... ssh can > > introduce additional latency that isn't apparent from the network... > > S-BCNT is always ~60KB when HPN is enabled. This jives with my "ssh is > spending all of its time waiting on select(2)" theory. So, I looked at what getsockopt SO_RCVBUF returns, and it returns: case SO_RCVBUF: optval = so->so_rcv.sb_hiwat; Which is NOT S-BMAX, so it looks like OpenSSH only ever gets 66052 bytes for the buffer... If it's decided to base it's waiting for ack on SO_RCVBUF, then this is probably where the issue is... Try setting HPNBufferSize to something resonably large, like 1MB, or above your bandwidth-delay product to see if this will make a difference.. Per: Conditions: HPNBufferSize SET, TCPRcvBufPoll enabled, TCPRcvBuf NOT Set Result: HPN Buffer Size = grows to HPNBufferSize The buffer will grow up to the maximum size specified here. This could be an interaction w/ the HPN buffering and the socket buffer size... As the receive buffer never grows large, HPN can't buffer enough data to meet the bandwidth product delay, even though it should be able to.. > > It's very possible that we don't set any of these values, so what > > happens is that ssh reads the value of the receive buffer at startup, > > which is 64k or so, and only does buffering in that size.. Then you > > end up w/ a latency not of your network, but of the speed at which > > your computer can encrypt at... Just a thought, but you could also > > measure latency between writes using ktrace to help figure this > > out... > > Yes, I believe something like this is happening now. > > Thank you again for your help... this thread is proving to me one of > those quantum leap moments for me in terms of FreeBSD knowledge. Glad I could be of help. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."