FreeBSD Mail Archives

Date:      Wed, 26 Aug 2015 01:24:57 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Chris Stankevitz <chris@stankevitz.com>
Cc:        freebsd-net@freebsd.org
Subject:   Re: ssh over WAN: TCP window too small
Message-ID:  <20150826082457.GQ33167@funkthat.com>
In-Reply-To: <55DD2A98.2010605@stankevitz.com>
References:  <55DCF080.7080208@stankevitz.com> <20150826010323.GN33167@funkthat.com> <55DD2A98.2010605@stankevitz.com>

Chris Stankevitz wrote this message on Tue, Aug 25, 2015 at 19:55 -0700:
> John-Mark,
> 
> Thank you for your reply.
> 
> On 8/25/15 6:03 PM, John-Mark Gurney wrote:
> > Chris Stankevitz wrote this message on Tue, Aug 25, 2015 at 15:47 -0700:
> >> # cat /dev/urandom | ssh root@host 'cat > /dev/null'
> >
> > Don't use this for testing... use /dev/zero or some other device
> > that can produce data faster than this...
> 
> Okay.  As I'm sure you can imagine, I used urandom to avoid compression 
> artifacts.  My urandom produces data at ~300 Mbps... but I will use 
> /dev/zero from now on.

Yeh, unless you enable compression, ssh doesn't use it, so it won't
be an issue...  Also, if you want to free up even more cpu, you can
test w/ "-c none" which disables encryption...

> > So, our SSH does have the HPN patches:
> > https://www.psc.edu/index.php/hpn-ssh
> >
> > and the README says:
> > BUFFER SIZES:
> > - if HPN is disabled the receive buffer size will be set to the OpenSSH default
> >    of 64K.
> 
> Yes... I spent some time reading that document and fretting over whether 
> or not HPN was really incorporated in my setup.  I "confirmed" that it 
> was available and enabled by setting "HPNDisabled no" and restarting 
> sshd (on both sides) without complaint.  I'm half-tempted to build from 
> ports to be certain.
> 
> > Looks like there are undocumented options like TCPRcvBuf that you can
> > use to adjust the recv buffer window...
> 
> According to the HPN README the default (which I am using) is the 
> "system wide TCP receive buffer size".  I don't know what value that is 
> or where it comes from (net.inet.tcp.???).  I will experiment with 
> TCPRcvBuf.

It does look like the values are in KB, as I tried to set it to 30000,
and I got this error message:
Couldn't set socket receive buffer to 30720000: No buffer space available

Also, don't forget that if you set this in .ssh/config, you only set
the client size recive buffer, not the server side, so you'd probably
need to add this to the server's sshd_config to enable it for server
receive side...

> > We have code that will auto grow
> > buffer sizes properly so that slow connections won't use up too much
> > buffer space...
> 
> That is what I expected, although I believe openssh tries thwart/limit 
> this by requesting particular buffer sizes (I'm really unqualified to 
> talk about this).  And it is my understanding that HPN undoes these 
> limitations although I'm not sure if it opens the door to FreeBSD having 
> full control or uses its own voodoo.

You can verify this w/ ktrace -i ssh <params>...  Then after that, you
do a kdump | grep SO_RCVBUF | grep setsockopt to see if the program
set any...  If you see something like:
$ kdump | grep SO_RCVBUF | grep setsockopt
  6641 ssh      CALL  setsockopt(0x3,SOL_SOCKET,SO_RCVBUF,0x62dd8c,0x4)
  6641 ssh      CALL  setsockopt(0x7,SOL_SOCKET,SO_RCVBUF,0x7fffffffcaa4,0x4)

Then the buffer size is being set...  I don't see this w/o tcprcvbuf
in my config file, but I do when I add it...

> > In a quick test of mine, I'm seeing a buffer size of ~520k from my
> > MacOSX box, and ~776k from my 9.2-R box...  Server in both cases is
> > a June -CURRENT
> 
> Thank you for those numbers.  Since my system is basically stock, I 
> wonder if my bad behavior is an artifact of something on my network. 
> Did you invoke ssh more or less as "cat /dev/zero | ssh root@host 'cat > 
> /dev/null'"?   Are you quoting S-BCNT numbers?

The exact command is:
dd if=/dev/zero bs=1m | ssh carbon dd of=/dev/null bs=1m 

And the numbers I was quoting was the R-BMAX numbers...  As that is:
                     R-BMAX        Maximum bytes that can be used in the
                                   receive buffer.

S-BCNT is just the number of bytes waiting to be sent, not the largest
possible number that could be buffered...  You really can't depend upon
that number as only on high latency links will that have an appreciable
number, otherwise you'll likely catch it between the ack draining it,
and the program running to refill it...

> > netstat -xAanfinet is helpful on this...
> 
> That is brilliant!  I was using pcap and wireshark to deduce some of 
> that info.

Yep, there are lots of great debugging ways..  You could have also used
dtrace for some of this too.. :)

> I include my sender and receiver netstat's below for the ssh-ing 
> /dev/zero.  It differs from iperf (which works well), most notably in 
> S-BCNT (~1MB for iperf, ~64kB for ssh).  I think in my case the question is:
> 
> - who is keeping S-BCNT so low (openssh, HPN, or FreeBSD)?

As it's easier to change the client's recv buffer, you might want to
try the command:
ssh <host> dd if=/dev/zero bs=1m > /dev/null

And then you can play around w/ tcprcvbuf, though you should verify
that SO_RCVBUF is being set before this though...

> - Is the limitation introduced by the sending or receiving system?
> 
> - what is the mechanism by which S-BCNT grows when using ssh over 
> long/fat pipes?

Oh, I forgot to ask to make sure that net.inet.tcp.{send,recv}buf_auto
is enabled:
$ sysctl net.inet.tcp.{send,recv}buf_auto
net.inet.tcp.sendbuf_auto: 1
net.inet.tcp.recvbuf_auto: 1

Maybe a dump of your net.inet.tcp might also be helpful...

> Thank you again,
> 
> Chris
> 
> SSH Sender
> Recv-Q  0
> Send-Q  50132
> R-MBUF  0
> S-MBUF  16
> R-CLUS  0
> S-CLUS  14
> R-HIWA  66052
> S-HIWA  82852
> R-LOWA  1
> S-LOWA  2048
> R-BCNT  0
> S-BCNT  57344

You were probably unlucky when you sampled this value, and caught it at
a bad time...  Also, look at how much CPU time ssh uses...  ssh can
introduce additional latency that isn't apparent from the network...

Ahhh, also make sure the TCPRcvBufPoll is enabled...  I'm not sure
if that is the default, but I think that will tell ssh to check to
see what the current recv buffer size is, and buffer up to that
amount of data:
  Conditions: HPNBufferSize NOT Set, TCPRcvBufPoll enabled, TCPRcvBuf NOT Set 
  Result: HPN Buffer Size = up to 64MB 
    This is the default state.  The HPN buffer size will grow to a maximum of
    64MB as the TCP receive buffer grows.  The maximum HPN Buffer size of 64MB
    is geared towards 10GigE transcontinental connections. 

> R-BMAX  528416
> S-BMAX  662816

These look correct...

> rexmt   0.29
> persist 0
> keep    7199.98
> 2msl    0
> delack  0
> rcvtime 0.01
> 
> SSH Receiver
> Recv-Q  0
> Send-Q  36
> R-MBUF  0
> S-MBUF  1
> R-CLUS  0
> S-CLUS  0
> R-HIWA  66052
> S-HIWA  33700
> R-LOWA  1
> S-LOWA  2048
> R-BCNT  0
> S-BCNT  256
> R-BMAX  528416
> S-BMAX  269600

These also look correct...

> rexmt   0.24
> persist 0
> keep    7199.96
> 2msl    0
> delack  0.06
> rcvtime 0.03

It's very possible that we don't set any of these values, so what
happens is that ssh reads the value of the receive buffer at startup,
which is 64k or so, and only does buffering in that size..  Then you
end up w/ a latency not of your network, but of the speed at which
your computer can encrypt at...  Just a thought, but you could also
measure latency between writes using ktrace to help figure this
out...

It really looks like we should set TCPRcvBufPoll by default on
FreeBSD...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150826082457.GQ33167>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation