Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 25 Jul 2002 09:26:39 -0500
From:      "Jaime Bozza" <jbozza@thinkburst.com>
To:        "'Matthew Dillon'" <dillon@apollo.backplane.com>
Cc:        <stable@FreeBSD.ORG>
Subject:   RE: RE: Abominable NFSv3 read performance / FreeBSD server / Solaris client
Message-ID:  <02d401c233e7$49ef80d0$6401010a@bozza>
In-Reply-To: <200207250002.g6P02m07030238@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Matt,
   First of all, I want to thank you for all the help in explaining the
dumps and so forth.  I have a much better understanding of the
interaction of NFS between the two systems.  You've been more than
patient with my stumbling and learning.

   In comparing the differences between the freebsd-dump and the
solaris-dump (new ones in which I've been able to increase the
advertised window, but not more than around 32K, even with much higher
buffers), one difference I noticed is that the FreeBSD client seems to
be advertising a sliding window. (RFC1323 I assume?)  Even if I set
tcp_wscale_always on the Solaris client, it still only advertises the
same window size every time.  I've fiddled with both the tcp and the nfs
tunables and it just seems that the Solaris system I'm testing with
can't seem to handle that much data in its buffers.  (I was able to cut
the time in half using the default rsize of 32768, but the system still
just couldn't seem to handle the blocks as quickly as a smaller rsize.)

   I also installed tcpdump on the Solaris system so I could look at
dumps between Solaris to Solaris and compare.  From that, I noticed the
Solaris server advertises a much smaller (around 24k) window no matter
what, even with the client advertising something higher.  (I tried
setting xmit_hiwat in the startup scripts and restarting the Solaris
server to assure the setting was changed before the nfs daemons came
online) I may still not be getting the settings correct, but I'm at a
loss at what I'm missing.

   Regardless, thanks again for the help.  I have enough data to make
the connections work similar, even if the behind the scenes aren't
anything alike.


Jaime

   

-----Original Message-----
From: Matthew Dillon [mailto:dillon@apollo.backplane.com] 
Sent: Wednesday, July 24, 2002 7:03 PM
To: Jaime Bozza
Cc: stable@FreeBSD.ORG
Subject: Re: RE: Abominable NFSv3 read performance / FreeBSD server /
Solaris client



:Ok, put me in my corner.  I *knew* there was something wrong with the
:tcpdump, but I sat there looking at it and just thinking it was
:different because of the OS.  A big DUH from me.
:
:Ok, attached are three dumps (using your params below) from the FreeBSD
:server side.  All are TCP (I had to force TCP on the FreeBSD client
:since it defaults to UDP - Solaris doesn't really give you the choice)
:Even though it may not be relevant, I gave two dumps from Solaris, one
:with 8K rsize and one with 32K rsize.  (Since 32K is where the massive
:increase in time occurs)
:
:Just to test your point, a UDP connection with a Solaris client showed
a
:similar tcpdump (to the FreeBSD UDP dump) and the speed was also
:similar, so I think the network itself is fine.
:
:10.1.2.10 = FreeBSD Server
:10.1.2.9 = Solaris Client
:10.1.2.50 = FreeBSD Client
:
:Jaime

    Well, looking at the solaris8k-dump the solaris client is way behind
    on its acks.  It's acking the 16K point after the FreeBSD server has
    pushed out 33KB, so the FreeBSD server is probably hitting the
Solaris
    client's TCP window limit.  The FreeBSD server is then not
restarting
    the transmit as quickly as it could, but the basic problem is that
    Solaris is advertising too small a window I think.

16:48:42.376155 10.1.2.10.2049 > 10.1.2.9.3887210299: reply ERR 1460
(DF)
16:48:42.376609 10.1.2.9.1016 > 10.1.2.10.2049: . ack 47461 win 24820
(DF)
16:48:42.376835 10.1.2.9.1016 > 10.1.2.10.2049: . ack 50381 win 24820
(DF)
16:48:42.377041 10.1.2.9.1016 > 10.1.2.10.2049: . ack 53301 win 24820
(DF)
16:48:42.456437 10.1.2.9.290121090 > 10.1.2.10.2049: 172 read fh
957,375898/2134869 8192 bytes @ 0x000016000 (DF)

    Above, solaris is queueing the next read command.

16:48:42.456486 10.1.2.10.2049 > 10.1.2.9.1381004381: reply ERR 588 (DF)

    Above, FreeBSD is *finishing* sending the data from the previous 
    read command.  Normally FreeBSD would have burst this data up top
just
    after the 3887210299 point but it didn't probably because it ran out
    of window space.

    And below FreeBSD is starting to send the data for the most recent
    read command.

16:48:42.456656 10.1.2.10.2049 > 10.1.2.9.290121090: reply ok 1460 read
(DF)
16:48:42.456668 10.1.2.10.2049 > 10.1.2.9.1859070703: reply ERR 1460
(DF)


    Note: When you are tring to read the tcpdump output just ignore the
    'reply ERR' stuff, it's just TCPDUMP trying to interpret data blocks
    in the stream as commands when they're really just data blocks.

    What you need to do is get Solaris to advertise a much larger
window.
    Perhaps you've tried this already and did not seem to work, but
perhaps
    that is because you didn't reset the TCP connection (killing and 
    restarting the NFSD's on the FreeBSD server should suffice to reset
    the Solaris client's TCP connections). ( At the same time, make sure
    you keep FreeBSD's transmit buffers bumped up to at least 65535, but
    the main issue appears to be Solaris's advertised window ).

    If you do a dump on FreeBSD and Solaris does not advertise larger 
    windows (it's advertising 24820 most of the time in the dumps you've
    provided to date), then you have not managed to get Solaris to
advertise
    a larger window.

						-Matt





To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?02d401c233e7$49ef80d0$6401010a>