Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jan 2014 17:30:43 -0600
From:      Bryan Venteicher <bryanv@freebsd.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        freebsd-net@freebsd.org, J David <j.david.lists@gmail.com>, Garrett Wollman <wollman@freebsd.org>, Bryan Venteicher <bryanv@freebsd.org>
Subject:   Re: Terrible NFS performance under 9.2-RELEASE?
Message-ID:  <CAGaYwLcDVMA3=1x4hXXVvRojCBewWFZUyZfdiup=jo685%2B51%2BA@mail.gmail.com>
In-Reply-To: <1352428787.18632865.1391036503658.JavaMail.root@uoguelph.ca>
References:  <CABXB=RQj2evY7=Q0_7vbHrQrH3fPkW774gjNxWLwWbRXMzjdDA@mail.gmail.com> <1352428787.18632865.1391036503658.JavaMail.root@uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 29, 2014 at 5:01 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote:

> J David wrote:
> > On Tue, Jan 28, 2014 at 7:32 PM, Rick Macklem <rmacklem@uoguelph.ca>
> > wrote:
> > > Hopefully Garrett and/or you will be able to do some testing of it
> > > and report back w.r.t. performance gains, etc.
> >
> > OK, it has seen light testing.
> >
> > As predicted the vtnet drops are eliminated and CPU load is reduced.
> >
> Ok, that's good news. Bryan, is increasing VTNET_MAX_TX_SEGS in the
> driver feasible?
>
>

I've been busy the last few days, and won't be able to get to any code
until the weekend.

The current MAX_TX_SEGS value is mostly arbitrary - the implicit limit is
VIRTIO_MAX_INDIRECT. This value is used in virtqueue.c to allocate an array
of 'struct vring_desc' which is 16 bytes so we have some next power of 2
rounding going on, so we can make it bigger without using any real
additional memory usage.

But also note I do put an MAX_TX_SEGS sized array of 'struct sglist_segs'
on the stack so it cannot be made too big. Even what is currently there is
probably already pushing what's a Good Idea to put on the stack anyways
(especially since it is near the bottom of a typically pretty deep call
stack). I've been meaning to move that to hanging on the 'struct vtnet_txq'
instead.

I think all TSO capable drivers that use m_collapse(..., 32) (and don't set
if_hw_tsomax) are broken - there looks to be several. I was slightly on top
of my game by using 33 since it appears m_collapse() does not touch the
pkthdr mbuf (I think that was my thinking 3 years ago, and seems to be the
case by a quick glance at the code). I think drivers using m_defrag(...,
32) are OK, but that function can be much, much more expensive.


However, I do suspect we'll be putting a refined version of the patch
> in head someday (maybe April, sooner would have to be committed by
> someone else). I suspect that Garrett's code for server read will work
> well and I'll cobble something to-gether for server readdir and client
> write.
>
> > The performance is also improved:
> >
> > Test Before After
> > SeqWr 1506 7461
> > SeqRd 566 192015
> > RndRd 602 218730
> > RndWr 44 13972
> >
> > All numbers in kiB/sec.
> >
> If you get the chance, you can try a few tunables on the server.
> vfs.nfsd.fha.enable=0
> - ken@ found that FHA was necessary for ZFS exports, to avoid out
>   of order reads from confusing ZFS's sequential reading heuristic.
> However, FHA also means that all readaheads for a file are serialized
> with the reads for the file (same fh->same nfsd thread). Somehow, it
> seems to me that doing reads concurrently in the server (given shared
> vnode locks) could be a good thing.
> --> I wonder what the story is for UFS?
> So, it would be interesting to see what disabling FHA does for the
> sequential read test.
>
> I think I already mentioned the DRC cache ones:
> vfs.nfsd.tcphighwater=100000
> vfs.nfsd.tcpcachetimeo=600 (actually I think Garrett uses 300)
>
> Good to see some progress, rick
> ps: Daniel reports that he will be able to test the patch this
>     weekend, to see if it fixes his problem that required TSO
>     to be disabled, so we'll wait and see.
>
> > There were initially still some problems with lousy hostcache values
> > on the client after the test, which is what causes the iperf
> > performance to tank after the NFS test, but after a reboot of both
> > sides and fresh retest, I haven't reproduced that again.  If it comes
> > back, I'll try to figure out what's going on.
> >
> Hopefully a networking type might know what is going on, because this
> is way out of my area of expertise.
>
> > But this definitely looks like a move in the right direction.
> >
> > Thanks!
> > _______________________________________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-net
> > To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe@freebsd.org"
> >
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGaYwLcDVMA3=1x4hXXVvRojCBewWFZUyZfdiup=jo685%2B51%2BA>