Date: Wed, 29 Jan 2014 17:30:43 -0600 From: Bryan Venteicher <bryanv@freebsd.org> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-net@freebsd.org, J David <j.david.lists@gmail.com>, Garrett Wollman <wollman@freebsd.org>, Bryan Venteicher <bryanv@freebsd.org> Subject: Re: Terrible NFS performance under 9.2-RELEASE? Message-ID: <CAGaYwLcDVMA3=1x4hXXVvRojCBewWFZUyZfdiup=jo685%2B51%2BA@mail.gmail.com> In-Reply-To: <1352428787.18632865.1391036503658.JavaMail.root@uoguelph.ca> References: <CABXB=RQj2evY7=Q0_7vbHrQrH3fPkW774gjNxWLwWbRXMzjdDA@mail.gmail.com> <1352428787.18632865.1391036503658.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Jan 29, 2014 at 5:01 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote: > J David wrote: > > On Tue, Jan 28, 2014 at 7:32 PM, Rick Macklem <rmacklem@uoguelph.ca> > > wrote: > > > Hopefully Garrett and/or you will be able to do some testing of it > > > and report back w.r.t. performance gains, etc. > > > > OK, it has seen light testing. > > > > As predicted the vtnet drops are eliminated and CPU load is reduced. > > > Ok, that's good news. Bryan, is increasing VTNET_MAX_TX_SEGS in the > driver feasible? > > I've been busy the last few days, and won't be able to get to any code until the weekend. The current MAX_TX_SEGS value is mostly arbitrary - the implicit limit is VIRTIO_MAX_INDIRECT. This value is used in virtqueue.c to allocate an array of 'struct vring_desc' which is 16 bytes so we have some next power of 2 rounding going on, so we can make it bigger without using any real additional memory usage. But also note I do put an MAX_TX_SEGS sized array of 'struct sglist_segs' on the stack so it cannot be made too big. Even what is currently there is probably already pushing what's a Good Idea to put on the stack anyways (especially since it is near the bottom of a typically pretty deep call stack). I've been meaning to move that to hanging on the 'struct vtnet_txq' instead. I think all TSO capable drivers that use m_collapse(..., 32) (and don't set if_hw_tsomax) are broken - there looks to be several. I was slightly on top of my game by using 33 since it appears m_collapse() does not touch the pkthdr mbuf (I think that was my thinking 3 years ago, and seems to be the case by a quick glance at the code). I think drivers using m_defrag(..., 32) are OK, but that function can be much, much more expensive. However, I do suspect we'll be putting a refined version of the patch > in head someday (maybe April, sooner would have to be committed by > someone else). I suspect that Garrett's code for server read will work > well and I'll cobble something to-gether for server readdir and client > write. > > > The performance is also improved: > > > > Test Before After > > SeqWr 1506 7461 > > SeqRd 566 192015 > > RndRd 602 218730 > > RndWr 44 13972 > > > > All numbers in kiB/sec. > > > If you get the chance, you can try a few tunables on the server. > vfs.nfsd.fha.enable=0 > - ken@ found that FHA was necessary for ZFS exports, to avoid out > of order reads from confusing ZFS's sequential reading heuristic. > However, FHA also means that all readaheads for a file are serialized > with the reads for the file (same fh->same nfsd thread). Somehow, it > seems to me that doing reads concurrently in the server (given shared > vnode locks) could be a good thing. > --> I wonder what the story is for UFS? > So, it would be interesting to see what disabling FHA does for the > sequential read test. > > I think I already mentioned the DRC cache ones: > vfs.nfsd.tcphighwater=100000 > vfs.nfsd.tcpcachetimeo=600 (actually I think Garrett uses 300) > > Good to see some progress, rick > ps: Daniel reports that he will be able to test the patch this > weekend, to see if it fixes his problem that required TSO > to be disabled, so we'll wait and see. > > > There were initially still some problems with lousy hostcache values > > on the client after the test, which is what causes the iperf > > performance to tank after the NFS test, but after a reboot of both > > sides and fresh retest, I haven't reproduced that again. If it comes > > back, I'll try to figure out what's going on. > > > Hopefully a networking type might know what is going on, because this > is way out of my area of expertise. > > > But this definitely looks like a move in the right direction. > > > > Thanks! > > _______________________________________________ > > freebsd-net@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > To unsubscribe, send any mail to > > "freebsd-net-unsubscribe@freebsd.org" > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGaYwLcDVMA3=1x4hXXVvRojCBewWFZUyZfdiup=jo685%2B51%2BA>