Date: Tue, 1 Apr 2014 09:53:49 +0800 From: Marcelo Araujo <araujobsdport@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: FreeBSD Filesystems <freebsd-fs@freebsd.org>, Alexander Motin <mav@freebsd.org> Subject: Re: RFC: How to fix the NFS/iSCSI vs TSO problem Message-ID: <CAOfEmZikpVQo84Tq-4Lp5EXHYkXk3XmF69XPkod1-PtQKdEuAw@mail.gmail.com> In-Reply-To: <2056019527.3811582.1396316605342.JavaMail.root@uoguelph.ca> References: <CAOfEmZhUtUhX_OOGV6R4ogTJPTL0cEPGDv3WgPM2M3hiPs9mxQ@mail.gmail.com> <2056019527.3811582.1396316605342.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
2014-04-01 9:43 GMT+08:00 Rick Macklem <rmacklem@uoguelph.ca>: > Marcelo Araujo wrote: > > > > Hello Rick, > > > > > > We have made couple more benchmarks here with additional options, > > such like '64 threads' and readahead=8. > > > I can't remember, but if you haven't already done so, another thing to > try are these sysctls on the server: > sysctl vfs.nfsd.tcphighwater=100000 > sysctl vfs.nfsd.tcpcachetimeo=300 > > These should reduce the server's CPU overhead (how important these > setting are depends on how current your kernel is). > I haven't done it, I don't have these sysctl on my system. > > > > > Now, we add nfsstat and netstat -m into the table. > > Here attached is the full benchmark, and I can say, this patch really > > improved the read speed. > > > I noticed a significant reduction in CPU usage on the server (about 20%). > An interesting question would be "Is this CPU reduction a result of > avoiding the m_defrag() calls in the ix driver?". > I do believe it is because avoid m_defrag(), but I didn't try to dig into it to check if is really m_defrag(). > Unfortunately, the only way I can think of answering this is doing the > benchmarks on hardware without the 32 mbuf chain limitation, but I > doubt that you can do that? > No, I don't have any hardware without the 32mbuf limitation. > > Put another way, it would be interesting to compare "with vs without" > the patch on machines where the network interface can handle 35 mbufs > in the transmit chain, so there aren't m_defrag() calls being done for > the non-patched case. > > Anyhow, have fun with it, rick > Maybe Christopher can do this benchmark as well in his environment. > > > > > I understand your concern about add more one sysctl, however maybe we > > can do something like ZFS does, if it detect the system is AMD and > > have more than X of RAM it enables some options by default, or a > > kind of warning can be displayed show the new sysctl option. > > > > > > Of, course other people opinion will be very welcome. > > > > > > Best Regards, > > > > > > > > 2014-03-29 6:44 GMT+08:00 Rick Macklem < rmacklem@uoguelph.ca > : > > > > > > > > > > Marcelo Araujo wrote: > > > 2014-03-28 5:37 GMT+08:00 Rick Macklem < rmacklem@uoguelph.ca >: > > > > > > > Christopher Forgeron wrote: > > > > > I'm quite sure the problem is on 9.2-RELEASE, not 9.1-RELEASE > > > > > or > > > > > earlier, > > > > > as a 9.2-STABLE from last year I have doesn't exhibit the > > > > > problem. > > > > > New > > > > > code in if.c at line 660 looks to be what is starting this, > > > > > which > > > > > makes me > > > > > wonder how TSO was being handled before 9.2. > > > > > > > > > > I also like Rick's NFS patch for cluster size. I notice an > > > > > improvement, but > > > > > don't have solid numbers yet. I'm still stress testing it as we > > > > > speak. > > > > > > > > > Unfortunately, this causes problems for small i386 systems, so I > > > > am reluctant to commit it to head. Maybe a variant that is only > > > > enabled for amd64 systems with lots of memory would be ok? > > > > > > > > > > > Rick, > > > > > > Maybe you can create a SYSCTL to enable/disable it by the end user > > > will be > > > more reasonable. Also, of course, it is so far safe if only 64Bits > > > CPU can > > > enable this SYSCTL. Any other option seems not OK, will be hard to > > > judge > > > what is lots of memory and what is not, it will depends what is > > > running > > > onto the system. > > > > > I guess adding it so it can be optionally enabled via a sysctl isn't > > a bad idea. I think the largest risk here is "how do you tell people > > what the risk of enabling this is"? > > > > There are already a bunch of sysctls related to NFS that few people > > know how to use. (I recall that Alexander has argued that folk don't > > want > > worry about these tunables and I tend to agree.) > > > > If I do a variant of the patch that uses m_getjcl(..M_WAITOK..), then > > at least the "breakage" is thread(s) sleeping on "btallo", which is > > fairly easy to check for, althouggh rather obscure. > > (Btw, I've never reproduced this for a patch that changes the code to > > always use MJUMPAGESIZE mbuf clusters. > > I can only reproduce it intermittently when the patch mixes > > allocation of > > MCLBYTES clusters and MJUMPAGESIZE clusters.) > > > > I've been poking at it to try and figure out how to get > > m_getjcl(..M_NOWAIT..) > > to return NULL instead of looping when it runs out of boundary tags > > (to > > see if that can result in a stable implementation of the patch), but > > haven't had much luck yet. > > > > Bottom line: > > I just don't like committing a patch that can break the system in > > such an > > obscure way, even if it is enabled via a sysctl. > > > > Others have an opinion on this? > > > > Thanks, rick > > > > > > > The SYSCTL will be great, and in case you don't have time to do it, > > > I > > > can > > > give you a hand. > > > > > > I'm gonna do more benchmarks today and will send another report, > > > but > > > in our > > > product here, I'm inclined to use this patch, because 10~20% speed > > > up > > > in > > > read for me is a lot. :-) > > > > > > Thank you so much and best regards, > > > -- > > > Marcelo Araujo > > > araujo@FreeBSD.org > > > > > > > _______________________________________________ > > > freebsd-net@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-net > > > To unsubscribe, send any mail to > > > " freebsd-net-unsubscribe@freebsd.org " > > > > > > > > > > > > > -- > > Marcelo Araujo > > araujo@FreeBSD.org > -- Marcelo Araujo araujo@FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOfEmZikpVQo84Tq-4Lp5EXHYkXk3XmF69XPkod1-PtQKdEuAw>