Date: Thu, 1 Jul 2010 13:18:15 -0700 (PDT) From: alan bryan <alan.bryan@yahoo.com> To: Garrett Cooper <yanefbsd@gmail.com> Cc: freebsd-stable@freebsd.org Subject: Re: NFS 75 second stall Message-ID: <604345.41122.qm@web50504.mail.re2.yahoo.com> In-Reply-To: <AANLkTillzgI775xETcZcmyj4TyTVihZJ5tSznxOoWE_r@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--- On Thu, 7/1/10, Garrett Cooper <yanefbsd@gmail.com> wrote: > From: Garrett Cooper <yanefbsd@gmail.com> > Subject: Re: NFS 75 second stall > To: "alan bryan" <alan.bryan@yahoo.com> > Cc: freebsd-stable@freebsd.org > Date: Thursday, July 1, 2010, 12:23 PM > On Thu, Jul 1, 2010 at 11:51 AM, alan > bryan <alan.bryan@yahoo.com> > wrote: > > > > > > --- On Thu, 7/1/10, Garrett Cooper <yanefbsd@gmail.com> > wrote: > > > >> From: Garrett Cooper <yanefbsd@gmail.com> > >> Subject: Re: NFS 75 second stall > >> To: "alan bryan" <alan.bryan@yahoo.com> > >> Cc: freebsd-stable@freebsd.org > >> Date: Thursday, July 1, 2010, 11:13 AM > >> On Thu, Jul 1, 2010 at 11:01 AM, alan > >> bryan <alan.bryan@yahoo.com> > >> wrote: > >> > Setup: > >> > > >> > server - FreeBSD 8-stable from today. 2 UFS > dirs > >> exported via NFS. > >> > client - FreeBSD 8.0-Release. Running a > test php > >> script that copies around various files to/from 2 > separate > >> NFS mounts. > >> > > >> > Situation: > >> > > >> > script is started (forked to do 20 > simultaneous runs) > >> and 20 1GB files are copied to the NFS dir which > works > >> fine. When it then switches to reading those > files back > >> and simultaneously writing to the other NFS mount > I see a > >> hang of 75 seconds. If I do an "ls -l" on the > NFS mount it > >> hangs too. After 75 seconds the client has > reported: > >> > > >> > nfs server 192.168.10.133:/usr/local/export1: > not > >> responding > >> > nfs server 192.168.10.133:/usr/local/export1: > is alive > >> again > >> > nfs server 192.168.10.133:/usr/local/export1: > not > >> responding > >> > nfs server 192.168.10.133:/usr/local/export1: > is alive > >> again > >> > > >> > and then things start working again. The > server was > >> originally FreeBSD 8.0-Release also but was > upgraded to the > >> latest stable to see if this issue could be > avoided. > >> > > >> > # nfsstat -s -W -w 1 > >> > GtAttr Lookup Rdlink Read Write > Rename > >> Access Rddir > >> > 0 0 0 222 > 257 > >> 0 0 0 > >> > 0 0 0 178 > 135 > >> 0 0 0 > >> > 0 0 0 85 > 127 > >> 0 0 0 > >> > 0 0 0 0 > 0 > >> 0 0 0 > >> > 0 0 0 0 > 0 > >> 0 0 0 > >> > 0 0 0 0 > 0 > >> 0 0 0 > >> > 0 0 0 0 > 0 > >> 0 0 0 > >> > 0 0 0 0 > 0 > >> 0 0 0 > >> > > >> > ... for 75 rows of all zeros > >> > > >> > 0 0 0 272 > 266 > >> 0 0 0 > >> > 0 0 0 167 > 165 > >> 0 0 0 > >> > > >> > I also tried runs with 15 simultaneous > processes and > >> 25. 15 processes gave only about a 5 second > stall but 25 > >> gave again the same 75 second stall. > >> > > >> > Further, I tested with 2 mounts to the same > server but > >> from ZFS filesytems with the exact same > stall/timeout > >> periods. So, it doesn't appear to matter what > the > >> underlying filesystem is - it's something in NFS > or > >> networking code. > >> > > >> > Any ideas on what's going on here? What's > causing > >> the complete stall period of zero NFS activity? > Any flaws > >> with my testing methods? > >> > > >> > Thanks for any and all help/ideas. > >> > >> What network driver are you using? Have you tried > >> tcpdumping the packets? > >> -Garrett > >> > > > > I'm using igb currently but have also used em. I > have not tried tcpdumping the packets yet on this test. > Any suggestions on things to look out for (I'm not that > familiar with that whole process). > > > > Which brings up another point - I'm using TCP > connections for NFS, not UDP. > > Is the net.inet.tcp.tso sysctl enabled or > not? What about rxcsum and txcsum? > Thanks, > -Garrett > I haven't intentionally/explicitly set any of this so it's "default": # sysctl net.inet.tcp.tso net.inet.tcp.tso: 1 igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=13b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,TSO4> ether 00:30:48:c3:26:94 inet 192.168.10.133 netmask 0xffffff00 broadcast 192.168.10.255 media: Ethernet autoselect (1000baseT <full-duplex>) status: active Thanks, Alan
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?604345.41122.qm>
