Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 1 Jul 2010 12:23:49 -0700
From:      Garrett Cooper <yanefbsd@gmail.com>
To:        alan bryan <alan.bryan@yahoo.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: NFS 75 second stall
Message-ID:  <AANLkTillzgI775xETcZcmyj4TyTVihZJ5tSznxOoWE_r@mail.gmail.com>
In-Reply-To: <538823.39365.qm@web50508.mail.re2.yahoo.com>
References:  <AANLkTilNvy3FYUNjjiJ85eWrF7jTAvJJ9E7Q2eqhhQj6@mail.gmail.com> <538823.39365.qm@web50508.mail.re2.yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jul 1, 2010 at 11:51 AM, alan bryan <alan.bryan@yahoo.com> wrote:
>
>
> --- On Thu, 7/1/10, Garrett Cooper <yanefbsd@gmail.com> wrote:
>
>> From: Garrett Cooper <yanefbsd@gmail.com>
>> Subject: Re: NFS 75 second stall
>> To: "alan bryan" <alan.bryan@yahoo.com>
>> Cc: freebsd-stable@freebsd.org
>> Date: Thursday, July 1, 2010, 11:13 AM
>> On Thu, Jul 1, 2010 at 11:01 AM, alan
>> bryan <alan.bryan@yahoo.com>
>> wrote:
>> > Setup:
>> >
>> > server - FreeBSD 8-stable from today.=A0 2 UFS dirs
>> exported via NFS.
>> > client - FreeBSD 8.0-Release. =A0Running a test php
>> script that copies around various files to/from 2 separate
>> NFS mounts.
>> >
>> > Situation:
>> >
>> > script is started (forked to do 20 simultaneous runs)
>> and 20 1GB files are copied to the NFS dir which works
>> fine.=A0 When it then switches to reading those files back
>> and simultaneously writing to the other NFS mount I see a
>> hang of 75 seconds.=A0 If I do an "ls -l" on the NFS mount it
>> hangs too.=A0 After 75 seconds the client has reported:
>> >
>> > nfs server 192.168.10.133:/usr/local/export1: not
>> responding
>> > nfs server 192.168.10.133:/usr/local/export1: is alive
>> again
>> > nfs server 192.168.10.133:/usr/local/export1: not
>> responding
>> > nfs server 192.168.10.133:/usr/local/export1: is alive
>> again
>> >
>> > and then things start working again.=A0 The server was
>> originally FreeBSD 8.0-Release also but was upgraded to the
>> latest stable to see if this issue could be avoided.
>> >
>> > # nfsstat -s -W -w 1
>> > =A0GtAttr Lookup Rdlink=A0=A0=A0Read=A0 Write Rename
>> Access=A0 Rddir
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 222=A0 =A0 257
>> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 178=A0 =A0 135
>> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=A0=A085=A0 =A0 127
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> >
>> > ... for 75 rows of all zeros
>> >
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 272=A0 =A0 266
>> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 167=A0 =A0 165
>> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0
>> >
>> > I also tried runs with 15 simultaneous processes and
>> 25. =A015 processes gave only about a 5 second stall but 25
>> gave again the same 75 second stall.
>> >
>> > Further, I tested with 2 mounts to the same server but
>> from ZFS filesytems with the exact same stall/timeout
>> periods. =A0So, it doesn't appear to matter what the
>> underlying filesystem is - it's something in NFS or
>> networking code.
>> >
>> > Any ideas on what's going on here? =A0What's causing
>> the complete stall period of zero NFS activity? =A0 Any flaws
>> with my testing methods?
>> >
>> > Thanks for any and all help/ideas.
>>
>> What network driver are you using? Have you tried
>> tcpdumping the packets?
>> -Garrett
>>
>
> I'm using igb currently but have also used em. =A0I have not tried tcpdum=
ping the packets yet on this test. =A0Any suggestions on things to look out=
 for (I'm not that familiar with that whole process).
>
> Which brings up another point - I'm using TCP connections for NFS, not UD=
P.

    Is the net.inet.tcp.tso sysctl enabled or not? What about rxcsum and tx=
csum?
Thanks,
-Garrett



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTillzgI775xETcZcmyj4TyTVihZJ5tSznxOoWE_r>