From owner-freebsd-stable@FreeBSD.ORG Thu Jul 1 20:18:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06226106564A for ; Thu, 1 Jul 2010 20:18:19 +0000 (UTC) (envelope-from alan.bryan@yahoo.com) Received: from web50504.mail.re2.yahoo.com (web50504.mail.re2.yahoo.com [206.190.38.80]) by mx1.freebsd.org (Postfix) with SMTP id B57608FC12 for ; Thu, 1 Jul 2010 20:18:18 +0000 (UTC) Received: (qmail 47580 invoked by uid 60001); 1 Jul 2010 20:18:15 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1278015495; bh=ZRu9EMSENp1GC3vEJqtscNMnjmEQgPUbGo0LMrAyzEY=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=QJ8jsSm8i4ZacyjxjDF/ywPj/VkVdLYVV8sZ0CYXYyWQre9v5EPnuxgg8UiP70HCctsv+WWFZFjli3ohMMSTvWrAS68FFDKFbzL6kNvYoJHxl8mju7h/l9t5PzTSuz/Bg26bfH31CqbnLczkBpJIUgPHolY4C9FZsfSj5z2oOLc= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=QxdZrHH1G7nPX0r2kFV808NKqs1MwftNvBFptoE7cbPUYhJKtwuTe9gRIObe72i25CjKJ3oydefivcbTt3Ysq/Lpm/TnEHgnrvKV8WvZ+Gz2ZV0QrTtoQnFvwvdb55raO+keymtunBlMgCsvkywpIxNHKhZOPZX57HjkrMd0EBI=; Message-ID: <604345.41122.qm@web50504.mail.re2.yahoo.com> X-YMail-OSG: 6l6XbaoVM1m3pgi4p3dO4AqSCfUUc6dHkeHSJ9gPuClSs23 ML.Pf.LkBO83irtmdrR1hoC.WhkOIddb8hXasQ0FfsFiKzjctabtH.JOM.Rb XAvi1SLI1VLa_KEE499PTEtPNQsD_hBPkL6EE5up5e.VI.iyeiaANWVJ3YVZ v7.VkPOGSQbAtTu3baf4NNSjhNG1bVjvMft6uJpMnmzD.5YTPcEntSsyzxt4 up0GQVp6t_eXXjRgTe1zsMocNAl1oHBY93hd1DjnJpFTf6c6_CQ9jfbic3.6 xc2QeDVglcGVpbf.JccgcWAle_s6KzveKtB0Am5UYrTPH7.p3ZYsfKtclyVV gbg-- Received: from [99.24.6.121] by web50504.mail.re2.yahoo.com via HTTP; Thu, 01 Jul 2010 13:18:15 PDT X-Mailer: YahooMailClassic/11.1.4 YahooMailWebService/0.8.104.274457 Date: Thu, 1 Jul 2010 13:18:15 -0700 (PDT) From: alan bryan To: Garrett Cooper In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: NFS 75 second stall X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jul 2010 20:18:19 -0000 =0A=0A--- On Thu, 7/1/10, Garrett Cooper wrote:=0A=0A>= From: Garrett Cooper =0A> Subject: Re: NFS 75 second s= tall=0A> To: "alan bryan" =0A> Cc: freebsd-stable@fre= ebsd.org=0A> Date: Thursday, July 1, 2010, 12:23 PM=0A> On Thu, Jul 1, 2010= at 11:51 AM, alan=0A> bryan =0A> wrote:=0A> >=0A> >= =0A> > --- On Thu, 7/1/10, Garrett Cooper =0A> wrote:= =0A> >=0A> >> From: Garrett Cooper =0A> >> Subject: Re:= NFS 75 second stall=0A> >> To: "alan bryan" =0A> >> = Cc: freebsd-stable@freebsd.org=0A> >> Date: Thursday, July 1, 2010, 11:13 A= M=0A> >> On Thu, Jul 1, 2010 at 11:01 AM, alan=0A> >> bryan =0A> >> wrote:=0A> >> > Setup:=0A> >> >=0A> >> > server - FreeBSD 8-= stable from today.=A0 2 UFS=0A> dirs=0A> >> exported via NFS.=0A> >> > clie= nt - FreeBSD 8.0-Release. =A0Running a=0A> test php=0A> >> script that copi= es around various files to/from 2=0A> separate=0A> >> NFS mounts.=0A> >> >= =0A> >> > Situation:=0A> >> >=0A> >> > script is started (forked to do 20= =0A> simultaneous runs)=0A> >> and 20 1GB files are copied to the NFS dir w= hich=0A> works=0A> >> fine.=A0 When it then switches to reading those=0A> f= iles back=0A> >> and simultaneously writing to the other NFS mount=0A> I se= e a=0A> >> hang of 75 seconds.=A0 If I do an "ls -l" on the=0A> NFS mount i= t=0A> >> hangs too.=A0 After 75 seconds the client has=0A> reported:=0A> >>= >=0A> >> > nfs server 192.168.10.133:/usr/local/export1:=0A> not=0A> >> re= sponding=0A> >> > nfs server 192.168.10.133:/usr/local/export1:=0A> is aliv= e=0A> >> again=0A> >> > nfs server 192.168.10.133:/usr/local/export1:=0A> n= ot=0A> >> responding=0A> >> > nfs server 192.168.10.133:/usr/local/export1:= =0A> is alive=0A> >> again=0A> >> >=0A> >> > and then things start working = again.=A0 The=0A> server was=0A> >> originally FreeBSD 8.0-Release also but= was=0A> upgraded to the=0A> >> latest stable to see if this issue could be= =0A> avoided.=0A> >> >=0A> >> > # nfsstat -s -W -w 1=0A> >> > =A0GtAttr Loo= kup Rdlink=A0=A0=A0Read=A0 Write=0A> Rename=0A> >> Access=A0 Rddir=0A> >> >= =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 222=A0 =A0=0A> 257=0A> >> = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 178=A0 =A0=0A> 135=0A> >> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> = >> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=A0=A085=A0=0A> =A0 127= =0A> >> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=0A> =A0 0=0A> >> =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0=0A> =A0 0=0A> >> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> > = =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=0A> =A0 0=0A> >= > =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0 0= =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=0A> =A0 0=0A> >> =A0 =A0 0=A0 =A0 =A0 0= =A0 =A0 =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0= =A0 =A0=0A> =A0 0=0A> >> =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> >=0A> >= > > ... for 75 rows of all zeros=0A> >> >=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0= 0=A0 =A0 =A0 0=A0 =A0 272=A0 =A0=0A> 266=0A> >> =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=0A> >> > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 167=A0 =A0=0A= > 165=0A> >> =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=0A> >> >=0A> >> > I also tried= runs with 15 simultaneous=0A> processes and=0A> >> 25. =A015 processes gav= e only about a 5 second=0A> stall but 25=0A> >> gave again the same 75 seco= nd stall.=0A> >> >=0A> >> > Further, I tested with 2 mounts to the same=0A>= server but=0A> >> from ZFS filesytems with the exact same=0A> stall/timeou= t=0A> >> periods. =A0So, it doesn't appear to matter what=0A> the=0A> >> un= derlying filesystem is - it's something in NFS=0A> or=0A> >> networking cod= e.=0A> >> >=0A> >> > Any ideas on what's going on here? =A0What's=0A> causi= ng=0A> >> the complete stall period of zero NFS activity? =A0=0A> Any flaws= =0A> >> with my testing methods?=0A> >> >=0A> >> > Thanks for any and all h= elp/ideas.=0A> >>=0A> >> What network driver are you using? Have you tried= =0A> >> tcpdumping the packets?=0A> >> -Garrett=0A> >>=0A> >=0A> > I'm usin= g igb currently but have also used em. =A0I=0A> have not tried tcpdumping t= he packets yet on this test.=0A> =A0Any suggestions on things to look out f= or (I'm not that=0A> familiar with that whole process).=0A> >=0A> > Which b= rings up another point - I'm using TCP=0A> connections for NFS, not UDP.=0A= > =0A> =A0 =A0 Is the net.inet.tcp.tso sysctl enabled or=0A> not? What abou= t rxcsum and txcsum?=0A> Thanks,=0A> -Garrett=0A> =0A=0AI haven't intention= ally/explicitly set any of this so it's "default":=0A=0A# sysctl net.inet.t= cp.tso=0Anet.inet.tcp.tso: 1=0A=0A=0Aigb0: flags=3D8843 metric 0 mtu 1500=0A=09options=3D13b=0A=09ether 00:30:48:c3:26:94=0A=09inet= 192.168.10.133 netmask 0xffffff00 broadcast 192.168.10.255=0A=09media: Eth= ernet autoselect (1000baseT )=0A=09status: active=0A=0AThanks,= =0AAlan=0A=0A=0A=0A=0A