From owner-freebsd-stable@FreeBSD.ORG Thu Jul 1 18:13:31 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8763106566B for ; Thu, 1 Jul 2010 18:13:31 +0000 (UTC) (envelope-from yanefbsd@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 92A608FC0A for ; Thu, 1 Jul 2010 18:13:31 +0000 (UTC) Received: by qwg5 with SMTP id 5so982903qwg.13 for ; Thu, 01 Jul 2010 11:13:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Ki2JLRGZnHMHVifVW3aGM05w4gTD2GHRhLeHZTccuks=; b=Tgbr19d9Mls/DE7nDh/Qwvev12tQH3FJWSGC+xs9YaAVBKwfaLm7lHZcyr8K0x4Qp8 7N4nlWIuFniCq3/bfd5bHZsWSoyEuytdfR0t2LT1YMj29b6x5qk61wHV/6bWjH2RZTGs P1f+oRJQns2Xl6QwLyYR2ixgUNVT6BQsWQskg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=oUFHC/zBDk8F47Dn5nlYJfFT0ndW4gXfZ4ISQtguY/CuLAPHDIBarQs8fHbiMr4APr uUHOkkNwBR++WeAkAW1PQPYPm5o1vVE05apZHu2u8tiKZubwrtGvMz2v0Cqtx8vgnWMl oJyK/5p69RP/p1W04y8G5z2TaowsNY6ZxV+Ho= MIME-Version: 1.0 Received: by 10.224.11.69 with SMTP id s5mr6162764qas.158.1278007999940; Thu, 01 Jul 2010 11:13:19 -0700 (PDT) Received: by 10.229.221.83 with HTTP; Thu, 1 Jul 2010 11:13:19 -0700 (PDT) In-Reply-To: <425902.41392.qm@web50501.mail.re2.yahoo.com> References: <425902.41392.qm@web50501.mail.re2.yahoo.com> Date: Thu, 1 Jul 2010 11:13:19 -0700 Message-ID: From: Garrett Cooper To: alan bryan Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: NFS 75 second stall X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Jul 2010 18:13:32 -0000 On Thu, Jul 1, 2010 at 11:01 AM, alan bryan wrote: > Setup: > > server - FreeBSD 8-stable from today.=A0 2 UFS dirs exported via NFS. > client - FreeBSD 8.0-Release. =A0Running a test php script that copies ar= ound various files to/from 2 separate NFS mounts. > > Situation: > > script is started (forked to do 20 simultaneous runs) and 20 1GB files ar= e copied to the NFS dir which works fine.=A0 When it then switches to readi= ng those files back and simultaneously writing to the other NFS mount I see= a hang of 75 seconds.=A0 If I do an "ls -l" on the NFS mount it hangs too.= =A0 After 75 seconds the client has reported: > > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > nfs server 192.168.10.133:/usr/local/export1: not responding > nfs server 192.168.10.133:/usr/local/export1: is alive again > > and then things start working again.=A0 The server was originally FreeBSD= 8.0-Release also but was upgraded to the latest stable to see if this issu= e could be avoided. > > # nfsstat -s -W -w 1 > =A0GtAttr Lookup Rdlink=A0=A0=A0Read=A0 Write Rename Access=A0 Rddir > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 222=A0 =A0 257=A0 =A0 =A0 = 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 178=A0 =A0 135=A0 =A0 =A0 = 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0=A0=A085=A0 =A0 127=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 = =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > > ... for 75 rows of all zeros > > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 272=A0 =A0 266=A0 =A0 =A0 = 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > =A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 =A0 0=A0 =A0 167=A0 =A0 165=A0 =A0 =A0 = 0=A0 =A0 =A0 0=A0 =A0 =A0 0 > > I also tried runs with 15 simultaneous processes and 25. =A015 processes = gave only about a 5 second stall but 25 gave again the same 75 second stall= . > > Further, I tested with 2 mounts to the same server but from ZFS filesytem= s with the exact same stall/timeout periods. =A0So, it doesn't appear to ma= tter what the underlying filesystem is - it's something in NFS or networkin= g code. > > Any ideas on what's going on here? =A0What's causing the complete stall p= eriod of zero NFS activity? =A0 Any flaws with my testing methods? > > Thanks for any and all help/ideas. What network driver are you using? Have you tried tcpdumping the packets? -Garrett