From owner-freebsd-fs@freebsd.org Thu Jul 23 21:59:19 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C5AF9A8D9C for ; Thu, 23 Jul 2015 21:59:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id C5AB91246 for ; Thu, 23 Jul 2015 21:59:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DLBQBNPLFV/61jaINbGYQ5BoMdjiOyJQKCDBEBAQEBAQEBgQqEIwEBAQECASMEUgULAgEIDgoCAg0ZAgJXAgQTiCYItWiWFgEBAQEBAQQBAQEBAR2BIooqhBohCQ40B4JpgUMFhxKFLYghjX6EHZM4AiaEGSIxgQZBgQQBAQE X-IronPort-AV: E=Sophos;i="5.15,532,1432612800"; d="scan'208";a="226072488" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 23 Jul 2015 17:59:11 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 2442C15F542; Thu, 23 Jul 2015 17:59:11 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id TrkNQndjioBd; Thu, 23 Jul 2015 17:59:09 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 26D4C15F55D; Thu, 23 Jul 2015 17:59:09 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 7jF37sCYS-hr; Thu, 23 Jul 2015 17:59:09 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 09AE715F542; Thu, 23 Jul 2015 17:59:09 -0400 (EDT) Date: Thu, 23 Jul 2015 17:59:09 -0400 (EDT) From: Rick Macklem To: Ahmed Kamal Cc: Graham Allan , Ahmed Kamal via freebsd-fs Message-ID: <576106597.2326662.1437688749018.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <20150716235022.GF32479@physics.umn.edu> <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca> <55B12EB7.6030607@physics.umn.edu> <1935759160.2320694.1437688383362.JavaMail.zimbra@uoguelph.ca> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: 9YgEn3j29XZHVRi6Xn/biaCGqqGDxA== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2015 21:59:19 -0000 Ahmed Kamal wrote: > Can you please let me know the ultimate packet trace command I'd need to > run in case of any nfs4 troubles .. I guess this should be comprehensive > even at the expense of a larger output size (which we can trim later).. > Thanks a lot for the help! > tcpdump -s 0 -w .pcap host ( refers to a file name you choose and refers to the host name of a client generating traffic.) --> But you won't be able to allow this to run for long during the storm or the file will be huge. Then you look at .pcap in wireshark, which knows NFS. rick > On Thu, Jul 23, 2015 at 11:53 PM, Rick Macklem wrote: > > > Graham Allan wrote: > > > For our part, the user whose code triggered the pathological behaviour > > > on SL5 reran it on SL6 without incident - I still see lots of > > > sequence-id errors in the logs, but nothing bad happened. > > > > > > I'd still like to ask them to rerun again on SL5 to see if the "accept > > > skipped seqid" patch had any effect, though I think we expect not. Maybe > > > it would be nice if I could get set up to capture rolling tcpdumps of > > > the nfs traffic before they run that though... > > > > > > Graham > > > > > > On 7/20/2015 10:26 PM, Ahmed Kamal wrote: > > > > Hi folks, > > > > > > > > I've upgraded a test client to rhel6 today, and I'll keep an eye on it > > > > to see what happens. > > > > > > > > During the process, I made the (I guess mistake) of zfs send | recv to > > a > > > > locally attached usb disk for backup purposes .. long story short, > > > > sharenfs property on the received filesystem was causing some > > nfs/mountd > > > > errors in logs .. I wasn't too happy with what I got .. I destroyed the > > > > backup datasets and the whole pool eventually .. and then rebooted the > > > > whole nas box .. After reboot my logs are still flooded with > > > > > > > > Jul 21 05:12:36 nas kernel: nfsrv_cache_session: no session > > > > Jul 21 05:13:07 nas last message repeated 7536 times > > > > Jul 21 05:15:08 nas last message repeated 29664 times > > > > > > > > Not sure what that means .. or how it can be stopped .. Anyway, will > > > > keep you posted on progress. > > > > > Oh, I didn't see the part about "reboot" before. Unfortunately, it sounds > > like the > > client isn't recovering after the session is lost. When the server > > reboots, the > > client(s) will get NFS4ERR_BAD_SESSION errors back because the server > > reboot has > > deleted all sessions. The NFS4ERR_BAD_SESSION should trigger state > > recovery on the client. > > (It doesn't sound like the clients went into recovery, starting with a > > Create_session > > operation, but without a packet trace, I can't be sure?) > > > > rick > > > > > > > > -- > > > ------------------------------------------------------------------------- > > > Graham Allan - gta@umn.edu - allan@physics.umn.edu > > > School of Physics and Astronomy - University of Minnesota > > > ------------------------------------------------------------------------- > > > > > > > > >