From owner-freebsd-fs@freebsd.org Thu Jul 23 21:53:06 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7BB649A8C1C for ; Thu, 23 Jul 2015 21:53:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 26E001EBB for ; Thu, 23 Jul 2015 21:53:05 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DKBQALRbFV/61jaINbDguEOQaDHY4jsiUCghARAQEBAQEBAYEKhCQBAQQjVhACAQgYAgINGQICVwIEE4gutgCWFwEBAQEBAQQBAQEBAR2BIooqhBohCQ40B4JpgUMFhxKNTqVTAiaDP1oiMYEGQYEEAQEB X-IronPort-AV: E=Sophos;i="5.15,533,1432612800"; d="scan'208";a="227832744" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 23 Jul 2015 17:53:04 -0400 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 4B56915F542; Thu, 23 Jul 2015 17:53:04 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id qA0w_Jp1aXN9; Thu, 23 Jul 2015 17:53:03 -0400 (EDT) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id A41C115F55D; Thu, 23 Jul 2015 17:53:03 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id OpU7jTQRR4lP; Thu, 23 Jul 2015 17:53:03 -0400 (EDT) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 8602015F542; Thu, 23 Jul 2015 17:53:03 -0400 (EDT) Date: Thu, 23 Jul 2015 17:53:03 -0400 (EDT) From: Rick Macklem To: Graham Allan Cc: Ahmed Kamal , Ahmed Kamal via freebsd-fs Message-ID: <1935759160.2320694.1437688383362.JavaMail.zimbra@uoguelph.ca> In-Reply-To: <55B12EB7.6030607@physics.umn.edu> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <20150716235022.GF32479@physics.umn.edu> <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca> <55B12EB7.6030607@physics.umn.edu> Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF34 (Win)/8.0.9_GA_6191) Thread-Topic: Linux NFSv4 clients are getting (bad sequence-id error!) Thread-Index: tjTKN2j4d0bPpf9zbbgJO12vPj66UQ== X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2015 21:53:06 -0000 Graham Allan wrote: > For our part, the user whose code triggered the pathological behaviour > on SL5 reran it on SL6 without incident - I still see lots of > sequence-id errors in the logs, but nothing bad happened. > > I'd still like to ask them to rerun again on SL5 to see if the "accept > skipped seqid" patch had any effect, though I think we expect not. Maybe > it would be nice if I could get set up to capture rolling tcpdumps of > the nfs traffic before they run that though... > > Graham > > On 7/20/2015 10:26 PM, Ahmed Kamal wrote: > > Hi folks, > > > > I've upgraded a test client to rhel6 today, and I'll keep an eye on it > > to see what happens. > > > > During the process, I made the (I guess mistake) of zfs send | recv to a > > locally attached usb disk for backup purposes .. long story short, > > sharenfs property on the received filesystem was causing some nfs/mountd > > errors in logs .. I wasn't too happy with what I got .. I destroyed the > > backup datasets and the whole pool eventually .. and then rebooted the > > whole nas box .. After reboot my logs are still flooded with > > > > Jul 21 05:12:36 nas kernel: nfsrv_cache_session: no session > > Jul 21 05:13:07 nas last message repeated 7536 times > > Jul 21 05:15:08 nas last message repeated 29664 times > > > > Not sure what that means .. or how it can be stopped .. Anyway, will > > keep you posted on progress. > Oh, I didn't see the part about "reboot" before. Unfortunately, it sounds like the client isn't recovering after the session is lost. When the server reboots, the client(s) will get NFS4ERR_BAD_SESSION errors back because the server reboot has deleted all sessions. The NFS4ERR_BAD_SESSION should trigger state recovery on the client. (It doesn't sound like the clients went into recovery, starting with a Create_session operation, but without a packet trace, I can't be sure?) rick > > -- > ------------------------------------------------------------------------- > Graham Allan - gta@umn.edu - allan@physics.umn.edu > School of Physics and Astronomy - University of Minnesota > ------------------------------------------------------------------------- > >