From owner-freebsd-fs@freebsd.org Thu Jul 23 21:55:21 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 56E199A8C9D for ; Thu, 23 Jul 2015 21:55:21 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E1F4D1F8D for ; Thu, 23 Jul 2015 21:55:20 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wicmv11 with SMTP id mv11so41598153wic.0 for ; Thu, 23 Jul 2015 14:55:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=XU8wq0SZg+PO6Y1V8wZhjDOJZiar8OwfF5IaG0XjEdQ=; b=IonJAiQgn06ywcW1bo+pRh+xBDfNR3W52iJZLgDgBpvU7/jYwgubEsBezXKiPo8qva SFQAfyiB0ALdq1SOQ/yB9ZNPpXM1sWYAgIIXTWyUusPYZS2z6C37FJ0+Y4DaBNQihxmA 6wnrhFoNvGD+xE2rNzF16alWPBREdX9h69Xg0PbmrRosLvWhyn5hM9aP1k6dRXjR87DN DoQDZNw8Hn9YRaVVeCakPIQ+j6IkElXSizSSx2elWbd5lrFhUITv78lVAo0TStSN4h+b uf5/jdvi/4jznyPtxbgwvyFAebWFkziBZYjPwBqVi5Zuu6sOK4e1tC1ULTJ+OPTE8CuL xoXg== X-Received: by 10.180.20.198 with SMTP id p6mr755888wie.38.1437688518033; Thu, 23 Jul 2015 14:55:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Thu, 23 Jul 2015 14:54:58 -0700 (PDT) In-Reply-To: <1935759160.2320694.1437688383362.JavaMail.zimbra@uoguelph.ca> References: <684628776.2772174.1435793776748.JavaMail.zimbra@uoguelph.ca> <791936587.3443190.1435873993955.JavaMail.zimbra@uoguelph.ca> <20150716235022.GF32479@physics.umn.edu> <184170291.10949389.1437161519387.JavaMail.zimbra@uoguelph.ca> <55B12EB7.6030607@physics.umn.edu> <1935759160.2320694.1437688383362.JavaMail.zimbra@uoguelph.ca> From: Ahmed Kamal Date: Thu, 23 Jul 2015 23:54:58 +0200 Message-ID: Subject: Re: Linux NFSv4 clients are getting (bad sequence-id error!) To: Rick Macklem Cc: Graham Allan , Ahmed Kamal via freebsd-fs Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jul 2015 21:55:21 -0000 Can you please let me know the ultimate packet trace command I'd need to run in case of any nfs4 troubles .. I guess this should be comprehensive even at the expense of a larger output size (which we can trim later).. Thanks a lot for the help! On Thu, Jul 23, 2015 at 11:53 PM, Rick Macklem wrote: > Graham Allan wrote: > > For our part, the user whose code triggered the pathological behaviour > > on SL5 reran it on SL6 without incident - I still see lots of > > sequence-id errors in the logs, but nothing bad happened. > > > > I'd still like to ask them to rerun again on SL5 to see if the "accept > > skipped seqid" patch had any effect, though I think we expect not. Maybe > > it would be nice if I could get set up to capture rolling tcpdumps of > > the nfs traffic before they run that though... > > > > Graham > > > > On 7/20/2015 10:26 PM, Ahmed Kamal wrote: > > > Hi folks, > > > > > > I've upgraded a test client to rhel6 today, and I'll keep an eye on it > > > to see what happens. > > > > > > During the process, I made the (I guess mistake) of zfs send | recv to > a > > > locally attached usb disk for backup purposes .. long story short, > > > sharenfs property on the received filesystem was causing some > nfs/mountd > > > errors in logs .. I wasn't too happy with what I got .. I destroyed the > > > backup datasets and the whole pool eventually .. and then rebooted the > > > whole nas box .. After reboot my logs are still flooded with > > > > > > Jul 21 05:12:36 nas kernel: nfsrv_cache_session: no session > > > Jul 21 05:13:07 nas last message repeated 7536 times > > > Jul 21 05:15:08 nas last message repeated 29664 times > > > > > > Not sure what that means .. or how it can be stopped .. Anyway, will > > > keep you posted on progress. > > > Oh, I didn't see the part about "reboot" before. Unfortunately, it sounds > like the > client isn't recovering after the session is lost. When the server > reboots, the > client(s) will get NFS4ERR_BAD_SESSION errors back because the server > reboot has > deleted all sessions. The NFS4ERR_BAD_SESSION should trigger state > recovery on the client. > (It doesn't sound like the clients went into recovery, starting with a > Create_session > operation, but without a packet trace, I can't be sure?) > > rick > > > > > -- > > ------------------------------------------------------------------------- > > Graham Allan - gta@umn.edu - allan@physics.umn.edu > > School of Physics and Astronomy - University of Minnesota > > ------------------------------------------------------------------------- > > > > >