From owner-freebsd-fs@freebsd.org Wed Feb 28 11:45:43 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 9002AF40A4E for ; Wed, 28 Feb 2018 11:45:43 +0000 (UTC) (envelope-from mail@osfux.nl) Received: from vm1982.vellance.net (vm1982.vellance.net [79.99.187.212]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 26EA172F7E; Wed, 28 Feb 2018 11:45:42 +0000 (UTC) (envelope-from mail@osfux.nl) Received: from vm1982.vellance.net (localhost [127.0.0.1]) by vm1982.vellance.net (Postfix) with ESMTP id E5351201AE; Wed, 28 Feb 2018 12:45:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=osfux.nl; s=default; t=1519818340; bh=OuB38v8otyRQ6LhwYO4+bjTa9Se+1OGvYTrg0tln6CM=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=PyUQfFVbubMJ5M6F5Li3QPBq6R5yEue+tQ7dKf6GscrMZs0eXKRSi3UGGUzwwU79e GP/CpGuEJ0rUFzNv2uB9Ls0X70nML+0qz3dTXZQK+ElJdUfxCAmUedk+V7afAM6XW7 nYViu8tW1BjVJa6I49vwzwod0+dinJMqP+JeAYg0NwpqXUjaDqtPa6p6UEugkGnQxE PKLqMdgIzgRgSFTw7nq2ZWZK69WwqaQwGsVQurmw6nf4GUp4altx8B5l9+DqbS+r2H EaCUFEhOK2ZJQzJrYabSC6A/91hwYjecCICymVzHmEUorSIg/4FEPya5kDGkUOmkYU f6n4JYE/kPSieBhgGtEJ4QfOf5GNxGNuRYZ9L5vXQB4eyRye7NHuORhQ92R580Rjg/ GMgz95oc8FtemRFTHnnULETpM3/5RD48dpipYv7LxGT0og0ChnQnoPrGGUmInGUEuO CAQf50nK/u1r5qdhWZCrul6fROUWjTWihO1o5yTi+1YFgP+0wXoKHZ/jmdyPBIJy/V TNKHJiYJEe2P4t/hxtNouwvsO1blkMYsLV3OOSTBz/WfbTj+9jcoNzqK9Nx92L4vgs IZBwcZmvPfsuXNE/W/ytdj53qKMvbosDD4IAnU1bWXzEfj6b60LRf9TZ2PpwiNKmQA T2N/IvHI85drfp3NSUik/Jns= Received: from vm1982.vellance.net (localhost [127.0.0.1]) by vm1982.vellance.net (Postfix) with ESMTP id 6858D20152; Wed, 28 Feb 2018 12:45:34 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=osfux.nl; s=default; t=1519818338; bh=OuB38v8otyRQ6LhwYO4+bjTa9Se+1OGvYTrg0tln6CM=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=LqllTgpGg3GpuWjwfVifCeCJAFcReWYhvLeR+HFDX6e5Lu3dNhMiNzp6tnaMfkdpQ EehQ54pkK1f9b7gm8/WC11UQxKI6NVHaDUk8WKK/Pkxe+4HUmPCWlpHwchgHQ9vzxC CMyYNVRHpfZhDoEH81OS0LmJUhmsr1lZuE76KuEib9lsiExC5765pQ0jsuVnsLp+Ls xUIg/nMR6zMfkujn8NSmVf4vgrw2fc+Us8Y/zOYIjgpvGC6iH5xi7V70H1LktJ68ys AJBhdu2+1S2Dh5rjae5SdKOVkHmJziNUZKlCjTmb9uXXeqMb1nDM3/3nsCPAK9JfCt Lw04oXqMEJjAmXJZS1njLpSUAfl5zOUq+pr8sEAKpainb2EzhTHMYcOH289oLTdywS XWnVRDCQpAxf/uszCwe67IILIeD+uLq2Zb1yOZKDiTZvxJHQqh6GsnkOITtO00kj4I Qd9QAUC8gNPhpB9i7MMRzuxS+eKz2Njgprx5mXbAALjfPT2ig97j+gc4ceHlRplUQE gF+lvT00P1FHMgO7Zg7yVqb4BDMFfqzLJ6zxcK3KHWreJCBGNr4rcyY7HQ+tS4ZGwm lQ1Xmtu4RjyquIq7pFNxpz3Mc9FgPtSiUJ6OE2BCl/AWsZiHmXYr3PZOWck8/DuDWC Hq4tbh1fNv8HjklbQVkOQm7M= X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on vm1982.vellance.net X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=ham autolearn_force=no version=3.4.1 Received: from rubens-MacBook-Air.local (engineering.quanza.net [91.208.87.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by vm1982.vellance.net (Postfix) with ESMTPSA; Wed, 28 Feb 2018 12:45:34 +0100 (CET) Subject: Re: Linux NFSv4 clients: bad sequence-id errors. To: Rick Macklem , "freebsd-fs@freebsd.org" Cc: "rmacklem@FreeBSD.org" References: From: Ruben Message-ID: <897d0abd-0770-a61c-5a1f-f267cd10e3d2@osfux.nl> Date: Wed, 28 Feb 2018 12:45:32 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US X-Virus-Scanned: ClamAV using ClamSMTP X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Feb 2018 11:45:43 -0000 On 27/02/2018 23:54, Rick Macklem wrote: > Ruben wrote: >> I'm experiencing a strange issue on a machine providing a couple of >> nfsv4 exports. A Linux client that generates a lot of traffic to and > >from the nfs server sometimes starts throwing "bad sequence-id errors": >> Feb 27 10:39:42 localhost kernel: [12481477.608103] NFS: v4 server >> returned a bad sequence-id error on an unconfirmed sequence 80f7d0d0! > The handling of sequence-id in NFSv4.0 is complex and I won't even try > to guess why this is happening. > I am surprised that your Linux mounts are using NFSv4.0 and not NFSv4.1? > (Usually Linux uses the most recent version supported by the server.) > I mention this since "sessions" replaced the sequence-id stuff in NFSv4.1 > and, as such, shouldn't have such an issue. After some digging around I found out that the linux kernel on the client (raspbian running hexxeh firmware) was compiled with : CONFIG_NFS_V4=y # CONFIG_NFS_V4_1 is not set options regarding nfs v4 functionality. Thank you for pointing that out, Ill resolve that. > >> They typically occur after a couple of months of uptime on the nfsd >> machine. Every couple of seconds they are thrown by the client. The >> situation is "remedied" by restarting the nfsd on the server. Although >> functionality on the specific client does not appear to be affected >> (much?), its a bit disturbing. I've done some digging and found : > The fact that this is fixed by restarting the nfsd suggests a client side > problem. > Why? > Because restarting the nfsd does not reset any server state, so the sequence-id > situation would not be affected by doing this. (To get rid of server side state, > you must unload the nfsd.ko after killing off the nfsd daemon.) > > All restarting the nfsd daemon will do is force the client to establish a new > TCP connection. That is at a layer below the NFS state. Thank you for your elaboration! > >> https://lists.freebsd.org/pipermail/freebsd-fs/2015-July/021707.html >> >> and the patch attached by Rick ( nfsv41exch.patch : >> http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20150729/586f776a/attachment.bin >> ) . >> >> Since the issue started manifesting itself I have restarted the nfs >> daemon (grabbed a pcap and the corresponding error lines mentioning the >> sequences prior to doing that in case anyone is interested). > If you email me the pcap as an attachment, I can take a look at it in wireshark. I've sent a download link to my troubleshooting efforts to you. Please do not look into it purely on my behalf (Im focussing on getting the client to actually run nfs v4.1 instead of v4.0)! > >> The nfs server runs FreeBSD 11.1 : > I'm being lazy and not looking, but I am almost sure a 2015 patch will be in 11.1 > and probably also in 10.2 and 11.0. >> I'm wondering: can the 2015 patch provided by Rick still be "safely" >> applied or has the nfs code changed too much since then? I've witnessed >> this issue a couple of times now and would very much like to test the >> patch provided. > As above, I'd be surprised if the patch isn't already in your 11.1 kernel, > but you can take a look. > If it isn't, let me know because that means it slipped through the cracks > and I need to get it committed, etc. I'm not at all versed in C but if I find anything ill get back to you. > rick Ruben