From owner-freebsd-fs@freebsd.org Wed Jul 1 16:23:56 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 5DB1D9922FC for ; Wed, 1 Jul 2015 16:23:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: from mail-wg0-x22b.google.com (mail-wg0-x22b.google.com [IPv6:2a00:1450:400c:c00::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 12AE01EB8 for ; Wed, 1 Jul 2015 16:23:56 +0000 (UTC) (envelope-from email.ahmedkamal@googlemail.com) Received: by wguu7 with SMTP id u7so41368298wgu.3 for ; Wed, 01 Jul 2015 09:23:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=htW/F40WOO254sm2YStfrod3IeotbjgX3YYwHXMYybM=; b=m3b75DNxwrtrtH0h5/YAaKmc55skAMg3WTmfTirhXV0FcTL3xgNYt69+0cD14Iu/tX F3GYuYw4BKozS4fb6PJ7V3PWAmJxl4NCM9B/OMAkSL+rPnHBAvDfPLm28gGJ8g42RrBJ T6exOYErOyx2vuiKF3VXhix9NG2IRAU0ZnmTFUNrBNE86Wk0QEA/Mb3G1KYtoPY/SLu7 uxV2exog5nnmZXNBFSCnnajSZUyAqWTFUFy8xaWBSQ+zKxjoOno57luv80fisFynCsBP bzpKUBNQ91G6ztvQVNGDtfQ//6DDSrpdG1ob+C/CNZKIGcoyMzzMKnHsbwm29raAWo2+ Q0xw== X-Received: by 10.194.6.229 with SMTP id e5mr3196200wja.158.1435767834455; Wed, 01 Jul 2015 09:23:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.6.143 with HTTP; Wed, 1 Jul 2015 09:23:35 -0700 (PDT) From: Ahmed Kamal Date: Wed, 1 Jul 2015 18:23:35 +0200 Message-ID: Subject: Linux NFSv4 clients are getting (bad sequence-id error!) To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Jul 2015 16:23:56 -0000 Hi all, I'm a refugee from linux land. I just set up my first freebsd 10.1 zfs box, sharing /home over nfs. Since every home directory is its own zfs dataset, I chose to use nfsv4 to enable recursively sharing/mounting any directory under /home (I understand nfs4 is a must in this scenario!) I'm able to mount form linux (rhel5 latest kernel) successfully. Users are working fine. However every now and then a user screams that his session is frozen. Usually the processes are stuck in nfs_wait or rpc_* state. I tried using a much newer linux kernel (3.2 however it still faced the same problem). The errors in Linux log files are mostly: Jul 1 17:41:47 mammoth kernel: NFS: v4 server nas returned a *bad sequence-id error*! Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_locks: unhandled error -11. Zeroing state Jul 1 17:52:32 mammoth kernel: nfs4_reclaim_open_state: Lock reclaim failed! My search led me to (https://access.redhat.com/solutions/1328073) a detailed analysis of the issue, which you can read over here https://dl.dropboxusercontent.com/u/51939288/nfs4-bad-seq.pdf .. NetApp confirmed this was a bug for them (I'm wondering if this is still in FreeBSD?!) PS: Right before sending this, I saw dmesg on the freebsd box advising increasing vfs.nfsd.tcphighwater .. So I up'ed that to 64000. I also up'ed the number of nfs server threads (-t) from 10 to 60 (we're roughly 40 linux machines) Any advice is most appreciated! Thanks