From owner-freebsd-stable@freebsd.org Fri Aug 5 14:54:53 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 58537BAF64F for ; Fri, 5 Aug 2016 14:54:53 +0000 (UTC) (envelope-from daniel.genis@gmx.de) Received: from mout.gmx.net (mout.gmx.net [212.227.17.21]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mout.gmx.net", Issuer "TeleSec ServerPass DE-1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AF4F51BDD; Fri, 5 Aug 2016 14:54:51 +0000 (UTC) (envelope-from daniel.genis@gmx.de) Received: from [192.168.101.17] ([37.74.194.90]) by mail.gmx.com (mrgmx103) with ESMTPSA (Nemesis) id 0MhQju-1bsK0R2YBm-00Meqz; Fri, 05 Aug 2016 16:54:49 +0200 Subject: Re: zfs recv causes nfs server to throw NFSERR_IO i/o errors To: Alan Somers References: <1df33129-0c3e-dfc3-6867-46ab0473ae57@gmx.de> Cc: FreeBSD From: Daniel Genis Message-ID: <1eb419ed-4180-11c2-3bf3-5d3013a07197@gmx.de> Date: Fri, 5 Aug 2016 16:54:49 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:IZQrZuq5FG4km2e5G+sAOEEovJcpOXpn02A6bVLQo6k27pP6K7E tXHy7SSprp9wTvn1SQRWY8EYwehDoJ0lLBqqY0UjxwFNRVraHUWri+Fmq+CX73oDxcPFlsy HzuGtZbO+/zNl0RNJU/L5SAEi0JLVbQSGCy7iGWAaURwAKcLnpU3ZCdCmJFGlN3IfSJ/c7k 5/prG5Wzij0cMeN9tfSgg== X-UI-Out-Filterresults: notjunk:1;V01:K0:p45aPKGfyQE=:jTFIyxnLIsLTFDRQ+cgRVr vqq40dRjxN7Prwi6nLdRFkjJL7Q5GVOztJKYOvpzibWfwXFE4ppmvxanJbXZiPLm4VrM7M2nE 0OgPSEuaSHHxOHNgGfHwwhU/OUodjyF0uJoSU7qge6C9yDOOeDMUYx94d2gNzCYu3ma5PpcvG 02F6+UyKS8D2RwFvBB4GB17+BtaZBOhJpSF3Db5l8awtqZa/yuEX3UDp5gbqQFhv6YNoAFCWF 2wWkCXnoat9wM4aVTld9UItSDJArZQtNR1iTSC+e2i4/ceBY/aXsIDKp4bSilP/4wFowV8CZp B4H3YgP2KDOZ2m26fft1CxiK/Mc50+4nb7R71kew+EvjpzZ5+uoXC3tDLsCYwzRR33eKsuLI9 PhnqPcf9yvr0vVl1W3iDINtnEARjznfNAfQUZ5y/j5qpd32pjnxa0mrjwIuhYNZPc2l2fmkD2 XJpWzl1jAgNbQIYqy/Nye9GrbMiZh6EnLQ1sIcIuUrxv3MYOhaTxDJ+3k9x3wwVa1yywinpuY YX7dDrgwEKl12ffJ0iQh6fdLH476yuLaCOtPD7bdG9OWnaxoSSAPqcy6FNpvq+9LRPHVSKL0e udYdaw8caI3mYPxmijZKtoJ7MRgZgPWi2W5EtzxAvoZq3GNmU6IDcUaVyb5G0L7hLsWBs2/WT uo+bwyK/eVWPSyB9JqeXeL6GMeyTlES/xiywfsfrejDBHCaVEW5gdxqT9uWFJrphvmWlR9t4f m04u+hDwI8u13TOtS6E2OGgvyU1AK0Qtgvn8cYXcaBqwq757YurNrKvFZrtwv5/m23d64mcO1 SzOgzju X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Aug 2016 14:54:53 -0000 Thank you, that probably is what the doctor ordered! My quick testing shows that it's very likely fixed. Kudos! :-) On 08/05/2016 04:11 PM, Alan Somers wrote: > On Fri, Aug 5, 2016 at 7:22 AM, Daniel Genis wrote: >> Hi everyone, >> >> we've been tracing an issue where snapshot replication is causing >> interruptions for the NFS serivce. >> >> The problem is as follows: >> >> Every time a zfs recv finishes, there is a chance for the NFS server to >> return an NFSERR_IO for a GETATTR call. This shows up as input/output >> errors on the nfs clients. >> >> Here the tcpdump showing the NFS conversation: >> https://nopaste.me/view/95d1a79d >> >> NFS 202 V3 GETATTR Call (Reply In 6043), FH: 0x8c711a60 >> NFS 98 V3 GETATTR Reply (Call In 6042) Error: NFS3ERR_IO >> NFS 222 V3 LOOKUP Call (Reply In 6046), DH: 0x6694634f/example.file.txt >> NFS 102 V3 LOOKUP Reply (Call In 6045) Error: NFS3ERR_ACCES >> >> We've been able to verify that there is a _direct_ correlation between >> the zfs recv command and these NFS errors. For every input/output error >> we can find a log entry of a replication just finishing (zfs recv exiting). >> >> The receiving server is running 10.3-RELEASE >> >> I've read about a VFS/ZFS deadlock issue which is to be included/fixed >> in Freebsd 11.0-BETA4. >> >> Could our issue be related? >> Otherwise does anyone have any suggestions how to tackle this issue? >> >> >> For the record, say we have two volumes: >> tank/volumeA and tank/volumeB >> >> If there is a zfs recv busy for tank/volumeA then tank/volumeB can get >> these NFS "io" errors, it does not have to be the same volume. >> >> >> Has anyone else seen/experience this as well? >> >> Any insights are appreciated! >> >> With kind regards, >> >> Daniel > Try adding mountd_flags="-S" to /etc/rc.conf.