From owner-freebsd-fs@FreeBSD.ORG Sun Jul 7 21:53:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2174F18A for ; Sun, 7 Jul 2013 21:53:41 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id D12DD1829 for ; Sun, 7 Jul 2013 21:53:40 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 2229A2F29A; Sun, 7 Jul 2013 21:53:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=3la/j6bSMw3ck74triXRMzkePLk=; b=Z6XRZUSVucse6pj0yVZ4noLquWbg SxUgr13QSWYJ+VpQxACOzy1jZEp4S51qM8q42MO9lY7J/PvPCAjJjmEFE5RThfsG 04gYBEwkMMDhWUSb7WP2fAurubUvQl51iLyCG9DUwj3/jqcWShWXVqdXVaFfdCL5 LsSJ1g/+JXhTi3U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=vQL2OJ 1NnszonaANyyUyG3InLCIr6FSsC5X8tAgFARCwNaDxkKbc/7wVPJjSRpCiJbzt69 FRepb91z/jDZCSBLiX8ukuq4a12OanRAHh77GND/IWjHm2lyvoSswaA5v0MilS/P E+qN0YmdnegCQAnSjmyKThBamovrG9hglG7uc= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 16FE52F298; Sun, 7 Jul 2013 21:53:33 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 2CAEF2F295; Sun, 7 Jul 2013 21:53:32 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id D1CE65C55; Mon, 8 Jul 2013 09:53:24 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 0A37349FB979; Mon, 8 Jul 2013 09:53:29 +1200 (NZST) Date: Mon, 08 Jul 2013 09:53:28 +1200 Message-ID: <87zjty11gn.wl%berend@pobox.com> From: Berend de Boer To: Markus Gebert Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? In-Reply-To: <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> References: <87li5o5tz2.wl%berend@pobox.com> <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Mon_Jul__8_09:53:28_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: AA680946-E74F-11E2-A3B9-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jul 2013 21:53:41 -0000 --pgp-sign-Multipart_Mon_Jul__8_09:53:28_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Markus" == Markus Gebert writes: Markus> But taking a zfs snapshot is an atomic operation. Why not Markus> use that? For example: Markus> 1. snapshot the zfs at the same point in time you'd issue Markus> that ioctl on Linux 2. take the EBS snapshot at any time Markus> 3. clone the EBS snapshot to the new/other VM 4. zfs Markus> import the pool there 5. zfs rollback the filesystem to Markus> the snapshot taken in step 1 (or clone it and use that) OK, various tests later: this does not really work. If you create the snapshot, and make a backup, the snapshot does not show up on the backup (whatever the reason, perhaps the disks so inconsistent zfs has to rollback). But the biggest issue is that if writing is going on and you make the EBS snapshot, you can't really mount it. Maybe zfs mounts after hours, but I just gave if it didn't mount it after 1.5 hours. Another interesting thing I've seen was a completely empty drive after mount! So clearly EBS snapshots on a mounted multi-drive pool don't work. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Mon_Jul__8_09:53:28_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2eNYAAoJEKOfeD48G3g5AtwQALGQNz1a6uVaxxBcirEXBfBX ncsPfm/5A8J886KlXSm3sFpUcNXfLZwfN3OZBX723bF1UnehBBPq0C2gMqXgKKW4 9O2dH+4wcv1MnLqbxqM0vDvOIfq3YXlwoGkwSdUSOeLtbSdcQfDWYB9tqLDED02d mwZ01JE+b/MsjbPnSXpV52zxoZrt13rZLRn3wxeESybHnWMUqxnuoNzC+Ww0p+fB 0GlLHheBjTn2JBMkgTQWmdE1JpZy6ghrXAYV2yHix7qxgFi7xll5a+OZcCKqkDlv DHXxKbu12pOVY8XmeJbHFkm8nefQ8IMyOAwnhf0oPqLO8DCDMifx9aG4Oq4v1Ofk fjSG4WqqRNqkh1lBEj2/pm0B7qOp/Vbb96M6D3y4LmEnzw7ff7UTzGKTY81ypHTP 0Dg3qliara1Tn5AyXemUPu2nlPk7/W5phY7aGQc1LK326C7TtPuK7GsgqzSl1AzQ 6esrgxT3kJQY3kfkUm4nxB30XfOB4geWdqXg3qHJcej/EBN9Z+MTXqWBqsi57e1y H5q1hL+pFO4mSVhQ5s5ut+7+viHkVsGLZyp6L6Hf7JKmvjPdMBji9AIwmwGDvuwv O+FAa8pPpzNfvAzYp/h0nvHd4APyjm2HIY5A+MGGWQ9B24jaMEl+RBNiUuSbCJwE mSMWj7b2WJpwuBIV6I+Q =AqdM -----END PGP SIGNATURE----- --pgp-sign-Multipart_Mon_Jul__8_09:53:28_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Sun Jul 7 22:54:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9FD47CF8 for ; Sun, 7 Jul 2013 22:54:14 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 5CB231A91 for ; Sun, 7 Jul 2013 22:54:13 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 7C87D2F6E2 for ; Sun, 7 Jul 2013 22:54:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:subject:mime-version:content-type :content-transfer-encoding; s=sasl; bh=DAQ3ZXlLe3+p+AZsyB3AblmDJ vs=; b=waIkvqt1QjFp+jeeDwPRtJ7jKtksQgdp2zLw634X59b0xFq11Z7HJfGCw i3jX1YuVtu/3Mx/LbZctNz3vgNnqBaH/SQdWZVVtpvHJfqJjAYlpgrmc2tfgJWQk P4nTisezXmdPfIJqUsazzyCxd54DTXRnNEeyPn0D38zVYbNIPU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:subject:mime-version:content-type :content-transfer-encoding; q=dns; s=sasl; b=d4dOyuwDTKNW7yN+LkW xnOD/w8cjKE0dMeC3LUpD2NGAOgDPto0oGC4tXTFr6TYdvlj7EmfKO8N0+pJno2a SJuPHtTIrm8mZlSZHf7/SPPFKtciyCdQA4uJyXERr5pvhz3NEM9wGVK71+xDFmYI cBik2HNSjHXvOZvLtpcwlVX4= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 710842F6E1 for ; Sun, 7 Jul 2013 22:54:12 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 00B3A2F6DE for ; Sun, 7 Jul 2013 22:54:12 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 1A9085C84 for ; Mon, 8 Jul 2013 10:54:05 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 7203B49FB979 for ; Mon, 8 Jul 2013 10:54:09 +1200 (NZST) Date: Mon, 08 Jul 2013 10:54:09 +1200 Message-ID: <87y59i0yni.wl%berend@pobox.com> From: Berend de Boer To: freebsd-fs Subject: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Mon_Jul__8_10:54:09_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 23E84B2A-E758-11E2-B4B1-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jul 2013 22:54:14 -0000 --pgp-sign-Multipart_Mon_Jul__8_10:54:09_2013-1 Content-Type: text/plain; charset=US-ASCII Hi All, I've just completed a round of NFS testing on FreeBSD 9.1 on AWS. The underlying file system is ZFS. I have a real nfs killer test: doing an "svn update" of a directory of 3541 files. Performance on the NFS server itself is good: checking this out takes 11 seconds (doing the same on a not really comparable Linux NFS server is 43 seconds). Doing this on a client writing to an NFS mounted home directory is however terrible. Really terrible. This takes 25 minutes! With "sync=disable" it's 16 minutes. (doing this against the underpowered Linux NFS server is about 4.5 minutes). The problem might be that the NFS server (nfsd) runs at 70-80% CPU. So my writing speed is cpu bound (go figure). Repeating this with nfs4 + udp: doesn't work at all, get input/output error after a few files. nfs3: 1m55s (nfsd cpu is in the 6%-8% range) Varying tcp/udp, lock/nolock, sync/async, or enabling zfs sync=standard doesn't make any particular difference. All timing were within the 1m49s - 2m3s range. So what's up with NFS4 on FreeBSD? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Mon_Jul__8_10:54:09_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2fGRAAoJEKOfeD48G3g5NBsP/2p6Di1zrqbEtUASd81QfoaM 598UTjg44oDr7HZor02JiTbwdTtVpvN94sEb8jZa5ZNhcLnOvL6uBdzTSS+Bz7Tb X+ZQte2qfjtFLShBb8IwWt+L1hpVYiT/Hxg37hgnHtzzuW4Z3bR2mNUtAi5Bjdv6 HPNpZ0tjTY60huCLMI9tR67VBcGeo2yCBf/k32oM/wGoQoZQLgZlAxUdpk52KiQ1 jxeCY/fgjXyHA0MlC5H+q3e55Q9pS+sDK+0c6op3Z3QbCGstncWXSBPtWxWHMwuq Fg+rhaM8A1tbAEMIUHgN2pI8YqTrGOokrG2MH4YHPy+i6QwupZr5PksbGE6VW+EK 3k8v8rxS9t7DErBZWZjQMclaFnswcVg95v7gCCyrwS2LCPEv+y6nvGQvKeyLnkp8 Vdq+WaEoPrOhR/Wv2dmHStO+9628t0A8aNCoUXjGUV+w9yg+Sf2RL33AwY++6pEX frHzSIts+UjBf/H/l4FLY5OC/6TG+pnMG7pEWeNv069HR5NQ564rEMPnPLB++kqo ySOicFdudetqCeS8gsF8LE2Z2SwT+U/mOwnDyUq/ukpclu9biMIwRauA6NpNLIaR ywjZsVnTA9Ip4NlmV1dCYN3EPWx6zAroiXctAlshEfm4Ud6Tu3ubvj0LbJuCI7v2 KKnIZiqgv+F9AWaswJhL =gJek -----END PGP SIGNATURE----- --pgp-sign-Multipart_Mon_Jul__8_10:54:09_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 00:19:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 54F55CFA for ; Mon, 8 Jul 2013 00:19:21 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 1BD8C1CF0 for ; Mon, 8 Jul 2013 00:19:20 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg= c=1 sm=2 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=6I5d2MoRAAAA:8 a=GzJd4s-eAAAA:8 a=jhuFN07rW3Fs5YaUhqQA:9 a=QEXdDO2ut3YA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAJQE2lGDaFve/2dsb2JhbABaFoMlSoMIvViBIXSCIwEBBAEjBFIFFhgCAg0ZAlkGiBwGDKdHkDuBJo4RNAeCU4EcA5h8kB+DLSCBbA X-IronPort-AV: E=Sophos;i="4.87,1016,1363147200"; d="scan'208";a="38683382" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 07 Jul 2013 20:19:19 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 7921EB402C; Sun, 7 Jul 2013 20:19:19 -0400 (EDT) Date: Sun, 7 Jul 2013 20:19:19 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> In-Reply-To: <87y59i0yni.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 00:19:21 -0000 Berend de Boer wrote: > Hi All, > > I've just completed a round of NFS testing on FreeBSD 9.1 on AWS. The > underlying file system is ZFS. I have a real nfs killer test: doing > an "svn update" of a directory of 3541 files. > > Performance on the NFS server itself is good: checking this out takes > 11 seconds (doing the same on a not really comparable Linux NFS > server > is 43 seconds). > > Doing this on a client writing to an NFS mounted home directory is > however terrible. Really terrible. This takes 25 minutes! With > "sync=disable" it's 16 minutes. > > (doing this against the underpowered Linux NFS server is about 4.5 > minutes). > > The problem might be that the NFS server (nfsd) runs at 70-80% CPU. > So > my writing speed is cpu bound (go figure). > See below w.r.t. patch that reduces cpu overheads in the DRC. > Repeating this with nfs4 + udp: doesn't work at all, get input/output > error after a few files. > The RFCs for NFSv4 require use of transport protocols that include congestion control --> no UDP support. What client are you using? (If it is a FreeBSD one, I need to patch it to make the mount fail. Since NFSv4 over UDP was done in early testing, the client might still have that bogus code in it.) > nfs3: 1m55s (nfsd cpu is in the 6%-8% range) > > Varying tcp/udp, lock/nolock, sync/async, or enabling zfs > sync=standard doesn't make any particular difference. All timing were > within the 1m49s - 2m3s range. > > So what's up with NFS4 on FreeBSD? > Please try this patch: http://people.freebsd.org/rmacklem/~drc4.patch - After you apply the patch and boot the rebuilt kernel, the cpu overheads should be reduced after you increase the value of vfs.nfsd.tcphighwater. The larger you make it, the more space (mbuf clusters and other kernel malloc'd data structures) the DRC uses, but hopefully with reduced CPU overheads. The plan is to commit a patch semantically similar to this to head and then MFC it someday. ivoras@ has a similar patch, but written in cleaner C. However, I've never gotten around to combining the patches into a version for head. Someday I'll get to it, but not in time for 9.2. Good luck with it, rick ps: There is also ken@'s file handle affinity patch, which is in head (and I think stable/9), but it only works for NFSv3 at this point. Hopefully we'll come up with a patch for NFSv4 for it someday. > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 00:31:40 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7BBA3F53 for ; Mon, 8 Jul 2013 00:31:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 42A7C1D5D for ; Mon, 8 Jul 2013 00:31:39 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg= c=1 sm=2 a=4x594vOIrDwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=6I5d2MoRAAAA:8 a=GzJd4s-eAAAA:8 a=v161nJs2HSYVeKKg6NEA:9 a=QEXdDO2ut3YA:10 a=SV7veod9ZcQA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAOIH2lGDaFve/2dsb2JhbABaFoMlSoMIvViBIXSCIwEBAQEDAQEBIAQnIAsbGAICDRkCKQEJJgYIBwQBHASHbgynSJA7gSaNE340B4JTgRwDlQ6DbpAfgy0gMoEDNw X-IronPort-AV: E=Sophos;i="4.87,1016,1363147200"; d="scan'208";a="38684045" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 07 Jul 2013 20:31:16 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B19C3B4055; Sun, 7 Jul 2013 20:31:16 -0400 (EDT) Date: Sun, 7 Jul 2013 20:31:16 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <50381589.2918947.1373243476714.JavaMail.root@uoguelph.ca> In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 00:31:40 -0000 I wrote: > Berend de Boer wrote: > > Hi All, > > > > I've just completed a round of NFS testing on FreeBSD 9.1 on AWS. > > The > > underlying file system is ZFS. I have a real nfs killer test: > > doing > > an "svn update" of a directory of 3541 files. > > > > Performance on the NFS server itself is good: checking this out > > takes > > 11 seconds (doing the same on a not really comparable Linux NFS > > server > > is 43 seconds). > > > > Doing this on a client writing to an NFS mounted home directory is > > however terrible. Really terrible. This takes 25 minutes! With > > "sync=disable" it's 16 minutes. > > > > (doing this against the underpowered Linux NFS server is about 4.5 > > minutes). > > > > The problem might be that the NFS server (nfsd) runs at 70-80% CPU. > > So > > my writing speed is cpu bound (go figure). > > > See below w.r.t. patch that reduces cpu overheads in the DRC. > > > Repeating this with nfs4 + udp: doesn't work at all, get > > input/output > > error after a few files. > > > The RFCs for NFSv4 require use of transport protocols that include > congestion control --> no UDP support. What client are you using? > (If it is a FreeBSD one, I need to patch it to make the mount fail. > Since NFSv4 over UDP was done in early testing, the client might > still have that bogus code in it.) > > > nfs3: 1m55s (nfsd cpu is in the 6%-8% range) > > > > Varying tcp/udp, lock/nolock, sync/async, or enabling zfs > > sync=standard doesn't make any particular difference. All timing > > were > > within the 1m49s - 2m3s range. > > > > So what's up with NFS4 on FreeBSD? > > > Please try this patch: > http://people.freebsd.org/rmacklem/~drc4.patch Oops, it actually is: http://people.freebsd.org/~rmacklem/drc4.patch > - After you apply the patch and boot the rebuilt kernel, the cpu > overheads should be reduced after you increase the value of > vfs.nfsd.tcphighwater. The larger you make it, the more space > (mbuf clusters and other kernel malloc'd data structures) the DRC > uses, but hopefully with reduced CPU overheads. > The plan is to commit a patch semantically similar to this to head > and > then MFC it someday. ivoras@ has a similar patch, but written in > cleaner > C. However, I've never gotten around to combining the patches into a > version for head. Someday I'll get to it, but not in time for 9.2. > > Good luck with it, rick > ps: There is also ken@'s file handle affinity patch, which is in head > (and I think stable/9), but it only works for NFSv3 at this > point. > Hopefully we'll come up with a patch for NFSv4 for it someday. > > > -- > > All the best, > > > > Berend de Boer > > > > > > ------------------------------------------------------ > > Awesome Drupal hosting: https://www.xplainhosting.com/ > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 01:19:20 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7C450A80 for ; Mon, 8 Jul 2013 01:19:20 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 3AD6C1ED9 for ; Mon, 8 Jul 2013 01:19:19 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 3ACDC1DA06; Mon, 8 Jul 2013 01:19:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=f3vft/iOKQ59MpwieDnrogl95xs=; b=iI1iZIO+H3e9GgGh1q0/U87ELuRN 6bs8TPUaAeB3zk1sF7VeFMbbC9v2lMDi6TD2ET/+FUxzi508Bvm53gTecx8147r3 vf3Je6E04goLrLQITNQc3OH/J/HU1Er6BHs2+Fa9MOVxrQmEGcxiSI+UbzdrOXhC bMyYYs1lDq/oyA0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=Jh6tW2 ogvpU9Tijixm9RpliChrxMJDF1dSNSJfSCboU+AsjayTAlSTgMUI/SnYIlLgEdHj 6KujyQLGM9bUfNVa/2Kj6ZOzJ4oBmhk6G9s4m8MlHa4V0fvbNJgZE/LBnd94eg2G Y7gXAw1qd8x9IQA+kKEJhSBbyjL9K1e3VrWC0= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 306BF1DA05; Mon, 8 Jul 2013 01:19:17 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 9F0921DA04; Mon, 8 Jul 2013 01:19:16 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id C24915C55; Mon, 8 Jul 2013 13:19:09 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 2137449FB979; Mon, 8 Jul 2013 13:19:14 +1200 (NZST) Date: Mon, 08 Jul 2013 13:19:14 +1200 Message-ID: <87sizp26i5.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Mon_Jul__8_13:19:13_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 6848B016-E76C-11E2-9AF6-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 01:19:20 -0000 --pgp-sign-Multipart_Mon_Jul__8_13:19:13_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: >> Repeating this with nfs4 + udp: doesn't work at all, get >> input/output error after a few files. >> Rick> The RFCs for NFSv4 require use of transport protocols that Rick> include congestion control --> no UDP support. What client Rick> are you using? Happened to be Ubuntu 10.04 LTS. >> Rick> Please try this patch: Rick> http://people.freebsd.org/rmacklem/~drc4.patch - After you Rick> apply the patch and boot the rebuilt kernel, Ah kernel. The problem is I'm on Amazon, and I doubt I can rebuilt the kernel that easily. The patches to make that work only recently landed. I'll need to ask Colin Percival how I can safely do that. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Mon_Jul__8_13:19:13_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2hORAAoJEKOfeD48G3g5hKEQAI6Afn3dd3RJiUTOFuVlY765 0zfpTutNzeuVRPjyyd4w5WHeSfarkW6eSLzGHt6HqlZsWEdliqpxea7FEzdt8xJp N04jq1g9HiUR4o1zwBDGTiZt61G9jAaswHq6LoOOod6A3uTIcRYrW0bKZ5MT0UFq xSgftyaxMpHmoq0UuVq8v9tBcvleoQFaA4lt7oR5aTuF8F8J8aDE8LgWcPMRUhUN fF05FtVh4yjiJdIj2Knd4srJtKylJVgCs+IwALBcc6i+gL8CneIbxY/LqEbc2JNy 1e31VfyLzWsgBYUu1mGZJSqrrKnRfv2tkrp6CfcIr/MOhWALkeNpSj5F4aEGXh1z dgoWnmh7FpnfPY2hSqay00/VuyoDDGIB0wd61kuRMxOxtikRZHQSITRjz4uigzQV cH4DEfibZoo50rdPpHemPESAyLAPSszUl69+c24SST11HmliEkOBjSfQOIRlhisq hKM5aaFGklFJERBbrW+0uJ2PTXQDwCTpe/IxywRDW3wIzSYqzIZ0cmJKSujA0zQH 3RxmRBg2eXbCXtzt2usyzOKEGNWM8YQQtv38JFPDuMna2WiID398H7tP9GynCt00 LRoznJGBbxrLUeUszu2psFZUCn8bEqclHQ2QvcH1OyTGgeaCESH2OQacVULqAuAq xPJ//kN2EGqGwwaPGegj =c5j3 -----END PGP SIGNATURE----- --pgp-sign-Multipart_Mon_Jul__8_13:19:13_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 05:56:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 04D47762 for ; Mon, 8 Jul 2013 05:56:23 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id C55DF198A for ; Mon, 8 Jul 2013 05:56:22 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 42F2B29452; Mon, 8 Jul 2013 05:56:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=M3enjgX5anq7WbQns2e0zWumv6Y=; b=GqTnuvxDR3fE1ljp0/oU5RLSsTcc 8vwMhHZ2UII3FueEibNsgEVJ51Rmg4vxUkdhMcx9olj+S0/Q1llbCOVusp34RLN8 lPSjTO7V5SvZ0thqZVPH5J9T4lmTei8ZV4Rnm+Dtv0KczIJOfYeCXwfddWsyeFk1 53+HjFZVcmt76ws= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=IgE+j1 jS09+aE5637cuU4Y/H5X08OHAz1VmaSUB7MopOUgamiV1CVNwW396eXnIEa9zCxO IK4Doj/E+5oe9XovMYWXvghgatNAY5waiAZvSkbvO4W1/5yiONGfF2a70tbn8aRH CWe6DjnVwkV47K51jrhE487koeDMYdRgxfmjQ= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 3925329451; Mon, 8 Jul 2013 05:56:20 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 32E2829450; Mon, 8 Jul 2013 05:56:19 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 37CD75C55; Mon, 8 Jul 2013 17:56:12 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 6B98749FB979; Mon, 8 Jul 2013 17:56:16 +1200 (NZST) Date: Mon, 08 Jul 2013 17:56:16 +1200 Message-ID: <87d2qt1tof.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Mon_Jul__8_17:56:16_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 1C197226-E793-11E2-BBB8-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 05:56:23 -0000 --pgp-sign-Multipart_Mon_Jul__8_17:56:16_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> Please try this patch: Hi Rick, Could you please reroll the patc hagainst 9.1-RELEASE? Not sure what version of FreeBSD you made this for. I have 9.1-RELEASE. Get three failures: # patch --check -p0 < ~/drc4.patch Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |--- fs/nfsserver/nfs_nfsdcache.c.orig 2013-01-07 09:04:13.000000000 -0500 |+++ fs/nfsserver/nfs_nfsdcache.c 2013-03-12 22:42:05.000000000 -0400 -------------------------- Patching file fs/nfsserver/nfs_nfsdcache.c using Plan A... Hunk #1 succeeded at 160. Hunk #2 succeeded at 216. Hunk #3 succeeded at 271. Hunk #4 succeeded at 357. Hunk #5 succeeded at 370. Hunk #6 succeeded at 381. Hunk #7 succeeded at 396 with fuzz 2. Hunk #8 succeeded at 426. Hunk #9 succeeded at 444. Hunk #10 succeeded at 468. Hunk #11 failed at 476. Hunk #12 failed at 501. Hunk #13 succeeded at 523. Hunk #14 succeeded at 531. Hunk #15 succeeded at 547. Hunk #16 succeeded at 568. Hunk #17 succeeded at 579. Hunk #18 succeeded at 601. Hunk #19 succeeded at 665. Hunk #20 succeeded at 674. Hunk #21 failed at 683. Hunk #22 succeeded at 718. Hunk #23 succeeded at 729. Hunk #24 succeeded at 750. Hunk #25 succeeded at 779. Hunk #26 succeeded at 788. Hunk #27 succeeded at 803. Hunk #28 succeeded at 828. Hunk #29 succeeded at 927. Hunk #30 succeeded at 943. 3 out of 30 hunks failed--saving rejects to fs/nfsserver/nfs_nfsdcache.c.rej Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |--- fs/nfsserver/nfs_nfsdport.c.orig 2013-03-02 18:19:34.000000000 -0500 |+++ fs/nfsserver/nfs_nfsdport.c 2013-03-12 17:51:31.000000000 -0400 -------------------------- Patching file fs/nfsserver/nfs_nfsdport.c using Plan A... Hunk #1 succeeded at 59 with fuzz 1 (offset -2 lines). Hunk #2 succeeded at 3284 (offset -22 lines). Hunk #3 succeeded at 3351 with fuzz 1 (offset -5 lines). Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |--- fs/nfs/nfsport.h.orig 2013-03-02 18:35:13.000000000 -0500 |+++ fs/nfs/nfsport.h 2013-03-12 17:51:31.000000000 -0400 -------------------------- Patching file fs/nfs/nfsport.h using Plan A... Hunk #1 succeeded at 547 (offset -62 lines). Hmm... The next patch looks like a unified diff to me... The text leading up to this was: -------------------------- |--- fs/nfs/nfsrvcache.h.orig 2013-01-07 09:04:15.000000000 -0500 |+++ fs/nfs/nfsrvcache.h 2013-03-12 18:02:42.000000000 -0400 -------------------------- Patching file fs/nfs/nfsrvcache.h using Plan A... Hunk #1 succeeded at 41. done I just want to make sure I can apply it cleanly and that it's not a mistake I made when making this work when it doesn't. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Mon_Jul__8_17:56:16_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2lSAAAoJEKOfeD48G3g5qTgP/3bm1BT2GeK9DVW9hSll5LCb tXvzCUb80AB5gcQ4wJX2jX3jTbwP1OeHcHElsIi1bDA4gxTTIlJf9wBm3isgaAUQ 5pKz2rkgpyfgbkuyPGJnCVDbMazeqsqbOfp4F4t+BRow6jfZjPi8vUAC9UwhXwx4 YRuGXvDATvJjv8n2BzV7XLuUaHAez3pv9PFyhcHOMRidIKLaP8R/G5nyiAFEWBhY caZzPUy99Fhz5A50+F9xtKG43xFutDICpLMQo/YoNXhHKMwIe+2iJnE5c4UJZktC ZHIb9aho/hugVPoYw5AdHqBkswaMYMwyxjxH0bCcs8+HyZK/A9NQNCKvKtOa12Dg qsJNNM1Tis5xBYueA7XomyAMj2WH4VlnVvEQyMjO1ft2DueV8tDq99e/rhpkWDdj hmj4FCyK6s5mlZemjTYxfVvBwJynlTC4eJrIvMoNHKxOk+bshnz5RawazZd+xno1 /O6sIZF2RzwjE0uu5He9rdnGeqF/fl+qQmXC9PQLtO2Tk441sfeIbIgKy02psFhF zKhUWThQQjg5g2mu/7MmtrnRSQ/arN7Gt3G1uZ7twD9ZciWpkUM+/s2sycI4oAqV kDPApw9h+5EFOy/+f533YT8bG2+LtYymFqz9xVVOG4SURMeetXdR7kiy0B6kJFGh WdtKFQ1x+TfSHv7Grzi1 =4I0l -----END PGP SIGNATURE----- --pgp-sign-Multipart_Mon_Jul__8_17:56:16_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 11:35:38 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A8C0CCC4 for ; Mon, 8 Jul 2013 11:35:38 +0000 (UTC) (envelope-from markus.gebert@hostpoint.ch) Received: from mail.adm.hostpoint.ch (mail.adm.hostpoint.ch [IPv6:2a00:d70:0:a::e0]) by mx1.freebsd.org (Postfix) with ESMTP id 6AF0710FB for ; Mon, 8 Jul 2013 11:35:38 +0000 (UTC) Received: from [2001:1620:2013:1:8863:9957:fb76:e10e] (port=50969) by mail.adm.hostpoint.ch with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1Uw93Q-000Acw-Ll; Mon, 08 Jul 2013 12:52:24 +0200 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? From: Markus Gebert In-Reply-To: <87zjty11gn.wl%berend@pobox.com> Date: Mon, 8 Jul 2013 12:51:42 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> References: <87li5o5tz2.wl%berend@pobox.com> <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> To: Berend de Boer X-Mailer: Apple Mail (2.1508) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 11:35:38 -0000 On 07.07.2013, at 23:53, Berend de Boer wrote: >>>>>> "Markus" =3D=3D Markus Gebert = writes: >=20 > Markus> But taking a zfs snapshot is an atomic operation. Why not > Markus> use that? For example: >=20 > Markus> 1. snapshot the zfs at the same point in time you'd issue > Markus> that ioctl on Linux 2. take the EBS snapshot at any time > Markus> 3. clone the EBS snapshot to the new/other VM 4. zfs > Markus> import the pool there 5. zfs rollback the filesystem to > Markus> the snapshot taken in step 1 (or clone it and use that) >=20 > OK, various tests later: this does not really work. If you create the > snapshot, and make a backup, the snapshot does not show up on the > backup (whatever the reason, perhaps the disks so inconsistent zfs has > to rollback). I was under the impression that metadata operations are always = synchronous in zfs, so since 'zfs snapshot' is such an operation it = should be on disk as soon as the command completes. But I've nerver = actually confirmed this. So, it could be that zfs snapshots don't get = commited to disk immediately after all, or EBS confirmed write/flush = commands that we're not commited to whatever is considered stable = storage in that cloud. I don't know how well-behaved EBS and its = snapshots are when ich comes to flush commands, write order etc. So all = just speculation, stopping here... > But the biggest issue is that if writing is going on and you make the > EBS snapshot, you can't really mount it. Maybe zfs mounts after hours, > but I just gave if it didn't mount it after 1.5 hours. By 'mount' do you mean the import of the pool? Did you use -F on import? = In any case, this sounds too long. Was the system doing IO? > Another interesting thing I've seen was a completely empty drive after > mount! That's a bit unspecific. What's empty? Disk full of zeros? Partition = full of zeros? Pool without file system on it? Pool with empty = filesystems? > So clearly EBS snapshots on a mounted multi-drive pool don't work. I think with a lot of writes and transactions, you can't realiably avoid = a scenario where zfs can't find a valid or good enough mutual = transaction accross all disks. By avoiding writes while doing the EBS = snapshots, you could more likely end up with something you can actually = import. If you can't avoid the writes, you're out of luck. What you = really need in that case is the ability to snapshot all EBS disks as = group. Anyway, with the tools at hand (no IO "freeze" like Linux, no EBS = snapshots for groups of disks), I don't think you can acomplish what you = originally intended. Markus From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 11:41:12 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6D363593 for ; Mon, 8 Jul 2013 11:41:12 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 5FEFF15DA for ; Mon, 8 Jul 2013 11:41:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r68BeXgq053981 for ; Mon, 8 Jul 2013 11:41:12 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r68B6Z7b046101 for freebsd-fs@FreeBSD.org; Mon, 8 Jul 2013 11:06:35 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 8 Jul 2013 11:06:35 GMT Message-Id: <201307081106.r68B6Z7b046101@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 11:41:12 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- p kern/180236 fs [zfs] [nullfs] Leakage free space using ZFS with nullf o kern/178854 fs [ufs] FreeBSD kernel crash in UFS o kern/178713 fs [nfs] [patch] Correct WebNFS support in NFS server and o kern/178412 fs [smbfs] Coredump when smbfs mounted o kern/178388 fs [zfs] [patch] allow up to 8MB recordsize o kern/178349 fs [zfs] zfs scrub on deduped data could be much less see o kern/178329 fs [zfs] extended attributes leak o kern/178238 fs [nullfs] nullfs don't release i-nodes on unlink. f kern/178231 fs [nfs] 8.3 nfsv4 client reports "nfsv4 client/server pr o kern/178103 fs [kernel] [nfs] [patch] Correct support of index files o kern/177985 fs [zfs] disk usage problem when copying from one zfs dat o kern/177971 fs [nfs] FreeBSD 9.1 nfs client dirlist problem w/ nfsv3, o kern/177966 fs [zfs] resilver completes but subsequent scrub reports o kern/177658 fs [ufs] FreeBSD panics after get full filesystem with uf o kern/177536 fs [zfs] zfs livelock (deadlock) with high write-to-disk o kern/177445 fs [hast] HAST panic o kern/177240 fs [zfs] zpool import failed with state UNAVAIL but all d o kern/176978 fs [zfs] [panic] zfs send -D causes "panic: System call i o kern/176857 fs [softupdates] [panic] 9.1-RELEASE/amd64/GENERIC panic o bin/176253 fs zpool(8): zfs pool indentation is misleading/wrong o kern/176141 fs [zfs] sharesmb=on makes errors for sharenfs, and still o kern/175950 fs [zfs] Possible deadlock in zfs after long uptime o kern/175897 fs [zfs] operations on readonly zpool hang o kern/175449 fs [unionfs] unionfs and devfs misbehaviour o kern/175179 fs [zfs] ZFS may attach wrong device on move o kern/175071 fs [ufs] [panic] softdep_deallocate_dependencies: unrecov o kern/174372 fs [zfs] Pagefault appears to be related to ZFS o kern/174315 fs [zfs] chflags uchg not supported o kern/174310 fs [zfs] root point mounting broken on CURRENT with multi o kern/174279 fs [ufs] UFS2-SU+J journal and filesystem corruption o kern/173830 fs [zfs] Brain-dead simple change to ZFS error descriptio o kern/173718 fs [zfs] phantom directory in zraid2 pool f kern/173657 fs [nfs] strange UID map with nfsuserd o kern/173363 fs [zfs] [panic] Panic on 'zpool replace' on readonly poo o kern/173136 fs [unionfs] mounting above the NFS read-only share panic o kern/172942 fs [smbfs] Unmounting a smb mount when the server became o kern/172348 fs [unionfs] umount -f of filesystem in use with readonly o kern/172334 fs [unionfs] unionfs permits recursive union mounts; caus o kern/171626 fs [tmpfs] tmpfs should be noisier when the requested siz o kern/171415 fs [zfs] zfs recv fails with "cannot receive incremental o kern/170945 fs [gpt] disk layout not portable between direct connect o bin/170778 fs [zfs] [panic] FreeBSD panics randomly o kern/170680 fs [nfs] Multiple NFS Client bug in the FreeBSD 7.4-RELEA o kern/170497 fs [xfs][panic] kernel will panic whenever I ls a mounted o kern/169945 fs [zfs] [panic] Kernel panic while importing zpool (afte o kern/169480 fs [zfs] ZFS stalls on heavy I/O o kern/169398 fs [zfs] Can't remove file with permanent error o kern/169339 fs panic while " : > /etc/123" o kern/169319 fs [zfs] zfs resilver can't complete o kern/168947 fs [nfs] [zfs] .zfs/snapshot directory is messed up when o kern/168942 fs [nfs] [hang] nfsd hangs after being restarted (not -HU o kern/168158 fs [zfs] incorrect parsing of sharenfs options in zfs (fs o kern/167979 fs [ufs] DIOCGDINFO ioctl does not work on 8.2 file syste o kern/167977 fs [smbfs] mount_smbfs results are differ when utf-8 or U o kern/167688 fs [fusefs] Incorrect signal handling with direct_io o kern/167685 fs [zfs] ZFS on USB drive prevents shutdown / reboot o kern/167612 fs [portalfs] The portal file system gets stuck inside po o kern/167272 fs [zfs] ZFS Disks reordering causes ZFS to pick the wron o kern/167260 fs [msdosfs] msdosfs disk was mounted the second time whe o kern/167109 fs [zfs] [panic] zfs diff kernel panic Fatal trap 9: gene o kern/167105 fs [nfs] mount_nfs can not handle source exports wiht mor o kern/167067 fs [zfs] [panic] ZFS panics the server o kern/167065 fs [zfs] boot fails when a spare is the boot disk o kern/167048 fs [nfs] [patch] RELEASE-9 crash when using ZFS+NULLFS+NF o kern/166912 fs [ufs] [panic] Panic after converting Softupdates to jo o kern/166851 fs [zfs] [hang] Copying directory from the mounted UFS di o kern/166477 fs [nfs] NFS data corruption. o kern/165950 fs [ffs] SU+J and fsck problem o kern/165521 fs [zfs] [hang] livelock on 1 Gig of RAM with zfs when 31 o kern/165392 fs Multiple mkdir/rmdir fails with errno 31 o kern/165087 fs [unionfs] lock violation in unionfs o kern/164472 fs [ufs] fsck -B panics on particular data inconsistency o kern/164370 fs [zfs] zfs destroy for snapshot fails on i386 and sparc o kern/164261 fs [nullfs] [patch] fix panic with NFS served from NULLFS o kern/164256 fs [zfs] device entry for volume is not created after zfs o kern/164184 fs [ufs] [panic] Kernel panic with ufs_makeinode o kern/163801 fs [md] [request] allow mfsBSD legacy installed in 'swap' o kern/163770 fs [zfs] [hang] LOR between zfs&syncer + vnlru leading to o kern/163501 fs [nfs] NFS exporting a dir and a subdir in that dir to o kern/162944 fs [coda] Coda file system module looks broken in 9.0 o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo o kern/161864 fs [ufs] removing journaling from UFS partition fails on o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic f kern/160860 fs [ufs] Random UFS root filesystem corruption with SU+J o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159930 fs [ufs] [panic] kernel core o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs p kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o bin/153142 fs [zfs] ls -l outputs `ls: ./.zfs: Operation not support o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server o kern/145750 fs [unionfs] [hang] unionfs locks the machine s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141950 fs [unionfs] [lor] ufs/unionfs/ufs Lock order reversal o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/137588 fs [unionfs] [lor] LOR nfs/ufs/nfs o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis p kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126973 fs [unionfs] [hang] System hang with unionfs and init chr o kern/126553 fs [unionfs] unionfs move directory problem 2 (files appe o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o bin/123574 fs [unionfs] df(1) -t option destroys info for unionfs (a o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o kern/121385 fs [unionfs] unionfs cross mount -> kernel panic o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118318 fs [nfs] NFS server hangs under special circumstances o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/67326 fs [msdosfs] crash after attempt to mount write protected o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t o kern/9619 fs [nfs] Restarting mountd kills existing mounts 326 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 12:41:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1516EDFA; Mon, 8 Jul 2013 12:41:01 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-wg0-x230.google.com (mail-wg0-x230.google.com [IPv6:2a00:1450:400c:c00::230]) by mx1.freebsd.org (Postfix) with ESMTP id 80E3F1E7B; Mon, 8 Jul 2013 12:41:00 +0000 (UTC) Received: by mail-wg0-f48.google.com with SMTP id f11so3647706wgh.15 for ; Mon, 08 Jul 2013 05:40:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=3C6s6nuonag5V+OfmlZYnUOWv/ddTgytAS0EZmFz7Ng=; b=xI2Frg0EwzlW+AwT2r58d1GgN1/B6d9TogCY59nNJaYDz0fpr6KL4ORmYCDICpRyiy 3G4t1TqdCFONwbjS4ZgJ2eXYkoUJ+e3bmcMeehntHhX0W9NFcSVUzM+x2WYhQgRLp9Or Rm3NL3D/vJvttSUtQyo2Wy49g3NtM80jVa5dWhBVB1+ASYrCK01kN+74vfzw2S3EbOSp +ikoNV1EMQxHPvcpPQUmh95i+2pOKIhFulmRrRfbPA/ElZNj9HMvNn62VOSzlfYUmF7c Jx+ao0LtWTJvK+kiHL6TG6pUeq/5X4oFASuqgZyefzcE9sz4qqTFDIV0O9gWOZb7qs61 8LRA== X-Received: by 10.180.211.171 with SMTP id nd11mr11512989wic.17.1373287259569; Mon, 08 Jul 2013 05:40:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.217.123.138 with HTTP; Mon, 8 Jul 2013 05:40:19 -0700 (PDT) From: Artem Naluzhnyy Date: Mon, 8 Jul 2013 15:40:19 +0300 Message-ID: Subject: RAID10 stripe size and PostgreSQL performance To: freebsd-database@freebsd.org, freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 12:41:01 -0000 Hi, I'm benchmarking PostgreSQL using different RAID10 stripe size values for a new server. Tried bonnie++ and pgbench on two stripe size configurations: * 32 KB (a half of current UFS bsize) - 254 pgbench tps * 1 MB (max supported by the RAID controller) - 626 pgbench tps See OS/hardware configuration, benchmark methodology and raw results here - http://pastebin.com/F8uZEZdm Is this expected behavior with more than twice higher pgbench tps on 1MB stripe size? Are there any RAID stripe size recommendations for better PostgreSQL performance? (I can not change the FS type, standard PG block size etc. - they are locked by vendor in this commercial FreeBSD distribution) -- Artem Naluzhnyy From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 15:17:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8623AB86 for ; Mon, 8 Jul 2013 15:17:10 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id 44CD31838 for ; Mon, 8 Jul 2013 15:17:09 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1UwDBW-000125-Dt for freebsd-fs@freebsd.org; Mon, 08 Jul 2013 17:17:02 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Jul 2013 17:17:02 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 08 Jul 2013 17:17:02 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Subject: Re: RAID10 stripe size and PostgreSQL performance Date: Mon, 08 Jul 2013 17:16:45 +0200 Lines: 50 Message-ID: References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2UBCEUFKHBCWAHTMQXKDM" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130322 Thunderbird/17.0.4 In-Reply-To: X-Enigmail-Version: 1.5.1 Cc: freebsd-database@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 15:17:10 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2UBCEUFKHBCWAHTMQXKDM Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 08/07/2013 14:40, Artem Naluzhnyy wrote: > Hi, >=20 > I'm benchmarking PostgreSQL using different RAID10 stripe size values > for a new server. Tried bonnie++ and pgbench on two stripe size > configurations: >=20 > * 32 KB (a half of current UFS bsize) - 254 pgbench tps > * 1 MB (max supported by the RAID controller) - 626 pgbench tps >=20 > See OS/hardware configuration, benchmark methodology and raw results > here - http://pastebin.com/F8uZEZdm >=20 > Is this expected behavior with more than twice higher pgbench tps on > 1MB stripe size? No, it is not. For start, can you please repeat your benchmarks but with restarting the PostgreSQL server between each pgbench run? Also, you should make sure that the database is located on the same location on the disk platters by e.g. creating a small partition which is about 150% larger than your pgbench database (and your pgbench database should be at least 2x larger than your RAM, if you are going to benchmark IO and not memory caches), which is located at the same position (byte offset) in your RAID10 volume. ------enig2UBCEUFKHBCWAHTMQXKDM Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlHa194ACgkQ/QjVBj3/HSzcsACgn/Z2oEbEzOYTUc+jxH8ZY90s y3AAoJel6Gebm0Wi0ScGFpMXrdo9PKv1 =Q4jH -----END PGP SIGNATURE----- ------enig2UBCEUFKHBCWAHTMQXKDM-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 17:04:10 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id BD8AC402; Mon, 8 Jul 2013 17:04:10 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 963C71CE8; Mon, 8 Jul 2013 17:04:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r68H4A5Z017715; Mon, 8 Jul 2013 17:04:10 GMT (envelope-from avg@freefall.freebsd.org) Received: (from avg@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r68H4A3O017712; Mon, 8 Jul 2013 17:04:10 GMT (envelope-from avg) Date: Mon, 8 Jul 2013 17:04:10 GMT Message-Id: <201307081704.r68H4A3O017712@freefall.freebsd.org> To: mloftis@wgops.com, avg@FreeBSD.org, freebsd-fs@FreeBSD.org From: avg@FreeBSD.org Subject: Re: kern/141305: [zfs] FreeBSD ZFS+sendfile severe performance issues (no cache) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 17:04:10 -0000 Synopsis: [zfs] FreeBSD ZFS+sendfile severe performance issues (no cache) State-Changed-From-To: open->closed State-Changed-By: avg State-Changed-When: Mon Jul 8 17:04:10 UTC 2013 State-Changed-Why: This was fixed long ago. http://www.freebsd.org/cgi/query-pr.cgi?pr=141305 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 18:39:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F2F6D7F7; Mon, 8 Jul 2013 18:39:03 +0000 (UTC) (envelope-from bofh@terranova.net) Received: from tog.net (tog.net [216.89.226.5]) by mx1.freebsd.org (Postfix) with ESMTP id D1A0C11A0; Mon, 8 Jul 2013 18:39:03 +0000 (UTC) Received: from [IPv6:2605:5a00:ffff::face] (unknown [IPv6:2605:5a00:ffff::face]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tog.net (Postfix) with ESMTPSA id 3bpwRP3JKbz6NN; Mon, 8 Jul 2013 14:38:57 -0400 (EDT) Message-ID: <51DB0730.1010200@terranova.net> Date: Mon, 08 Jul 2013 14:38:40 -0400 From: Travis Mikalson Organization: TerraNovaNet Internet Services User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Report: ZFS deadlock in 9-STABLE References: <51D45401.5050801@terranova.net> <51D47A5F.3030501@delphij.net> In-Reply-To: <51D47A5F.3030501@delphij.net> X-Enigmail-Version: 0.96.0 OpenPGP: url=http://www.terranova.net/pgp/bofh Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: kib@freebsd.org, d@delphij.net X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 18:39:04 -0000 Xin Li wrote: > Hi, > > Sorry for the top posting but I am quite convinced that this is a > known issue that we have seen with our customer. Please try applying > this patch [1] and please report back if that fixes your problem. > > Note that if you would like to provide more help, we would appreciate > that you test Konstantin's patch as well, at: > > http://lists.freebsd.org/pipermail/freebsd-hackers/2013-May/042876.html > > [1] See attachment; the commit is > https://github.com/trueos/trueos/commit/f678ae7c7f72fba577b00e3d0c237c4f297575c6 As of ~12 hours ago I am running that system with your patch (which seems to have incorporated Konstantin's patch already) and if it was effective I will not have anything to report back for at least 20 days. No news will be good news in this case. If I make it to 20+ days without a livelock, that will be a pretty good indication that this helped if not fixed it and I will report back at that time. Thus far this year I've not gone longer than 15 days without a livelock. Thank you again! From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 20:06:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7D086FF0 for ; Mon, 8 Jul 2013 20:06:33 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 44AB016FC for ; Mon, 8 Jul 2013 20:06:32 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 3834E2FFB3; Mon, 8 Jul 2013 20:06:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=pC03HdW5XSxRW3TjPPhnOWTymdk=; b=sUiETYgGroc+jOX9+8Ehn8pJVtOT 7YNgCOhGuiOC5Y/omOrhJlberSpbJQafDOYovZEZg0qlZJtdkSNwn9tC8XkB9hGq 0FuC3AMclos/bWV1DvXF6J8vlzNXX7q3htqT0iINabg8zJnitCDsLMmISz0wOJdg VMJMEENR4vzsADM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=d6I++Z o2LYvyP+fT2tgz8GFeSCfCNN1QHCQvUnhsd4C82LeTQRu30Y/TO+JB36PJGXyoKX 36aaIbcAi7Gf8KGddX795TdZEdj71XqlQHXUa+p2Q8JeuOGFez23M1Wu4oYOOOe2 EEFxwqpYlt/UCxRmUKr9gZJdSbbTwq+7CSHhY= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 2BB492FFB2; Mon, 8 Jul 2013 20:06:29 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 0F9FA2FFA6; Mon, 8 Jul 2013 20:06:28 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id E32F45C55; Tue, 9 Jul 2013 08:06:20 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 38A7A49FB97A; Tue, 9 Jul 2013 08:06:25 +1200 (NZST) Date: Tue, 09 Jul 2013 08:06:25 +1200 Message-ID: <877gh024vy.wl%berend@pobox.com> From: Berend de Boer To: Markus Gebert Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? In-Reply-To: <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> References: <87li5o5tz2.wl%berend@pobox.com> <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_08:06:24_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: DFC09486-E809-11E2-827F-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 20:06:33 -0000 --pgp-sign-Multipart_Tue_Jul__9_08:06:24_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Markus" == Markus Gebert writes: Markus> By 'mount' do you mean the import of the pool? Did you use Markus> -F on import? In any case, this sounds too long. Was the Markus> system doing IO? Yep, did -F as well, I believe I have seen the most disastrous imports in that case, i.e. empty disks. And it did continuous I/O, as best as I could determine at max disk performance. >> Another interesting thing I've seen was a completely empty >> drive after mount! Markus> That's a bit unspecific. What's empty? Disk full of zeros? Markus> Partition full of zeros? Pool without file system on it? Markus> Pool with empty filesystems? With empty file systems. Markus> What you really need in that case is the ability to Markus> snapshot all EBS disks as group. Which Linux offers. Markus> Anyway, with the tools at hand (no IO "freeze" like Linux, Markus> no EBS snapshots for groups of disks), I don't think you Markus> can acomplish what you originally intended. Indeed. I think my best strategy is doing frequent ZFS snapshots to a stand-by server, and take the backups from there. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_08:06:24_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2xvAAAoJEKOfeD48G3g5Hg4P/2uEveGay27gtcSb5KEKg62q xy3aW6Hm5GH40WbXYxdRAGMcwQwnVwzYPTefE05KTd+KYl6fXHPA2prVLmOm2PVJ u82ff/E3SjEUq6eAqIl8pngztXfoTcoSosATV4bIk01a4zuSGf+KndMWSqt8LhuQ Rixpz/nakYEa7T1zcIYkNiH5BSoqSKdFuWqOhTCUeDC61tN1sno+uzhBi3+X0gFv uUdcrl5Aic3hfVmLzSL4jHznMuS9xEDgsossEmTNfKK6tgw+F0QVZK+5o4hMY3z6 1vMyAN1K5GuDxFyGusqsjf+B2rcFZYN68rRx+VjviV9yqTtKS6QO7P/HroWaRNtL bP7xacwWw3MCcwhbO9K6csVJSXSty5aHB+tbKBTtRo+0oNppZrItbOWfWZzZbpfa TiCDjMFXmkRrsw1k5cQlN5YHuovGnmc4RaS3owyi1nQKQLIt/1vGC7LsRI2ykIjs jNwbbKI09cWDsBtNjwu1aGxR4lpd7FUE2TOlMQ2+pp0HIL2JWD4L6EdZlD+uXFLd 1TnwLE4HPq8bLLJ8T22E85DQSp29Wygc/wFwEiLudGIXc9FF4LeSGZdlC8TE5B/3 hXDGfA7yB9DHEe0mVi1QF88iDIL7NWAR0pcme/jJd+Wj3lT+sWAAkug+fiCbzevK nEfproH4w+KiCGGs2Dp1 =tF0Q -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_08:06:24_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 20:09:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D0B3B133 for ; Mon, 8 Jul 2013 20:09:01 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 8DB3A1719 for ; Mon, 8 Jul 2013 20:09:01 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id C64432E1A4; Mon, 8 Jul 2013 20:09:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=JLD0c3/QohyG3FYzb2Y5d4k/KFg=; b=A916SWVMEcDayQUPVGFI2PNJyQtV rrBqAwAOwDdtLQO9gMwTHo/LHAtHQBTmcb07KzmsXEXv3MQTTdloLkgP6X4f8oHP ZSATlzuo21KGLq2ego5IMn96BElemSoztLEmw/bF5HGuy0Kir5RurVgishMhUX6v Y7skQ+qNybKpWp8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=Dk0Lqm 4Wiky6P4gXWoIQLw4mN7Caqd+QXw9j9Xq/WbT0srEeNiUQRPmWYwJw3QMw044JCh C3TwqlWSSHL0fjGov+qPAHRIwXAdDo215WYJaBXVyx8uNOJdkD4Thdgi9+qxYIWd 4CkAfZ25tzUyEoCne54jG95VKeMeWGIiDrdfg= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id BA3E62E1A3; Mon, 8 Jul 2013 20:09:00 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 3C8742E19E; Mon, 8 Jul 2013 20:09:00 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 5C5F45C55; Tue, 9 Jul 2013 08:08:53 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id A8F8849FB97A; Tue, 9 Jul 2013 08:08:57 +1200 (NZST) Date: Tue, 09 Jul 2013 08:08:57 +1200 Message-ID: <8761wk24rq.wl%berend@pobox.com> From: Berend de Boer To: Will Andrews Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? In-Reply-To: References: <87li5o5tz2.wl%berend@pobox.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_08:08:57_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 3A74C1CC-E80A-11E2-B383-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 20:09:01 -0000 --pgp-sign-Multipart_Tue_Jul__9_08:08:57_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Will" == Will Andrews writes: Will> zpool freeze is a debugging-only command, as the comment Will> suggests. It is not really of much use outside of testing Will> changes to ZIL code. Once run, the only thing you can do to Will> get normal I/O running again is to export the pool and Will> import it again. Right, so I might have screwed my server? It's only semi-production, but you're saying something is wrong with it right now? Does it not commit stuff to disk anymore? Because I might have screwed up some EBS snapshot tests as I tested this "zfs freeze" thing a few times. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_08:08:57_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2xxZAAoJEKOfeD48G3g525sQAM9AP8wWeqRzhBgHj8zTxrOS wxtufZyoguqy4EABCK0vauUhWc6rzM5gjuMOkUgywfUFB3uXnZPFKh/DfrHU3z1D 021ep66+9IDAX2hLj2PwgSD6KsKY+AasWzqYF4a2/VYtdaOECIRcMlu8MuTu6KP9 Wg6MyxZXQjJ48bLs1oH5Bh+Vs4dqrsGSyY4CTXS7jzS6GffeSTg946NCILn3kO/E lNYeHl/AXMYqaHMJA9Hamoi310CZ8waz0EYnkq+dUQfcC97BgR5S4ez9M4SphiRF xgeBagyWpjDz8t4da+4Si6FEz0uGUUZsaOUpt0fCpto7QBeuiR6e7skq7sKt69qr Jtk1jh2vZ66Wo6Wnjapo3zub0cjzyEKQw7wjfrkpnvMkEF1FE2bTNIvnN3HBqcIG uvqjoJ1yftLz2tNcj24Mn3MFP1Z19GxC5ajtWWnj4ctEDPTAAGzheybeCFpXsvI9 7NEtVjvsbWZIiQjIxnzmJBGIU7lQCSic2w8t+NKAmOpfnNLZ+q62SEM2pPthx8gU j9p7n2HNS9T5DferYAjys0n0NMMn2yLoQKfiey/JyZaNSoTFyngB+q5E6UY+dUlp cmoOOXqysVzAExzGNxUyW2IFOX1eVT0TtxbplemYtlZJm2yHLgB5/PMtfX7kOKoE nAb4LJJMNWgAE8QiJXC7 =5oi8 -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_08:08:57_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 21:02:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4C81890F for ; Mon, 8 Jul 2013 21:02:07 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay4-d.mail.gandi.net (relay4-d.mail.gandi.net [217.70.183.196]) by mx1.freebsd.org (Postfix) with ESMTP id C86CF18DC for ; Mon, 8 Jul 2013 21:02:06 +0000 (UTC) Received: from mfilter25-d.gandi.net (mfilter25-d.gandi.net [217.70.178.153]) by relay4-d.mail.gandi.net (Postfix) with ESMTP id 3146217209A; Mon, 8 Jul 2013 23:01:49 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter25-d.gandi.net Received: from relay4-d.mail.gandi.net ([217.70.183.196]) by mfilter25-d.gandi.net (mfilter25-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id Zama8qeo26Sh; Mon, 8 Jul 2013 23:01:47 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay4-d.mail.gandi.net (Postfix) with ESMTPSA id E3611172094; Mon, 8 Jul 2013 23:01:46 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 13CAF73A31; Mon, 8 Jul 2013 14:01:45 -0700 (PDT) Date: Mon, 8 Jul 2013 14:01:45 -0700 From: Jeremy Chadwick To: Berend de Boer Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? Message-ID: <20130708210145.GA89605@icarus.home.lan> References: <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <877gh024vy.wl%berend@pobox.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 21:02:07 -0000 On Tue, Jul 09, 2013 at 08:06:25AM +1200, Berend de Boer wrote: > >>>>> "Markus" == Markus Gebert writes: > Markus> What you really need in that case is the ability to > Markus> snapshot all EBS disks as group. > > Which Linux offers. And now I shall quote your original mail that started this thread: > I'm experimenting with building a FreeBSD NFS server on Amazon AWS > EC2. I've created a zpool with 5 disks in a raidz2 configuration. > > How can I make a consistent backup of this using EBS? Therefore, the answer/solution for you at this stage seems to be: use Linux. Linux does what you need -- it offers you guest-level utilities that interface with the proprietary storage system back-end (EBS) that is offered by your choice of hosting vendor (Amazon). So what's the problem with using Linux? Why sound so apathetic-yet-confrontational (re: "Linux offers this, FreeBSD doesn't")? There is absolutely no shame in any way/shape/form in using an OS that meets your needs/requirements. I can't speak for others, but if that's Windows, great -- if that's Linux, great -- if that's some proprietary thing that only 30 people use, great. I really couldn't care less what someone uses, as long as it allows them to accomplish what they need. Otherwise, if this is some sort of "deal-breaker" for you and you absolutely need this functionality on FreeBSD, my advice is to talk to Amazon. EBS is a closed, black-box-proprietary thing. The userland utilities they may offer on Linux, depending on how they work, could be made to work on FreeBSD (and NOT through Linux emulation, thank you very much). If this is something you want, you should talk to them about it. They aren't going to know/care unless you tell them it's something that interests you as a customer. If they respond "that's nice, we have only a small interest in FreeBSD", then you should be able to take that response and make decisions based upon it, depending on what your needs are. ***You*** are responsible for those choices, not anyone here. :-) So please do not try to make this a "Linux vs. FreeBSD" thing when the actual limitation here is being indirectly imposed on you by your choice of hosting vendor. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 21:33:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6F981D59 for ; Mon, 8 Jul 2013 21:33:23 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qc0-x22e.google.com (mail-qc0-x22e.google.com [IPv6:2607:f8b0:400d:c01::22e]) by mx1.freebsd.org (Postfix) with ESMTP id 3226E1A91 for ; Mon, 8 Jul 2013 21:33:23 +0000 (UTC) Received: by mail-qc0-f174.google.com with SMTP id m15so2580655qcq.19 for ; Mon, 08 Jul 2013 14:33:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=i+k/1pXDN6BfUtD+tQPpFciauU24D9iDsK4I+3mY/Dw=; b=L5OUb+865GPbNoVU9uUVDiTj9Plq7VoGuK0LFhD1BtprsEsE7locnww3moW3TsR5eV fqUrQki5sqLRjkQVkOiz91w2j5jLbd2ltOKH2GJPHh1R0mD7wioackoeVHl8ahvwaKGv OnwK9EU+t3FQbeyI2mEuU6anLpmny40DbGA1YwE+nfaYFF4mYqoWK4JSpWLoSTLq0a0k 2OJqfAyD6NeUCDC4P0DW6ne4/ib71xdPfdLZR2iBVNP4GfFrumE6UoXEvRkrPORZZz+1 K4HnH+sqFz9ZodMzEbdEyHuFqboNlnZwoEeLFX4oY7dkMnGgmvYgxPdKfI2d0ei6SIME EDpg== MIME-Version: 1.0 X-Received: by 10.49.85.4 with SMTP id d4mr18600731qez.10.1373319202701; Mon, 08 Jul 2013 14:33:22 -0700 (PDT) Received: by 10.49.49.135 with HTTP; Mon, 8 Jul 2013 14:33:22 -0700 (PDT) In-Reply-To: <20130708210145.GA89605@icarus.home.lan> References: <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> Date: Mon, 8 Jul 2013 14:33:22 -0700 Message-ID: Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? From: Freddie Cash To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 21:33:23 -0000 On Mon, Jul 8, 2013 at 2:01 PM, Jeremy Chadwick wrote: > On Tue, Jul 09, 2013 at 08:06:25AM +1200, Berend de Boer wrote: > > >>>>> "Markus" == Markus Gebert writes: > > Markus> What you really need in that case is the ability to > > Markus> snapshot all EBS disks as group. > > > > Which Linux offers. > > And now I shall quote your original mail that started this thread: > > > I'm experimenting with building a FreeBSD NFS server on Amazon AWS > > EC2. I've created a zpool with 5 disks in a raidz2 configuration. > > > > How can I make a consistent backup of this using EBS? > > Therefore, the answer/solution for you at this stage seems to be: use > Linux. Linux does what you need -- it offers you guest-level utilities > that interface with the proprietary storage system back-end (EBS) that > is offered by your choice of hosting vendor (Amazon). So what's the > problem with using Linux? Why sound so apathetic-yet-confrontational > (re: "Linux offers this, FreeBSD doesn't")? > Something else to consider is that this may not be a FreeBSD issue at all, but a filesystem/storage system issue. Meaning, if you use ZFS on Linux ... EBS backups will not work. Same if you try to use Solaris or Illumos or any other ZFS-enabled OS that will run in Amazon's cloud. At which point, it would make more sense taking the discussion upstream to Illumos to find a way to quiesce a ZFS pool in such a way that EBS backups would work. Once that is done, then it can filter downstream to FreeBSD, Linux, and others. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 22:23:00 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1BC9C984 for ; Mon, 8 Jul 2013 22:23:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id D6C5C1C70 for ; Mon, 8 Jul 2013 22:22:59 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=u+Bwc9JL7tMNtl/i9xObSTPSFclN5AOtXcIZY5dPsHA= c=1 sm=2 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=6I5d2MoRAAAA:8 a=GzJd4s-eAAAA:8 a=MUmArWVO6LuOtltXXSgA:9 a=QEXdDO2ut3YA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAHU621GDaFve/2dsb2JhbABZFoMlTYMIvXqBMHSCIwEBBAEjVgUWGAICDRkCWQYTiAkGDKdckSuBJo0DgQ40B4JUgRwDmHyQH4MtIIE1Nw X-IronPort-AV: E=Sophos;i="4.87,1022,1363147200"; d="scan'208";a="39456094" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 08 Jul 2013 18:22:52 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id D9571B3F11; Mon, 8 Jul 2013 18:22:52 -0400 (EDT) Date: Mon, 8 Jul 2013 18:22:52 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> In-Reply-To: <87d2qt1tof.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 22:23:00 -0000 Berend de Boer wrote: > >>>>> "Rick" == Rick Macklem writes: > > Rick> Please try this patch: > > Hi Rick, > > Could you please reroll the patc hagainst 9.1-RELEASE? Not sure what > version of FreeBSD you made this for. I haven't a clue, either, to be honest;-) > I have 9.1-RELEASE. Get three > failures: > I don't have a copy of 9.1-RELEASE handy, but here's one for stable/9. If that still fails, just email and I'll download 9.1-RELEASE and create one for it. http://people.freebsd.org/~rmacklem/drc4-stable9.patch rick > # patch --check -p0 < ~/drc4.patch > Hmm... Looks like a unified diff to me... > The text leading up to this was: > -------------------------- > |--- fs/nfsserver/nfs_nfsdcache.c.orig 2013-01-07 > |09:04:13.000000000 -0500 > |+++ fs/nfsserver/nfs_nfsdcache.c 2013-03-12 22:42:05.000000000 > |-0400 > -------------------------- > Patching file fs/nfsserver/nfs_nfsdcache.c using Plan A... > Hunk #1 succeeded at 160. > Hunk #2 succeeded at 216. > Hunk #3 succeeded at 271. > Hunk #4 succeeded at 357. > Hunk #5 succeeded at 370. > Hunk #6 succeeded at 381. > Hunk #7 succeeded at 396 with fuzz 2. > Hunk #8 succeeded at 426. > Hunk #9 succeeded at 444. > Hunk #10 succeeded at 468. > Hunk #11 failed at 476. > Hunk #12 failed at 501. > Hunk #13 succeeded at 523. > Hunk #14 succeeded at 531. > Hunk #15 succeeded at 547. > Hunk #16 succeeded at 568. > Hunk #17 succeeded at 579. > Hunk #18 succeeded at 601. > Hunk #19 succeeded at 665. > Hunk #20 succeeded at 674. > Hunk #21 failed at 683. > Hunk #22 succeeded at 718. > Hunk #23 succeeded at 729. > Hunk #24 succeeded at 750. > Hunk #25 succeeded at 779. > Hunk #26 succeeded at 788. > Hunk #27 succeeded at 803. > Hunk #28 succeeded at 828. > Hunk #29 succeeded at 927. > Hunk #30 succeeded at 943. > 3 out of 30 hunks failed--saving rejects to > fs/nfsserver/nfs_nfsdcache.c.rej > Hmm... The next patch looks like a unified diff to me... > The text leading up to this was: > -------------------------- > |--- fs/nfsserver/nfs_nfsdport.c.orig 2013-03-02 18:19:34.000000000 > |-0500 > |+++ fs/nfsserver/nfs_nfsdport.c 2013-03-12 17:51:31.000000000 > |-0400 > -------------------------- > Patching file fs/nfsserver/nfs_nfsdport.c using Plan A... > Hunk #1 succeeded at 59 with fuzz 1 (offset -2 lines). > Hunk #2 succeeded at 3284 (offset -22 lines). > Hunk #3 succeeded at 3351 with fuzz 1 (offset -5 lines). > Hmm... The next patch looks like a unified diff to me... > The text leading up to this was: > -------------------------- > |--- fs/nfs/nfsport.h.orig 2013-03-02 18:35:13.000000000 -0500 > |+++ fs/nfs/nfsport.h 2013-03-12 17:51:31.000000000 -0400 > -------------------------- > Patching file fs/nfs/nfsport.h using Plan A... > Hunk #1 succeeded at 547 (offset -62 lines). > Hmm... The next patch looks like a unified diff to me... > The text leading up to this was: > -------------------------- > |--- fs/nfs/nfsrvcache.h.orig 2013-01-07 09:04:15.000000000 -0500 > |+++ fs/nfs/nfsrvcache.h 2013-03-12 18:02:42.000000000 -0400 > -------------------------- > Patching file fs/nfs/nfsrvcache.h using Plan A... > Hunk #1 succeeded at 41. > done > > > I just want to make sure I can apply it cleanly and that it's not a > mistake I made when making this work when it doesn't. > > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 22:26:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6DE27AC1 for ; Mon, 8 Jul 2013 22:26:07 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 354071CA6 for ; Mon, 8 Jul 2013 22:26:06 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 9E9EE2FF1D; Mon, 8 Jul 2013 22:25:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=mbGOWHKWjs6U5GGyjgwdW3xGfys=; b=TThC0wwvSjDRZvOMJDgWhS+sW6v4 U56vwWhV93A15cKVxsdeqEeQREnrCKRv7ZHpmpSI06jRQoL01T2iDs9PjBDId1ci lAoKJpzId2S/fSpVN1C7IiAQ2TDarHOOR7w8SeeUYsU6a5k1JfGy6rtSaz1ZsRaF fjUP2ezpB7A+J/8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=UfZDxR eIZW00yHnE074DkD3BE3Ll50QpQnegj5D1kX3zoWyTtkwSpJEpEuwDxrD/+dTq6z au38vp2NEX72PmNWZ0rRzdp+MdjuFJZOGCdZFCEnTMQR0bRepJulgMy2PFtIXVQx MmfthLzeJ5HjnSdn1X++TWv1sN+j+qPMckPgM= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 932C92FF1C; Mon, 8 Jul 2013 22:25:58 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id A96702FF1A; Mon, 8 Jul 2013 22:25:57 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id C59645C55; Tue, 9 Jul 2013 10:25:50 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 1C20649FB97A; Tue, 9 Jul 2013 10:25:55 +1200 (NZST) Date: Tue, 09 Jul 2013 10:25:55 +1200 Message-ID: <87wqp0zo24.wl%berend@pobox.com> From: Berend de Boer To: Jeremy Chadwick Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? In-Reply-To: <20130708210145.GA89605@icarus.home.lan> References: <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_10:25:54_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 5C6F6F94-E81D-11E2-8762-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 22:26:07 -0000 --pgp-sign-Multipart_Tue_Jul__9_10:25:54_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Jeremy" == Jeremy Chadwick writes: Jeremy> Therefore, the answer/solution for you at this stage seems Jeremy> to be: use Linux. Linux does what you need -- it offers Jeremy> you guest-level utilities that interface with the Jeremy> proprietary storage system back-end (EBS) that is offered Jeremy> by your choice of hosting vendor (Amazon). Ding dong, nothing to do with guest level utilities. Completely irrelevant, and I've repeated that numerous times here. All Amazon's guest level tools work perfectly fine on FreeBSD. Amazon has ZERO Linux/Windows/FreeBSD/whatever specific stuff. Jeremy> There is absolutely no shame in any way/shape/form in Jeremy> using an OS that meets your needs/requirements. And that's why I'm trying to use FreeBSD as it has features that Linux doesn't have. Trust me, if Linux had everything FreeBSD has, why would I use FreeBSD? It's even more expensive on AWS! Jeremy> EBS is a closed, black-box-proprietary thing. Ding dong, it's just block storage like every other block storage out there. Sigh. I fail to see that mentioning something Linux can do on this mailing list would upset you? You don't want to know or read that?? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_10:25:54_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2zxyAAoJEKOfeD48G3g5HFQP/iqCVsAWYe7dvDzxaLtmxOAx RodbQiH0/xvlwvBDPAfqbAcXfgFvG0jmsYZqWykDMG3MsvUcOKkCZfSf3By+3PXm 6nSUgGvoD5Q2OdS6rrlgkQoAbW0Bl+MmykNoV52c5+nNwzIaAC5x/DvN1XA6juuj FRYETUVTkcMhHgAWQjUl9QYNa8aGvlju4lcVC3LXcBvFuW8dZoBdNUke52H106dO 7lN3CUWPQUWCkvsruw05jPcUfz/wnTOXlkHNMtTFuhW8/68jbWFSsF/5GBz3wMRz GQ3eecz2afhkND+NyRqmYpeH9EOxpB7AN8mQHvdM0dCyl0AI82/H8E74yR7DR3xR tF+GlXDmrVxY/cDIMq6sNaRKOtG9KqLIMlSVIopaX34HeTfz5aHdwOReXF/W4sBm BxNIjNmMtks/+QSKbtqJaF8liQenWlqh3XM0b/ajc0vMYVSprH9ZF55sTytZPmWd Ej6r7PJQAKG/HGC+kT+ugFYiDY7Lt6sOKTjR+b5reDDVzFrE5LOcHsUsjq3DPzEC FA1el+okwHRLkj46/27md+vvUjCwhXj/clo0eZekB+sVYa8YEPPrsK9w5uok4d8G ISYoHoSBz1zYw1//Dk5Pvw5kbfDB+HleoZ8AUXr9+kZN0niL5G/MqRdIf9UiKb1u frv5XNrC0CARIOeFh0G0 =UxM+ -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_10:25:54_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 22:31:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C292ABEB for ; Mon, 8 Jul 2013 22:31:54 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 8758D1CEA for ; Mon, 8 Jul 2013 22:31:54 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id A19102F2DC; Mon, 8 Jul 2013 22:31:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=4y9xnNJHmQNr1tAf7turQW48z1E=; b=Uq88ujVUSSJdP1ekye3Ib0LUC5S2 PTUpTtK9rF8l6Z2WcAQe6ltK5XTK/nSmDYtaLC1OkGMCFxd0UPctpATVYcm+/bHP 1YjC8qflZN0oRrmHldfmBdjnSwuYcXdf8H0UKxthbkrF9elyGigC/Pip9WuOldSb kQ8xu5YSgGndRao= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=I+EfZ/ nMLo1qtRZXPq+4eJdSbcKWxm4mr8mYWmZkZd0vJu2fG17Vuc8jYEF5z6SHKNCff5 wLW6HVrlHcazRPVJiIHAGVupatGd4U1sZEgVvbWNRejueVgCyJcc7zdYVBV5up4V Jlnj/0Kcge5YYdHBPphruSGAK6KqIF0/eO6Ns= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 955AA2F2DA; Mon, 8 Jul 2013 22:31:53 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id A8ADC2F2D7; Mon, 8 Jul 2013 22:31:52 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id C8A655C55; Tue, 9 Jul 2013 10:31:45 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id BF02049FB97A; Tue, 9 Jul 2013 10:31:49 +1200 (NZST) Date: Tue, 09 Jul 2013 10:31:49 +1200 Message-ID: <87vc4kznsa.wl%berend@pobox.com> From: Berend de Boer To: Freddie Cash Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? In-Reply-To: References: <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_10:31:49_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 300805C8-E81E-11E2-ABD7-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 22:31:54 -0000 --pgp-sign-Multipart_Tue_Jul__9_10:31:49_2013-1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable >>>>> "Freddie" =3D=3D Freddie Cash writes: Freddie> Something else to consider is that this may not be a Freddie> FreeBSD issue at all, but a filesystem/storage system Freddie> issue.=C2=A0 Meaning, if you use ZFS on Linux ... EBS backups Freddie> will not work.=C2=A0 Same if you try to use Solaris or Illumos Freddie> or any other ZFS-enabled OS that will run in Amazon's Freddie> cloud. And you are exactly right. I can only freeze file systems on Linux that's support that. For example lvm/xfs support that. Not sure about ext, last time I tried that (Ubuntu 12.04) it didn't. I didn't dare to use ZFS on Linux, so didn't check, but given my experience with ZFS so far I doubt they would have added this. And they would have, as it is a file system specific thing. Freddie> At which point, it would make more sense taking the Freddie> discussion upstream to Illumos to find a way to quiesce a Freddie> ZFS pool in such a way that EBS backups would work.=C2=A0 Once Freddie> that is done, then it can filter downstream to FreeBSD, Freddie> Linux, and others. Great tip. Didn't know exactly if the ZFS implementation in FreeBSD was forked or not. I see on their home page about submitting patches :-) I've been on #zfs but not much feedback there. I'll join their mailing list and ask this question. --=20 Thanks again, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_10:31:49_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2z3VAAoJEKOfeD48G3g5sQQP/2TMPTX9yV63/5Xqmno0BBd4 tlRV5MRBrNDgwfp0rypiWDN9pGHIAS+h+230NuEVc6n6t3CkcM939aDw74dqTiCl AoEBLhNAdsw76S1XlX8OlTLxdw4nWzKBbPK2RH1q7gzDvbxyqOTE7v55v+NG8hkr IWYW7ViNiUOwBEwor8STf8M5GFNCfDJN+UqbQFzOlI3gkwoZVjTx+25WAhDhcgSe /OXQ8t9Swo2KwlH9OmxxRh50cYOpcgpiK10chsmS1lq72rLNSS5yZSnMN59hAf4X 6O3vekEpbLiXlbt4LtXU98AY1JqpHSvgjcy7FcH3zdBYsPO4Fct6ey4/Klc8p2DC SvMw2clflvBN7ynHLWAMEbIvTeGTZRje4ESe0Kw+J6erv9E/aW0IcTzcIWtPlv1M HBDpx3q0ZbmAuvRanfH+S3IBmmnltxdmYzaCIqgrc1ibzMAgo2BcWWZCvdxtvVOL C9GEYEvxu1vkCPNh2uQx3zQAH02I1gsY/1lJkwZ3pXjWnD/gHZjIklX1ZvVTmOn5 2fJ1WUpOEZPIfXFvB5C7inHtFDa26W3yTKNLBmVZVmzfauiPdFKkFMksPp51Zc7e 42BoVkb9eUnbgo4O0z6cW65C91w4UeEau+A3FqzWyNXTvbjSiWbXEae48i0RonK0 2NekkJyC9dqZDnvoEYiu =Fyua -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_10:31:49_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 22:37:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 27082CBF for ; Mon, 8 Jul 2013 22:37:47 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-qc0-x22f.google.com (mail-qc0-x22f.google.com [IPv6:2607:f8b0:400d:c01::22f]) by mx1.freebsd.org (Postfix) with ESMTP id DBF191D22 for ; Mon, 8 Jul 2013 22:37:46 +0000 (UTC) Received: by mail-qc0-f175.google.com with SMTP id k14so2621656qcv.20 for ; Mon, 08 Jul 2013 15:37:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=bWLQOx2mErDbliBnhNMuQ05cJFm+94Q/viMIlw+PZH0=; b=FowPbApfQkaKrl+fB3XDQvNPreTAORQM3M9Pa5YNTt6SGd2V/3VJnzt8o5cZvJjt+K xYOJLa4rU+e5AbiXlASUEStqY/DL6oE4iiJWQJI3g4OB2UNZXjUIjVYjUsj09jwAUodZ wZD2fO8f+Y2k7qhpPXZkc9BWGD2ZjOSOxIyXYSRPRIqXTUt4CAtJaPrDpDOOsSTOKjyb qhySedHpAOuqr8eproJcw1WHCSbjneSq+Sn6X+ZI3pzIIlMHihdoIXnes1GsnBYmhfyR TlKM7kNJdNssGc5/Z0EXF+nZkjGv0Tn+M+a5v1JeJIWkEjaLTM+kA9eVgVTjSWxeoskZ z+ag== MIME-Version: 1.0 X-Received: by 10.224.151.137 with SMTP id c9mr21058798qaw.107.1373323066430; Mon, 08 Jul 2013 15:37:46 -0700 (PDT) Received: by 10.49.49.135 with HTTP; Mon, 8 Jul 2013 15:37:46 -0700 (PDT) In-Reply-To: <87vc4kznsa.wl%berend@pobox.com> References: <87ehbg5raq.wl%berend@pobox.com> <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> <87vc4kznsa.wl%berend@pobox.com> Date: Mon, 8 Jul 2013 15:37:46 -0700 Message-ID: Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? From: Freddie Cash To: Berend de Boer Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 22:37:47 -0000 On Mon, Jul 8, 2013 at 3:31 PM, Berend de Boer wrote: > >>>>> "Freddie" == Freddie Cash writes: > > Freddie> At which point, it would make more sense taking the > Freddie> discussion upstream to Illumos to find a way to quiesce a > Freddie> ZFS pool in such a way that EBS backups would work. Once > Freddie> that is done, then it can filter downstream to FreeBSD, > Freddie> Linux, and others. > > Great tip. Didn't know exactly if the ZFS implementation in FreeBSD > was forked or not. I see on their home page about submitting patches > :-) > The FreeBSD implementation of ZFS isn't 100% identical to the Illumos (aka "reference") implementation, mainly due to GEOM; however, the FreeBSD ZFS maintainers try to keep it at feature parity with Illumos (and even push patches upstream that get added to Illumos). Same with the Linux implementation of ZFS, although there are more changes made to that one to shoehorn it into that wonderful mess they call "a storage stack". :) There are a handful of features available in the ZFS-on-Linux implementation that aren't anywhere else (like "-o ashift=" for zpool create/add). All in all, the ZFS-using OS projects try to stay as close to the Illumos version as is reasonable for the OS. It certainly would be interesting to have a "zfs freeze" and/or a "zpool freeze" (depending on where you want to quiesce things), but it may not play into how ZFS works (wanting to have complete control over the block devices, meaning no special magic underneath like block-level snapshots). :) Or, it may be the "next great feature" of ZFS. :) -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:04:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id AD69290 for ; Mon, 8 Jul 2013 23:04:47 +0000 (UTC) (envelope-from outbackdingo@gmail.com) Received: from mail-ob0-x22b.google.com (mail-ob0-x22b.google.com [IPv6:2607:f8b0:4003:c01::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 792561DF9 for ; Mon, 8 Jul 2013 23:04:47 +0000 (UTC) Received: by mail-ob0-f171.google.com with SMTP id dn14so6272532obc.30 for ; Mon, 08 Jul 2013 16:04:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=CmpTcAM8vT4iZbwUQe164KVfbSkJty4oMq9JU64MH7o=; b=pUFyzsGD9fQE/Z5cK/dD2we0yzkiC2BBBUZIn/xntDHrveQKL5n0uz8rNWkBqEn4Tb 4YD0ZoQC+4xdsq31fwe46KsVX2avClmyLnQmmZcNcR5uxURi87Qlss60z3ts5TOZVqBg SiPBLL3GwHoZcBs5NatFinBMDY9tKStf3hVRKegC/W+J6H/mHLPRlLIrMzEwIascrraF UGe+LuAob3msYZpa9NgyUzQp61wGx+nwZg8f75mSdi0RAZO1QFeo8Yf5xsXG3seqOZll C8NFIdovWV3UGchJFp3pDuyQLRjzvGvB1Cq8mCaXd9w8RBh4tPGrZRI1UMJiQxfdcJvZ U16w== MIME-Version: 1.0 X-Received: by 10.60.37.74 with SMTP id w10mr22066416oej.30.1373324686984; Mon, 08 Jul 2013 16:04:46 -0700 (PDT) Received: by 10.76.90.197 with HTTP; Mon, 8 Jul 2013 16:04:46 -0700 (PDT) In-Reply-To: <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> References: <87d2qt1tof.wl%berend@pobox.com> <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> Date: Mon, 8 Jul 2013 19:04:46 -0400 Message-ID: Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 From: Outback Dingo To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:04:47 -0000 On Mon, Jul 8, 2013 at 6:22 PM, Rick Macklem wrote: > Berend de Boer wrote: > > >>>>> "Rick" == Rick Macklem writes: > > > > Rick> Please try this patch: > > > > Hi Rick, > > > > Could you please reroll the patc hagainst 9.1-RELEASE? Not sure what > > version of FreeBSD you made this for. > I haven't a clue, either, to be honest;-) > > > I have 9.1-RELEASE. Get three > > failures: > > > I don't have a copy of 9.1-RELEASE handy, but here's one for stable/9. > If that still fails, just email and I'll download 9.1-RELEASE and create > one for it. > http://people.freebsd.org/~rmacklem/drc4-stable9.patch this patches cleanly against 9/stable updated as of 20 mins ago > > > rick > > > # patch --check -p0 < ~/drc4.patch > > Hmm... Looks like a unified diff to me... > > The text leading up to this was: > > -------------------------- > > |--- fs/nfsserver/nfs_nfsdcache.c.orig 2013-01-07 > > |09:04:13.000000000 -0500 > > |+++ fs/nfsserver/nfs_nfsdcache.c 2013-03-12 22:42:05.000000000 > > |-0400 > > -------------------------- > > Patching file fs/nfsserver/nfs_nfsdcache.c using Plan A... > > Hunk #1 succeeded at 160. > > Hunk #2 succeeded at 216. > > Hunk #3 succeeded at 271. > > Hunk #4 succeeded at 357. > > Hunk #5 succeeded at 370. > > Hunk #6 succeeded at 381. > > Hunk #7 succeeded at 396 with fuzz 2. > > Hunk #8 succeeded at 426. > > Hunk #9 succeeded at 444. > > Hunk #10 succeeded at 468. > > Hunk #11 failed at 476. > > Hunk #12 failed at 501. > > Hunk #13 succeeded at 523. > > Hunk #14 succeeded at 531. > > Hunk #15 succeeded at 547. > > Hunk #16 succeeded at 568. > > Hunk #17 succeeded at 579. > > Hunk #18 succeeded at 601. > > Hunk #19 succeeded at 665. > > Hunk #20 succeeded at 674. > > Hunk #21 failed at 683. > > Hunk #22 succeeded at 718. > > Hunk #23 succeeded at 729. > > Hunk #24 succeeded at 750. > > Hunk #25 succeeded at 779. > > Hunk #26 succeeded at 788. > > Hunk #27 succeeded at 803. > > Hunk #28 succeeded at 828. > > Hunk #29 succeeded at 927. > > Hunk #30 succeeded at 943. > > 3 out of 30 hunks failed--saving rejects to > > fs/nfsserver/nfs_nfsdcache.c.rej > > Hmm... The next patch looks like a unified diff to me... > > The text leading up to this was: > > -------------------------- > > |--- fs/nfsserver/nfs_nfsdport.c.orig 2013-03-02 > 18:19:34.000000000 > > |-0500 > > |+++ fs/nfsserver/nfs_nfsdport.c 2013-03-12 17:51:31.000000000 > > |-0400 > > -------------------------- > > Patching file fs/nfsserver/nfs_nfsdport.c using Plan A... > > Hunk #1 succeeded at 59 with fuzz 1 (offset -2 lines). > > Hunk #2 succeeded at 3284 (offset -22 lines). > > Hunk #3 succeeded at 3351 with fuzz 1 (offset -5 lines). > > Hmm... The next patch looks like a unified diff to me... > > The text leading up to this was: > > -------------------------- > > |--- fs/nfs/nfsport.h.orig 2013-03-02 18:35:13.000000000 -0500 > > |+++ fs/nfs/nfsport.h 2013-03-12 17:51:31.000000000 -0400 > > -------------------------- > > Patching file fs/nfs/nfsport.h using Plan A... > > Hunk #1 succeeded at 547 (offset -62 lines). > > Hmm... The next patch looks like a unified diff to me... > > The text leading up to this was: > > -------------------------- > > |--- fs/nfs/nfsrvcache.h.orig 2013-01-07 09:04:15.000000000 -0500 > > |+++ fs/nfs/nfsrvcache.h 2013-03-12 18:02:42.000000000 -0400 > > -------------------------- > > Patching file fs/nfs/nfsrvcache.h using Plan A... > > Hunk #1 succeeded at 41. > > done > > > > > > I just want to make sure I can apply it cleanly and that it's not a > > mistake I made when making this work when it doesn't. > > > > -- > > All the best, > > > > Berend de Boer > > > > > > ------------------------------------------------------ > > Awesome Drupal hosting: https://www.xplainhosting.com/ > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:12:33 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2148931A for ; Mon, 8 Jul 2013 23:12:33 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id D48E51E5C for ; Mon, 8 Jul 2013 23:12:32 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id EF4A42F605; Mon, 8 Jul 2013 23:12:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=JYnEzVoIvSFq11smZ+2iygjj9HA=; b=iPlLYBxEIsiczg2CnuV6eGxT3LfO N1/7bGEeb03YzkI6vOPfmOI5ssdo97FYwuw2AyCcoIIxT39q7GhvGXcBRjg3WF+T rGiV1TpxwAA79Py2GedcJ+4zmjRWQzLrXMxpbXnMr1RZ9xee7vfhuR0FWCbtWw9b mBpSj6rL2MyTj34= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=Muvv34 N74glmJ3Cx/RdVyJyR7A5taNouedCQhvNI58voQurryBNZSljzs6FhQqGq1NYA94 PGyrHRnsQrZ1LQ01Hw0J/M1+e+enx0CzxT71sTwMc0YspKSMf9F5m0A7+7kDoFR4 jYMh7p9HVmpY+8RRRfqo3XpCs+c+rt+gO6YOg= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id E472F2F604; Mon, 8 Jul 2013 23:12:31 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 679892F602; Mon, 8 Jul 2013 23:12:31 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 40AA45C6A; Tue, 9 Jul 2013 11:12:24 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 936BC49FB97A; Tue, 9 Jul 2013 11:12:28 +1200 (NZST) Date: Tue, 09 Jul 2013 11:12:28 +1200 Message-ID: <87txk4zlwj.wl%berend@pobox.com> From: Berend de Boer To: Outback Dingo Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: References: <87d2qt1tof.wl%berend@pobox.com> <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_11:12:28_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: DDA0C396-E823-11E2-ACF8-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:12:33 -0000 --pgp-sign-Multipart_Tue_Jul__9_11:12:28_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Outback" == Outback Dingo writes: Outback> this patches cleanly against 9/stable updated as of 20 Outback> mins ago It does, but it does not compile, the `i' variable wasn't declared. I just fixed that and started the compile again. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_11:12:28_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR20dcAAoJEKOfeD48G3g5Wt4QAKOCzLJY/01AFghvYJc/hR5r 4bkkv7P8nrdGUgbZeE6EOTfnVA6dpwMTkSa+I4b7PfGsnf5alIwVOHJt8Wp8iNuU swpTaAlGZ4gnjxfevz6gJNLNtOEGZrgOrdyp4Li1TrGmKAwS+l1KVP3xiWapBAY3 eS+rmp8UAFa0Q7VB2N0uFZtRqkMsTJNPYKwNXe1Y9BR4ymJirx/KRTqpWh/OMGAh KUyBHc8oXVb7+Q9uYrWZxo8fYK/bkQ9L3siGlq+flvpn9xRy4PKEgsWMJEerQRH9 J6SklGU/XHxIKYOz699X7ITvv4w8eIOxr7pztSYqPkiorWdF1ZNkSNjZTbmqH27v LsaDr3NYMbr1YaSepF1uESYnnMXSuNA1a4lYa4Pwwpt1l7sT+QVqdmOu/7Emibqg y9R1QvwBfGSXri+ecR6SJ11FG2t7N/WtRT6T8PzDBBByKm9V6BWn6Ga3Smhepq2w yW1AvLL9z8D0yzM1fq6lKwH261aYan60SrwPG4kOlPLr4LU89egcd3NkbKA/wF5t v/R8PVudK1UQj7DeRGoG6E5y2wRYY5y0HLFTyqxpG+fGFUeJKXeGsDbfRyfPzVcU MJywGBgf7yyd/JVQ6uModdvjizKN6duU0FJF4TIaDgVVqts6J+Te5ofKHt893BRh +nEozlTQUmzSykrcNHhf =3o92 -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_11:12:28_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:17:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 715FC4BE for ; Mon, 8 Jul 2013 23:17:47 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by mx1.freebsd.org (Postfix) with ESMTP id 165D91E93 for ; Mon, 8 Jul 2013 23:17:47 +0000 (UTC) Received: from mfilter4-d.gandi.net (mfilter4-d.gandi.net [217.70.178.134]) by relay5-d.mail.gandi.net (Postfix) with ESMTP id 9E14441C064; Tue, 9 Jul 2013 01:17:30 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter4-d.gandi.net Received: from relay5-d.mail.gandi.net ([217.70.183.197]) by mfilter4-d.gandi.net (mfilter4-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id efbVz2KavqF0; Tue, 9 Jul 2013 01:17:29 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id 261ED41C061; Tue, 9 Jul 2013 01:17:28 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 3A28B73A31; Mon, 8 Jul 2013 16:17:26 -0700 (PDT) Date: Mon, 8 Jul 2013 16:17:26 -0700 From: Jeremy Chadwick To: Berend de Boer Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? Message-ID: <20130708231726.GA91280@icarus.home.lan> References: <20130703055047.GA54853@icarus.home.lan> <6488DECC-2455-4E92-B432-C39490D18484@dragondata.com> <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> <87wqp0zo24.wl%berend@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87wqp0zo24.wl%berend@pobox.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:17:47 -0000 On Tue, Jul 09, 2013 at 10:25:55AM +1200, Berend de Boer wrote: > >>>>> "Jeremy" == Jeremy Chadwick writes: > > Jeremy> Therefore, the answer/solution for you at this stage seems > Jeremy> to be: use Linux. Linux does what you need -- it offers > Jeremy> you guest-level utilities that interface with the > Jeremy> proprietary storage system back-end (EBS) that is offered > Jeremy> by your choice of hosting vendor (Amazon). > > Ding dong, nothing to do with guest level utilities. Completely > irrelevant, and I've repeated that numerous times here. All Amazon's > guest level tools work perfectly fine on FreeBSD. Amazon has ZERO > Linux/Windows/FreeBSD/whatever specific stuff. Then please explain this comment: > On Tue, Jul 09, 2013 at 08:06:25AM +1200, Berend de Boer wrote: > > >>>>> "Markus" == Markus Gebert writes: > > Markus> What you really need in that case is the ability to > > Markus> snapshot all EBS disks as group. > > > > Which Linux offers. Because what you're saying here, vs. what you said above, is highly conflicting as I read it. I do not understand how you can say "Amazon's guest-level tools work on FreeBSD", then immediately say "Amazon has zero OS-specific stuff", *while* saying "Linux offers {said thing I want}". These statements are extremely confusing. I think it would help if you would really start to provide *actual commands* of what you're doing to accomplish the tasks you want to accomplish -- both as you do them now on Linux, as well as the ZFS-related stuff you've done on FreeBSD. To my knowledge, no where in this thread have you actually shown scrollback/proof/etc. of what you've been doing, just magical one-liners that seem strange (for example, the "empty disk" thing would imply that EBS isn't even doing storage caching correctly, which makes no sense and cannot be true else *nothing* would work!). It might turn out to be that what you're seeing can be traced down to user error, or it can be traced back to the controller driver (on FreeBSD) not behaving how you want (might require SCSI quirks, etc.), etc... The possibilities are endless, and without hard data, I think everyone is struggling/throwing their arms up in the air. The reason I mention storage controller drivers is, for example, on ESXi (I forget what versions) it was discovered that cache flushing (the OS's way of telling the storage driver, which then tells the disks, "make sure everything is written to the platters") was a no-op of sorts -- ESXi would say "yep I've done it" when in fact it hadn't yet. With virtualisation this is a common situation, **solely because** of all the abstraction and caching (at so many layers that it's almost unfathomable -- no joke). I put EBS into the same category. Because really, all you know ***TRULY*** is "I see disk daX using controller xyzX", and the rest you have to pray/hope works (from the OS driver level all the way down to the physical media this stuff is written to on the back end / within EBS). The more abstraction = the more chances something will not behave how bare metal expects. > Jeremy> There is absolutely no shame in any way/shape/form in > Jeremy> using an OS that meets your needs/requirements. > > And that's why I'm trying to use FreeBSD as it has features that Linux > doesn't have. Trust me, if Linux had everything FreeBSD has, why would > I use FreeBSD? It's even more expensive on AWS! I'm glad you're trying it, but if you're at the stage where you're saying "Linux can do X/Y/Z", then it's up to **you** to weigh the pros and cons of moving away from Linux. Nobody here can make that decision for you, whether it be based on technical need or money or whatever else. > Jeremy> EBS is a closed, black-box-proprietary thing. > > Ding dong, it's just block storage like every other block storage out > there. Sigh. You once again generalise. Every kind of storage back-end is different, I don't know why you have such trouble understanding this. You're quite literally telling me that EBS is the same as iSCSI is the same as an ESXi file-based backing store is the same as a physical disk (remember your words: "disks are just software"), which is completely and entirely wrong -- their behaviours differ severely, not just as solutions/methods, but in *how they behave* when submit with I/O requests. All that "black magic" that goes on under the hood? It matters. Every single bit of it. > I fail to see that mentioning something Linux can do on this mailing > list would upset you? You don't want to know or read that?? I am one of the most "anti-FreeBSD" people you will ever encounter -- go ahead, ask anyone on this list, you will find that I am the one who is usually quick to take my "well then FreeBSD needs to get it's sh** together on a server level" soapbox. And that doesn't come lightly given that I've been using/working professionally with FreeBSD since the 2.2.x days. Prior to that I ran Linux (0.99pl45 until early 1.3.x). I'm just trying to give you some perspective into how I am, if it matters. What upsets me is that you're saying "Linux has XYZ", in turn creating a "FreeBSD vs. Linux" conflict, when I personally strive to see that kind of "OS war" advocacy end. I want to see people use whatever tool/OS/thing solves their dilemmas. If that thing is Linux, cool. If it's FreeBSD, cool. If it's Windows, cool. The dilemmas, limitations, and needs/etc. vary per person, per org, per company. I am against all forms of OS advocacy. I prefer to keep an open mind. So while I give you two thumbs up for giving FreeBSD a try, and ZFS too, but if Linux does something that you can't get on FreeBSD, then to me it seems like the choice is obvious. But that's me -- I love oversimplifying. ;-) -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:18:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7415B52F for ; Mon, 8 Jul 2013 23:18:07 +0000 (UTC) (envelope-from outbackdingo@gmail.com) Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com [IPv6:2607:f8b0:4003:c01::22a]) by mx1.freebsd.org (Postfix) with ESMTP id 42E131E9A for ; Mon, 8 Jul 2013 23:18:07 +0000 (UTC) Received: by mail-ob0-f170.google.com with SMTP id ef5so6285133obb.29 for ; Mon, 08 Jul 2013 16:18:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=df2IefixXzdHe8sUWc7xSeMdZuBIH9CsDldUIt9iiyg=; b=HMh4VZtTBGjBCoHWj5bdlzAeR2AJXXMjKJWADB21AGIjJvSIlQZjMXcXrjXNizm4Au ctNaZE0whHrcbDB1vqdiV7cvGB8ZM6e7gr3pkEsCu3d33t2MOHILCTmU4OwI1faK6YOm jgEoLxZBn7hRJVEtpRoyRPsqZJrTwD6AKqunsGrVjytC5N6DUq1n+4xyksopXsU+WNFx OgoWPkwHBkw14JgD6kJscq/wwuhu5+/z5j4ZYfX6Z3kVbQ7JZm2oHp9+tfULmr9BT39V BQhpmvYbXwBdV72pfDq42KzhkFI1WooJk+0iFCRkv3IPUlyO3C+b8iz8ZGS4bpUj2YdT qLDw== MIME-Version: 1.0 X-Received: by 10.182.120.132 with SMTP id lc4mr22144222obb.22.1373325486820; Mon, 08 Jul 2013 16:18:06 -0700 (PDT) Received: by 10.76.90.197 with HTTP; Mon, 8 Jul 2013 16:18:06 -0700 (PDT) In-Reply-To: <87txk4zlwj.wl%berend@pobox.com> References: <87d2qt1tof.wl%berend@pobox.com> <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> <87txk4zlwj.wl%berend@pobox.com> Date: Mon, 8 Jul 2013 19:18:06 -0400 Message-ID: Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 From: Outback Dingo To: Berend de Boer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:18:07 -0000 On Mon, Jul 8, 2013 at 7:12 PM, Berend de Boer wrote: > >>>>> "Outback" == Outback Dingo writes: > > Outback> this patches cleanly against 9/stable updated as of 20 > Outback> mins ago > > It does, but it does not compile, the `i' variable wasn't declared. I > just fixed that and started the compile again. > > yupp just hit that also, whats the fix ??? > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:27:32 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B857773B for ; Mon, 8 Jul 2013 23:27:32 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 756ED1EE1 for ; Mon, 8 Jul 2013 23:27:31 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 96A7A2FED8; Mon, 8 Jul 2013 23:27:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=lJJJBEsC4Wo2FDYa6eUEzCylA4g=; b=unEVWdc1b8wu8BCJJrmDjO1/VsCk fd/o38ney1tui5ck6Ico5AYwT3hwwYvGJvFKDs+rPeAQwJ/LSp3/F9zOwG9W8HcB 5rltZG41CuaAT3wksxYKGVoP76NFZf/XNk+iSa2aFr6/IO+WqrWcTIAg62ukIm7t 8RvhaFiE1RhGkHw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=ww8vYw woA8emowmvuMVEEnS5xytcuCkwRgVluPRiDRgfsCq52m8ffR2vy/lyGsOm/BxiFV NHbGNktgpDQK6EexQ3FJJINGVVTi4s/rWY6JCT8cdIHi3KRRNBxAuBPRT5t1SLxs Lz+8fLS4tkU1nBoidKV/EVIVac7quL2Yjt2gI= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 89DD22FED2; Mon, 8 Jul 2013 23:27:18 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 6521C2FECC; Mon, 8 Jul 2013 23:27:16 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id D80935C6A; Tue, 9 Jul 2013 11:27:08 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id DB54949FB97A; Tue, 9 Jul 2013 11:27:12 +1200 (NZST) Date: Tue, 09 Jul 2013 11:27:12 +1200 Message-ID: <87sizozl7z.wl%berend@pobox.com> From: Berend de Boer To: Outback Dingo Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: References: <87d2qt1tof.wl%berend@pobox.com> <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> <87txk4zlwj.wl%berend@pobox.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_11:27:12_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: ED203BE2-E825-11E2-B349-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:27:32 -0000 --pgp-sign-Multipart_Tue_Jul__9_11:27:12_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Outback" == Outback Dingo writes: Outback> yupp just hit that also, whats the fix ??? Just add a line with: int i; at the beginning of the function, below the other variable declarations. Then everything compiles. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_11:27:12_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR20rQAAoJEKOfeD48G3g5WiMP/2z7aMXd8e6uABBota9olLpx 7/F68lDT4xLId+L1alABwzlimaqY9cc0IZcq3oTyXvHG3DbBhP/3apHHpxLKVTJ5 o8H7esJCXL4qes69DaD9MsR2+M6pxGY+UfcsLJjWGXA6tX0TD77Yr6fu0/dmDOcl W5Pgmh7z4m/oGRrDPtIwyJ/WL4mWGAGOcYJQoOM0VLUV9xmx/v/Gr1GUT1VdwBBV IRfLKdKchPZVllOINWopkdHX4BZ1jQvXAZ20FjtWuXkkH4LD1GvuNrvFKvF2Lmzg RoOAp3oRv/S5dDPoaiX//4MfvXYZFu3BIlIVbfg4QizE0wW2RaLJJyIqIItfmEWA eCo9EJGljysl/J/BpqGOhNdXrKoFKiVLNs6sWSKVSZWJVzH/m5LPK+2OgQQEkuPM 0tLtFqME+kb69ibgQTVK2TsiJzL/4sHlse9NFDgLM7FouCcMksSfR3+YjnE5Yz3B Ss5QYcL+87KgTUoOuMQr0HV/8KsZxHXLzf2zwxomKf5Rbwkz5Qdoa7wVkVdztPA6 7R13MWp3rIwDL6rqUh86EWIj0V5C5fTZhxXG8Xpnuh/MbtxwvyqB1oJeDTNtNWhw laqm6oD6UEcZqK0vl9H62RuLWMVUYfhO7Ryv/0NwGk3wLMwfRqHEhzWxMldACtaL n8BfOGf0WKizEP4fkqSf =ioHS -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_11:27:12_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Mon Jul 8 23:28:23 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 601977B6 for ; Mon, 8 Jul 2013 23:28:23 +0000 (UTC) (envelope-from outbackdingo@gmail.com) Received: from mail-ob0-x236.google.com (mail-ob0-x236.google.com [IPv6:2607:f8b0:4003:c01::236]) by mx1.freebsd.org (Postfix) with ESMTP id 2C57E1EED for ; Mon, 8 Jul 2013 23:28:23 +0000 (UTC) Received: by mail-ob0-f182.google.com with SMTP id va7so6302627obc.27 for ; Mon, 08 Jul 2013 16:28:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=wDmaNSXcCY9vd3Jv/H+jf786i3+7JaCp5odQsxHGS+4=; b=wRHz/hhHwtwvqXo1K6x4xsNqV7wfC3dn1ofLQXW58SfYHtDaRaNjaDNP3VM6K71+yp qXWDAbrwFHq3K7l5qdUTj7ZF8q1WLNhPzfk55z2A4ojsJ8cvhqOQ3H8ULfuWrNnG8aQg cx7oODJz6mhzmCtDFKSAUavnGLZxQVJ7WF1nCTXbsOCMJ60m0h4Fm1pwu0PppQmo0xYE FYa6CZGS+AJpvP0+GBAgDyGAskN0C5XwC10dcszxXoR8+jHIdlLS2MITqSTI/Yntd4Gj GBtMbpBA/jY6sM9iltI5mVbPh7ILNvc4ld4qB4HTsHnLtDWUU7lqqJ6yIbC1fSeO1LYH DXkQ== MIME-Version: 1.0 X-Received: by 10.182.165.232 with SMTP id zb8mr22140097obb.101.1373326102798; Mon, 08 Jul 2013 16:28:22 -0700 (PDT) Received: by 10.76.90.197 with HTTP; Mon, 8 Jul 2013 16:28:22 -0700 (PDT) In-Reply-To: <87sizozl7z.wl%berend@pobox.com> References: <87d2qt1tof.wl%berend@pobox.com> <2014252440.3314312.1373322172877.JavaMail.root@uoguelph.ca> <87txk4zlwj.wl%berend@pobox.com> <87sizozl7z.wl%berend@pobox.com> Date: Mon, 8 Jul 2013 19:28:22 -0400 Message-ID: Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 From: Outback Dingo To: Berend de Boer Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Jul 2013 23:28:23 -0000 On Mon, Jul 8, 2013 at 7:27 PM, Berend de Boer wrote: > >>>>> "Outback" == Outback Dingo writes: > > Outback> yupp just hit that also, whats the fix ??? > > Just add a line with: > > int i; > > at the beginning of the function, below the other variable > declarations. Then everything compiles. > jeeeeeez damned if im not a programmer but considered that.... :) thanks for clarifying > > > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 00:05:26 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1ACB0DB7 for ; Tue, 9 Jul 2013 00:05:26 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay5-d.mail.gandi.net (relay5-d.mail.gandi.net [217.70.183.197]) by mx1.freebsd.org (Postfix) with ESMTP id 975491036 for ; Tue, 9 Jul 2013 00:05:24 +0000 (UTC) Received: from mfilter2-d.gandi.net (mfilter2-d.gandi.net [217.70.178.140]) by relay5-d.mail.gandi.net (Postfix) with ESMTP id 4EC6F41C054; Tue, 9 Jul 2013 02:05:14 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter2-d.gandi.net Received: from relay5-d.mail.gandi.net ([217.70.183.197]) by mfilter2-d.gandi.net (mfilter2-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id hM1+T-qLK+U1; Tue, 9 Jul 2013 02:05:12 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay5-d.mail.gandi.net (Postfix) with ESMTPSA id BE87341C075; Tue, 9 Jul 2013 02:05:10 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id C72A073A31; Mon, 8 Jul 2013 17:05:08 -0700 (PDT) Date: Mon, 8 Jul 2013 17:05:08 -0700 From: Jeremy Chadwick To: Freddie Cash Subject: Re: EBS snapshot backups from a FreeBSD zfs file system: zpool freeze? Message-ID: <20130709000508.GA92194@icarus.home.lan> References: <14A2336A-969C-4A13-9EFA-C0C42A12039F@hostpoint.ch> <87zjty11gn.wl%berend@pobox.com> <41CC5720-B1EA-4841-8BA5-893F4A628EAD@hostpoint.ch> <877gh024vy.wl%berend@pobox.com> <20130708210145.GA89605@icarus.home.lan> <87vc4kznsa.wl%berend@pobox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 00:05:26 -0000 On Mon, Jul 08, 2013 at 03:37:46PM -0700, Freddie Cash wrote: > On Mon, Jul 8, 2013 at 3:31 PM, Berend de Boer wrote: > > > >>>>> "Freddie" == Freddie Cash writes: > > > > Freddie> At which point, it would make more sense taking the > > Freddie> discussion upstream to Illumos to find a way to quiesce a > > Freddie> ZFS pool in such a way that EBS backups would work. Once > > Freddie> that is done, then it can filter downstream to FreeBSD, > > Freddie> Linux, and others. > > > > Great tip. Didn't know exactly if the ZFS implementation in FreeBSD > > was forked or not. I see on their home page about submitting patches > > :-) > > > > The FreeBSD implementation of ZFS isn't 100% identical to the Illumos (aka > "reference") implementation, mainly due to GEOM; however, the FreeBSD ZFS > maintainers try to keep it at feature parity with Illumos (and even push > patches upstream that get added to Illumos). > > Same with the Linux implementation of ZFS, although there are more changes > made to that one to shoehorn it into that wonderful mess they call "a > storage stack". :) There are a handful of features available in the > ZFS-on-Linux implementation that aren't anywhere else (like "-o ashift=" > for zpool create/add). > > All in all, the ZFS-using OS projects try to stay as close to the Illumos > version as is reasonable for the OS. > > It certainly would be interesting to have a "zfs freeze" and/or a "zpool > freeze" (depending on where you want to quiesce things), but it may not > play into how ZFS works (wanting to have complete control over the block > devices, meaning no special magic underneath like block-level snapshots). > :) Or, it may be the "next great feature" of ZFS. :) Well back to his original statement, quoting: > On Linux' file systems I can freeze a file system, start the backup of > all disks, and unfreeze. This freeze usually only takes 100ms or so. I interpret this statement to mean, on Linux: 1. Some command is issued at the filesystem level that causes all I/O operations (read and write) directed to/from that filesystem to block (wait) indefinitely, and that all pending queued writes to the disk are flushed to disk (on FreeBSD we would call this BIO_FLUSH), 2. Some other command is issued (at the Amazon EBS level, whether it be done via a web page or via CLI commands on the same Linux box -- though I don't know how that would work unless the CLI tools are on a completely separate filesystem), where an EBS snapshot is taken (similar to a filesystem snapshot but at the actual storage level, Possibly if this is a Linux command there's an actual device driver that sits between the storage layer and EBS which can effectively "halt" or "control" things in some manner (would not be surprised! VMs often offer this) -- I'll call this a "shim", 3. Some command is issued at the filesystem level that releases that block/wait, and all future I/O requests go through. What this means is that "block-level snapshots" are what would be necessary -- the key here is that writes pending (scheduled to be written to the disk) need to be flushed, and that any other I/O block. I do not think something like CACHE FLUSH EXT (i.e. the ATA command used to actually flush disk-level cache to the platters) matters -- EBS, whether the data is "in its cache" or not has no bearing, it should know what to do in either case. All this would be because of what EBS would require/mandate. On FreeBSD we don't have the Linux equivalent of #1/#3 -- the layer where this would be done, ideally, is at the GEOM level (ex. "gfreeze" command would block all I/O and also issue BIO_FLUSH to ensure things had been written). Due to the split between GEOM and filesystems (unrelated things per se), one would have to issue "gfreeze" on the disks that make up the filesystem, followed by doing the EBS backup/snapshot, followed by "gfreeze -u" on all the disks. Wishful thinking, and very idealistic, but that's my take on it. I have no idea how you'd issue this command to select disks without there being some risk; i.e. if a 5-disk raidz1, you'd issue that command 5 times (even if just in 1 single command, the kernel still has to iterate over 5 items linearly), which means there's a chance the filesystem could have successfully written parts of something to some of those 5 disks, thus upon an EBS snapshot restore the filesystem is actually inconsistent (ZFS reporting checksum failures, for example). I have no idea how at the filesystem level (ex. zfs, not zpool) such could be accomplished because again BIO_FLUSH is what's needed, and that would be at the "provider" level (GEOM term) -- I think (kernel folks please correct me). I also have no idea how other layers (ex. CAM) would react to such a "freeze". Likewise, I worry about userland applications; 100ms is a nice and convenient number... ;-) On FreeBSD I think what most folks do is avoid all of the above and use filesystem snapshots exclusively, either ZFS or UFS, although UFS snapshots... well... don't get me started. Filesystem snapshots are "supposed" to be fast, but they depend greatly on a lot of things and how they're implemented. But honestly they're what most people turn to, rather than doing backups at the "block level" (e.g. EBS). I've never encountered anything like a "block level" freeze or snapshot on bare metal (this would have to be done somehow at the controller level; SANs have this, I believe, but not simple HBAs that I've worked with). One can't even do something like extend sync(8) to somehow issue BIO_FLUSH, because it doesn't guarantee contention between the BIO_FLUSH and the time things are done -- more writes could enter the queue or maybe enough that the queue is full + gets processed right then and there, leading to the same situation. This whole thing is a mess due to the layers of disconnect between all the pieces (including on Linux -- it just so happens they have some interesting way with **very specific filesystems** to accomplish this task), and if you ask me, a complete disconnect from reality between the "cloud providers" (Amazon, etc.) and how actual storage and filesystems *work*. Very naughty assumptions being made on their part, unless, of course, there is that "shim" I spoke about. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 01:02:28 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 889BB276 for ; Tue, 9 Jul 2013 01:02:28 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 46E8219E8 for ; Tue, 9 Jul 2013 01:02:26 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id EB0D82894D; Tue, 9 Jul 2013 01:02:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=2loi3IJqNdtPRzLD39DlJp8MLpE=; b=N/ekqcRSOwuahqSirhQHeeuHlKm3 tAqovwwVLEhiMEslSP4auPoT7piAGp51zW3zyDiFEUzDEWob2OaSHo4FLU7twYiI 4reiIn7TzEMLR/1H6qpCYO1IputBsNqLfp+IWxeJocvnGsa3XvFLVAduOskKAYHu waGNZnWk7aFS9fo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=MvOsdf +Tt5i4OVy92dpElfsPjHgs/VUygxIwlIyeqqu9xYmsafvOmIMgyQX7b9rQyLZBaC Rj/j3h5Uq4BKnM7zxYHPhawKnz+NbjgBZGgijwwaDUU9DvPVpqbOv1IAt8elXMbk pIETEDfTpMK7N6lC8I3Ka5Dpdty/KF+HUqyLM= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id C2F872894C; Tue, 9 Jul 2013 01:02:23 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 528CA28949; Tue, 9 Jul 2013 01:02:22 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 963135C6A; Tue, 9 Jul 2013 13:02:14 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id DFA2449FB97A; Tue, 9 Jul 2013 13:02:18 +1200 (NZST) Date: Tue, 09 Jul 2013 13:02:18 +1200 Message-ID: <87ppuszgth.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_13:02:18_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 3621715A-E833-11E2-AAF9-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 01:02:28 -0000 --pgp-sign-Multipart_Tue_Jul__9_13:02:18_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> After you apply the patch and boot the rebuilt kernel, the Rick> cpu overheads should be reduced after you increase the value Rick> of vfs.nfsd.tcphighwater. What number would I be looking at? 100? 100,000? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_13:02:18_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR22EaAAoJEKOfeD48G3g5vXcQAJ/6JfLjL2mf+nwrrKJ/nvCN 0npqHkW8iVVTmr4k9i9/2bJGZaf16SkJXfkyfs/Sj4hShCFY175oTwAo3QxXr4sE xEg4xI9yPo/n5pHcOJuf90uIp8azzf+1AvLdhGyykldSO4WvXDWSD936I5mwIEgL Mn2pwzwP6gbYydnsf+IdUfnOTK7bnze5CIS/jLgRZPRtmg/2ctiYLxWrbK+1RSrG GHqSe4mLN/bh/yW80mYBccOl0vJNghmj00YkDyPotP2kx8+nf9mHyq/zayglX/MJ WcHrAfd6uKFnvAyBaXnrEsrtDeHidGHCXx2866XDVdrBMVOPAYSieXxH0Gg2AGDi yh4Oyq/B+ydNlPUO2+NcfXMbNFa12KwxrZOH9JW/a5ku3MYkXMn6yU8Qy0k4Tojv q3F8YtQHjV6doJSxebFH1vfOW+NrBbHJoB2mwvARV5UPdAVkDUL8ovPIi1xEyRp6 vwS0rc2aHsMfpdK236l9EitIIxpG9lwBausGXkHJrGN3T9I1rjjHoBWHBua7DZxP W7BVhqI/9b8zBel1li+0fgJEbxaIEk3yOkdg1LGbuzyiTlgfRIIX9NuSJSOhywGx 6WY/3bHK+fVKSrrgvjscyFd3LTTkZkGfclHrHGVHYdp5Q2SFfq/a2awGcry+lpxw fmcbJIQIVAgB0qz51xo6 =7hLn -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_13:02:18_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 01:04:41 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 37A643A6 for ; Tue, 9 Jul 2013 01:04:41 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id EA43E1A13 for ; Tue, 9 Jul 2013 01:04:40 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 2193228A60; Tue, 9 Jul 2013 01:04:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=dYeBILxFEGp5mwKf9YwRSKfaiOw=; b=EEvRvQBAEL3Pd+RLx/Z4RHC5GCjR 7KGyWTJWDifJ6VxgKQ+rKL7yKEYN6Uy3MxXSrW17y5ikBR/Mm5P6Rnfsi76mLbia MRIRkLTd+PMnEsyyS7cg9a+4stlSZOVNCZ7ZCz7WBCbpUz/aVMR2VycrqL9ovuU8 cAZ6Quu8zE0RaeA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=RFBAML txdOLF+FltXTXkyhQyKifj6pjvL64Qe3DLF66jNYEcfs/DFONXMoVqTkiNMNDrr6 vvxXVLy7lgcJ5cz9UXOQXVANFo+PVpZbQTsaqjJyQvQfmqnrFNX+dgACwI5vShNI BUBjF2CtYiBxTNxB21rlcItyml8r82BYo79Qg= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 1855828A5F; Tue, 9 Jul 2013 01:04:40 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 8AD2F28A5A; Tue, 9 Jul 2013 01:04:39 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id B13675C6A; Tue, 9 Jul 2013 13:04:32 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 0E24249FB97A; Tue, 9 Jul 2013 13:04:37 +1200 (NZST) Date: Tue, 09 Jul 2013 13:04:37 +1200 Message-ID: <87obaczgpm.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_13:04:36_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 87EA1A14-E833-11E2-A81A-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 01:04:41 -0000 --pgp-sign-Multipart_Tue_Jul__9_13:04:36_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> After you apply the patch and boot the rebuilt kernel, the Rick> cpu overheads should be reduced after you increase the value Rick> of vfs.nfsd.tcphighwater. Do I need to umount the share? Restart something after changing this value? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_13:04:36_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR22GkAAoJEKOfeD48G3g594oP/0zxY6FS5jjPD+2GOkJt159m 8tzLLDVHuh7SheFkMrrH579ax/mPtCu8V3azBJ/SahVEMUy0+YlNTV6L+GIrap62 rUdfhh+Vx0K//j8QSIk/0gcpC2EZ6aM22TYzSXH5QWBRkAZgYTuYX14z+tJ2eiWG s6LyjGbYnKhNgakmDZMJmYoKaIOPu1slGg+inD/oHl29xEocZHPnBA+2Z8XGcXH4 eKvrDVHxN0PvYPul1ya88rp+ZlCXWsLSUBxrUWs7mHnXqe84765FF99Uny+euoOB qdwmNIXrCz0yAubJeWn8pkCppSecp/rC9qZe/oApfs4D20KjgawWmE1sVdNwtE8j gTvrxyZ0DhSeWrtqVTaxttnLAhiMiqDVMaL8hIEhGXUhuOXoToWBBYVoTWFNNquT IxK0FNgJvnzVQP7B0YahvDBQqoKY75aEijrbbJkLfX8E+XZwA7eceCP+WqW6tb95 ERX8p8IqweS+N4RCLXg1lVm97QLDzyhxymPePOEQdKyYoAAl0bDd8vFE0cfXUhpz PPWDM/e+g1U9TmmGDq75D16jyd443AqTY72j79iMOiexcJ9gDkzPjINYUWQVXKfd 6WLX4VT3se4cHEnOw0jAWxbAkK5bhSNmuABsOJAMkxSr+CWMmVR54gGQIrWYe7Xm WE+Me0CHpEknmGYNtT/P =dzd1 -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_13:04:36_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 01:41:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E7B27AE5 for ; Tue, 9 Jul 2013 01:41:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 41EF51BA1 for ; Tue, 9 Jul 2013 01:41:07 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg= c=1 sm=2 a=2CN1efILQXEA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=GzJd4s-eAAAA:8 a=rvF0lruhulTt8CwKKYcA:9 a=QEXdDO2ut3YA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAMpp21GDaFve/2dsb2JhbABagztNgwi9eYE0dIIjAQEFI1YbGAICDRkCWQYTiA+na5EqBIEmjhE0B4JUgRwDqRuDLSCBbA X-IronPort-AV: E=Sophos;i="4.87,1024,1363147200"; d="scan'208";a="38844549" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 08 Jul 2013 21:41:01 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 64B3BB3F1C; Mon, 8 Jul 2013 21:41:01 -0400 (EDT) Date: Mon, 8 Jul 2013 21:41:01 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <1306739954.3352769.1373334061402.JavaMail.root@uoguelph.ca> In-Reply-To: <87obaczgpm.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 01:41:09 -0000 Berend de Boer wrote: > >>>>> "Rick" == Rick Macklem writes: > > > Rick> After you apply the patch and boot the rebuilt kernel, the > Rick> cpu overheads should be reduced after you increase the > value > Rick> of vfs.nfsd.tcphighwater. > > Do I need to umount the share? Restart something after changing this > value? > I think you can safely change it "on the fly". It simply defines how large the DRC cache can grow before the nfsd thread will try to trim it down. The default of 0 means "trim every RPC", which keeps it at a minimal size, but can result in significant CPU overhead. rick > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 01:43:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 91FC1EE0; Tue, 9 Jul 2013 01:43:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 471981BC2; Tue, 9 Jul 2013 01:43:53 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=u+Bwc9JL7tMNtl/i9xObSTPSFclN5AOtXcIZY5dPsHA= c=1 sm=2 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=GzJd4s-eAAAA:8 a=TVmYBYqIu9tU_sYfVX4A:9 a=QEXdDO2ut3YA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAIpp21GDaFve/2dsb2JhbABagztNgwi9eYE0dIIjAQEFI1YbGAICDRkCWQYTiA+na5EqBIEmjhE0B4JUgRwDlACVG4MtIIFs X-IronPort-AV: E=Sophos;i="4.87,1024,1363147200"; d="scan'208";a="39469718" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 08 Jul 2013 21:43:52 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 581CBB3F1D; Mon, 8 Jul 2013 21:43:52 -0400 (EDT) Date: Mon, 8 Jul 2013 21:43:52 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <27783474.3353362.1373334232356.JavaMail.root@uoguelph.ca> In-Reply-To: <87ppuszgth.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs , Garrett Wollman X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 01:43:53 -0000 Berend de Boer wrote: > >>>>> "Rick" == Rick Macklem writes: > > Rick> After you apply the patch and boot the rebuilt kernel, the > Rick> cpu overheads should be reduced after you increase the > value > Rick> of vfs.nfsd.tcphighwater. > > What number would I be looking at? 100? 100,000? > Garrett Wollman might have more insight into this, but I would say on the order of 100s to maybe 1000s. rick > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 02:20:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6CCBD981 for ; Tue, 9 Jul 2013 02:20:14 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 2A3D21DAA for ; Tue, 9 Jul 2013 02:20:11 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 3773A24D57; Tue, 9 Jul 2013 02:20:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=szancFYaaOWJ2fQvPJkzjBGzNwU=; b=Mg9l4TB7IyjMC/t+uDNYVwj5dDc9 Qr6GOB0FMQg4Nvpjn5/IKGrUW+8SXs0Fw0OnpKd7sgWEODXfNSGBinX5cve+V87U oXabynf0yBkJPHRna26KC9CGdTodyy0v6faOJ2wfhJECYOEevvUG6z44LiBjU2OO wO7Wt+5NGFMW7gY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=gaVDpf xNd0zOMNvZjcm3lRhwdUp5TiVEjCdS+pcoVe7wOJlnb3HsRQEs7E7XXV0tel25ib IefJ8DprjbV02P1+b8EUjgZ1IDuHyycMOEtyFUlL81eHb1D5cQb9QC4umqnYeQaP bDXmtkXZEqA0fKi/J3qW5pWsejWfgd7rTneZA= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 2ED0824D55; Tue, 9 Jul 2013 02:20:09 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id A6DB624D53; Tue, 9 Jul 2013 02:20:08 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 9C6D95C77; Tue, 9 Jul 2013 14:20:01 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id E62D149FB97A; Tue, 9 Jul 2013 14:20:05 +1200 (NZST) Date: Tue, 09 Jul 2013 14:20:05 +1200 Message-ID: <87k3l0zd7u.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_14:20:05_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 1379FF90-E83E-11E2-87A9-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 02:20:14 -0000 --pgp-sign-Multipart_Tue_Jul__9_14:20:05_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> After you apply the patch and boot the rebuilt kernel, the Rick> cpu overheads should be reduced after you increase the value Rick> of vfs.nfsd.tcphighwater. Have set it to 10,000, max cpu for nfsd I've seen is below 6%. Makes no real difference whatsoever to the great slowness of nfs4 in this use-case. I.e. did two tests: 17.5 minutes with sync=disabled, 21.5 minutes with sync=enabled, but difference in this case could simply be due to whatever else was going on that that time. FYI, in the nfs3 mount nfsd is at 0% at all times, basically uses no cpu whatsoever. The weird thing is that nfs3 performance seems to have been greatly affected: the same test which ran at 2 minutes on udp is now between 7-11 minutes. As this could be a problem with how I'm testing now (I recompiled the kernel), I'll try to see what numbers I get when I undo the patch and work against a recompiled kernel. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_14:20:05_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR23NVAAoJEKOfeD48G3g5fqoP/16z4OGMu//AEaKoMkXNbWMo a6M0BWyKec0BKKL1V1tKoyp5oKu3IjYvZ7RPWRCmd6TxPd8h1wigr+PayyL4VekX Wuz1u/Jgd18BNTWYyIqqNCi/cH/hMYS58AAf5Jq5SEmOZy3V3bFnPmdoM+jovTzR fkM26Hw9siFtcMoPzCw8SnRC3jFiLScfHtzNib2BQa/FerTeA+gKe5M8C5ylyhP/ wa5k4Lt+NtFZEXfHnsXNK11fOmITmrfaz6vnMnVURkcf4j0ITa45dQiufYjHxl6a 9MqC8YnVo18i/3coaKKDnD4qf5m3fGpNoW1Pmkmb+uiwDnqne8G0eVTUW5lqBug7 gblxstadl/diFMiEuv5oBXnUSfZTVraihyFAiShkj79fISA4mLpthmY6Pm50k1ov fjVj+2tpSfx1/FGQNTOwaVMqLfa+QeWrVePirfQU2e+JMgkdr/F6VckIkIeYhCZB b0G9KlaEf8V283xOlQ7tuFUkuXhFq9jRB3E13+73DW/QhMcq2RGa2Qv04BAWoB0U ICD5+ZIQRqIfVOvlXsxtn2yN3cEoWMq5+NhKwWTKA2WDxqzh4+15s/s5+G3j9qvL oIca+U8qn1WficYYqPqxMDHH/bjGSVe10pUV1oiSD1Dv52DhJY44TeLaFmXdnZFi +Z5vILE394P0eFdjBzmw =4kgg -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_14:20:05_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 02:24:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 37CBDA99 for ; Tue, 9 Jul 2013 02:24:39 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id D1CC61DEF for ; Tue, 9 Jul 2013 02:24:38 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r692OakD018309; Mon, 8 Jul 2013 22:24:36 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r692OaHZ018306; Mon, 8 Jul 2013 22:24:36 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20955.29796.228750.131498@hergotha.csail.mit.edu> Date: Mon, 8 Jul 2013 22:24:36 -0400 From: Garrett Wollman To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <27783474.3353362.1373334232356.JavaMail.root@uoguelph.ca> References: <87ppuszgth.wl%berend@pobox.com> <27783474.3353362.1373334232356.JavaMail.root@uoguelph.ca> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Mon, 08 Jul 2013 22:24:36 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 02:24:39 -0000 < said: > Berend de Boer wrote: >> >>>>> "Rick" == Rick Macklem writes: >> Rick> After you apply the patch and boot the rebuilt kernel, the Rick> cpu overheads should be reduced after you increase the >> value Rick> of vfs.nfsd.tcphighwater. >> >> What number would I be looking at? 100? 100,000? >> > Garrett Wollman might have more insight into this, but I would say on > the order of 100s to maybe 1000s. On my production servers, I'm running with the following tuning (after Rick's drc4.patch): ----loader.conf---- kern.ipc.nmbclusters="1048576" vfs.zfs.scrub_limit="16" vfs.zfs.vdev.max_pending="24" vfs.zfs.arc_max="48G" # # Tunable per mps(4). We had sigificant numbers of allocation failures # with the default value of 2048, so bump it up and see whether there's # still an issue. # hw.mps.max_chains="4096" # # Simulate the 10-CURRENT autotuning of maxusers based on available memory # kern.maxusers="8509" # # Attempt to make the message buffer big enough to retain all the crap # that gets spewed on the console when we boot. 64K (the default) isn't # enough to even list all of the disks. # kern.msgbufsize="262144" # # Tell the TCP implementation to use the specialized, faster but possibly # fragile implementation of soreceive. NFS calls soreceive() a lot and # using this implementation, if it works, should improve performance # significantly. # net.inet.tcp.soreceive_stream="1" # # Six queues per interface means twelve queues total # on this hardware, which is a good match for the number # of processor cores we have. # hw.ixgbe.num_queues="6" ----sysctl.conf---- # Make sure that device interrupts are not throttled (10GbE can make # lots and lots of interrupts). hw.intr_storm_threshold=12000 # If the NFS replay cache isn't larger than the number of operations nfsd # can perform in a second, the nfsd service threads will spend all of their # time contending for the mutex that protects the cache data structure so # that they can trim them. If the cache is big enough, it will only do this # once a second. vfs.nfsd.tcpcachetimeo=300 vfs.nfsd.tcphighwater=150000 ----modules/nfs/server/freebsd.pp---- exec {'sysctl vfs.nfsd.minthreads': command => "sysctl vfs.nfsd.minthreads=${min_threads}", onlyif => "test $(sysctl -n vfs.nfsd.minthreads) -ne ${min_threads}", require => Service['nfsd'], } exec {'sysctl vfs.nfsd.maxthreads': command => "sysctl vfs.nfsd.maxthreads=${max_threads}", onlyif => "test $(sysctl -n vfs.nfsd.maxthreads) -ne ${max_threads}", require => Service['nfsd'], } ($min_threads and $max_threads are manually configured based on hardware, currently 16/64 on 8-core machines and 16/96 on 12-core machines.) As this is the summer, we are currently very lightly loaded. There's apparently still a bug in drc4.patch, because both of my non-scratch production servers show a negative CacheSize in nfsstat. (I hope that all of these patches will make it into 9.2 so we don't have to maintain our own mutant NFS implementation.) -GAWollman From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 06:47:03 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7CB38D19 for ; Tue, 9 Jul 2013 06:47:03 +0000 (UTC) (envelope-from Ivailo.Tanusheff@skrill.com) Received: from co1outboundpool.messaging.microsoft.com (co1ehsobe002.messaging.microsoft.com [216.32.180.185]) by mx1.freebsd.org (Postfix) with ESMTP id 3AAFC1B82 for ; Tue, 9 Jul 2013 06:47:02 +0000 (UTC) Received: from mail94-co1-R.bigfish.com (10.243.78.245) by CO1EHSOBE029.bigfish.com (10.243.66.94) with Microsoft SMTP Server id 14.1.225.22; Tue, 9 Jul 2013 06:46:55 +0000 Received: from mail94-co1 (localhost [127.0.0.1]) by mail94-co1-R.bigfish.com (Postfix) with ESMTP id CDCF3B40119; Tue, 9 Jul 2013 06:46:55 +0000 (UTC) X-Forefront-Antispam-Report: CIP:157.56.249.213; KIP:(null); UIP:(null); IPV:NLI; H:AM2PRD0710HT002.eurprd07.prod.outlook.com; RD:none; EFVD:NLI X-SpamScore: 0 X-BigFish: PS0(zz9371I542Izz1f42h1ee6h1de0h1fdah2073h1202h1e76h1d1ah1d2ah1fc6hzz8275dhz2fh2a8h668h839h944hd24hf0ah1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh162dh1631h1758h18e1h1946h19b5h19ceh1ad9h1b0ah1d07h1d0ch1d2eh1d3fh1de9h1dfeh1dffh1e1dh9a9j1155h) Received-SPF: pass (mail94-co1: domain of skrill.com designates 157.56.249.213 as permitted sender) client-ip=157.56.249.213; envelope-from=Ivailo.Tanusheff@skrill.com; helo=AM2PRD0710HT002.eurprd07.prod.outlook.com ; .outlook.com ; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(199002)(189002)(13464003)(377454003)(49866001)(77982001)(74366001)(16406001)(76482001)(63696002)(74662001)(74316001)(66066001)(4396001)(53806001)(74706001)(77096001)(54356001)(79102001)(76786001)(69226001)(74502001)(33646001)(59766001)(31966008)(46102001)(47736001)(80022001)(81342001)(76796001)(50986001)(65816001)(47976001)(81542001)(83072001)(51856001)(47446002)(76576001)(56816003)(74876001)(56776001)(54316002)(24736002); DIR:OUT; SFP:; SCL:1; SRVR:DB3PR07MB057; H:DB3PR07MB059.eurprd07.prod.outlook.com; RD:InfoNoRecords; A:1; MX:1; LANG:en; Received: from mail94-co1 (localhost.localdomain [127.0.0.1]) by mail94-co1 (MessageSwitch) id 1373352413615733_19166; Tue, 9 Jul 2013 06:46:53 +0000 (UTC) Received: from CO1EHSMHS023.bigfish.com (unknown [10.243.78.237]) by mail94-co1.bigfish.com (Postfix) with ESMTP id 89C32D0005D; Tue, 9 Jul 2013 06:46:53 +0000 (UTC) Received: from AM2PRD0710HT002.eurprd07.prod.outlook.com (157.56.249.213) by CO1EHSMHS023.bigfish.com (10.243.66.33) with Microsoft SMTP Server (TLS) id 14.1.225.23; Tue, 9 Jul 2013 06:46:53 +0000 Received: from DB3PR07MB057.eurprd07.prod.outlook.com (10.242.137.144) by AM2PRD0710HT002.eurprd07.prod.outlook.com (10.255.165.37) with Microsoft SMTP Server (TLS) id 14.16.329.3; Tue, 9 Jul 2013 06:46:22 +0000 Received: from DB3PR07MB059.eurprd07.prod.outlook.com (10.242.137.149) by DB3PR07MB057.eurprd07.prod.outlook.com (10.242.137.144) with Microsoft SMTP Server (TLS) id 15.0.702.21; Tue, 9 Jul 2013 06:46:20 +0000 Received: from DB3PR07MB059.eurprd07.prod.outlook.com ([169.254.2.80]) by DB3PR07MB059.eurprd07.prod.outlook.com ([169.254.2.80]) with mapi id 15.00.0702.005; Tue, 9 Jul 2013 06:46:20 +0000 From: Ivailo Tanusheff To: mxb , Steven Hartland Subject: RE: Slow resilvering with mirrored ZIL Thread-Topic: Slow resilvering with mirrored ZIL Thread-Index: AQHOd+tbf+F9+D3Sm0aSEsDbaMdqDJlS63aAgAAJSYCAABH2AIAABPjZgAAD5YCAAJSVgIABGwyAgAAFcYCAAAWKgIAAGraAgAAITruAABYvAIAAKSq7gAC7DACAAEdnAIAAEH3lgAA/DoCABXDj4A== Date: Tue, 9 Jul 2013 06:46:19 +0000 Message-ID: <111013dbe0cd4cebb20791a0f19ce724@DB3PR07MB059.eurprd07.prod.outlook.com> References: <20130704000405.GA75529@icarus.home.lan> <20130704171637.GA94539@icarus.home.lan> <2A261BEA-4452-4F6A-8EFB-90A54D79CBB9@alumni.chalmers.se> <20130704191203.GA95642@icarus.home.lan> <43015E9015084CA6BAC6978F39D22E8B@multiplay.co.uk> <3CFB4564D8EB4A6A9BCE2AFCC5B6E400@multiplay.co.uk> <51D6A206.2020303@digsys.bg> <20130705145332.GA5449@icarus.home.lan> <9052B6E6-F742-4C10-87B5-2EFE03FDB31E@alumni.chalmers.se> In-Reply-To: <9052B6E6-F742-4C10-87B5-2EFE03FDB31E@alumni.chalmers.se> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [217.18.249.148] x-forefront-prvs: 0902222726 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: skrill.com Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 06:47:03 -0000 Hi, Sorry for the late response, but you may also eneble vdev prefetch to incre= ase the speed of resilver: sysctl vfs.zfs.vdev.cache.size Some say this speeds up the process :) Regards, Ivailo Tanusheff -----Original Message----- From: owner-freebsd-fs@freebsd.org [mailto:owner-freebsd-fs@freebsd.org] On= Behalf Of mxb Sent: Friday, July 05, 2013 10:38 PM To: Steven Hartland Cc: freebsd-fs@freebsd.org Subject: Re: Slow resilvering with mirrored ZIL Thanks everyone for a very good info provided in this discussion! Question is if I should wait for resilvering to finish? It runs at 97B/s no= w. Do I have any other options in this situation? Put back old disk? I really don't want to lose all data on this pool. //mxb From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 07:08:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D2F3F2F9 for ; Tue, 9 Jul 2013 07:08:53 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 937991C43 for ; Tue, 9 Jul 2013 07:08:53 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id B208C2A619; Tue, 9 Jul 2013 07:08:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=9onPaN8lOZtZxd1xbQzEjkujd/c=; b=g88mopY0m5EWhH1364PCfXasa9Z7 9TGvCUDUAQ5yu86ROUikjqL4j9ScZ7ysHcKbKURoKfGALZZvC7s2QhcqI1bIj9b4 Wawy8ExXrYxQAwvu/52+IcpoJe8TV0D6Rz/ZcFwYhdoVo3+/o0Ta0xn40fvA3bVj Ura6uyyQJaGu/4k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=IrYw/E PN6InMnMQHiuTqg1LCwxd5GPjiHZOK+iJkEaUCIRdpSqe9N6z0wyTn1E3ciIE2w/ zR0Mea+2snwCXqAxSZEl4/o10tqB/VlJFyodW2rin5GaVWeOhsv+pr5wbvk/WSwL GxcrL9+ZAVCzvA4L5I6o/yQD+f0ZFMc4FBxsM= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id A50AC2A615; Tue, 9 Jul 2013 07:08:51 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 1BBBD2A612; Tue, 9 Jul 2013 07:08:50 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 3FA885C55; Tue, 9 Jul 2013 19:08:43 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 3A97D49FB97A; Tue, 9 Jul 2013 19:08:47 +1200 (NZST) Date: Tue, 09 Jul 2013 19:08:47 +1200 Message-ID: <87bo6cyzuo.wl%berend@pobox.com> From: Berend de Boer To: Garrett Wollman Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <20955.29796.228750.131498@hergotha.csail.mit.edu> References: <87ppuszgth.wl%berend@pobox.com> <27783474.3353362.1373334232356.JavaMail.root@uoguelph.ca> <20955.29796.228750.131498@hergotha.csail.mit.edu> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_19:08:46_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 68739C0E-E866-11E2-AE3F-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 07:08:54 -0000 --pgp-sign-Multipart_Tue_Jul__9_19:08:46_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Garrett" == Garrett Wollman writes: Garrett> Great stuff, thanks Garrett! Garrett> ----modules/nfs/server/freebsd.pp---- Where does this go? -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_19:08:46_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR27b+AAoJEKOfeD48G3g5z4UQAMrM6k56gs8dLe2Gt8CQZt0Z XJJWGS3WOpqHPjtonQbzye3lsjaemyUzs8iXcgr3uj3+/wF3IfwiSBoY8GyYenow h+k0e/h8BNPied0XJDeVkV2p0jf/JtrJ9oEdjp31ZLjIm3WUKdGosxqRiWhVGh4H atydMqKYXUsrYd8mF9M4v9Z5mTPDVNHGX3w4o8vfhpghboiuM6jSsH9wssSlj1oM 92OPCcrHBUww/Rxml+/NJVJDdw/HxxTdFFdm6MItC+/Wy3SY4mjONMl3yv98WMTj COvXmKFija8IPynVZ39GbpH1fz8/Srj4QCfkHzNnJrZMiAxVZkqBvlwZPZfsDP+u uh6mDJkJofXIEnbZIrF9XVhMhlXIFkTOdA95qM1nyuPvwR0N7daDeesU0zK3iCKX KG0OsOy8+tC/apEW+x6HlxqC5KiAEMN2Xov7MN/8ArpH9/MFZf4B3Cn8rF/OVNvn JT9v96GIfrZnZ4gYv0DSEQpckBF3JIbIK5mbCjtNobwZYmWojBKHYpx/HjmLA4Uc k5Qv8M7oYswNJ40blQXFJnZahSWUOzBOrB7W6qGCeCDsfL0XaHme+Xk/mkxbyqlE R/n+izLDSkHVOS4uLuTllL/kyih3qsxi8Q4ZV8mVrhB+ThEU9DEKQhWqJJM2Kavt Q+OAAs4VCqdgRBT8Xo4v =mWiF -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_19:08:46_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 07:48:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 405B3440 for ; Tue, 9 Jul 2013 07:48:14 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 0D2A61E3F for ; Tue, 9 Jul 2013 07:48:13 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 76C952CF77; Tue, 9 Jul 2013 07:48:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=JuRPQ4vw3EGHFNvZY2fue7owmIc=; b=j8PU2LwsxwaIANQVCdu9BSMwO32a dPVuBaAD7z9UuLdizQ3dKqtCkjHxFSkUymAolkNGRxDRFGyyBz62yhnADK3pbBOo EbyH8hE1dBbeL0bp7SPY2ecnBOV53bFL4uFh85Ydsxz7cforOfUev/9pB+vCsdxB JsogOnCr3uUz+go= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=THRgw1 bHDJlSRQrbo/FA8C5hOrv2k36YREVCnKv/bf1E6DXfMEneSmnFPHqEQ3BaIGyE0V h5A27uMMnpALPyHgnstvCHYVA+irHXzgrdd7ZpOcxeBvtUm9R/aU+LPs0yWxSW2o M8ul4CNX/csKhvyG5ypRFF9hFTQBdlDAoI6Fk= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 6DD7C2CF76; Tue, 9 Jul 2013 07:48:09 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 8AAD62CF6F; Tue, 9 Jul 2013 07:48:08 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 324D85C55; Tue, 9 Jul 2013 19:48:01 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 7D86349FB97A; Tue, 9 Jul 2013 19:48:05 +1200 (NZST) Date: Tue, 09 Jul 2013 19:48:05 +1200 Message-ID: <87a9lwyy16.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_19:48:05_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: E59A31DE-E86B-11E2-B70A-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 07:48:14 -0000 --pgp-sign-Multipart_Tue_Jul__9_19:48:05_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> After you Rick> apply the patch and boot the rebuilt kernel, the cpu Rick> overheads should be reduced after you increase the value of Rick> vfs.nfsd.tcphighwater. OK, completely disregard my previous email. I actually was testing against a server in a different data centre, didn't think it would matter too much, but clearly it does (ping times 2-3 times higher). So moved server + disks into the same data centre as the nfs client. 1. Does not effect nfs3. 2. When I do not set vfs.nfsd.tcphighwater, I get a "Remote I/O error" on the client. On server I see: nfsd server cache flooded, try to increase nfsrc_floodlevel (this just FYI). 3. With vfs.nfsd.tcphighwater set to 150,000. I get very high cpu, 50%. Performance is now about 8m15s. Which is better, but still twice above a lower spec Linux NFS4 server, and four times slower than nfs3 on the same box. 4. With Garrett's settings, I looked at when the cpu starts to increase. It starts slow, but raises quickly to 50% in about 1 minute. Time was similar 7m54s. 5. I lowered vfs.nfsd.tcphighwater to 10,000 but then it actually became worse, cpu quickly went to 70%, i.e. not much difference with FreeBSD without patch. Didn't keep this test running to see if it became slower over time. Making it 300,000 seems that the cpu increases are slower (but it keeps rising). So from what I observe from the patch is that it makes the rise in cpu increase slower, but doesn't stop it. I.e. after a few minutes, even with setting 300,000 the cpu is getting to 50%, but dropped a bit after a while to hover around 40%. Then it crept back to over 50%. 6. So the conclusion is: this patch helps somewhat, but nfs4 behaviour is still majorly impaired compared to nfs3. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_19:48:05_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR28A1AAoJEKOfeD48G3g5yPIQANLfMW8UEy6PjEtkTuraF8JB 26WLNBr3+P7r/nOpJF+Wa148rQ+Snb338W4eLATcmxDvXseoN/Q3EZkDQaxl9kDu CDc3wiT0UNJ4bUcuF5gemeon8pPCdcy8NYkfBoz0bUzLP/nHO81i9twk4Evlsgzz sgEng3NxP4GYjykyqX8tWWiW83i8a/BNL3p5Oi1srJp/hbbzPF/dhv4FrfFGWHIK lIh6AUc12UZh7MTyHhrwdLWNMerYyOL5BH2WGAgs/2+Z9ZU/AvjjVpsrUJQE0nZh LA2hkm5CG3XzDHEW+8B8Qgz1G3HQXjRD4AGQ4ygOAaPS9UIrWKOxvbH/iTQ1vBXQ 6/6VhPBNWlFzsU0VLRHhXcGXtnmXg7N0E21WLipzKUCa9U/zD3wgfKxSoX2eIz13 JOROZ1jXZrPwmAcynYMh+WouZSyWCS4sN6rbVAGoJyExkmRRH18mGTfAJDZcVqgA jL2o9Onfe8mDADWRQU5mYoHDUGYP61pqyCuqTPantw1SSjKySCRuIfJL7hh3c0P6 UZu3JtXun8o4ojmg1o76qd0KN4i7Wrqn+lGn7J0QaYE6zDMA0sHgDMVf7zFvvKhj ZtOOzOapP5jmv6RBmtU985E5THVowfBGSWSzHGSNWCf2kh9OtAr0qKyS/YFWzS+/ Tc9ZBp6ezyVBwUJgszKi =WX8Z -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_19:48:05_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 08:11:12 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8AE76E03 for ; Tue, 9 Jul 2013 08:11:12 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.21.123]) by mx1.freebsd.org (Postfix) with ESMTP id 118301F27 for ; Tue, 9 Jul 2013 08:11:11 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [193.68.6.1]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.6/8.14.6) with ESMTP id r698B1sN074040 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Tue, 9 Jul 2013 11:11:02 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <51DBC595.4020407@digsys.bg> Date: Tue, 09 Jul 2013 11:11:01 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130627 Thunderbird/17.0.7 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> <87a9lwyy16.wl%berend@pobox.com> In-Reply-To: <87a9lwyy16.wl%berend@pobox.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 08:11:12 -0000 On 09.07.13 10:48, Berend de Boer wrote: > OK, completely disregard my previous email. I actually was testing > against a server in a different data centre, didn't think it would > matter too much, but clearly it does (ping times 2-3 times higher). > Could you please actually post a diagram of your setup, with all the components, including the "low spec Linux server". Do not forget the RTT (ping) between these hosts. If you have made any network tuning too. Networking protocols like NFS are heavily influenced by factors like RTT. An "underpowered" box that is "nearby" (has lover RTT) usually performs much better than a "powerful box" with larger RTT and other network bottlenecks. Unfortunately, AWS is far from perfect hardware emulation and there might be other layers that intervene with the NFS protocol. Daniel From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 08:43:07 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4B881273 for ; Tue, 9 Jul 2013 08:43:07 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 0A66010EA for ; Tue, 9 Jul 2013 08:43:06 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 30A7B2C4D9; Tue, 9 Jul 2013 08:43:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=0c1Ne4bQKU9xGxUFjBfsM1rS5iQ=; b=jfxkpEAOCtd6YMw7oRqqWyjVOIwj IhFDPAQmhuZR3rnnh9ZAIErthVUAQtMCUADGOeryc3ZpjV72VD//UFXyraf8BZeP EqNsFWXGku3PBGV8nnbHQwtGRykid/Ueuc8Pj8lCHQJP/7RJh6FtaXdL70jxBfCj pyBWv7ew6HBQ5xw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=bQCRur hLAuclRYoj9MUEDa12kbpGZbgbEjMM0Os62hvhyf49fLuxRrOLGCxleuA1xPQWzw fYmKIon0+rCquN9F4bNVGjoWqKLNl78niT7M4lYbQOeN60xuWjfgbH2HEkW1szpp 0sv3ao72xnUBWcmU2XAz2oS2ZUrZaxuuUyniY= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id 21FDF2C4D7; Tue, 9 Jul 2013 08:43:05 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 8F45F2C4D5; Tue, 9 Jul 2013 08:43:04 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id B1A325C77; Tue, 9 Jul 2013 20:42:57 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 02C1649FB97A; Tue, 9 Jul 2013 20:43:02 +1200 (NZST) Date: Tue, 09 Jul 2013 20:43:01 +1200 Message-ID: <877gh0yvhm.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <87a9lwyy16.wl%berend@pobox.com> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> <87a9lwyy16.wl%berend@pobox.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_20:43:01_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 922F1412-E873-11E2-A2E6-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 08:43:07 -0000 --pgp-sign-Multipart_Tue_Jul__9_20:43:01_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Berend" == Berend de Boer writes: Berend> 1. Does not effect nfs3. One update: with zfs sync=standard, I can get nfs3 to perform as bad as the nfs4+patch. 1. without nolock, without async, sync=standard: 8m21s 2. with nolock, without async, sync=standard: 5m37s 3. without nolock, with async, sync=standard: 4m56s. 4. with nolock, with async, sync=standard: 4m23s. 5. without nolock, without async, sync=disabled: 1m57. PS: the nfs4 test was done with sync=disabled. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_20:43:01_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR280VAAoJEKOfeD48G3g5tcUQALnPe0TB3qdItA0HB4hch019 OLbAo6TXj1lw+0mki2iBedAUsS4Ol3RlByViRxLpL/ZE9NOyQLuAmXPiMRI/3smZ qfeIQQ2osJQn2OHOY/ov+WbyQpaAe/A9UwwuRBOXgjXw3Qxx6NWUBdR46j9E132Z /PQ/Q3BKYE3xr5Xwaq9tV8bzleyNHexKe6pipt8qR911tPbiA3SUhR828KLiFewr iLKKzBdLPFzqnOkgTw6bqlH10iqjD4vzJYuERAg9Od9MVpHmzB2tU0AmDOrvSSE5 agfD35V80AGHx7T9/hGQSY7WgiBpvmepsZZT4woPZjUwEPt0+s4RJ9NQto3cZu3K QECgWjqQ7Xot8of+16K4MqT42FlwXKkPHh0F8Hs2AWmH+ZtP2NaQyszWtH20KO/D VBcJip97Hxm4uHFrc/kJhuRhYqT0doOf2dQfWP71m81KkYMVMaSNpVMmJzTHLo0d ia0cwgf4fDM9RygWE/dK8Jnz3KaOthMsd7Th7QrGn2msJHyaJQaN3NBF5bGlTR26 G+naR5XRj6GWmcFtx1saC/NX/5RZzdga7VIJfiOFeSNXcQjhdQOiAkz9+qY4360u tkZf738U44R2PU8+pzCCj6Qy4B+bkl+9iuV3n/scNxUT+WAFmM8L28AKaSXPT088 8jPTq2MOYwRuE1Oxu83d =vc4D -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_20:43:01_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 08:50:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7B58D654 for ; Tue, 9 Jul 2013 08:50:08 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 2D2EC1132 for ; Tue, 9 Jul 2013 08:50:07 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id E5A872C9AD; Tue, 9 Jul 2013 08:50:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=tboU3nG049FKipe+Ay3dDawXFmc=; b=P4hl75wsWcT8T2P3ue/qjwP50bvZ ii0bGXfL1PLcLyaQMYYgA1GRyp5c4RdHYvvBrdwS/2+hv7CeSOT+HVSCzZX7cLTY thSHuP8EbGhMBAFnbn19/WQp2IJPwzJE48h/c6V3Gsy37wRh/PigojylK8kv8yHg eLsdUFQMdz5txJ4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=KpYHAb A3w51zQQyK9i0nHaTCRgx0+Y3iohMcxQF97cFdzEKcr6Fsincr1/eBW8v6VHbK/j x1bnL6PyiZFtENrv3Qw2nldG+PrZIJPJNfaFN3nWbrJkvSwmwOedi3pj72oD9IxQ 1ovS1UITi9f8OYXC3vKVyMk4ssN2SU07sq3eE= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id DAF152C9AB; Tue, 9 Jul 2013 08:50:06 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id EF5E42C9A8; Tue, 9 Jul 2013 08:50:05 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 16D7E5C77; Tue, 9 Jul 2013 20:49:59 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 59B2649FB97A; Tue, 9 Jul 2013 20:50:03 +1200 (NZST) Date: Tue, 09 Jul 2013 20:50:03 +1200 Message-ID: <8761wkyv5w.wl%berend@pobox.com> From: Berend de Boer To: Daniel Kalchev Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <51DBC595.4020407@digsys.bg> References: <87y59i0yni.wl%berend@pobox.com> <580122426.2916694.1373242759482.JavaMail.root@uoguelph.ca> <87a9lwyy16.wl%berend@pobox.com> <51DBC595.4020407@digsys.bg> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Tue_Jul__9_20:50:03_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 8D5AE19A-E874-11E2-B78B-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 08:50:08 -0000 --pgp-sign-Multipart_Tue_Jul__9_20:50:03_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Daniel" == Daniel Kalchev writes: Daniel> Could you please actually post a diagram of your setup, Daniel> with all the components, including the "low spec Linux Daniel> server". Do not forget the RTT (ping) between these Daniel> hosts. If you have made any network tuning too. To be honest, I think that's of little use. I have no impact on where my server is placed in the AWS data centre, nor what other load (cpu or i/o) is also taking place on these boxes. So having approximate values (I'm posting factors of 2 or 4) is the only useful strategy. As you can see I'm not talking about 5% differences here, only about 200% (or 2000%!!!) differences. Pings within an AWS data centre are similar, let's say about 0.450ms. But have a reasonable range. Unless you know AWS I think it's no use posting hardware specs (there are none). The Linux box is a c1.medium, the FreeBSD is an m1.large. FreeBSD is EBS optimised (but disks are not). Daniel> Networking protocols like NFS are heavily influenced by Daniel> factors like RTT. An "underpowered" box that is "nearby" Daniel> (has lover RTT) usually performs much better than a Daniel> "powerful box" with larger RTT and other network Daniel> bottlenecks. I fully agree with that. Daniel> Unfortunately, AWS is far from perfect hardware emulation Daniel> and there might be other layers that intervene with the Daniel> NFS protocol. Exactly right. Given that we are talking factors of difference here I think the hardware in this case does not matter. The problem is in the software. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Tue_Jul__9_20:50:03_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR2867AAoJEKOfeD48G3g5+BkQAK1hqpXTzGsivtMBuHa9Rp4a JE1dVHH7+midancQrNifhK3PB4r7iL3RT8hU6DK4jmcVT1QfBLX1w62yzJzfpOtt /gK6NRDhsiv2IctUA/LwYb1Q3yJkAEJ4ukJAgJEtUld7tUQHvp7z1Bk+4qemRgLF tujdhkIYSOItYUfmoNA73A9iLA1QQrZJ99aQml1TzqbloUcPu9q+fMP80eHey0Ew pPTMRjaPtTktkPdFxxeMQu9VcHkdrY00uMU8VkLVICLkb9y0j/RUoIY5mkOLKOwF CPDQyNZLREDMrzlttARhZZaNOM3LK8qIlleLIuRYawtkrrjDa+u/KK0V+WRelv5E 8rrEoEs77ZClZScne/AGz55fvxm8fcMfqv2s6YkG9C2kg3IIEtEHJoNacdLp0kF0 IIShWNKgjMt7pymPt5QVDKUshe74WTolygkRupiBfo3X99JvtjiN2QFQdX3msRq5 oY6P4h2Ro+RJPLvZ+5r9Lvsx7QlEpMdF69lqVegKvVglc3sLAjtbyeXuUqYLLXtR E7/SogQVQHTvhE3km6nvCAXBqbv3mzbaHI5jf3q0WDq+fLma1m6NRgobJ36ycOlS z5DinD47LYZkbMGqwvANrblw6gK7yrzDU1KlIDV9kcCE6k0Lgzg1IDD7D0iPzMOV gS3pXLJCoYaconVkhTGO =Hhfq -----END PGP SIGNATURE----- --pgp-sign-Multipart_Tue_Jul__9_20:50:03_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 13:03:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D6CEAFD0; Tue, 9 Jul 2013 13:03:58 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-qe0-x22c.google.com (mail-qe0-x22c.google.com [IPv6:2607:f8b0:400d:c02::22c]) by mx1.freebsd.org (Postfix) with ESMTP id 8DF3A1EAC; Tue, 9 Jul 2013 13:03:58 +0000 (UTC) Received: by mail-qe0-f44.google.com with SMTP id 5so2999404qeb.31 for ; Tue, 09 Jul 2013 06:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=v6dXmHtkO65ny6FVKgySzmqEPWEhI0hhz94ihD4RwZY=; b=ObPJCUxozMhMf0Q33Ed/ZLHI2BXytmJLyHq0pEF58T9UC8TAP5ce3CrVQ7aczKDGc2 Kw4d3jUlV68kRqXDLF911jMF+oBOQ1yPl6c8Lz/cW2+MpukH1r50npc9sY8+Gf9Kc8Hq +1eUv2VO4UzY6DIPL9JWAeLn6IkbidwAMTuAgrlYmcIDZmYBeGBEIgeqDFDpySOyAExH vfjcqEO09LUNapwJ+7dS90cbcKwHSeNrAgTybpDsq7dEP5eZzVTVduzo4J8mwMf/IsvB iD/7O7XXB5sW/cV3m2s3HLrTp2FjtAgCS1jh6GtdKvwhgFOxrCM0xN3B3OOi1O4hbOgn c3Yw== MIME-Version: 1.0 X-Received: by 10.224.13.19 with SMTP id z19mr23377106qaz.12.1373375038092; Tue, 09 Jul 2013 06:03:58 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.195.72 with HTTP; Tue, 9 Jul 2013 06:03:58 -0700 (PDT) Date: Tue, 9 Jul 2013 06:03:58 -0700 X-Google-Sender-Auth: 7ibDvLFRUe6vBywUWbbUlGnDlIw Message-ID: Subject: Deadlock in nullfs/zfs somewhere From: Adrian Chadd To: freebsd-current , freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 13:03:58 -0000 Hi all, I'm doing some -10 i386/amd64 package builds on a 32-core build server running: FreeBSD vm0.freebsd.org 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r252897: Sat Jul 6 23:16:03 UTC 2013 sbruno@vm0.freebsd.org:/usr/obj/usr/src/sys/VM0 amd64 And I hit a deadlock: Unread portion of the kernel message buffer: panic: deadlkres: possible deadlock detected for 0xfffffe00adc2a920, blocked for 1800101 ticks (kgdb) tid 100874 [Switching to thread 799 (Thread 100874)]#0 sched_switch (td=0xfffffe00adc2a920, newtd=, flags=) at /usr/src/sys/kern/sched_ule.c:1954 1954 cpuid = PCPU_GET(cpuid); (kgdb) bt #0 sched_switch (td=0xfffffe00adc2a920, newtd=, flags=) at /usr/src/sys/kern/sched_ule.c:1954 #1 0xffffffff804e70ee in mi_switch (flags=260, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:487 #2 0xffffffff8052150a in sleepq_wait (wchan=0x0, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:620 #3 0xffffffff804c2abc in sleeplk (lk=, flags=524544, ilk=, wmesg=0xffffffff80f1b89a "zfs", pri=, timo=) at /usr/src/sys/kern/kern_lock.c:226 #4 0xffffffff804c22f5 in __lockmgr_args (lk=0xfffffe00ad56a068, flags=, ilk=0xfffffe00ad56a098, wmesg=0xffffffff80f1b89a "zfs", pri=96, timo=51, line=) at /usr/src/sys/kern/kern_lock.c:919 #5 0xffffffff8056a26c in vop_stdlock (ap=) at lockmgr.h:97 #6 0xffffffff80790ded in VOP_LOCK1_APV (vop=, a=) at vnode_if.c:2084 #7 0xffffffff805891a3 in _vn_lock (vp=0xfffffe00ad56a000, flags=, file=0xffffffff807fb89e "/usr/src/sys/kern/vfs_subr.c", line=2099) at vnode_if.h:859 #8 0xffffffff805791aa in vget (vp=0xfffffe00ad56a000, flags=524544, td=0xfffffe00adc2a920) at /usr/src/sys/kern/vfs_subr.c:2099 #9 0xffffffff805664b2 in cache_lookup (dvp=0xfffffe00ad4e1588, vpp=0xffffff9049b29188, cnp=0xffffff9049b295a0, tsp=0x0, ticksp=0x0) at /usr/src/sys/kern/vfs_cache.c:674 #10 0xffffffff80567651 in vfs_cache_lookup (ap=) at /usr/src/sys/kern/vfs_cache.c:1033 #11 0xffffffff8078efa2 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:129 #12 0xffffffff8126714b in null_lookup (ap=0xffffff9049b29248) at vnode_if.h:54 #13 0xffffffff8078efa2 in VOP_LOOKUP_APV (vop=, a=) at vnode_if.c:129 #14 0xffffffff8056f6eb in lookup (ndp=0xffffff9049b29520) at vnode_if.h:54 #15 0xffffffff8056ee84 in namei (ndp=0xffffff9049b29520) at /usr/src/sys/kern/vfs_lookup.c:292 #16 0xffffffff80588952 in vn_open_cred (ndp=0xffffff9049b29520, flagp=0xffffff9049b296a0, cmode=0, vn_open_flags=, cred=0xfffffe071c32a900, fp=0x0) at /usr/src/sys/kern/vfs_vnops.c:202 #17 0xffffffff8056a774 in vop_stdvptocnp (ap=) at /usr/src/sys/kern/vfs_default.c:797 #18 0xffffffff81267a1b in null_vptocnp (ap=0xffffff9049b29878) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:824 #19 0xffffffff80792628 in VOP_VPTOCNP_APV (vop=, a=) at vnode_if.c:3649 #20 0xffffffff80567ee3 in vn_vptocnp_locked (vp=0xffffff9049b29900, cred=0xfffffe071c32a900, buf=0xfffffe00ad708800 "", buflen=0xffffff9049b298fc) at vnode_if.h:1564 #21 0xffffffff80567a02 in vn_fullpath1 (td=0xfffffe00adc2a920, vp=0xfffffe03ec1d5ce8, rdir=0xfffffe071b898760, buf=0xfffffe00ad708800 "", retbuf=0xffffff9049b29960, buflen=1004) at /usr/src/sys/kern/vfs_cache.c:1325 #22 0xffffffff805677b5 in kern___getcwd (td=0xfffffe00adc2a920, buf=0x80dd3d4
, bufseg=UIO_USERSPACE, buflen=Cannot access memory at address 0x400 ) at /usr/src/sys/kern/vfs_cache.c:1089 #23 0xffffffff8076554c in ia32_syscall (frame=0xffffff9049b29ac0) at subr_syscall.c:134 #24 0xffffffff807227a5 in Xint0x80_syscall () at ia32_exception.S:73 #25 0x0000000008072c33 in ?? () Previous frame inner to this frame (corrupt stack?) .. and it's here: (kgdb) sleepchain 100874 thread 100874 (pid 75371, make) blocked on lk "zfs" SHARED (count 2) Now, this system doesn't have witness (yet!), so a bunch more hoops need to be jumped through to figure out what else is blocking on that particular lock. Does anyone have any ideas as to what's going on? Or has it been fixed over the last couple days and I haven't noticed? Thanks! -adrian From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 14:47:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2B95151C for ; Tue, 9 Jul 2013 14:47:47 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-qe0-x236.google.com (mail-qe0-x236.google.com [IPv6:2607:f8b0:400d:c02::236]) by mx1.freebsd.org (Postfix) with ESMTP id E835F1453 for ; Tue, 9 Jul 2013 14:47:46 +0000 (UTC) Received: by mail-qe0-f54.google.com with SMTP id ne12so3111039qeb.27 for ; Tue, 09 Jul 2013 07:47:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=3/yYai+KZlHuI1qUWSKdHZThLZdXYbprL7l25MRLPIM=; b=W3zJ6UhEGyzOEhLCKYs34zjHsP8xAN+Km+rtCc0zZ7iYa7K5qJUneHVnnrMSA3mGcQ YWOsskAunxaEMYGN+8E07Gp0/PjJ/ERK/bl+vPBS6SOVFJObNzUkEGopa5L9vpMgJEYy Tkg++Nj3P1zjBA40JQ4BcZGuwwWBAusDNuqPkcI8GEHvhrmA34q5dzhpTW2mz3/ovmGC 4p9QtPV7sDWCYE+0gzeQ9HiaLjpaLEVneiKsqq4bgvkTCMnUG1Y4uNFuXeo5rtTWDXae Ni4wvBvBoBE3lfblLmq7ebbgE9CnH38VSObkyXO+FBBzvN3YYnazyDKULPkjDlXT2EIm QMxg== MIME-Version: 1.0 X-Received: by 10.224.98.140 with SMTP id q12mr23553911qan.99.1373381266451; Tue, 09 Jul 2013 07:47:46 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.49.26.193 with HTTP; Tue, 9 Jul 2013 07:47:46 -0700 (PDT) In-Reply-To: <20130702000732.GA72587@icarus.home.lan> References: <20130702000732.GA72587@icarus.home.lan> Date: Tue, 9 Jul 2013 16:47:46 +0200 X-Google-Sender-Auth: FSY9sXeBmiZkKPSdgc-F7QAXl7E Message-ID: Subject: Re: Compatibility options for mount(8) From: Robert Millan To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 14:47:47 -0000 Hi Jeremy, 2013/7/2 Jeremy Chadwick : > On Tue, Jul 02, 2013 at 01:11:52AM +0200, Robert Millan wrote: > Minor but are well-justified given quality of code: > > 1. Put "n" in alphabetical order/after "l" and not at the end of the > getopt() string, i.e.: > > while ((ch = getopt(argc, argv, "adF:fLlno:prt:uvw")) != -1) Will do. Thanks for the correction. > 2. Please use strncmp(). I know other parts of the same code use strcmp() > and those should really be improved at some other time, but while you're > already there you might as well use strncmp() (you'll see others have > done the same), i.e.: > > } else if (strncmp(p, "remount", 7) == 0) { What is the rationale behind this? > 3. mount(8) man page should reflect these (IMO). Attached is the diff > for that; wording based off of some other man pages. Thank you. -- Robert Millan From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 14:49:04 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C5D5C596 for ; Tue, 9 Jul 2013 14:49:04 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-qc0-x233.google.com (mail-qc0-x233.google.com [IPv6:2607:f8b0:400d:c01::233]) by mx1.freebsd.org (Postfix) with ESMTP id 8E61D1462 for ; Tue, 9 Jul 2013 14:49:04 +0000 (UTC) Received: by mail-qc0-f179.google.com with SMTP id e11so3003592qcx.38 for ; Tue, 09 Jul 2013 07:49:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=xt6/WYgGAKAmviezvwHTQ7tq2Bn4d9q0tE8EOv5xYHs=; b=Q6U7P00i8e/VQLhEyW1ennC+/us1p2MUgMHontMBPMVy5RefOI7wjdhJxBZX86nFDf 0KVrYIMD2p4Q6ml11jDzhybkBqEyFno7Lbh95Lp6lWXiyQvLDi0cpWYhmXngyYbc/bc5 AA374mdn0TnIyJBXw1fB4r+Je7gA2yuizF/n2jzlzvezzp96yb3gKdxsg9dkvj1mMpdJ 1v2Lv+KaVMPE8NTujhcj7wI8JYo40XUhR6il0qTOtoTEyukxVc0Q0eJwrqFH0dDKBHHJ S3AmH6xUeYpZUhpvHO3SHbUaxX99a4ExNDCsBzsVs0ob9pijWJysy9WxBGg0913b8WZV OwOQ== MIME-Version: 1.0 X-Received: by 10.49.59.228 with SMTP id c4mr21046127qer.15.1373381344200; Tue, 09 Jul 2013 07:49:04 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.49.26.193 with HTTP; Tue, 9 Jul 2013 07:49:04 -0700 (PDT) In-Reply-To: <201307020342.r623gOTv012017@chez.mckusick.com> References: <201307020342.r623gOTv012017@chez.mckusick.com> Date: Tue, 9 Jul 2013 16:49:04 +0200 X-Google-Sender-Auth: 4vtwV0ekibCND5QPXRsMWby6-oI Message-ID: Subject: Re: Compatibility options for mount(8) From: Robert Millan To: Kirk McKusick Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 14:49:04 -0000 2013/7/2 Kirk McKusick : >> + append_arg(a, strdup("-o")); >> + append_arg(a, strdup("update")); >> + continue; > > As noted above, I would recoomend using "-u". Will do. Thanks for the pointer. -- Robert Millan From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 15:02:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9128E84E; Tue, 9 Jul 2013 15:02:21 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by mx1.freebsd.org (Postfix) with ESMTP id 4F2FE1625; Tue, 9 Jul 2013 15:02:21 +0000 (UTC) Received: from mfilter16-d.gandi.net (mfilter16-d.gandi.net [217.70.178.144]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id 6CB3AA80D1; Tue, 9 Jul 2013 17:02:04 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter16-d.gandi.net Received: from relay3-d.mail.gandi.net ([217.70.183.195]) by mfilter16-d.gandi.net (mfilter16-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id hvmYYwxcxdKr; Tue, 9 Jul 2013 17:01:32 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 91CA0A80C6; Tue, 9 Jul 2013 17:01:32 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 1534B73A31; Tue, 9 Jul 2013 08:01:29 -0700 (PDT) Date: Tue, 9 Jul 2013 08:01:29 -0700 From: Jeremy Chadwick To: Robert Millan Subject: Re: Compatibility options for mount(8) Message-ID: <20130709150129.GA8289@icarus.home.lan> References: <20130702000732.GA72587@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 15:02:21 -0000 On Tue, Jul 09, 2013 at 04:47:46PM +0200, Robert Millan wrote: > > 2. Please use strncmp(). I know other parts of the same code use strcmp() > > and those should really be improved at some other time, but while you're > > already there you might as well use strncmp() (you'll see others have > > done the same), i.e.: > > > > } else if (strncmp(p, "remount", 7) == 0) { > > What is the rationale behind this? Primarily security and stability. I won't get into a discussion about this as it'll just bikeshed, particularly when there's an almost indefinite amount of information online about the dangers of strcmp(3). -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 15:22:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 9B5D943C for ; Tue, 9 Jul 2013 15:22:02 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 5BAE717A9 for ; Tue, 9 Jul 2013 15:22:02 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.5/8.14.5) with ESMTP id r69FM0m1026572; Tue, 9 Jul 2013 11:22:00 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.5/8.14.4/Submit) id r69FLxSG026556; Tue, 9 Jul 2013 11:21:59 -0400 (EDT) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <20956.10903.687565.787628@hergotha.csail.mit.edu> Date: Tue, 9 Jul 2013 11:21:59 -0400 From: Garrett Wollman To: Berend de Boer Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <87bo6cyzuo.wl%berend@pobox.com> References: <87ppuszgth.wl%berend@pobox.com> <27783474.3353362.1373334232356.JavaMail.root@uoguelph.ca> <20955.29796.228750.131498@hergotha.csail.mit.edu> <87bo6cyzuo.wl%berend@pobox.com> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Tue, 09 Jul 2013 11:22:01 -0400 (EDT) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on hergotha.csail.mit.edu Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 15:22:02 -0000 < said: Garrett> ----modules/nfs/server/freebsd.pp---- > Where does this go? On your Puppet server. -GAWollman From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 16:57:03 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 69F0C7E9; Tue, 9 Jul 2013 16:57:03 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id D5E551D0C; Tue, 9 Jul 2013 16:57:02 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id r69Gux9a077444; Tue, 9 Jul 2013 19:56:59 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua r69Gux9a077444 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id r69GuxkU077443; Tue, 9 Jul 2013 19:56:59 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 9 Jul 2013 19:56:59 +0300 From: Konstantin Belousov To: Robert Millan Subject: Re: Compatibility options for mount(8) Message-ID: <20130709165658.GO91021@kib.kiev.ua> References: <20130702000732.GA72587@icarus.home.lan> <20130709150129.GA8289@icarus.home.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="g7MhQE0aLFS2f2WM" Content-Disposition: inline In-Reply-To: <20130709150129.GA8289@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 16:57:03 -0000 --g7MhQE0aLFS2f2WM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 09, 2013 at 08:01:29AM -0700, Jeremy Chadwick wrote: > On Tue, Jul 09, 2013 at 04:47:46PM +0200, Robert Millan wrote: > > > 2. Please use strncmp(). I know other parts of the same code use str= cmp() > > > and those should really be improved at some other time, but while you= 're > > > already there you might as well use strncmp() (you'll see others have > > > done the same), i.e.: > > > > > > } else if (strncmp(p, "remount", 7) =3D=3D 0)= { > >=20 > > What is the rationale behind this? >=20 > Primarily security and stability. I won't get into a discussion about > this as it'll just bikeshed, particularly when there's an almost > indefinite amount of information online about the dangers of strcmp(3). Robert, please ignore this. The person does not know what he talks about. The use of strncmp() is plain wrong. E.g., it would match "remount1" as well as any longer option starting with "remount". Original patch is fine. --g7MhQE0aLFS2f2WM Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJR3EDaAAoJEJDCuSvBvK1B048P/RxRZHYXHbofTOyg0zJx7ORl xi+qaeHQACvjMAwQSUsmCQLb+zz6bN5fYuB26aXRoWjmKkUhVzcLpb/9vAcIXZ46 JPxbNMnR817dJ+pCqt/pX0LTKAGJX9QIdwoEfG9w0G82VOOf8/G1X9Q4BK0YJj4x 6W/gCqd5Ps5j3TewQAg+NHtGq0xFVPCGHEM7oM91xONSVHm7IWRatOMV7ph8oU4d HsFt/BREsWX+5TcC0zL1FWi+PFt0OQi5gDBBAVid7+9iCe8/oDkEUBCTy5VPJJLc gfvi8RefZJtODFzbUes8l89cJMWktKohISupQQ+jtaGvSLg/8PjyyONqeZcU0fIU HooFHvtIMlo8cYTbu7ubetUfANZrhEoXj8rYe5aUE2ZZV1uRPq92xWcKS8w8fBkk p89olpxcufwkKWpzgfH0lSSw/OZdobOuZhQOhnh4+EulY0MdVRxEJuCWHm0PuoP5 dPpILPtYolIYhJo1nOfI20v50O6XD//kItlAD7fpRCb7KWjlvItCzYGBnugiRvBl pK4o+PIHmn6QVtorQYlxaRLFbBURdFz3vzLWAUPO7mMsH3YOzG9GBSstvELqUBbK J+KBceG9hnO7r3vCXE9595J3hZIKEl9jPkM0X5WVOPNl4tJjThjFsyXbK1vqkIBj UbBKncSUN9LiYdv/M9ob =ekgr -----END PGP SIGNATURE----- --g7MhQE0aLFS2f2WM-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 17:11:15 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D3A85EBF; Tue, 9 Jul 2013 17:11:15 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by mx1.freebsd.org (Postfix) with ESMTP id 87A821DD9; Tue, 9 Jul 2013 17:11:15 +0000 (UTC) Received: from mfilter17-d.gandi.net (mfilter17-d.gandi.net [217.70.178.145]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id 43E72A80C7; Tue, 9 Jul 2013 19:11:04 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mfilter17-d.gandi.net Received: from relay3-d.mail.gandi.net ([217.70.183.195]) by mfilter17-d.gandi.net (mfilter17-d.gandi.net [10.0.15.180]) (amavisd-new, port 10024) with ESMTP id k8Mb3Zfy5nLF; Tue, 9 Jul 2013 19:11:02 +0200 (CEST) X-Originating-IP: 76.102.14.35 Received: from jdc.koitsu.org (c-76-102-14-35.hsd1.ca.comcast.net [76.102.14.35]) (Authenticated sender: jdc@koitsu.org) by relay3-d.mail.gandi.net (Postfix) with ESMTPSA id 5D029A80DD; Tue, 9 Jul 2013 19:11:02 +0200 (CEST) Received: by icarus.home.lan (Postfix, from userid 1000) id 895E773A31; Tue, 9 Jul 2013 10:11:00 -0700 (PDT) Date: Tue, 9 Jul 2013 10:11:00 -0700 From: Jeremy Chadwick To: Konstantin Belousov Subject: Re: Compatibility options for mount(8) Message-ID: <20130709171100.GA10423@icarus.home.lan> References: <20130702000732.GA72587@icarus.home.lan> <20130709150129.GA8289@icarus.home.lan> <20130709165658.GO91021@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130709165658.GO91021@kib.kiev.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 17:11:15 -0000 On Tue, Jul 09, 2013 at 07:56:59PM +0300, Konstantin Belousov wrote: > On Tue, Jul 09, 2013 at 08:01:29AM -0700, Jeremy Chadwick wrote: > > On Tue, Jul 09, 2013 at 04:47:46PM +0200, Robert Millan wrote: > > > > 2. Please use strncmp(). I know other parts of the same code use strcmp() > > > > and those should really be improved at some other time, but while you're > > > > already there you might as well use strncmp() (you'll see others have > > > > done the same), i.e.: > > > > > > > > } else if (strncmp(p, "remount", 7) == 0) { > > > > > > What is the rationale behind this? > > > > Primarily security and stability. I won't get into a discussion about > > this as it'll just bikeshed, particularly when there's an almost > > indefinite amount of information online about the dangers of strcmp(3). > > Robert, please ignore this. The person does not know what he talks about. > > The use of strncmp() is plain wrong. E.g., it would match "remount1" > as well as any longer option starting with "remount". Original patch > is fine. kib@, thanks for correcting me -- you're absolutely right in this case. I was looking at the mountprog/userquota=/groupquota= examples and did not notice the use of strsep(3) within the while(). So yes, use of strncmp(3) in this case is completely wrong. My apologies. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 23:38:08 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 25FFCCC8 for ; Tue, 9 Jul 2013 23:38:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E290B15C2 for ; Tue, 9 Jul 2013 23:38:07 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=u+Bwc9JL7tMNtl/i9xObSTPSFclN5AOtXcIZY5dPsHA= c=1 sm=2 a=4x594vOIrDwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=ybZZDoGAAAAA:8 a=GzJd4s-eAAAA:8 a=9Pf269GSlW1ixed9Z58A:9 a=QEXdDO2ut3YA:10 a=qIVjreYYsbEA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAD6e3FGDaFve/2dsb2JhbABbgztNgwi+C4ErdIIjAQEEASNWBRYYAgINGQJZBhOICQapA5EaBIEmjhE0B4JWgR4DqR2DLSCBbA X-IronPort-AV: E=Sophos;i="4.87,1031,1363147200"; d="scan'208";a="39661795" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 09 Jul 2013 19:38:01 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 1E23DB408F; Tue, 9 Jul 2013 19:38:01 -0400 (EDT) Date: Tue, 9 Jul 2013 19:38:01 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <818900293.3878290.1373413081112.JavaMail.root@uoguelph.ca> In-Reply-To: <877gh0yvhm.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 23:38:08 -0000 Berend de Boer wrote: > >>>>> "Berend" == Berend de Boer writes: > > Berend> 1. Does not effect nfs3. > > One update: with zfs sync=standard, I can get nfs3 to perform as bad > as the nfs4+patch. > Hmm, this is interesting. ken@'s file handle affinity patch works for NFSv3, but not NFSv4. If I understood his posts correctly, the fh affinity patch was needed, so ZFS's heuristic for recognizing sequential reading would function correctly. (A file handle affinity patch for NFSv4 will take some time, since all RPCs in NFSv4 are compounds, with reads/writes imbedded in them, along with other ops.) If you could do some testing where you export a UFS volume, the results might help to isolate the issue to ZFS vs nfsd. > 1. without nolock, without async, sync=standard: 8m21s > > 2. with nolock, without async, sync=standard: 5m37s > > 3. without nolock, with async, sync=standard: 4m56s. > > 4. with nolock, with async, sync=standard: 4m23s. > > 5. without nolock, without async, sync=disabled: 1m57. > > > PS: the nfs4 test was done with sync=disabled. > W.r.t. CPU overheads and the size of vfs.nfsd.tcphighwater. The size of the NFSRVCACHE_HASHSIZE was increased to 500 by the patch. 500 seems about right for a vfs.nfsd.tcphighwater of a few thousand. If you are going to use a very large value (100000), I'd suggest you increase NFSRVCACHE_HASHSIZE to something like 10000. (You have to edit sys/fs/nfs/nfsrvcache.h and recompile to change this.) I'd suggset you try something like: vfs.nfsd.tcphighwater=5000 vfs.nfsd.tcpcachetimeo=300 (5 minutes instead of 12hrs) Also, I can't remember if you've bumped up the # of nfsd threads, but I'd go for 256. nfs_server_flags="-u -t -n 256" - in your /etc/rc.conf I never use ZFS, so I can't help w.r.t. ZFS tuning, rick > > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 23:53:55 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CBE1125C for ; Tue, 9 Jul 2013 23:53:55 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 93F121761 for ; Tue, 9 Jul 2013 23:53:55 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=u+Bwc9JL7tMNtl/i9xObSTPSFclN5AOtXcIZY5dPsHA= c=1 sm=2 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=GzJd4s-eAAAA:8 a=wAD_S4EvPvu0p8q3YE0A:9 a=QEXdDO2ut3YA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqIEAPGh3FGDaFve/2dsb2JhbABbgztNgwi+C4ErdIIjAQEEASMEUgUWGAICDRkCWQYTiAkGqHKRGgSBJo0DgQ40B4JWgR4DqR2DLSCBNTc X-IronPort-AV: E=Sophos;i="4.87,1031,1363147200"; d="scan'208";a="39663632" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 09 Jul 2013 19:53:54 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 9A184B405E; Tue, 9 Jul 2013 19:53:54 -0400 (EDT) Date: Tue, 9 Jul 2013 19:53:54 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <1212484294.3883594.1373414034622.JavaMail.root@uoguelph.ca> In-Reply-To: <87a9lwyy16.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 23:53:56 -0000 Berend de Boer wrote: > >>>>> "Rick" == Rick Macklem writes: > > Rick> After you > Rick> apply the patch and boot the rebuilt kernel, the cpu > Rick> overheads should be reduced after you increase the value of > Rick> vfs.nfsd.tcphighwater. > > OK, completely disregard my previous email. I actually was testing > against a server in a different data centre, didn't think it would > matter too much, but clearly it does (ping times 2-3 times higher). > > So moved server + disks into the same data centre as the nfs client. > > 1. Does not effect nfs3. > > 2. When I do not set vfs.nfsd.tcphighwater, I get a "Remote I/O > error" > on the client. On server I see: > > nfsd server cache flooded, try to increase nfsrc_floodlevel > > (this just FYI). > > 3. With vfs.nfsd.tcphighwater set to 150,000. I get very high cpu, > 50%. > The patch I sent you does not tune nfsrc_floodlevel based on what you set vfs.nfsd.tcphighwater to. That needs to be added to the patch. (I had some code that did this, but others recommended that it should be done as a part of the sysctl, but I haven't gotten around to coding that.) --> For things to work ok, vfs.nfsd.tcphighwater needs to be less than nfsrc_floodlevel (which is 16384). *** Again, I'd recommend setting vfs.nfsd.tcphighwater=5000 to 10000, which is well under 16384 and for which a hash table size of 500 should be ok. Believe it or not, this server was developed about 10 years ago on a PIII with 32 (no, not Gbytes, but Mbytes) of RAM. The sizing worked well for that hardware, but is obviously a bit small for newer hardware;-) > Performance is now about 8m15s. Which is better, but still twice > above a lower spec Linux NFS4 server, and four times slower than > nfs3 on the same box. > > 4. With Garrett's settings, I looked at when the cpu starts to > increase. It starts slow, but raises quickly to 50% in about 1 > minute. > I think his code uses a nfsrc_floodlevel tuned based on vfs.nfsd.tcphighwater and I suspect a much larger hash table size, too. > Time was similar 7m54s. > One other thing you can try is enabling delegations. On the server: vfs.nfsd.issue_delegations=1 > 5. I lowered vfs.nfsd.tcphighwater to 10,000 but then it actually > became worse, cpu quickly went to 70%, i.e. not much difference > with FreeBSD without patch. Didn't keep this test running to see > if > it became slower over time. > > Making it 300,000 seems that the cpu increases are slower (but it > keeps rising). > > So from what I observe from the patch is that it makes the rise in > cpu increase slower, but doesn't stop it. I.e. after a few > minutes, > even with setting 300,000 the cpu is getting to 50%, but dropped a > bit after a while to hover around 40%. Then it crept back to over > 50%. > > 6. So the conclusion is: this patch helps somewhat, but nfs4 > behaviour > is still majorly impaired compared to nfs3. > Well, reading and writing is the same for NFSv4 as NFSv3, except there isn't any file handle affinity support for NFSv4 (ties a set of nfsd thread(s) to reading/writing of a file). File handle affinity results in a more sequential series of VOP_READ()/VOP_WRITE() calls to the server file system. The other big difference between NFSv3 and NFSv4 are the Open operations. The only way to reduce the # of these done may be enabling delegations. How much effect this has depends on the client. rick > > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 23:57:03 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1F9BD426 for ; Tue, 9 Jul 2013 23:57:03 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DBA93179E for ; Tue, 9 Jul 2013 23:57:02 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg= c=1 sm=2 a=ctSXsGKhotwA:10 a=FKkrIqjQGGEA:10 a=V5z4IuhVU5kA:10 a=IkcTkHD0fZMA:10 a=pkkU0Bg7WzlNbPfhUykA:9 a=QEXdDO2ut3YA:10 a=fi6rhVxsa7yJvVoJ:21 a=BcIZAEyNbUJ55jHW:21 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEADOi3FGDaFve/2dsb2JhbABbhAiDCL4LgSt0giMBAQUjVhsYAgINGQJZBhOID6hykR6BJo4RNAeCVoEeA5QBlRyDLSCBbA X-IronPort-AV: E=Sophos;i="4.87,1031,1363147200"; d="scan'208";a="39048264" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 09 Jul 2013 19:57:02 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 16E06B3FAC; Tue, 9 Jul 2013 19:57:02 -0400 (EDT) Date: Tue, 9 Jul 2013 19:57:02 -0400 (EDT) From: Rick Macklem To: Garrett Wollman Message-ID: <74469452.3886197.1373414222081.JavaMail.root@uoguelph.ca> In-Reply-To: <20955.29796.228750.131498@hergotha.csail.mit.edu> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 23:57:03 -0000 Garrett Wollman wrote: > < said: > > > Berend de Boer wrote: > >> >>>>> "Rick" == Rick Macklem writes: > >> > Rick> After you apply the patch and boot the rebuilt kernel, the > Rick> cpu overheads should be reduced after you increase the > >> value > Rick> of vfs.nfsd.tcphighwater. > >> > >> What number would I be looking at? 100? 100,000? > >> > > Garrett Wollman might have more insight into this, but I would say > > on > > the order of 100s to maybe 1000s. > > On my production servers, I'm running with the following tuning > (after Rick's drc4.patch): > > ----loader.conf---- > kern.ipc.nmbclusters="1048576" > vfs.zfs.scrub_limit="16" > vfs.zfs.vdev.max_pending="24" > vfs.zfs.arc_max="48G" > # > # Tunable per mps(4). We had sigificant numbers of allocation > failures > # with the default value of 2048, so bump it up and see whether > there's > # still an issue. > # > hw.mps.max_chains="4096" > # > # Simulate the 10-CURRENT autotuning of maxusers based on available > memory > # > kern.maxusers="8509" > # > # Attempt to make the message buffer big enough to retain all the > crap > # that gets spewed on the console when we boot. 64K (the default) > isn't > # enough to even list all of the disks. > # > kern.msgbufsize="262144" > # > # Tell the TCP implementation to use the specialized, faster but > possibly > # fragile implementation of soreceive. NFS calls soreceive() a lot > and > # using this implementation, if it works, should improve performance > # significantly. > # > net.inet.tcp.soreceive_stream="1" > # > # Six queues per interface means twelve queues total > # on this hardware, which is a good match for the number > # of processor cores we have. > # > hw.ixgbe.num_queues="6" > > ----sysctl.conf---- > # Make sure that device interrupts are not throttled (10GbE can make > # lots and lots of interrupts). > hw.intr_storm_threshold=12000 > > # If the NFS replay cache isn't larger than the number of operations > nfsd > # can perform in a second, the nfsd service threads will spend all of > their > # time contending for the mutex that protects the cache data > structure so > # that they can trim them. If the cache is big enough, it will only > do this > # once a second. > vfs.nfsd.tcpcachetimeo=300 > vfs.nfsd.tcphighwater=150000 > > ----modules/nfs/server/freebsd.pp---- > exec {'sysctl vfs.nfsd.minthreads': > command => "sysctl vfs.nfsd.minthreads=${min_threads}", > onlyif => "test $(sysctl -n vfs.nfsd.minthreads) -ne > ${min_threads}", > require => Service['nfsd'], > } > > exec {'sysctl vfs.nfsd.maxthreads': > command => "sysctl vfs.nfsd.maxthreads=${max_threads}", > onlyif => "test $(sysctl -n vfs.nfsd.maxthreads) -ne > ${max_threads}", > require => Service['nfsd'], > } > > ($min_threads and $max_threads are manually configured based on > hardware, currently 16/64 on 8-core machines and 16/96 on 12-core > machines.) > > As this is the summer, we are currently very lightly loaded. There's > apparently still a bug in drc4.patch, because both of my non-scratch > production servers show a negative CacheSize in nfsstat. > > (I hope that all of these patches will make it into 9.2 so we don't > have to maintain our own mutant NFS implementation.) > Afraid not. I was planning on getting it in, but the release schedule appeared with a short time to code slush. Hopefully a cleaned up version of this will be in 10.0 and 9.3. rick > -GAWollman > > From owner-freebsd-fs@FreeBSD.ORG Tue Jul 9 23:57:55 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 002A64B7 for ; Tue, 9 Jul 2013 23:57:54 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id B2F0D17AF for ; Tue, 9 Jul 2013 23:57:54 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id D97F62F4C1; Tue, 9 Jul 2013 23:57:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=JZE5LshtvPP2EyFZbH0IKLyIyqc=; b=eLmbrYZuCC1pWXd9HoStx3t8IZEI 2M2Iwd+J2WW6iQoGexJ/Su8c4hubk2TiOw9kwYrk96jbvRf6H9jPjG3qnjntSBbO XyWaCzKwkZZQk2Uryz9wvFzpoGICbOV5EJsY4ueb4Ezo4m9pQwBpMsc+eTSV0zWK FPDg6PeVffi6gCQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=uUGM1k 639Y6UrC8sHNj7Z5GtH7Mzp25K99ELUfs4ULfmbLtl3izv5FGiyjQNF5FUgezlU0 RZCbPxXgfzPtCjHBw0Z6MxyDBfa7oe5sFw+o5YbDkw7P921KJmAoARqA5lI6SOo0 0UsvgqSekBT2t/+zgOpOf4KMpNuldi7WLwdzY= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id CF9422F4BF; Tue, 9 Jul 2013 23:57:47 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 56EFA2F4BE; Tue, 9 Jul 2013 23:57:47 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 8031D5C89; Wed, 10 Jul 2013 11:57:40 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id C74E849FB97C; Wed, 10 Jul 2013 11:57:44 +1200 (NZST) Date: Wed, 10 Jul 2013 11:57:44 +1200 Message-ID: <87k3kzxp53.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + ZFS + AWS EC2 In-Reply-To: <818900293.3878290.1373413081112.JavaMail.root@uoguelph.ca> References: <877gh0yvhm.wl%berend@pobox.com> <818900293.3878290.1373413081112.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Wed_Jul_10_11:57:44_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: 5ADCB16A-E8F3-11E2-BEBA-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Jul 2013 23:57:55 -0000 --pgp-sign-Multipart_Wed_Jul_10_11:57:44_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> Hmm, this is interesting. ken@'s file handle affinity patch Rick> works for NFSv3, but not NFSv4. If I understood his posts Rick> correctly, the fh affinity patch was needed, so ZFS's Rick> heuristic for recognizing sequential reading would function Rick> correctly. (A file handle affinity patch for NFSv4 will take Rick> some time, since all RPCs in NFSv4 are compounds, with Rick> reads/writes imbedded in them, along with other ops.) These are all very small files, not much large, so it's not some kind of sequential reading/writing test. Just thousands of small files being written, and a fair amount of reads. Rick> Also, I can't remember if you've bumped up the # of nfsd Rick> threads, but I'd go for 256. nfs_server_flags="-u -t -n Rick> 256" - in your /etc/rc.conf Didn't. But only one client was writing, so I figured that shouldn't matter. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Wed_Jul_10_11:57:44_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR3KN4AAoJEKOfeD48G3g576cP/3A0slrBgVVBqJ82XaNTOSd7 oMWOXe5RyK3YxrlbpfjfziGZkeCIVQSp/ip1gqZKABYHaGi+UD0Hw6apZNuyeuDa miHlYLBMijzqh8O1E0rJ72rrbdhlD4t+9J0jCjCwoGIz69WToz1ckUyW/y0iL+lq nkwMzi3JwXo0vO3RtIY4x0DBMyqjNH0LR3ROHLemz7CxoixJibWnRpLnrJO7Zx/y tUQo5l3CI4/EQQ6qCK0y9Hb27MJ3u4fjIJ3XgRpHdsekFDk32JpCzQ2mPeC8XmQx /Ftu59lk+hf5V2e93vmdV7mzLpsrW4IIu1A9WVXsVCkbzxaNX76f7Oo3/6tjvGUV R1JenQUtj5SKhPa4o80KSN24OEdaLhM5XPa1BisCy8un+4lgFZe6OBGLjZB0+MQ8 W1H8tWgZYHcPA4KMWLWWF5uNibZTSp8FzJyg6Z+H+NOwVagvokhdy45Q1UXa0J1q DnJsoQGpWCa6nDCEmLMBg5AezNYVPbLENtGn20uov0yCIMtSp0N4Yv8G6Dt4ntPL pRnkaL3+cmN5lrXT28BqIZBT2VeO5UczbitSpe/kRHbJKCfuox8/Qa2q/B6SFpXh Br+MEAWCkPru/w3mCqs17AhUpAnkoIjOirT+Pa40Hw/dl6GW4CFRynbpZGKkT0IC bFyq80l8ZjjaFsNFq2QZ =g89k -----END PGP SIGNATURE----- --pgp-sign-Multipart_Wed_Jul_10_11:57:44_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 03:03:31 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B1D762F2 for ; Wed, 10 Jul 2013 03:03:31 +0000 (UTC) (envelope-from jiansong.liu@gmail.com) Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) by mx1.freebsd.org (Postfix) with ESMTP id 5223B1EE0 for ; Wed, 10 Jul 2013 03:03:31 +0000 (UTC) Received: by mail-wg0-f49.google.com with SMTP id a12so5359922wgh.4 for ; Tue, 09 Jul 2013 20:03:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=OvqfK0DDQZRXTG1TUTBK9SKm/jI3kvbbBg0LF/Z3FaM=; b=JN1FovnlQOrUKYcmrsqRUk/Zzx2LR0QbUbZfMPpzCCgokeFlqwe6KyN8qgg5R4iAJ4 cBZeE/6Gvi/9pI/n7EKKDqwu5HLDYhur32L5u5CRpX7WWCGkgMw5MOG5J/BtfnV03UfC AKY7fD4+rAoBZgBoR6502co6EZxh5ArG3s8BPzy1idcxV1YVmzzE+Xa/57d9tfL3mJod 2By57gHemDRp9Gii3R4bMlo6+Q1q+NFN0lZDdd9dNG+XtpBYLLdkvKTeKP/HVYBms7Fl XSm5TbYG29VVYsMBXaP2DqoWKll6wObNrR9jcPQyfyQ1JjJQ54N5vqjp0Cf8jxm8uyzS UVpw== X-Received: by 10.194.60.5 with SMTP id d5mr17139072wjr.26.1373425410398; Tue, 09 Jul 2013 20:03:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.14.4 with HTTP; Tue, 9 Jul 2013 20:03:15 -0700 (PDT) From: Jiansong Liu Date: Wed, 10 Jul 2013 11:03:15 +0800 Message-ID: Subject: zpool import -D failed, "guid mismatch for provider /dev/da#:" To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 03:03:31 -0000 Hi All, I destroyed a pool and try to recovery with the "zpool import" command failed and it says have no pool to import, then I realized that I missed the "-D", so I run the command "zpool import -D", now it says UNAVAIL, the pool has six vdev (da0 da1 da2 da3 da5 da6) original: # zpool import -D pool: storage id: 8511691845980256432 state: UNAVAIL (DESTROYED) status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: http://illumos.org/msg/ZFS-8000-3C config: storage UNAVAIL insufficient replicas raidz2-0 UNAVAIL insufficient replicas 8368872959405194221 UNAVAIL cannot open 16925320148488343503 UNAVAIL cannot open 2721065418012152096 UNAVAIL cannot open 1488947662741999881 UNAVAIL cannot open 16956133848943560671 UNAVAIL cannot open 7236613667503893647 UNAVAIL cannot open Every time I run the "zpool import -D", the zfs debug will output a error for every vdev member, seems the vdev returned a wrong guid: vdev_geom_open_by_path:550[1]: Found provider by name /dev/da6. vdev_geom_attach:97[1]: Attaching to da6. vdev_geom_attach:118[1]: Created geom and consumer for da6. vdev_geom_read_config:243[1]: Reading config from da6... vdev_geom_detach:158[1]: Closing access to da6. vdev_geom_detach:162[1]: Destroyed consumer to da6. vdev_geom_detach:170[1]: Destroyed geom zfs::vdev. vdev_geom_open_by_path:562[1]: guid mismatch for provider /dev/da6: 7236613667503893647 != 0. vdev_geom_open_by_guid:518[1]: Searching by guid [7236613667503893647]. vdev_geom_read_config:243[1]: Reading config from da4s1g... vdev_geom_read_config:243[1]: Reading config from da4s1f... vdev_geom_read_config:243[1]: Reading config from da4s1e... vdev_geom_read_config:243[1]: Reading config from da4s1d... vdev_geom_read_config:243[1]: Reading config from da4s1b... vdev_geom_read_config:243[1]: Reading config from da4s1a... vdev_geom_read_config:243[1]: Reading config from da4s1... vdev_geom_read_config:243[1]: Reading config from da6... vdev_geom_read_config:243[1]: Reading config from da5... vdev_geom_read_config:243[1]: Reading config from da4... vdev_geom_read_config:243[1]: Reading config from da3... vdev_geom_read_config:243[1]: Reading config from da2... vdev_geom_read_config:243[1]: Reading config from da1... vdev_geom_read_config:243[1]: Reading config from da0... vdev_geom_open_by_guid:532[1]: Search by guid [7236613667503893647] failed. vdev_geom_open:617[1]: Provider /dev/da6 not found. the system version is 9-STABLE r250636 any comment and advice are appreciated, thanks in advance. Best regards, Jiansong Liu From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 04:06:50 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 59E04A89 for ; Wed, 10 Jul 2013 04:06:50 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id DB1601149 for ; Wed, 10 Jul 2013 04:06:49 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 26D2247E21; Wed, 10 Jul 2013 06:00:51 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.3 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.0.2] (c38-073.client.duna.pl [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id B745747E16 for ; Wed, 10 Jul 2013 06:00:51 +0200 (CEST) Message-ID: <51DCDC6C.2020205@platinum.linux.pl> Date: Wed, 10 Jul 2013 06:00:44 +0200 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: zpool import -D failed, "guid mismatch for provider /dev/da#:" References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 04:06:50 -0000 It fails because of a check at http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c?im=10#L272 state == POOL_STATE_DESTROYED will be true and no config will be read, then http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c?im=10#L445 returns 0 for guid. removing that check should make the pool importable On 2013-07-10 05:03, Jiansong Liu wrote: > Hi All, > > I destroyed a pool and try to recovery with the "zpool import" command > failed and it says have no pool to import, then I realized that I > missed the "-D", so I run the command "zpool import -D", now it says > UNAVAIL, the pool has six vdev (da0 da1 da2 da3 da5 da6) original: > > # zpool import -D > pool: storage > id: 8511691845980256432 > state: UNAVAIL (DESTROYED) > status: One or more devices are missing from the system. > action: The pool cannot be imported. Attach the missing > devices and try again. > see: http://illumos.org/msg/ZFS-8000-3C > config: > > storage UNAVAIL insufficient replicas > raidz2-0 UNAVAIL insufficient replicas > 8368872959405194221 UNAVAIL cannot open > 16925320148488343503 UNAVAIL cannot open > 2721065418012152096 UNAVAIL cannot open > 1488947662741999881 UNAVAIL cannot open > 16956133848943560671 UNAVAIL cannot open > 7236613667503893647 UNAVAIL cannot open > > Every time I run the "zpool import -D", the zfs debug will output a > error for every vdev member, seems the vdev returned a wrong guid: > > vdev_geom_open_by_path:550[1]: Found provider by name /dev/da6. > vdev_geom_attach:97[1]: Attaching to da6. > vdev_geom_attach:118[1]: Created geom and consumer for da6. > vdev_geom_read_config:243[1]: Reading config from da6... > vdev_geom_detach:158[1]: Closing access to da6. > vdev_geom_detach:162[1]: Destroyed consumer to da6. > vdev_geom_detach:170[1]: Destroyed geom zfs::vdev. > vdev_geom_open_by_path:562[1]: guid mismatch for provider /dev/da6: > 7236613667503893647 != 0. > vdev_geom_open_by_guid:518[1]: Searching by guid [7236613667503893647]. > vdev_geom_read_config:243[1]: Reading config from da4s1g... > vdev_geom_read_config:243[1]: Reading config from da4s1f... > vdev_geom_read_config:243[1]: Reading config from da4s1e... > vdev_geom_read_config:243[1]: Reading config from da4s1d... > vdev_geom_read_config:243[1]: Reading config from da4s1b... > vdev_geom_read_config:243[1]: Reading config from da4s1a... > vdev_geom_read_config:243[1]: Reading config from da4s1... > vdev_geom_read_config:243[1]: Reading config from da6... > vdev_geom_read_config:243[1]: Reading config from da5... > vdev_geom_read_config:243[1]: Reading config from da4... > vdev_geom_read_config:243[1]: Reading config from da3... > vdev_geom_read_config:243[1]: Reading config from da2... > vdev_geom_read_config:243[1]: Reading config from da1... > vdev_geom_read_config:243[1]: Reading config from da0... > vdev_geom_open_by_guid:532[1]: Search by guid [7236613667503893647] failed. > vdev_geom_open:617[1]: Provider /dev/da6 not found. > > the system version is 9-STABLE r250636 > any comment and advice are appreciated, thanks in advance. > > > Best regards, > Jiansong Liu > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 04:41:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 41AE7F3F for ; Wed, 10 Jul 2013 04:41:21 +0000 (UTC) (envelope-from sowmya@cloudbyte.co) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx1.freebsd.org (Postfix) with ESMTP id D221D122F for ; Wed, 10 Jul 2013 04:41:20 +0000 (UTC) Received: by mail-wi0-f182.google.com with SMTP id m6so5882609wiv.9 for ; Tue, 09 Jul 2013 21:41:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :x-gm-message-state; bh=cAsPo4aKrh9sHqFOloAcirZIZs08HAlCFA5YwNfmEN8=; b=Sh7qNliflcW+pOFiBwv3fWw6vs248jSFnApmWRSAHFKNwaRUKw9ywQiqQLCvRMeGCy 5keWGDymRNEBufMr6GvLmYOlohh0kgamjLnx3F+cryqpS4M6SpDcCLdeKKPwwOg/Pg19 eYXGzQlbjGt96iE8/xAWp+mrgg6RZwDgrldQuRsqvNUv7ndKM4CQYe2vdhUPSRjcaIDJ szt09nVNCFXKt9h/xzVD96Ee7f1lFHI7AhNw1waECNTHmEYrQLVFvoGWvVzgAh3DQb+J LbFCM7S4cliyl9euVrg2h++482cBo2Q7YEFZzx/01g8es0Fv3KZ7frjgiEnWUYTEdvjR JMhA== MIME-Version: 1.0 X-Received: by 10.194.48.116 with SMTP id k20mr17164260wjn.23.1373430938654; Tue, 09 Jul 2013 21:35:38 -0700 (PDT) Received: by 10.217.142.204 with HTTP; Tue, 9 Jul 2013 21:35:38 -0700 (PDT) Date: Wed, 10 Jul 2013 10:05:38 +0530 Message-ID: Subject: I/O error on the pool created using g_multipath device From: Sowmya L To: freebsd-fs@freebsd.org X-Gm-Message-State: ALoCoQksOeA9WhunNMGXlOYJK673BVXoYC1Qa8XiV0zb/6LdB/AaJJ81932i4xQLhEPObpRunmqr Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 04:41:21 -0000 Hi, i am finding read/write errors on the pool when the active path cable is pulled from jbod while running I/O on the pool. *freebsd version* : 9.0 *patches taken from stable 9 are* : MFC r234415 MFC r227464 , r227471 *g_multipath configuration:* Geom name: newdisk2 Type: AUTOMATIC Mode: Active/Passive UUID: 1ea053ef-e4a5-11e2-9887-00e0ed158a78 State: OPTIMAL Providers: 1. Name: multipath/newdisk2 Mediasize: 2000398933504 (1.8T) Sectorsize: 512 Mode: r0w0e0 State: OPTIMAL Consumers: 1. Name: da0 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE 2. Name: da6 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Mode: r1w1e1 State: PASSIVE Geom name: newdisk1 Type: AUTOMATIC Mode: Active/Passive UUID: 166e0467-e4a5-11e2-9887-00e0ed158a78 State: OPTIMAL Providers: 1. Name: multipath/newdisk1 Mediasize: 299999999488 (279G) Sectorsize: 512 Mode: r0w0e0 State: OPTIMAL Consumers: 1. Name: da2 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE 2. Name: da4 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r1w1e1 State: PASSIVE Geom name: newdisk3 Type: AUTOMATIC Mode: Active/Active UUID: 78a76feb-e84f-11e2-9b69-00e0ed158a78 State: OPTIMAL Providers: 1. Name: multipath/newdisk3 Mediasize: 299999999488 (279G) Sectorsize: 512 Mode: r1w1e1 State: OPTIMAL Consumers: 1. Name: da7 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r2w2e2 State: ACTIVE 2. Name: da8 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r2w2e2 State: ACTIVE Geom name: newdisk Type: AUTOMATIC Mode: Active/Passive UUID: 0a42c877-e4a5-11e2-9887-00e0ed158a78 State: OPTIMAL Providers: 1. Name: multipath/newdisk Mediasize: 2000398933504 (1.8T) Sectorsize: 512 Mode: r0w0e0 State: OPTIMAL Consumers: 1. Name: da1 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE 2. Name: da3 Mediasize: 2000398934016 (1.8T) Sectorsize: 512 Mode: r1w1e1 State: PASSIVE Geom name: newdisk4 Type: AUTOMATIC Mode: Active/Active UUID: 7b9f9e79-e84f-11e2-9b69-00e0ed158a78 State: OPTIMAL Providers: 1. Name: multipath/newdisk4 Mediasize: 299999999488 (279G) Sectorsize: 512 Mode: r1w1e1 State: OPTIMAL Consumers: 1. Name: da5 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r2w2e2 State: ACTIVE 2. Name: da9 Mediasize: 300000000000 (279G) Sectorsize: 512 Mode: r2w2e2 State: ACTIVE *pool configuration:* pool: mypool state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 multipath/newdisk3 ONLINE 0 0 0 multipath/newdisk4 ONLINE 0 0 0 errors: No known data errors Are there any dependencies for the patch that is taken from stable 9 code? -- Thanks & Regards, Sowmya L From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 06:26:53 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A27294E2 for ; Wed, 10 Jul 2013 06:26:53 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id EF35316EB for ; Wed, 10 Jul 2013 06:26:52 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA01190; Wed, 10 Jul 2013 09:26:48 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1UwnrU-000HW6-3v; Wed, 10 Jul 2013 09:26:48 +0300 Message-ID: <51DCFE70.90302@FreeBSD.org> Date: Wed, 10 Jul 2013 09:25:52 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130708 Thunderbird/17.0.7 MIME-Version: 1.0 To: Adam Nowacki , jiansong.liu@gmail.com Subject: Re: zpool import -D failed, "guid mismatch for provider /dev/da#:" References: <51DCDC6C.2020205@platinum.linux.pl> In-Reply-To: <51DCDC6C.2020205@platinum.linux.pl> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 06:26:53 -0000 on 10/07/2013 07:00 Adam Nowacki said the following: > It fails because of a check at > http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c?im=10#L272 > > state == POOL_STATE_DESTROYED will be true and no config will be read, then > http://fxr.watson.org/fxr/source/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c?im=10#L445 > returns 0 for guid. > > removing that check should make the pool importable And this has already been fixed in head. > On 2013-07-10 05:03, Jiansong Liu wrote: >> Hi All, >> >> I destroyed a pool and try to recovery with the "zpool import" command >> failed and it says have no pool to import, then I realized that I >> missed the "-D", so I run the command "zpool import -D", now it says >> UNAVAIL, the pool has six vdev (da0 da1 da2 da3 da5 da6) original: >> >> # zpool import -D >> pool: storage >> id: 8511691845980256432 >> state: UNAVAIL (DESTROYED) >> status: One or more devices are missing from the system. >> action: The pool cannot be imported. Attach the missing >> devices and try again. >> see: http://illumos.org/msg/ZFS-8000-3C >> config: >> >> storage UNAVAIL insufficient replicas >> raidz2-0 UNAVAIL insufficient replicas >> 8368872959405194221 UNAVAIL cannot open >> 16925320148488343503 UNAVAIL cannot open >> 2721065418012152096 UNAVAIL cannot open >> 1488947662741999881 UNAVAIL cannot open >> 16956133848943560671 UNAVAIL cannot open >> 7236613667503893647 UNAVAIL cannot open >> >> Every time I run the "zpool import -D", the zfs debug will output a >> error for every vdev member, seems the vdev returned a wrong guid: >> >> vdev_geom_open_by_path:550[1]: Found provider by name /dev/da6. >> vdev_geom_attach:97[1]: Attaching to da6. >> vdev_geom_attach:118[1]: Created geom and consumer for da6. >> vdev_geom_read_config:243[1]: Reading config from da6... >> vdev_geom_detach:158[1]: Closing access to da6. >> vdev_geom_detach:162[1]: Destroyed consumer to da6. >> vdev_geom_detach:170[1]: Destroyed geom zfs::vdev. >> vdev_geom_open_by_path:562[1]: guid mismatch for provider /dev/da6: >> 7236613667503893647 != 0. >> vdev_geom_open_by_guid:518[1]: Searching by guid [7236613667503893647]. >> vdev_geom_read_config:243[1]: Reading config from da4s1g... >> vdev_geom_read_config:243[1]: Reading config from da4s1f... >> vdev_geom_read_config:243[1]: Reading config from da4s1e... >> vdev_geom_read_config:243[1]: Reading config from da4s1d... >> vdev_geom_read_config:243[1]: Reading config from da4s1b... >> vdev_geom_read_config:243[1]: Reading config from da4s1a... >> vdev_geom_read_config:243[1]: Reading config from da4s1... >> vdev_geom_read_config:243[1]: Reading config from da6... >> vdev_geom_read_config:243[1]: Reading config from da5... >> vdev_geom_read_config:243[1]: Reading config from da4... >> vdev_geom_read_config:243[1]: Reading config from da3... >> vdev_geom_read_config:243[1]: Reading config from da2... >> vdev_geom_read_config:243[1]: Reading config from da1... >> vdev_geom_read_config:243[1]: Reading config from da0... >> vdev_geom_open_by_guid:532[1]: Search by guid [7236613667503893647] failed. >> vdev_geom_open:617[1]: Provider /dev/da6 not found. >> >> the system version is 9-STABLE r250636 >> any comment and advice are appreciated, thanks in advance. >> >> >> Best regards, >> Jiansong Liu >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 06:28:37 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C5E3F55A; Wed, 10 Jul 2013 06:28:37 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 83C5716F9; Wed, 10 Jul 2013 06:28:36 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA01202; Wed, 10 Jul 2013 09:28:34 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1UwntC-000HWD-FD; Wed, 10 Jul 2013 09:28:34 +0300 Message-ID: <51DCFEDA.1090901@FreeBSD.org> Date: Wed, 10 Jul 2013 09:27:38 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130708 Thunderbird/17.0.7 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: Deadlock in nullfs/zfs somewhere References: In-Reply-To: X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-current X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 06:28:37 -0000 on 09/07/2013 16:03 Adrian Chadd said the following: > Does anyone have any ideas as to what's going on? Please provide output of 'thread apply all bt' from kgdb, then perhaps someone might be able to tell. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 08:17:18 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 97890B40 for ; Wed, 10 Jul 2013 08:17:18 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 2233A1B11 for ; Wed, 10 Jul 2013 08:17:17 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004901886.msg for ; Wed, 10 Jul 2013 09:17:15 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 09:17:15 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-fs@freebsd.org Message-ID: <9DFC0C4FC3594628A5B53FB743C76FE6@multiplay.co.uk> From: "Steven Hartland" To: "Jiansong Liu" , References: Subject: Re: zpool import -D failed, "guid mismatch for provider /dev/da#:" Date: Wed, 10 Jul 2013 09:17:32 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 08:17:18 -0000 This was fixed by r252056 in head and r252308 in stable/9 Regards Steve ----- Original Message ----- From: "Jiansong Liu" > Hi All, > > I destroyed a pool and try to recovery with the "zpool import" command > failed and it says have no pool to import, then I realized that I > missed the "-D", so I run the command "zpool import -D", now it says > UNAVAIL, the pool has six vdev (da0 da1 da2 da3 da5 da6) original: > > # zpool import -D > pool: storage > id: 8511691845980256432 > state: UNAVAIL (DESTROYED) > status: One or more devices are missing from the system. > action: The pool cannot be imported. Attach the missing > devices and try again. > see: http://illumos.org/msg/ZFS-8000-3C > config: > > storage UNAVAIL insufficient replicas > raidz2-0 UNAVAIL insufficient replicas > 8368872959405194221 UNAVAIL cannot open > 16925320148488343503 UNAVAIL cannot open > 2721065418012152096 UNAVAIL cannot open > 1488947662741999881 UNAVAIL cannot open > 16956133848943560671 UNAVAIL cannot open > 7236613667503893647 UNAVAIL cannot open > > Every time I run the "zpool import -D", the zfs debug will output a > error for every vdev member, seems the vdev returned a wrong guid: > > vdev_geom_open_by_path:550[1]: Found provider by name /dev/da6. > vdev_geom_attach:97[1]: Attaching to da6. > vdev_geom_attach:118[1]: Created geom and consumer for da6. > vdev_geom_read_config:243[1]: Reading config from da6... > vdev_geom_detach:158[1]: Closing access to da6. > vdev_geom_detach:162[1]: Destroyed consumer to da6. > vdev_geom_detach:170[1]: Destroyed geom zfs::vdev. > vdev_geom_open_by_path:562[1]: guid mismatch for provider /dev/da6: > 7236613667503893647 != 0. > vdev_geom_open_by_guid:518[1]: Searching by guid [7236613667503893647]. > vdev_geom_read_config:243[1]: Reading config from da4s1g... > vdev_geom_read_config:243[1]: Reading config from da4s1f... > vdev_geom_read_config:243[1]: Reading config from da4s1e... > vdev_geom_read_config:243[1]: Reading config from da4s1d... > vdev_geom_read_config:243[1]: Reading config from da4s1b... > vdev_geom_read_config:243[1]: Reading config from da4s1a... > vdev_geom_read_config:243[1]: Reading config from da4s1... > vdev_geom_read_config:243[1]: Reading config from da6... > vdev_geom_read_config:243[1]: Reading config from da5... > vdev_geom_read_config:243[1]: Reading config from da4... > vdev_geom_read_config:243[1]: Reading config from da3... > vdev_geom_read_config:243[1]: Reading config from da2... > vdev_geom_read_config:243[1]: Reading config from da1... > vdev_geom_read_config:243[1]: Reading config from da0... > vdev_geom_open_by_guid:532[1]: Search by guid [7236613667503893647] failed. > vdev_geom_open:617[1]: Provider /dev/da6 not found. > > the system version is 9-STABLE r250636 > any comment and advice are appreciated, thanks in advance. > > > Best regards, > Jiansong Liu > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 09:02:24 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 05561558; Wed, 10 Jul 2013 09:02:24 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id BBCE41D11; Wed, 10 Jul 2013 09:02:23 +0000 (UTC) Received: from nine.des.no (smtp.des.no [194.63.250.102]) by smtp-int.des.no (Postfix) with ESMTP id 885E342F1; Wed, 10 Jul 2013 09:02:16 +0000 (UTC) Received: by nine.des.no (Postfix, from userid 1001) id D7E0E352BD; Wed, 10 Jul 2013 11:02:00 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org Subject: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 11:02:00 +0200 Message-ID: <86zjtupz3r.fsf@nine.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Cc: ivoras@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 09:02:24 -0000 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable The attached patch causes ZFS to base the minimum transfer size for a new vdev on the GEOM provider's stripesize (physical sector size) rather than sectorsize (logical sector size), provided that stripesize is a power of two larger than sectorsize and smaller than or equal to VDEV_PAD_SIZE. This should eliminate the need for ivoras@'s gnop trick when creating ZFS pools on Advanced Format drives. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=zfs-vdev-stripesize.diff Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c =================================================================== --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c (revision 253138) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c (working copy) @@ -578,6 +578,7 @@ { struct g_provider *pp; struct g_consumer *cp; + u_int sectorsize; size_t bufsize; int error; @@ -661,8 +662,21 @@ /* * Determine the device's minimum transfer size. + * + * This is a bit of a hack. For performance reasons, we would + * prefer to use the physical sector size (reported by GEOM as + * stripesize) as minimum transfer size. However, doing so + * unconditionally would break existing vdevs. Therefore, we + * compute ashift based on stripesize when the vdev isn't already + * part of a pool (vdev_asize == 0), and sectorsize otherwise. */ - *ashift = highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1; + if (vd->vdev_asize == 0 && pp->stripesize > pp->sectorsize && + ISP2(pp->stripesize) && pp->stripesize <= VDEV_PAD_SIZE) { + sectorsize = pp->stripesize; + } else { + sectorsize = pp->sectorsize; + } + *ashift = highbit(MAX(sectorsize, SPA_MINBLOCKSIZE)) - 1; /* * Clear the nowritecache settings, so that on a vdev_reopen() --=-=-=-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 09:16:30 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4F5309ED for ; Wed, 10 Jul 2013 09:16:30 +0000 (UTC) (envelope-from jiansong.liu@gmail.com) Received: from mail-vc0-x230.google.com (mail-vc0-x230.google.com [IPv6:2607:f8b0:400c:c03::230]) by mx1.freebsd.org (Postfix) with ESMTP id 136D51DDF for ; Wed, 10 Jul 2013 09:16:30 +0000 (UTC) Received: by mail-vc0-f176.google.com with SMTP id ha12so5141893vcb.7 for ; Wed, 10 Jul 2013 02:16:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=/gF29k3PqGJdrthSUgTh3RZFOOBKpfgoVDt/ys8skmU=; b=tViElBaFwNV6Hjku0NRY0wJxqvgZwZPVq5aXzBrPAW8losQbJ3szphKZDWA3sbWqRa bnwm+Mx1m7TigwltG9JPmG5zPHVQu7iM9rfWUw7C90ObA7Vnt0BWd800rXi35paBTF8A C9SVSzf6l18M4Sq5+GqLEHRq0iX8Jcyrusd3fF6wkdRT8mtiNbtoA5/cFh6o5UmwT2KT SQeP1q3IXSxoCpiDapDvptpL6VDdaxWeh6TM8RAwX2sp3R2ktN6NjRCRxAs47O97GjT6 3u+rz+Z0ZmSMCVd7eBd2XUA6BQvdpErtEFyfausW2tywFsmn3FLsIKoMcM554ELpnhEo Ex0g== X-Received: by 10.220.198.133 with SMTP id eo5mr18486705vcb.24.1373447789532; Wed, 10 Jul 2013 02:16:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.65.72 with HTTP; Wed, 10 Jul 2013 02:16:14 -0700 (PDT) In-Reply-To: <9DFC0C4FC3594628A5B53FB743C76FE6@multiplay.co.uk> References: <9DFC0C4FC3594628A5B53FB743C76FE6@multiplay.co.uk> From: Jiansong Liu Date: Wed, 10 Jul 2013 17:16:14 +0800 Message-ID: Subject: Re: zpool import -D failed, "guid mismatch for provider /dev/da#:" To: Steven Hartland Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 09:16:30 -0000 After a svn update and rebuild, the problem resolved, thank you very much guys. Best regards, Jiansong On Wed, Jul 10, 2013 at 4:17 PM, Steven Hartland wrote: > This was fixed by r252056 in head and r252308 in stable/9 > > Regards > Steve > > ----- Original Message ----- From: "Jiansong Liu" > > > Hi All, >> >> I destroyed a pool and try to recovery with the "zpool import" command >> failed and it says have no pool to import, then I realized that I >> missed the "-D", so I run the command "zpool import -D", now it says >> UNAVAIL, the pool has six vdev (da0 da1 da2 da3 da5 da6) original: >> >> # zpool import -D >> pool: storage >> id: 8511691845980256432 >> state: UNAVAIL (DESTROYED) >> status: One or more devices are missing from the system. >> action: The pool cannot be imported. Attach the missing >> devices and try again. >> see: http://illumos.org/msg/ZFS-**8000-3C >> config: >> >> storage UNAVAIL insufficient replicas >> raidz2-0 UNAVAIL insufficient replicas >> 8368872959405194221 UNAVAIL cannot open >> 16925320148488343503 UNAVAIL cannot open >> 2721065418012152096 UNAVAIL cannot open >> 1488947662741999881 UNAVAIL cannot open >> 16956133848943560671 UNAVAIL cannot open >> 7236613667503893647 UNAVAIL cannot open >> >> Every time I run the "zpool import -D", the zfs debug will output a >> error for every vdev member, seems the vdev returned a wrong guid: >> >> vdev_geom_open_by_path:550[1]: Found provider by name /dev/da6. >> vdev_geom_attach:97[1]: Attaching to da6. >> vdev_geom_attach:118[1]: Created geom and consumer for da6. >> vdev_geom_read_config:243[1]: Reading config from da6... >> vdev_geom_detach:158[1]: Closing access to da6. >> vdev_geom_detach:162[1]: Destroyed consumer to da6. >> vdev_geom_detach:170[1]: Destroyed geom zfs::vdev. >> vdev_geom_open_by_path:562[1]: guid mismatch for provider /dev/da6: >> 7236613667503893647 != 0. >> vdev_geom_open_by_guid:518[1]: Searching by guid [7236613667503893647]. >> vdev_geom_read_config:243[1]: Reading config from da4s1g... >> vdev_geom_read_config:243[1]: Reading config from da4s1f... >> vdev_geom_read_config:243[1]: Reading config from da4s1e... >> vdev_geom_read_config:243[1]: Reading config from da4s1d... >> vdev_geom_read_config:243[1]: Reading config from da4s1b... >> vdev_geom_read_config:243[1]: Reading config from da4s1a... >> vdev_geom_read_config:243[1]: Reading config from da4s1... >> vdev_geom_read_config:243[1]: Reading config from da6... >> vdev_geom_read_config:243[1]: Reading config from da5... >> vdev_geom_read_config:243[1]: Reading config from da4... >> vdev_geom_read_config:243[1]: Reading config from da3... >> vdev_geom_read_config:243[1]: Reading config from da2... >> vdev_geom_read_config:243[1]: Reading config from da1... >> vdev_geom_read_config:243[1]: Reading config from da0... >> vdev_geom_open_by_guid:532[1]: Search by guid [7236613667503893647] >> failed. >> vdev_geom_open:617[1]: Provider /dev/da6 not found. >> >> the system version is 9-STABLE r250636 >> any comment and advice are appreciated, thanks in advance. >> >> >> Best regards, >> Jiansong Liu >> ______________________________**_________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/**mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@**freebsd.org >> " >> >> > ==============================**================== > This e.mail is private and confidential between Multiplay (UK) Ltd. and > the person or entity to whom it is addressed. In the event of misdirection, > the recipient is prohibited from using, copying, printing or otherwise > disseminating it or any information contained in it. > In the event of misdirection, illegible or incomplete transmission please > telephone +44 845 868 1337 > or return the E.mail to postmaster@multiplay.co.uk. > > From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 09:25:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D675FEB5; Wed, 10 Jul 2013 09:25:27 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id BDCB21E65; Wed, 10 Jul 2013 09:25:26 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004902843.msg; Wed, 10 Jul 2013 10:25:24 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 10:25:24 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> From: "Steven Hartland" To: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= , , , References: <86zjtupz3r.fsf@nine.des.no> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 10:25:36 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_1133_01CE7D57.D127DB40" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: ivoras@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 09:25:28 -0000 This is a multi-part message in MIME format. ------=_NextPart_000_1133_01CE7D57.D127DB40 Content-Type: text/plain; format=flowed; charset="utf-8"; reply-type=original Content-Transfer-Encoding: 8bit Hi DES, unfortunately you need a quite bit more than this to work compatibly. I've had a patch here that does just this for quite some time but there's been some discussion on how we want additional control over this so its not been commited. If others are interested I've attached this as it achieves what we needed here so may also be of use for others too. There's also a big discussion on illumos about this very subject ATM so I'm monitoring that too. Hopefully there will be a nice conclusion come from that how people want to proceed and we'll be able to get a change in that works for everyone. Regards Steve ----- Original Message ----- From: "Dag-Erling Smørgrav" To: ; Cc: Sent: Wednesday, July 10, 2013 10:02 AM Subject: Make ZFS use the physical sector size when computing initial ashift The attached patch causes ZFS to base the minimum transfer size for a new vdev on the GEOM provider's stripesize (physical sector size) rather than sectorsize (logical sector size), provided that stripesize is a power of two larger than sectorsize and smaller than or equal to VDEV_PAD_SIZE. This should eliminate the need for ivoras@'s gnop trick when creating ZFS pools on Advanced Format drives. DES -- Dag-Erling Smørgrav - des@des.no -------------------------------------------------------------------------------- > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. ------=_NextPart_000_1133_01CE7D57.D127DB40 Content-Type: application/octet-stream; name="zzz-zfs-ashift-fix.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="zzz-zfs-ashift-fix.patch" Changes zfs zpool initial / desired ashift to be based off stripesize=0A= instead of sectorsize making it compatible with drives marked with=0A= the 4k sector size quirk.=0A= =0A= Without the correct min block size BIO_DELETE requests passed to=0A= a large number of current SSD's via TRIM don't actually perform=0A= any LBA TRIM so its vital for the correct operation of TRIM to get=0A= the correct min block size.=0A= =0A= To do this we added the additional dashift (desired ashift) to=0A= vdev_open_func_t calls. This was needed as just updating ashift to=0A= be based off stripesize would mean that a devices reported minimum=0A= transfer size (ashift) could increase and that in turn would cause=0A= member devices to be unusable and hence break pools with error=0A= ZFS-8000-5E.=0A= =0A= The global minimum ashift used for new zpools can now also be=0A= tuned using the vfs.zfs.min_create_ashift sysctl. This defaults=0A= to 12 (4096 byte blocks) in order to optimise for newer disks which=0A= are migrating from 512 to 4096 byte sectors.=0A= =0A= The value of vfs.zfs.min_create_ashift is limited to min of=0A= SPA_MINBLOCKSHIFT (9) and a max of SPA_MAXBLOCKSHIFT (17).=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c.orig = 2011-06-06 09:36:46.000000000 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_disk.c = 2012-11-02 14:47:55.293668071 +0000=0A= @@ -32,6 +32,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for disks.=0A= */=0A= @@ -103,7 +105,7 @@=0A= }=0A= =0A= static int=0A= -vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_disk_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= spa_t *spa =3D vd->vdev_spa;=0A= vdev_disk_t *dvd;=0A= @@ -284,7 +286,7 @@=0A= }=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= * If the ioctl isn't supported, assume DEV_BSIZE.=0A= */=0A= if (ldi_ioctl(dvd->vd_lh, DKIOCGMEDIAINFOEXT, (intptr_t)&dkmext,=0A= @@ -292,6 +294,7 @@=0A= dkmext.dki_pbsize =3D DEV_BSIZE;=0A= =0A= *ashift =3D highbit(MAX(dkmext.dki_pbsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D highbit(MAX(dkmext.dki_pbsize, (1ULL << zfs_min_ashift))) = - 1;=0A= =0A= /*=0A= * Clear the nowritecache bit, so that on a vdev_reopen() we will=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c.orig = 2012-01-05 22:31:25.000000000 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c = 2012-11-02 14:47:38.252107541 +0000=0A= @@ -30,6 +30,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for files.=0A= */=0A= @@ -47,7 +49,7 @@=0A= }=0A= =0A= static int=0A= -vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_file_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_file_t *vf;=0A= vnode_t *vp;=0A= @@ -127,6 +129,7 @@=0A= =0A= *psize =3D vattr.va_size;=0A= *ashift =3D SPA_MINBLOCKSHIFT;=0A= + *dashift =3D zfs_min_ashift;=0A= =0A= return (0);=0A= }=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c.orig = 2012-11-02 12:20:15.918986181 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c = 2012-11-02 14:47:48.135273692 +0000=0A= @@ -36,6 +36,8 @@=0A= #include =0A= #include =0A= =0A= +extern int zfs_min_ashift;=0A= +=0A= /*=0A= * Virtual device vector for GEOM.=0A= */=0A= @@ -408,7 +410,7 @@=0A= }=0A= =0A= static int=0A= -vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_geom_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= struct g_provider *pp;=0A= struct g_consumer *cp;=0A= @@ -494,9 +496,10 @@=0A= *psize =3D pp->mediasize;=0A= =0A= /*=0A= - * Determine the device's minimum transfer size.=0A= + * Determine the device's minimum and desired transfer size.=0A= */=0A= *ashift =3D highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;=0A= + *dashift =3D highbit(MAX(pp->stripesize, (1ULL << zfs_min_ashift))) - = 1;=0A= =0A= /*=0A= * Clear the nowritecache settings, so that on a vdev_reopen()=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c.orig = 2012-07-03 11:49:22.342245151 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c = 2012-07-03 11:58:02.161948585 +0000=0A= @@ -127,7 +127,7 @@=0A= }=0A= =0A= static int=0A= -vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_mirror_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= int numerrors =3D 0;=0A= int lasterror =3D 0;=0A= @@ -150,6 +150,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D MAX(*dashift, cvd->vdev_dashift);=0A= }=0A= =0A= if (numerrors =3D=3D vd->vdev_children) {=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c.orig = 2012-07-03 11:49:10.545275865 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_missing.c = 2012-07-03 11:58:07.670470640 +0000=0A= @@ -40,7 +40,7 @@=0A= =0A= /* ARGSUSED */=0A= static int=0A= -vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift)=0A= +vdev_missing_open(vdev_t *vd, uint64_t *psize, uint64_t *ashift, = uint64_t *dashift)=0A= {=0A= /*=0A= * Really this should just fail. But then the root vdev will be in the=0A= @@ -50,6 +50,7 @@=0A= */=0A= *psize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= return (0);=0A= }=0A= =0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c.orig = 2012-07-03 11:49:03.675875505 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c = 2012-07-03 11:58:15.334806334 +0000=0A= @@ -1447,7 +1447,7 @@=0A= }=0A= =0A= static int=0A= -vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_raidz_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= vdev_t *cvd;=0A= uint64_t nparity =3D vd->vdev_nparity;=0A= @@ -1476,6 +1476,7 @@=0A= =0A= *asize =3D MIN(*asize - 1, cvd->vdev_asize - 1) + 1;=0A= *ashift =3D MAX(*ashift, cvd->vdev_ashift);=0A= + *dashift =3D MAX(*dashift, cvd->vdev_dashift);=0A= }=0A= =0A= *asize *=3D vd->vdev_children;=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c.orig = 2012-07-03 11:49:27.901760380 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_root.c = 2012-07-03 11:58:19.704427068 +0000=0A= @@ -50,7 +50,7 @@=0A= }=0A= =0A= static int=0A= -vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift)=0A= +vdev_root_open(vdev_t *vd, uint64_t *asize, uint64_t *ashift, uint64_t = *dashift)=0A= {=0A= int lasterror =3D 0;=0A= int numerrors =3D 0;=0A= @@ -78,6 +78,7 @@=0A= =0A= *asize =3D 0;=0A= *ashift =3D 0;=0A= + *dashift =3D 0;=0A= =0A= return (0);=0A= }=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c.orig = 2012-10-22 20:41:50.234005351 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c 2012-10-22 = 20:42:16.355805894 +0000=0A= @@ -1125,6 +1125,7 @@=0A= uint64_t osize =3D 0;=0A= uint64_t asize, psize;=0A= uint64_t ashift =3D 0;=0A= + uint64_t dashift =3D 0;=0A= =0A= ASSERT(vd->vdev_open_thread =3D=3D curthread ||=0A= spa_config_held(spa, SCL_STATE_ALL, RW_WRITER) =3D=3D = SCL_STATE_ALL);=0A= @@ -1154,7 +1155,7 @@=0A= return (ENXIO);=0A= }=0A= =0A= - error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift);=0A= + error =3D vd->vdev_ops->vdev_op_open(vd, &osize, &ashift, &dashift);=0A= =0A= /*=0A= * Reset the vdev_reopening flag so that we actually close=0A= @@ -1255,14 +1256,16 @@=0A= */=0A= vd->vdev_asize =3D asize;=0A= vd->vdev_ashift =3D MAX(ashift, vd->vdev_ashift);=0A= + vd->vdev_dashift =3D MAX(dashift, vd->vdev_dashift);=0A= } else {=0A= /*=0A= * Make sure the alignment requirement hasn't increased.=0A= */=0A= if (ashift > vd->vdev_top->vdev_ashift) {=0A= + printf("ZFS ashift open failure of %s (%ld > %ld)\n", vd->vdev_path, = ashift, vd->vdev_top->vdev_ashift);=0A= vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN,=0A= VDEV_AUX_BAD_LABEL);=0A= return (EINVAL);=0A= }=0A= }=0A= =0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c.orig = 2012-11-05 15:27:52.092194343 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c = 2012-11-05 15:53:26.449021023 +0000=0A= @@ -145,9 +145,12 @@=0A= #include =0A= =0A= static boolean_t vdev_trim_on_init =3D B_TRUE;=0A= +static boolean_t vdev_dashift_enable =3D B_TRUE;=0A= SYSCTL_DECL(_vfs_zfs_vdev);=0A= SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, trim_on_init, CTLFLAG_RW,=0A= &vdev_trim_on_init, 0, "Enable/disable full vdev trim on = initialisation");=0A= +SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, optimal_ashift, CTLFLAG_RW,=0A= + &vdev_dashift_enable, 0, "Enable/disable optimal ashift usage on = initialisation");=0A= =0A= /*=0A= * Basic routines to read and write from a vdev label.=0A= @@ -282,6 +285,16 @@=0A= vd->vdev_ms_array) =3D=3D 0);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_METASLAB_SHIFT,=0A= vd->vdev_ms_shift) =3D=3D 0);=0A= + /*=0A= + * We use the max of ashift and dashift (the desired/optimal=0A= + * ashift), which is typically the stripesize of a device, to=0A= + * ensure we get the best performance from underlying devices.=0A= + * =0A= + * Its done here as it should only ever have an effect on new=0A= + * zpool creation.=0A= + */=0A= + if (vdev_dashift_enable)=0A= + vd->vdev_ashift =3D MAX(vd->vdev_ashift, vd->vdev_dashift);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_ASHIFT,=0A= vd->vdev_ashift) =3D=3D 0);=0A= VERIFY(nvlist_add_uint64(nv, ZPOOL_CONFIG_ASIZE,=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h.orig = 2012-10-22 20:40:08.361577293 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h = 2012-10-22 21:02:52.447781800 +0000=0A= @@ -55,7 +55,7 @@=0A= /*=0A= * Virtual device operations=0A= */=0A= -typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift);=0A= +typedef int vdev_open_func_t(vdev_t *vd, uint64_t *size, uint64_t = *ashift, uint64_t *dashift);=0A= typedef void vdev_close_func_t(vdev_t *vd);=0A= typedef uint64_t vdev_asize_func_t(vdev_t *vd, uint64_t psize);=0A= typedef int vdev_io_start_func_t(zio_t *zio);=0A= @@ -119,6 +119,7 @@=0A= uint64_t vdev_asize; /* allocatable device capacity */=0A= uint64_t vdev_min_asize; /* min acceptable asize */=0A= uint64_t vdev_ashift; /* block alignment shift */=0A= + uint64_t vdev_dashift; /* desired blk alignment shift */=0A= uint64_t vdev_state; /* see VDEV_STATE_* #defines */=0A= uint64_t vdev_prevstate; /* used when reopening a vdev */=0A= vdev_ops_t *vdev_ops; /* vdev operations */=0A= --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c.orig = 2012-11-02 14:56:29.474248887 +0000=0A= +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c 2012-11-03 = 01:27:28.066912403 +0000=0A= @@ -41,6 +41,30 @@=0A= #include =0A= #include =0A= =0A= +#define ZFS_MIN_ASHIFT SPA_MINBLOCKSHIFT=0A= +/*=0A= + * Max ashift - limited by how labels are accessed by zio_read_phys = using offsets=0A= + * within vdev_label_t=0A= + *=0A= + * If label access is fixed to work with ashift properly then the max = should be=0A= + * set to SPA_MAXBLOCKSHIFT=0A= + */=0A= +#define ZFS_MAX_ASHIFT 13=0A= +/*=0A= + * Optimum ashift - defaults to 12 which results in a min block size of = 4096 as=0A= + * this is the optimum value for newer disks which are migrating from = 512 to 4096=0A= + * byte sectors=0A= + */=0A= +#define ZFS_OPTIMUM_ASHIFT 12 =0A= +=0A= +/*=0A= + * Minimum ashift used when creating new pools=0A= + *=0A= + * This can be tuned using the sysctl vfs.zfs.min_create_ashift but is = limited=0A= + * to a min of ZFS_MIN_ASHIFT and a max of ZFS_MAX_ASHIFT=0A= + * =0A= + */=0A= +int zfs_min_ashift =3D MAX(SPA_MINBLOCKSHIFT, ZFS_OPTIMUM_ASHIFT);=0A= int zfs_no_write_throttle =3D 0;=0A= int zfs_write_limit_shift =3D 3; /* 1/8th of physical memory */=0A= int zfs_txg_synctime_ms =3D 1000; /* target millisecs to sync a txg */=0A= @@ -54,6 +78,9 @@=0A= =0A= static pgcnt_t old_physmem =3D 0;=0A= =0A= +#ifdef _KERNEL=0A= +static int min_ashift_sysctl(SYSCTL_HANDLER_ARGS);=0A= +=0A= SYSCTL_DECL(_vfs_zfs);=0A= TUNABLE_INT("vfs.zfs.no_write_throttle", &zfs_no_write_throttle);=0A= SYSCTL_INT(_vfs_zfs, OID_AUTO, no_write_throttle, CTLFLAG_RDTUN,=0A= @@ -78,6 +105,32 @@=0A= TUNABLE_QUAD("vfs.zfs.write_limit_override", &zfs_write_limit_override);=0A= SYSCTL_QUAD(_vfs_zfs, OID_AUTO, write_limit_override, CTLFLAG_RDTUN,=0A= &zfs_write_limit_override, 0, "");=0A= +SYSCTL_PROC(_vfs_zfs, OID_AUTO, min_create_ashift, CTLTYPE_INT | = CTLFLAG_RW,=0A= + &zfs_min_ashift, 0, min_ashift_sysctl, "I",=0A= + "Minimum ashift used when creating new pools");=0A= +=0A= +static int=0A= +min_ashift_sysctl(SYSCTL_HANDLER_ARGS)=0A= +{=0A= + int error, value;=0A= +=0A= + value =3D *(int *)arg1;=0A= +=0A= + error =3D sysctl_handle_int(oidp, &value, 0, req);=0A= +=0A= + if ((error !=3D 0) || (req->newptr =3D=3D NULL))=0A= + return (error);=0A= +=0A= + if (value < ZFS_MIN_ASHIFT)=0A= + value =3D ZFS_MIN_ASHIFT;=0A= + else if (value > ZFS_MAX_ASHIFT)=0A= + value =3D ZFS_MAX_ASHIFT;=0A= +=0A= + *(int *)arg1 =3D value;=0A= +=0A= + return (0);=0A= +}=0A= +#endif=0A= =0A= int=0A= dsl_pool_open_special_dir(dsl_pool_t *dp, const char *name, dsl_dir_t = **ddp)=0A= ------=_NextPart_000_1133_01CE7D57.D127DB40-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 10:46:27 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 98129BB2; Wed, 10 Jul 2013 10:46:27 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id 5A20012B2; Wed, 10 Jul 2013 10:46:27 +0000 (UTC) Received: from nine.des.no (smtp.des.no [194.63.250.102]) by smtp-int.des.no (Postfix) with ESMTP id 2FD32447F; Wed, 10 Jul 2013 10:46:26 +0000 (UTC) Received: by nine.des.no (Postfix, from userid 1001) id 85689352D6; Wed, 10 Jul 2013 12:46:10 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: "Steven Hartland" Subject: Re: Make ZFS use the physical sector size when computing initial ashift References: <86zjtupz3r.fsf@nine.des.no> <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> Date: Wed, 10 Jul 2013 12:46:10 +0200 In-Reply-To: <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> (Steven Hartland's message of "Wed, 10 Jul 2013 10:25:36 +0100") Message-ID: <86vc4ipua5.fsf@nine.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, zfs-devel@FreeBSD.org, ivoras@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 10:46:27 -0000 "Steven Hartland" writes: > Hi DES, unfortunately you need a quite bit more than this to work > compatibly. *chirp* *chirp* *chirp* DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 10:55:51 2013 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 30C9B3FB for ; Wed, 10 Jul 2013 10:55:51 +0000 (UTC) (envelope-from bryan-lists@shatow.net) Received: from secure.xzibition.com (secure.xzibition.com [173.160.118.92]) by mx1.freebsd.org (Postfix) with ESMTP id F020F1349 for ; Wed, 10 Jul 2013 10:55:50 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; c=nofws; d=shatow.net; h=message-id :date:from:mime-version:to:subject:content-type :content-transfer-encoding; q=dns; s=sweb; b=26o8EEbRbiqZ+U02P+D kuMD/RTo3PTYsoZTG2YZmCPaNykK+VxDqQ2davgMsHAqBrBNiedwuICoYO5L1Dzt 5fmjgRKsxDI5MMG/FpFg+hbbRSYlYUGDn6L97SMEZ5A0/kuVun8ar9j0iUusBjxC XG3Y2vW9kPNJnOP9mgRH9hvw= DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=shatow.net; h=message-id :date:from:mime-version:to:subject:content-type :content-transfer-encoding; s=sweb; bh=kE3aRJHASQUD0IYOtKKYsfShW //TNzZYzaCS29AGNaw=; b=n6LBmFG25tipXXbPwcaeRYM55Zcg7Qxq+aUiVK52w Ei4ehLM/j9qXGSkMDESRhajFCohaSjRT5eeKwbFRdJ5s8VKjtHq3fdCVj/AWrXaa S6BFT28tuooMCVnw1IwzEONuLFYDNmHTLvg3OmK9ugRjy3Z1ao4IblcYxMsrFaf2 zk= Received: (qmail 34209 invoked from network); 10 Jul 2013 05:49:08 -0500 Received: from unknown (HELO ?10.10.0.24?) (bryan@shatow.net@10.10.0.24) by sweb.xzibition.com with ESMTPA; 10 Jul 2013 05:49:08 -0500 Message-ID: <51DD3C1F.1000609@shatow.net> Date: Wed, 10 Jul 2013 05:49:03 -0500 From: Bryan Drewery User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 MIME-Version: 1.0 To: freebsd-fs@FreeBSD.org, FreeBSD Current Subject: NFS panic: newnfs_copycred: negative nfsc_ngroups (client HEAD r253033, server 9.1-R) X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 10:55:51 -0000 I received this panic on the client while doing heavy parallel reads/writes over NFS. I only recently moved these files to NFS, so I don't know whether or not it's a recent regression. Client: HEAD r253033 Server: 9.1-R core.txt: http://people.freebsd.org/~bdrewery/nfs.txt fstab of related paths: > tank:/tank/distfiles/freebsd /mnt/distfiles nfs rw,bg,noatime,intr,rsize=65536,wsize=65536,readahead=8,nfsv4 0 0 > tank:/usr/packages/ /mnt/all-packages nfs rw,bg,noatime,soft,retrycnt=3,rsize=65536,wsize=65536,readahead=8,nfsv4 0 0 Server: params on these paths: -maproot=root -network 10.10.0.0/16 tcpdump at the time: > 21:43:05.396585 IP 10.10.0.7.4180315003 > 10.10.0.5.2049: 168 getattr fh 0,4/2 > 21:43:05.396589 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48265029:48266477, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396603 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48266477:48267925, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396605 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack 48266477, win 3916, options [nop,nop,TS val 596674 ecr 1950216660], length 0 > 21:43:05.396608 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48267925:48269373, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396621 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48269373:48270821, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396624 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack 48269373, win 3870, options [nop,nop,TS val 596674 ecr 1950216660], length 0 > 21:43:05.396641 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48270821:48272269, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396653 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48272269:48273717, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396656 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack 48272269, win 3825, options [nop,nop,TS val 596674 ecr 1950216660], length 0 > 21:43:05.396659 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48273717:48275165, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396671 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48275165:48276613, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396674 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack 48275165, win 3780, options [nop,nop,TS val 596674 ecr 1950216660], length 0 > 21:43:05.396676 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48276613:48278061, ack 4394885, win 29124, options [nop,nop,TS val 1950216660 ecr 596674], length 1448 > 21:43:05.396689 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq 48278061:48279509, ack 4394885, win 29124, options [nop,nop,TS val > Write failed: Broken pipe I have nfsuserd running on both client/server. nfscbd is running. nfs_client_enable=yes in rc.conf. User lookups seem to work fine: > -rw-r--r-- 1 root bryan 1554804 Jul 6 10:50 /mnt/distfiles/pkg-1.1.4.tar.xz I ran a find -ls on these paths and all files return a user/group. I am guessing there is a race condition with files being written and looking up the associated groups. -- Regards, Bryan Drewery bdrewery@freenode/EFNet From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 11:03:20 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id AD306958; Wed, 10 Jul 2013 11:03:20 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from proxypop04.sare.net (proxypop04.sare.net [194.30.0.65]) by mx1.freebsd.org (Postfix) with ESMTP id 6FF721453; Wed, 10 Jul 2013 11:03:20 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop04.sare.net (Postfix) with ESMTPSA id 0E5DE9DC4EE; Wed, 10 Jul 2013 13:03:13 +0200 (CEST) Subject: Re: Make ZFS use the physical sector size when computing initial ashift Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=us-ascii From: Borja Marcos X-Priority: 3 In-Reply-To: <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> Date: Wed, 10 Jul 2013 13:03:09 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <774B60E8-19C2-4A3A-880D-0D8726DC6727@sarenet.es> References: <86zjtupz3r.fsf@nine.des.no> <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> To: Steven Hartland X-Mailer: Apple Mail (2.1283) Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , zfs-devel@FreeBSD.org, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 11:03:20 -0000 On Jul 10, 2013, at 11:25 AM, Steven Hartland wrote: > If others are interested I've attached this as it achieves what we = needed here so > may also be of use for others too. >=20 > There's also a big discussion on illumos about this very subject ATM = so I'm > monitoring that too. >=20 > Hopefully there will be a nice conclusion come from that how people = want to > proceed and we'll be able to get a change in that works for everyone. Hmm. I wonder if the simplest approach would be the better. I mean, = adding a flag to zpool. At home I have a playground FreeBSD machine with a ZFS zmirror, and, you = guessed it, I was careless when I purchased the components, I asked for two "1 TB drives" = and that I got, but different models, one of them "advanced format" and the other one "classic". I don't think it's that bad to create a pool on a classic disk using 4 = KB blocks, and it's quite likely that replacement disks will be 4 KB in the near future.=20 Also, if you use SSDs the situation is similar. Borja. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 11:10:44 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2686ED3A; Wed, 10 Jul 2013 11:10:44 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 5096715E8; Wed, 10 Jul 2013 11:10:43 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004904388.msg; Wed, 10 Jul 2013 12:10:35 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 12:10:35 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <197F9EAB64AD4A1FBC4DD75F7255D55D@multiplay.co.uk> From: "Steven Hartland" To: "Borja Marcos" References: <86zjtupz3r.fsf@nine.des.no> <628C5D1AF6044488B708484203D70B7A@multiplay.co.uk> <774B60E8-19C2-4A3A-880D-0D8726DC6727@sarenet.es> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 12:10:54 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , zfs-devel@FreeBSD.org, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 11:10:44 -0000 There's lots more to consider when considering a way foward not least of all ashift isn't a zpool configuration option is per top level vdev, space consideration of moving from 512b to 4k, see previous and current discussions on zfs-devel@freebsd.org and zfs@lists.illumos.org for details. Regards Steve ----- Original Message ----- From: "Borja Marcos" On Jul 10, 2013, at 11:25 AM, Steven Hartland wrote: > If others are interested I've attached this as it achieves what we needed here so > may also be of use for others too. > > There's also a big discussion on illumos about this very subject ATM so I'm > monitoring that too. > > Hopefully there will be a nice conclusion come from that how people want to > proceed and we'll be able to get a change in that works for everyone. Hmm. I wonder if the simplest approach would be the better. I mean, adding a flag to zpool. At home I have a playground FreeBSD machine with a ZFS zmirror, and, you guessed it, I was careless when I purchased the components, I asked for two "1 TB drives" and that I got, but different models, one of them "advanced format" and the other one "classic". I don't think it's that bad to create a pool on a classic disk using 4 KB blocks, and it's quite likely that replacement disks will be 4 KB in the near future. Also, if you use SSDs the situation is similar. ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 13:13:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9A5AB857; Wed, 10 Jul 2013 13:13:01 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from kirsi1.inet.fi (mta-out.inet.fi [195.156.147.13]) by mx1.freebsd.org (Postfix) with ESMTP id 0B5A51C7D; Wed, 10 Jul 2013 13:13:01 +0000 (UTC) Received: from jh (84.250.9.81) by kirsi1.inet.fi (8.5.140.03) (authenticated as heinja-g6) id 51BB5BE301E813B2; Wed, 10 Jul 2013 16:11:51 +0300 Date: Wed, 10 Jul 2013 16:11:50 +0300 From: Jaakko Heinonen To: Robert Millan Subject: Re: Compatibility options for mount(8) Message-ID: <20130710131150.GA74301@jh> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 13:13:01 -0000 On 2013-07-02, Robert Millan wrote: > - Map "-o remount" to its FreeBSD equivalent, "-o update". I am not sure if mount(8) is the right place for the translation. This seems to be the first string option translated by mount(8). The "rdonly" compatibility option is translated to "ro" in kernel. Looks inconsistent to me. -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 14:43:05 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A7635EEA; Wed, 10 Jul 2013 14:43:05 +0000 (UTC) (envelope-from c.kworr@gmail.com) Received: from mail-lb0-x233.google.com (mail-lb0-x233.google.com [IPv6:2a00:1450:4010:c04::233]) by mx1.freebsd.org (Postfix) with ESMTP id F23681231; Wed, 10 Jul 2013 14:43:04 +0000 (UTC) Received: by mail-lb0-f179.google.com with SMTP id w20so5725303lbh.38 for ; Wed, 10 Jul 2013 07:43:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=T+wtj9dsT9X29vQvfbzYbVtb8wdrLpRsAQo9BWrUNMU=; b=UnGxWrH84PQ6ILWEwYk9GR7lRObtxViLuqpDZbCkX+/WX+lQ53diZia0RsGXedQ95f FBo0qPehBD3kuDsDdmjWSmIqUKVkA4ueraNpQkqlJMlnldc0N8w/xWZP+pvdKju8c2w3 zMYV5YcCf5Tx8khoapj8es2miim+h0XV00EH386zwJw5MA/slLBuINagbQ28KCpu2nH+ 7ipTZzzxqXghdDCHWAQ5BKcsRPTk7Wv7pdHpENmyuztyNX2P/WOM4iUT24iFIyh9YLDC xax/qu3J/e5RXihLjbyyHAcuxAXIh/tCu8onNQUZnVFJotKtfnDEDFTSBlJYlpudoHAH 5B5Q== X-Received: by 10.152.27.9 with SMTP id p9mr15317348lag.4.1373467383949; Wed, 10 Jul 2013 07:43:03 -0700 (PDT) Received: from [192.168.1.139] (mau.donbass.com. [92.242.127.250]) by mx.google.com with ESMTPSA id n17sm10864712lbv.2.2013.07.10.07.43.02 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 10 Jul 2013 07:43:03 -0700 (PDT) Message-ID: <51DD72F5.8090008@gmail.com> Date: Wed, 10 Jul 2013 17:43:01 +0300 From: Volodymyr Kostyrko User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130627 Thunderbird/17.0.7 MIME-Version: 1.0 To: d@delphij.net Subject: Re: ZFS default compression algo for contemporary FreeBSD versions References: <51D576E1.6030803@gmail.com> <51D59B6C.5030600@gmail.com> <51D59C88.9060403@FreeBSD.org> <51D5DAB9.4070507@gmail.com> <51D5DCDF.2030503@delphij.net> <51D5DEC4.2000101@gmail.com> <51D5E42C.5010506@delphij.net> In-Reply-To: <51D5E42C.5010506@delphij.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Cc: freebsd-fs@FreeBSD.org, Dmitry Morozovsky , Andriy Gapon X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 14:43:05 -0000 05.07.2013 00:07, Xin Li wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On 7/4/13 1:44 PM, Volodymyr Kostyrko wrote: >> 04.07.2013 23:36, Xin Li wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >>> >>> On 7/4/13 1:27 PM, Volodymyr Kostyrko wrote: >>>> 04.07.2013 19:02, Andriy Gapon wrote: >>>>> on 04/07/2013 18:57 Volodymyr Kostyrko said the following: >>>>>> Yes. Much better in terms of speed. >>>>> >>>>> And compression too. >>>> >>>> Can't really say. >>>> >>>> When the code first appeared in stable I moved two of my >>>> machines (desktops) to LZ4 recreating each dataset. To my >>>> surprise gain at transition from lzjb was fairly minimal and >>>> sometimes LZ4 even loses to lzjb in compression size. However >>>> better compression/decompression speed and moreover earlier >>>> takeoff when data is incompressible clearly makes lz4 a >>>> winner. >>> >>> I'm interested in this -- what's the nature of data on that >>> dataset (e.g. plain text? binaries? images?) >> >> Triple no. Biggest difference in lzjb favor was at zvol with Mac OS >> X Snow Leo. >> >> Maybe it's just because recordsize is too small on zvols? Anyway >> the difference was like a 1% or 2%. Can't remember but can retest. > > Hmm that's weird. I haven't tried Mac iSCSI volumes but do have tried > Windows iSCSI volumes, and lz4 was a win. > > It may be helpful if you can post your 'zfs get all ' > output so we can try to reproduce the problem at lab? Sorry, my virtual machines are all phased out so can't reproduce. At most I can see this now with this two pools: arcade@ar1l0u\/home/arcade# zfs get all ar1l0u/vbox_mac_0_cp NAME PROPERTY VALUE SOURCE ar1l0u/vbox_mac_0_cp type volume - ar1l0u/vbox_mac_0_cp creation ср лип 10 11:55 2013 - ar1l0u/vbox_mac_0_cp used 18,2G - ar1l0u/vbox_mac_0_cp available 13,0G - ar1l0u/vbox_mac_0_cp referenced 18,2G - ar1l0u/vbox_mac_0_cp compressratio 1.11x - ar1l0u/vbox_mac_0_cp reservation none default ar1l0u/vbox_mac_0_cp volsize 30G local ar1l0u/vbox_mac_0_cp volblocksize 8K - ar1l0u/vbox_mac_0_cp checksum sha256 inherited from ar1l0u ar1l0u/vbox_mac_0_cp compression lzjb local ar1l0u/vbox_mac_0_cp readonly off default ar1l0u/vbox_mac_0_cp copies 1 default ar1l0u/vbox_mac_0_cp refreservation none default ar1l0u/vbox_mac_0_cp primarycache metadata local ar1l0u/vbox_mac_0_cp secondarycache all default ar1l0u/vbox_mac_0_cp usedbysnapshots 0 - ar1l0u/vbox_mac_0_cp usedbydataset 18,2G - ar1l0u/vbox_mac_0_cp usedbychildren 0 - ar1l0u/vbox_mac_0_cp usedbyrefreservation 0 - ar1l0u/vbox_mac_0_cp logbias throughput local ar1l0u/vbox_mac_0_cp dedup off default ar1l0u/vbox_mac_0_cp mlslabel - ar1l0u/vbox_mac_0_cp sync standard default ar1l0u/vbox_mac_0_cp refcompressratio 1.11x - ar1l0u/vbox_mac_0_cp written 18,2G - ar1l0u/vbox_mac_0_cp logicalused 20,2G - ar1l0u/vbox_mac_0_cp logicalreferenced 20,2G - arcade@ar1l0u\/home/arcade# zfs get all ar1l0u/vbox_mac_1 NAME PROPERTY VALUE SOURCE ar1l0u/vbox_mac_1 type volume - ar1l0u/vbox_mac_1 creation ср чер 26 18:02 2013 - ar1l0u/vbox_mac_1 used 5,48G - ar1l0u/vbox_mac_1 available 13,0G - ar1l0u/vbox_mac_1 referenced 18,0G - ar1l0u/vbox_mac_1 compressratio 1.09x - ar1l0u/vbox_mac_1 origin ar1l0u/vbox_mac_0@xcode - ar1l0u/vbox_mac_1 reservation none default ar1l0u/vbox_mac_1 volsize 30G local ar1l0u/vbox_mac_1 volblocksize 8K - ar1l0u/vbox_mac_1 checksum sha256 inherited from ar1l0u ar1l0u/vbox_mac_1 compression lz4 inherited from ar1l0u ar1l0u/vbox_mac_1 readonly off default ar1l0u/vbox_mac_1 copies 1 default ar1l0u/vbox_mac_1 refreservation none default ar1l0u/vbox_mac_1 primarycache metadata local ar1l0u/vbox_mac_1 secondarycache all default ar1l0u/vbox_mac_1 usedbysnapshots 0 - ar1l0u/vbox_mac_1 usedbydataset 5,48G - ar1l0u/vbox_mac_1 usedbychildren 0 - ar1l0u/vbox_mac_1 usedbyrefreservation 0 - ar1l0u/vbox_mac_1 logbias throughput local ar1l0u/vbox_mac_1 dedup off default ar1l0u/vbox_mac_1 mlslabel - ar1l0u/vbox_mac_1 sync standard default ar1l0u/vbox_mac_1 refcompressratio 1.13x - ar1l0u/vbox_mac_1 written 5,48G - ar1l0u/vbox_mac_1 logicalused 5,98G - ar1l0u/vbox_mac_1 logicalreferenced 20,2G - The latter is actually a clone of some other pool so I'm not sure whether this a result of working with snapshots or a real compression difference. -- Sphinx of black quartz, judge my vow. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 16:50:21 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5C3965A7; Wed, 10 Jul 2013 16:50:21 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-qc0-x231.google.com (mail-qc0-x231.google.com [IPv6:2607:f8b0:400d:c01::231]) by mx1.freebsd.org (Postfix) with ESMTP id 01FA71D35; Wed, 10 Jul 2013 16:50:20 +0000 (UTC) Received: by mail-qc0-f177.google.com with SMTP id n1so3704244qcx.8 for ; Wed, 10 Jul 2013 09:50:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=AmCt1GzwaZCCpyOCYz8zhrSjI1VPhCiCainfFADeGy4=; b=VC88wQR6AF4PHIiVxA7JEWQPw5P+WXBMPKdfacboTdLVj5DxtuPSOVOrWLpya6n7OW Xah/bOjPBXkp3deY5zg7gBriAHfvrk/Da7PMRsstQikirQZseamM+8c61nHKIUeXKk0o 5Mgff9tKEpMZW7ukE9mYT3ZHVA6DjlhuT7jM3TsQFymSIlKIwDEhD0O9EuR0AnTf9gKh j8y65AFAezKXoonwOva8iGPyh11M5Ea22yVPkUAh1Co/F2t0GjSCmuEHsxXDjOzon6qK 9qtEv5PzDJhMxm7h812WeWzlFR9/MZG15LeHJu/8Q5xuVmzvmHqZVRqNINmX/Qb4hj3A aIeA== MIME-Version: 1.0 X-Received: by 10.224.127.73 with SMTP id f9mr28788036qas.4.1373475020543; Wed, 10 Jul 2013 09:50:20 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.224.195.72 with HTTP; Wed, 10 Jul 2013 09:50:19 -0700 (PDT) In-Reply-To: <51DCFEDA.1090901@FreeBSD.org> References: <51DCFEDA.1090901@FreeBSD.org> Date: Wed, 10 Jul 2013 09:50:19 -0700 X-Google-Sender-Auth: QMfOZsPHqNrttdI8Zn7jI9WzjJw Message-ID: Subject: Re: Deadlock in nullfs/zfs somewhere From: Adrian Chadd To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, freebsd-current X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 16:50:21 -0000 On 9 July 2013 23:27, Andriy Gapon wrote: > on 09/07/2013 16:03 Adrian Chadd said the following: >> Does anyone have any ideas as to what's going on? > > Please provide output of 'thread apply all bt' from kgdb, then perhaps someone > might be able to tell. Done - http://people.freebsd.org/~adrian/ath/20130710-vm0-zfs-hang.txt adrian From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 17:21:09 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2077B847; Wed, 10 Jul 2013 17:21:09 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212]) by mx1.freebsd.org (Postfix) with ESMTP id 0C6201EF6; Wed, 10 Jul 2013 17:21:08 +0000 (UTC) Received: from zeta.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by anubis.delphij.net (Postfix) with ESMTPSA id 6BE4499BE; Wed, 10 Jul 2013 10:21:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=delphij.net; s=anubis; t=1373476867; bh=QWvdSJ8fOuO5/5aNWRb6P76tenHC3kB+m/UvBmfWclI=; h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To; b=kT7QZTcEcKXKd/5RzfQJW5lU3NogRHfy2Rzb06m9QqvqizHRMaBWxS604Vzh372ww QJth8LvJ0u13P7HRHcKwqChUG+qLJ5yIY80fRB5W3L90TPC4R0ocrX7aNkjflni6tu Bs+17tlBb5ffPo+6t/pREZzieHwHv/V9KFj7szkw= Message-ID: <51DD9801.4090808@delphij.net> Date: Wed, 10 Jul 2013 10:21:05 -0700 From: Xin Li Organization: The FreeBSD Project MIME-Version: 1.0 To: =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= Subject: Re: Make ZFS use the physical sector size when computing initial ashift References: <86zjtupz3r.fsf@nine.des.no> In-Reply-To: <86zjtupz3r.fsf@nine.des.no> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, ivoras@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: d@delphij.net List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 17:21:09 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 07/10/13 02:02, Dag-Erling Sm￸rgrav wrote: > The attached patch causes ZFS to base the minimum transfer size for > a new vdev on the GEOM provider's stripesize (physical sector size) > rather than sectorsize (logical sector size), provided that > stripesize is a power of two larger than sectorsize and smaller > than or equal to VDEV_PAD_SIZE. This should eliminate the need for > ivoras@'s gnop trick when creating ZFS pools on Advanced Format > drives. I think there are multiple versions of this (I also have one[1]) but the concern is that if one creates a pool with ashift=9, and now ashift=12, the pool gets unimportable. So there need a way to disable this behavior. Another thing (not really related to the automatic detection) is that we need a way to manually override this setting from command line when creating the pool, this is under active discussion at Illumos mailing list right now. [1] https://github.com/trueos/trueos/commit/3d2e3a38faad8df4acf442b055c5e98ab873fb26 Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- iQEcBAEBCgAGBQJR3ZgAAAoJEG80Jeu8UPuzM6kIALu3Ud4uu+kdcsp+zNS54iw6 Etx2xWOjbHhJ1PZ0BKJ4R5/BOfpW4b1DrarPtpZLxoyg55GwlEVCH8Cia9ucznfP KgFGwzztQlsiI5hcWD6RVNkAx/2o7sSynbprxxP1UdEdmH7f5MWVpNwjGE2KiIpA 0TxfTu8Sg0/QB7h3pGWt5sJSuwyogewvHIfTAgHEqnQdYPXxpadH7PS7shSJVdim z2C9GoyLVQ6BMxXzQDcmA+fllgMZVKXROG7SxDFNDTWPnZ9HMZp2OJKELLtuZB1y Iaq/gd3uPR2ZzPxw2OjdYKe7khWtmuU5Ox6+natsOKCqfoAfCjArA8zJZYsZoMI= =Nd1V -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 17:24:25 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 00E419D5 for ; Wed, 10 Jul 2013 17:24:24 +0000 (UTC) (envelope-from gvazz@yahoo.com) Received: from nm23-vm6.bullet.mail.ne1.yahoo.com (nm23-vm6.bullet.mail.ne1.yahoo.com [98.138.91.116]) by mx1.freebsd.org (Postfix) with ESMTP id A71691F22 for ; Wed, 10 Jul 2013 17:24:24 +0000 (UTC) Received: from [98.138.101.131] by nm23.bullet.mail.ne1.yahoo.com with NNFMP; 10 Jul 2013 17:24:18 -0000 Received: from [98.138.101.182] by tm19.bullet.mail.ne1.yahoo.com with NNFMP; 10 Jul 2013 17:24:18 -0000 Received: from [127.0.0.1] by omp1093.mail.ne1.yahoo.com with NNFMP; 10 Jul 2013 17:24:18 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 625724.14538.bm@omp1093.mail.ne1.yahoo.com Received: (qmail 14597 invoked by uid 60001); 10 Jul 2013 17:24:18 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1373477058; bh=+sj2AFxlFW+lxBOBfq9bmtctty32zwqoguL6R3XQzPQ=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=0QERRWaFpbdELYQzXnTD54lOD9aMGr04fy2hgfhC0JNrZKplbJYuiWPPVCVQImwHVozfEyxbJTd1c9nupdX8U4fecWSEkt1++WaFz3T8TVoxrooI8boFeuk8iUH4uq9K02XMQnV+0p1/BLN7rk3sXLBDaaBPZYxK0h4RBOOD0pM= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=GYGpuEubXnd42/Gb2Xj9zWDARfcMmmN8rtGKfSg3x/6fzjnv56Cz7b0N04K9vPr6xi8NfzC4BW3pqM8qARonNTr8W75Lnn0Mz5GQ19J5YQ2vwajoRF+eMEllZXg9QdVDfYRh0zmDKUa+fU1LLe8jXwejxpYfldY8cK077/S5Wag= ; X-YMail-OSG: Ld5w4ugVM1lisy7qkvYsGcIiC4aN0g8MNDVpb43e.gNEsW2 7DQT.ToNADEFTQwv9NJYE9s8vf3TGjEkPenx1AaDvQFna8tIdW12IuYB0wP4 k1opZZIu8jtzHcBysM6o4flYbK_ZNkzxEfXBw4loMvo_9KXLkE2Iu2kyrxek N14GhgK5ODurh9v0wmFvtOdAvrarYlSgfsDyKDT8PetGFm9V053FawwhSdsk aRQSwE8fa5yHO0cS83K2jebSmXRKZb3OWDa2B4H5aLua1lf5EiKmfuRuHNx7 J0sjX4G5SVAr3owiw.8gc_rTkyB.SlOT97B2JfeQ4OSIoELnMKiUmoAecXVB pk23wqhTO4ZwqzNtK4c13WgCbOfjwDj593apvjgYh5RqJ5OMGI41WfEUy._v JDOgWshbXpz__N2KyMB7zsBLKlJwMVbLt0Oh82UuHBoS4bSwonxxb1naJYw0 ok4f8WVHU_v.26mil6v9_gL3XUt3Z6qYr8XltLkuFCLMS4hGEm1sCfm1NwKT fMrRbw24TjE.y2JEil9k96wPv4qiZ4gAtMrjwRJdF5Ka7.dS9.O6opFVWqd8 wqCEiAYgU5cTLc3mqiRF0yQ-- Received: from [148.87.67.210] by web120501.mail.ne1.yahoo.com via HTTP; Wed, 10 Jul 2013 10:24:18 PDT X-Rocket-MIMEInfo: 002.001, wqAgSSB3b3VsZCBsaWtlIHRvIGtub3cgd2hhdCBpcyB0aGUgYnVmZmVyIHNpemUgcGFzc2VkIGJ5IHRoZSBjbGllbnQgYXMgYW4gYXJndW1lbnQgaW4gdGhlIHJlYWRkaXIvcmVhZGRpcnBsdXMKcmVxdWVzdC4gSXMgaXQgYmFzZWQgb24gZHRwcmVmIHNldHRpbmcgb24gdGhlIHNlcnZlcj8gT1IgaXMgaXTCoCBhIGZpeGVkIHZhbHVlLgoKwqBJIGFtIGNoYW5naW5nIGR0cHJlZiB2YWx1ZSBvbiB0aGUgU29sYXJpcyBhbmQgSSB3b3VsZCBsaWtlIHRvIGtub3cgaG93IGZyZWVic2QgY2xpZW50IHdvdWxkIHJlYWQBMAEBAQE- X-Mailer: YahooMailWebService/0.8.148.557 Message-ID: <1373477058.14313.YahooMailNeo@web120501.mail.ne1.yahoo.com> Date: Wed, 10 Jul 2013 10:24:18 -0700 (PDT) From: G V Subject: Nfs readdir/readdirplus buffer size To: "freebsd-fs@freebsd.org" MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: G V List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 17:24:25 -0000 =A0 I would like to know what is the buffer size passed by the client as an= argument in the readdir/readdirplus=0Arequest. Is it based on dtpref setti= ng on the server? OR is it=A0 a fixed value.=0A=0A=A0I am changing dtpref v= alue on the Solaris and I would like to know how freebsd client would read = that.=0A=0Athanks,=0AGirish=0A From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 17:38:58 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7B7BA101; Wed, 10 Jul 2013 17:38:58 +0000 (UTC) (envelope-from gibbs@FreeBSD.org) Received: from aslan.scsiguy.com (aslan.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id 58F54101B; Wed, 10 Jul 2013 17:38:58 +0000 (UTC) Received: from [192.168.6.139] (207-225-98-3.dia.static.qwest.net [207.225.98.3]) (authenticated bits=0) by aslan.scsiguy.com (8.14.7/8.14.5) with ESMTP id r6AHcoCY097797 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 10 Jul 2013 17:38:50 GMT (envelope-from gibbs@FreeBSD.org) Content-Type: multipart/signed; boundary="Apple-Mail=_4D9E9496-59FD-423A-B74B-D55D497C0941"; protocol="application/pgp-signature"; micalg=pgp-sha1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Make ZFS use the physical sector size when computing initial ashift From: "Justin T. Gibbs" In-Reply-To: <51DD9801.4090808@delphij.net> Date: Wed, 10 Jul 2013 11:38:45 -0600 Message-Id: <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> To: d@delphij.net X-Mailer: Apple Mail (2.1508) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (aslan.scsiguy.com [70.89.174.89]); Wed, 10 Jul 2013 17:38:50 +0000 (UTC) Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 17:38:58 -0000 --Apple-Mail=_4D9E9496-59FD-423A-B74B-D55D497C0941 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 On Jul 10, 2013, at 11:21 AM, Xin Li wrote: > Signed PGP part > On 07/10/13 02:02, Dag-Erling Sm=EF=BF=B8rgrav wrote: > > The attached patch causes ZFS to base the minimum transfer size for > > a new vdev on the GEOM provider's stripesize (physical sector size) > > rather than sectorsize (logical sector size), provided that > > stripesize is a power of two larger than sectorsize and smaller > > than or equal to VDEV_PAD_SIZE. This should eliminate the need for > > ivoras@'s gnop trick when creating ZFS pools on Advanced Format > > drives. >=20 > I think there are multiple versions of this (I also have one[1]) but > the concern is that if one creates a pool with ashift=3D9, and now > ashift=3D12, the pool gets unimportable. So there need a way to = disable > this behavior. >=20 > Another thing (not really related to the automatic detection) is that > we need a way to manually override this setting from command line when > creating the pool, this is under active discussion at Illumos mailing > list right now. >=20 > [1] > = https://github.com/trueos/trueos/commit/3d2e3a38faad8df4acf442b055c5e98ab8= 73fb26 >=20 > Cheers, > - --=20 > Xin LI https://www.delphij.net/ > FreeBSD - The Power to Serve! Live free or die >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" I'm sure lots of folks have "some solution" to this. Here is an old version of what we use at Spectra: = http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff The above patch is missing some cleanup that was motivated by my discussions with George Wilson about this change in April. I'll dig that up later tonight. Even if you don't read the full diff, please read the included checkin comment since it explains the motivation behind this particular solution. This is on my list of things to upstream in the next week or so after I add logic to the userspace tools to report whether or not the TLVs in a pool are using an optimal allocation size. This is only possible if you actually make ZFS fully aware of logical, physical, and the configured allocation size. All of the other patches I've seen just treat physical as logical. -- Justin --Apple-Mail=_4D9E9496-59FD-423A-B74B-D55D497C0941 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.19 (Darwin) iQEcBAEBAgAGBQJR3ZwlAAoJED9n8CuvaSf4Aj0H/AgxokI9bUkCTo2Krp0PG6qJ BLPugsux3zOTmOoaChH41M9xEiPRu7wlzc7aHNqZQC8MDpk1LTTI81sfJ9M5e1UH DwSCvfRTp5NIBC4sgXt/z9mMogvI3HU1cn2TQp4AfCoKprBBiSnOSPXfp1tujxr6 LZWB0vAAQOlviBS/c4upPn5/gN8VC5qkudu2cLnS+XVxq/udkttjHnLXxV87Lh8/ Dw+R5wAKlAGUMlXTmSc4mJmMxi5jsqxgQ7izNPOwZqZooETSNIOfT9E6Ppl4n+DW CZYHjorTFUCmXiXWCNAmUox00LJcYcrWZZA9sOaGj5FIQ5iMeUYkAbml8PaKQyU= =Znt+ -----END PGP SIGNATURE----- --Apple-Mail=_4D9E9496-59FD-423A-B74B-D55D497C0941-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 18:05:48 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 952D9C13; Wed, 10 Jul 2013 18:05:48 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id CF4D01179; Wed, 10 Jul 2013 18:05:47 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004909991.msg; Wed, 10 Jul 2013 19:05:44 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 19:05:44 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: From: "Steven Hartland" To: , =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 19:06:00 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="utf-8"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, ivoras@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 18:05:48 -0000 ----- Original Message ----- From: "Xin Li" > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 07/10/13 02:02, Dag-Erling Sm?rgrav wrote: >> The attached patch causes ZFS to base the minimum transfer size for >> a new vdev on the GEOM provider's stripesize (physical sector size) >> rather than sectorsize (logical sector size), provided that >> stripesize is a power of two larger than sectorsize and smaller >> than or equal to VDEV_PAD_SIZE. This should eliminate the need for >> ivoras@'s gnop trick when creating ZFS pools on Advanced Format >> drives. > > I think there are multiple versions of this (I also have one[1]) but > the concern is that if one creates a pool with ashift=9, and now > ashift=12, the pool gets unimportable. So there need a way to disable > this behavior. I've tested my patch in all configurations I can think of including exported ashift=9 pools being imported, all no issues. For your example e.g. # Create a 4K pool (min_create_ashift=4K, dev=512) test:src> sysctl vfs.zfs.min_create_ashift vfs.zfs.min_create_ashift: 12 test:src> mdconfig -a -t swap -s 128m -S 512 -u 0 test:src> zpool create mdpool md0 test:src> zdb mdpool | grep ashift ashift: 12 ashift: 12 # Create a 512b pool (min_create_ashift=512, dev=512) test:src> zpool destroy mdpool test:src> sysctl vfs.zfs.min_create_ashift=9 vfs.zfs.min_create_ashift: 12 -> 9 test:src> zpool create mdpool md0 test:src> zdb mdpool | grep ashift ashift: 9 ashift: 9 # Import a 512b pool (min_create_ashift=4K, dev=512) test:src> zpool export mdpool test:src> sysctl vfs.zfs.min_create_ashift=12 vfs.zfs.min_create_ashift: 9 -> 12 test:src> zpool import mdpool test:src> zdb mdpool | grep ashift ashift: 9 ashift: 9 # Create a 4K pool (min_create_ashift=512, dev=4K) test:src> zpool destroy mdpool test:src> mdconfig -d -u 0 test:src> mdconfig -a -t swap -s 128m -S 4096 -u 0 test:src> sysctl vfs.zfs.min_create_ashift=9 vfs.zfs.min_create_ashift: 12 -> 9 test:src> zpool create mdpool md0 test:src> zdb mdpool | grep ashift ashift: 12 ashift: 12 # Import a 4K pool (min_create_ashift=4K, dev=4K) test:src> zpool export mdpool test:src> sysctl vfs.zfs.min_create_ashift=12 vfs.zfs.min_create_ashift: 9 -> 12 test:src> zpool import mdpool test:src> zdb mdpool | grep ashift ashift: 12 ashift: 12 > Another thing (not really related to the automatic detection) is that > we need a way to manually override this setting from command line when > creating the pool, this is under active discussion at Illumos mailing > list right now. > > [1] > https://github.com/trueos/trueos/commit/3d2e3a38faad8df4acf442b055c5e98ab873fb26 Yep has been on my list for a while, based on previous discussions on zfs-devel@. I've not had any time recently but I'm following the illumos thread to see what conclusions they come to. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 18:13:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 29E1C31D; Wed, 10 Jul 2013 18:13:14 +0000 (UTC) (envelope-from delphij@delphij.net) Received: from anubis.delphij.net (anubis.delphij.net [64.62.153.212]) by mx1.freebsd.org (Postfix) with ESMTP id 12FF4120B; Wed, 10 Jul 2013 18:13:13 +0000 (UTC) Received: from zeta.ixsystems.com (drawbridge.ixsystems.com [206.40.55.65]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by anubis.delphij.net (Postfix) with ESMTPSA id 76ECB9DEA; Wed, 10 Jul 2013 11:13:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=delphij.net; s=anubis; t=1373479993; bh=FjIMYhZ22un7bkY8tLUyfXF6Kw82/IJxBZuDu/11fGg=; h=Date:From:Reply-To:To:CC:Subject:References:In-Reply-To; b=cI/vIr2mBorhCFAMuqfi4qLnifvdXjcGZNUtC8yUr40f6siCjnr0d5ICNur5B3ErI qWIrspOqjADKHiBsismW2/z0wb/kXm/EkZLNF+mU4y8UDSWUwuDeyqY3nVq5UpiH4L tJbID7e7paQLNlS9mByFjRD/QBC9vPt6/7mZim1o= Message-ID: <51DDA433.7040707@delphij.net> Date: Wed, 10 Jul 2013 11:13:07 -0700 From: Xin Li Organization: The FreeBSD Project MIME-Version: 1.0 To: "Justin T. Gibbs" Subject: Re: Make ZFS use the physical sector size when computing initial ashift References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> In-Reply-To: <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, =?UTF-8?B?RGFnLUVybGluZyBTbcO4cmdyYXY=?= , d@delphij.net, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: d@delphij.net List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 18:13:14 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 07/10/13 10:38, Justin T. Gibbs wrote: [snip] > I'm sure lots of folks have "some solution" to this. Here is an > old version of what we use at Spectra: > > http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff > > The above patch is missing some cleanup that was motivated by my > discussions with George Wilson about this change in April. I'll > dig that up later tonight. Even if you don't read the full diff, > please read the included checkin comment since it explains the > motivation behind this particular solution. > > This is on my list of things to upstream in the next week or so > after I add logic to the userspace tools to report whether or not > the TLVs in a pool are using an optimal allocation size. This is > only possible if you actually make ZFS fully aware of logical, > physical, and the configured allocation size. All of the other > patches I've seen just treat physical as logical. Yes, me too. Your version is superior. Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- iQEcBAEBCgAGBQJR3aQzAAoJEG80Jeu8UPuzHn8H/1ZpoTqAQ4+mgQOttOwXgBcr 2Fgh52ztW8fCEQSeIosxXKO06hP7HxFfTPvmeeWyjT8zIpSUSFV6G0NclebKDncP huGFofvx3BKPRmfzZp4iZx1wWQUxSHTmv6ceDwvP7P8GJ0mON+SrZxmmwUjKrf7V W9Sazl0p8e0nxSQykLyjjrkaBx5Iv+aUxu8Alomwy9BmpM8+gd2yutvzghW5L36L 0CvAtIMXdlc+eUdAqa/2rOk/nMOA9sfWVW0gkKYCZk6wvj2DMzjii05UechZ4Z+l 6nEU3UdVsbTX73CABZv4my4JAWc5Yk1s/cWrxtn68AfK8LMPFJCJcVXXOSckMWI= =351W -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 19:06:14 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6BFEF3ED; Wed, 10 Jul 2013 19:06:14 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 8F3B41619; Wed, 10 Jul 2013 19:06:13 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004910770.msg; Wed, 10 Jul 2013 20:06:11 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 20:06:11 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> From: "Steven Hartland" To: "Justin T. Gibbs" , References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 20:06:26 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="UTF-8"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org, =?UTF-8?Q?Dag-Erling_Sm=C3=B8rgrav?= , ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 19:06:14 -0000 ----- Original Message ----- From: "Justin T. Gibbs" > I'm sure lots of folks have "some solution" to this. Here is an > old version of what we use at Spectra: > > http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff > > The above patch is missing some cleanup that was motivated by my > discussions with George Wilson about this change in April. I'll > dig that up later tonight. Even if you don't read the full diff, > please read the included checkin comment since it explains the > motivation behind this particular solution. > > This is on my list of things to upstream in the next week or so after > I add logic to the userspace tools to report whether or not the > TLVs in a pool are using an optimal allocation size. This is only > possible if you actually make ZFS fully aware of logical, physical, > and the configured allocation size. All of the other patches I've seen > just treat physical as logical. Reading through your patch it seems that your logical_ashift equates to the current ashift values which for geom devices is based off sectorsize and your physical_ashift is based stripesize. This is almost identical to the approach I used adding a "desired ashift", which equates to your physical_ashift, along side the standard ashift i.e. required aka logical_ashift value :) One issue I did spot in your patch is that you currently expose zfs_max_auto_ashift as a sysctl but don't clamp its value which would cause problems should a user configure values > 13. If your interested in the reason for this its explained in the comments in my version which does a very similar thing with validation. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 19:24:40 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B6BDD948; Wed, 10 Jul 2013 19:24:40 +0000 (UTC) (envelope-from gibbs@FreeBSD.org) Received: from aslan.scsiguy.com (mail.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id 72B8916E5; Wed, 10 Jul 2013 19:24:39 +0000 (UTC) Received: from [192.168.6.139] (207-225-98-3.dia.static.qwest.net [207.225.98.3]) (authenticated bits=0) by aslan.scsiguy.com (8.14.7/8.14.5) with ESMTP id r6AJOWr5098398 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 10 Jul 2013 19:24:32 GMT (envelope-from gibbs@FreeBSD.org) Subject: Re: Make ZFS use the physical sector size when computing initial ashift Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Content-Type: text/plain; charset=us-ascii From: "Justin T. Gibbs" X-Priority: 3 In-Reply-To: <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> Date: Wed, 10 Jul 2013 13:24:26 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <0A3A05F7-7859-4285-B15A-5E7DDB751062@FreeBSD.org> References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> To: "Steven Hartland" X-Mailer: Apple Mail (2.1508) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (aslan.scsiguy.com [70.89.174.89]); Wed, 10 Jul 2013 19:24:32 +0000 (UTC) Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , d@delphij.net, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 19:24:40 -0000 On Jul 10, 2013, at 1:06 PM, "Steven Hartland" = wrote: > ----- Original Message ----- From: "Justin T. Gibbs"=20 >> I'm sure lots of folks have "some solution" to this. Here is an >> old version of what we use at Spectra: >> http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff >> The above patch is missing some cleanup that was motivated by my >> discussions with George Wilson about this change in April. I'll >> dig that up later tonight. Even if you don't read the full diff, >> please read the included checkin comment since it explains the >> motivation behind this particular solution. >>=20 >> This is on my list of things to upstream in the next week or so after >> I add logic to the userspace tools to report whether or not the >> TLVs in a pool are using an optimal allocation size. This is only >> possible if you actually make ZFS fully aware of logical, physical, >> and the configured allocation size. All of the other patches I've = seen >> just treat physical as logical. >=20 > Reading through your patch it seems that your logical_ashift equates = to > the current ashift values which for geom devices is based off = sectorsize > and your physical_ashift is based stripesize. >=20 > This is almost identical to the approach I used adding a "desired = ashift", > which equates to your physical_ashift, along side the standard ashift > i.e. required aka logical_ashift value :) Yes, the approaches are similar. Our current version records the = logical access size in the vdev structure too, which might relate to the issue below. > One issue I did spot in your patch is that you currently expose > zfs_max_auto_ashift as a sysctl but don't clamp its value which would > cause problems should a user configure values > 13. I would expect the zio pipeline to simply insert an ashift aligned = thunking buffer for these operations, but I haven't tried going past an ashift of = 13 in my tests. If it is an issue, it seems the restriction should be based = on logical access size, not optimal access size. -- Justin= From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 19:41:50 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 200F2DF7; Wed, 10 Jul 2013 19:41:50 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 34B9A1796; Wed, 10 Jul 2013 19:41:48 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004911185.msg; Wed, 10 Jul 2013 20:41:47 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 20:41:47 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: <7BB4167807A4434A9CD5FB0F1600439F@multiplay.co.uk> From: "Steven Hartland" To: "Justin T. Gibbs" References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> <0A3A05F7-7859-4285-B15A-5E7DDB751062@FreeBSD.org> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 20:42:01 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , d@delphij.net, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 19:41:50 -0000 ----- Original Message ----- From: "Justin T. Gibbs" > On Jul 10, 2013, at 1:06 PM, "Steven Hartland" wrote: >> ----- Original Message ----- From: "Justin T. Gibbs" >>> I'm sure lots of folks have "some solution" to this. Here is an >>> old version of what we use at Spectra: >>> http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff >>> The above patch is missing some cleanup that was motivated by my >>> discussions with George Wilson about this change in April. I'll >>> dig that up later tonight. Even if you don't read the full diff, >>> please read the included checkin comment since it explains the >>> motivation behind this particular solution. >>> >>> This is on my list of things to upstream in the next week or so after >>> I add logic to the userspace tools to report whether or not the >>> TLVs in a pool are using an optimal allocation size. This is only >>> possible if you actually make ZFS fully aware of logical, physical, >>> and the configured allocation size. All of the other patches I've seen >>> just treat physical as logical. >> >> Reading through your patch it seems that your logical_ashift equates to >> the current ashift values which for geom devices is based off sectorsize >> and your physical_ashift is based stripesize. >> >> This is almost identical to the approach I used adding a "desired ashift", >> which equates to your physical_ashift, along side the standard ashift >> i.e. required aka logical_ashift value :) > > Yes, the approaches are similar. Our current version records the logical > access size in the vdev structure too, which might relate to the issue > below. > > > One issue I did spot in your patch is that you currently expose > > zfs_max_auto_ashift as a sysctl but don't clamp its value which would > > cause problems should a user configure values > 13. > > I would expect the zio pipeline to simply insert an ashift aligned thunking > buffer for these operations, but I haven't tried going past an ashift of 13 in > my tests. If it is an issue, it seems the restriction should be based on > logical access size, not optimal access size. Yes with your methodology you'll only see the issue if zfs_max_auto_ashift and physical_ashift are both > 13, but this can be the case for example on a RAID controller with large stripsize. Looking back at my old patch it too suffers from the same issue along with the current code base, but that would only happen if logical sector size resulted in an ashift > 13 which is going to be much less common ;-) Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 19:50:49 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 74AF9353; Wed, 10 Jul 2013 19:50:49 +0000 (UTC) (envelope-from gibbs@FreeBSD.org) Received: from aslan.scsiguy.com (mail.scsiguy.com [70.89.174.89]) by mx1.freebsd.org (Postfix) with ESMTP id 3F9C21828; Wed, 10 Jul 2013 19:50:48 +0000 (UTC) Received: from [192.168.6.139] (207-225-98-3.dia.static.qwest.net [207.225.98.3]) (authenticated bits=0) by aslan.scsiguy.com (8.14.7/8.14.5) with ESMTP id r6AJoccA098537 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 10 Jul 2013 19:50:39 GMT (envelope-from gibbs@FreeBSD.org) Subject: Re: Make ZFS use the physical sector size when computing initial ashift Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Content-Type: text/plain; charset=iso-8859-1 From: "Justin T. Gibbs" X-Priority: 3 In-Reply-To: <7BB4167807A4434A9CD5FB0F1600439F@multiplay.co.uk> Date: Wed, 10 Jul 2013 13:50:33 -0600 Content-Transfer-Encoding: quoted-printable Message-Id: <00205B20-742F-44F6-B538-3B809D8BC03F@FreeBSD.org> References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> <0A3A05F7-7859-4285-B15A-5E7DDB751062@FreeBSD.org> <7BB4167807A4434A9CD5FB0F1600439F@multiplay.co.uk> To: "Steven Hartland" X-Mailer: Apple Mail (2.1508) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (aslan.scsiguy.com [70.89.174.89]); Wed, 10 Jul 2013 19:50:39 +0000 (UTC) Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , d@delphij.net, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 19:50:49 -0000 On Jul 10, 2013, at 1:42 PM, "Steven Hartland" = wrote: >=20 > ----- Original Message ----- From: "Justin T. Gibbs" >> On Jul 10, 2013, at 1:06 PM, "Steven Hartland" wrote: >>> ----- Original Message ----- From: "Justin T. Gibbs"=20 >>>> I'm sure lots of folks have "some solution" to this. Here is an >>>> old version of what we use at Spectra: >>>> http://people.freebsd.org/~gibbs/zfs_patches/zfs_auto_ashift.diff >>>> The above patch is missing some cleanup that was motivated by my >>>> discussions with George Wilson about this change in April. I'll >>>> dig that up later tonight. Even if you don't read the full diff, >>>> please read the included checkin comment since it explains the >>>> motivation behind this particular solution. >>>> This is on my list of things to upstream in the next week or so = after >>>> I add logic to the userspace tools to report whether or not the >>>> TLVs in a pool are using an optimal allocation size. This is only >>>> possible if you actually make ZFS fully aware of logical, physical, >>>> and the configured allocation size. All of the other patches I've = seen >>>> just treat physical as logical. >>> Reading through your patch it seems that your logical_ashift equates = to >>> the current ashift values which for geom devices is based off = sectorsize >>> and your physical_ashift is based stripesize. >>> This is almost identical to the approach I used adding a "desired = ashift", >>> which equates to your physical_ashift, along side the standard = ashift >>> i.e. required aka logical_ashift value :) >>=20 >> Yes, the approaches are similar. Our current version records the = logical >> access size in the vdev structure too, which might relate to the = issue >> below. >>=20 >> > One issue I did spot in your patch is that you currently expose >> > zfs_max_auto_ashift as a sysctl but don't clamp its value which = would >> > cause problems should a user configure values > 13. >>=20 >> I would expect the zio pipeline to simply insert an ashift aligned = thunking >> buffer for these operations, but I haven't tried going past an ashift = of 13 in >> my tests. If it is an issue, it seems the restriction should be = based on >> logical access size, not optimal access size. >=20 > Yes with your methodology you'll only see the issue if = zfs_max_auto_ashift > and physical_ashift are both > 13, but this can be the case for = example > on a RAID controller with large stripsize. I'm not sure I follow. logical_ashift is available in our latest code, = as is the physical_ashift. But even without the logical_ashift, why doesn't the = zio pipeline properly thunk zio_phys_read() access based on the configured = ashift? -- Justin From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 20:37:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EF3286B7; Wed, 10 Jul 2013 20:37:54 +0000 (UTC) (envelope-from prvs=1903808b5b=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 1FA4D1A84; Wed, 10 Jul 2013 20:37:53 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004911799.msg; Wed, 10 Jul 2013 21:37:51 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 10 Jul 2013 21:37:51 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1903808b5b=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: From: "Steven Hartland" To: "Justin T. Gibbs" References: <86zjtupz3r.fsf@nine.des.no> <51DD9801.4090808@delphij.net> <2B9367B6-8759-45C9-B120-9D00A381228F@FreeBSD.org> <97E5A0A8DFBF4F75AAE8EDEFDF849EB0@multiplay.co.uk> <0A3A05F7-7859-4285-B15A-5E7DDB751062@FreeBSD.org> <7BB4167807A4434A9CD5FB0F1600439F@multiplay.co.uk> <00205B20-742F-44F6-B538-3B809D8BC03F@FreeBSD.org> Subject: Re: Make ZFS use the physical sector size when computing initial ashift Date: Wed, 10 Jul 2013 21:38:07 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-fs@freebsd.org, =?iso-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , d@delphij.net, ivoras@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 20:37:55 -0000 ----- Original Message ----- From: "Justin T. Gibbs" ... >>> > One issue I did spot in your patch is that you currently expose >>> > zfs_max_auto_ashift as a sysctl but don't clamp its value which would >>> > cause problems should a user configure values > 13. >>> >>> I would expect the zio pipeline to simply insert an ashift aligned thunking >>> buffer for these operations, but I haven't tried going past an ashift of 13 in >>> my tests. If it is an issue, it seems the restriction should be based on >>> logical access size, not optimal access size. >> >> Yes with your methodology you'll only see the issue if zfs_max_auto_ashift >> and physical_ashift are both > 13, but this can be the case for example >> on a RAID controller with large stripsize. > > I'm not sure I follow. logical_ashift is available in our latest code, as is the > physical_ashift. But even without the logical_ashift, why doesn't the zio > pipeline properly thunk zio_phys_read() access based on the configured ashift? When I looked at it, which was a long time ago now so please excuse me if I'm a little rusty on the details, zio_phys_read() was working more luck than judgement as the offsets passed in where calculated from a valid start + increment based on the size of a structure within vdev_label_offset() with no ashift logic applied that I cound find. The result was pools created with large ashift's where unstable when I tested. Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 10 20:44:08 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3CC1F923; Wed, 10 Jul 2013 20:44:08 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 15B391AD1; Wed, 10 Jul 2013 20:44:08 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r6AKi7vi012463; Wed, 10 Jul 2013 20:44:07 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r6AKi7Bh012462; Wed, 10 Jul 2013 20:44:07 GMT (envelope-from linimon) Date: Wed, 10 Jul 2013 20:44:07 GMT Message-Id: <201307102044.r6AKi7Bh012462@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Subject: Re: kern/180438: [smbfs] [patch] mount_smbfs fails on arm because of wrong endianess assumption in libsmb X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Jul 2013 20:44:08 -0000 Old Synopsis: [patch] mount_smbfs fails on arm because of wrong endianess assumption in libsmb New Synopsis: [smbfs] [patch] mount_smbfs fails on arm because of wrong endianess assumption in libsmb Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jul 10 20:43:40 UTC 2013 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=180438 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 04:50:01 2013 Return-Path: Delivered-To: freebsd-fs@smarthost.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5143585E for ; Thu, 11 Jul 2013 04:50:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 44C2E1FFB for ; Thu, 11 Jul 2013 04:50:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r6B4o1G9018137 for ; Thu, 11 Jul 2013 04:50:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r6B4o1kI018136; Thu, 11 Jul 2013 04:50:01 GMT (envelope-from gnats) Date: Thu, 11 Jul 2013 04:50:01 GMT Message-Id: <201307110450.r6B4o1kI018136@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org Cc: From: dfilter@FreeBSD.ORG (dfilter service) Subject: Re: kern/180236: commit references a PR X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: dfilter service List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 04:50:01 -0000 The following reply was made to PR kern/180236; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/180236: commit references a PR Date: Thu, 11 Jul 2013 04:48:02 +0000 (UTC) Author: kib Date: Thu Jul 11 04:47:44 2013 New Revision: 253183 URL: http://svnweb.freebsd.org/changeset/base/253183 Log: MFC r252714: The tvp vnode on rename is usually unlinked. Drop the cached null vnode for tvp to allow the free of the lower vnode, if needed. PR: kern/180236 Modified: stable/9/sys/fs/nullfs/null_vnops.c Directory Properties: stable/9/sys/ (props changed) stable/9/sys/fs/ (props changed) Modified: stable/9/sys/fs/nullfs/null_vnops.c ============================================================================== --- stable/9/sys/fs/nullfs/null_vnops.c Thu Jul 11 03:57:53 2013 (r253182) +++ stable/9/sys/fs/nullfs/null_vnops.c Thu Jul 11 04:47:44 2013 (r253183) @@ -554,6 +554,7 @@ null_rename(struct vop_rename_args *ap) struct vnode *fvp = ap->a_fvp; struct vnode *fdvp = ap->a_fdvp; struct vnode *tvp = ap->a_tvp; + struct null_node *tnn; /* Check for cross-device rename. */ if ((fvp->v_mount != tdvp->v_mount) || @@ -568,7 +569,11 @@ null_rename(struct vop_rename_args *ap) vrele(fvp); return (EXDEV); } - + + if (tvp != NULL) { + tnn = VTONULL(tvp); + tnn->null_flags |= NULLV_DROP; + } return (null_bypass((struct vop_generic_args *)ap)); } _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 10:24:37 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0FD4C9B8; Thu, 11 Jul 2013 10:24:37 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-qa0-x22b.google.com (mail-qa0-x22b.google.com [IPv6:2607:f8b0:400d:c00::22b]) by mx1.freebsd.org (Postfix) with ESMTP id B90241051; Thu, 11 Jul 2013 10:24:36 +0000 (UTC) Received: by mail-qa0-f43.google.com with SMTP id d13so7556708qak.16 for ; Thu, 11 Jul 2013 03:24:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=NgTYgBDZrtLWQLbn99bdptbOPhhgXey94WKDY2dIf94=; b=OytQnyDklaSBKwVfwBgWmWIgMc6HtvwBOP+UGQl3WYljzu3X16SsOe+DrUwhxczyAP z9Axjfn9JToPH7aK/+J12WpCd2krVI5WRtFi0GQAl+R9Wj4ZUa4mtr1z71IXDYOay5ZU AbMkmChvszzYNh4Mb5ljD+4WDY66jQlCPo+Lzgv4XslfUvLzPjLCigp8xMD6nu3bHN8p Sb2cMOckXEI4uPvqnWpBu9nfY1I/8sVHBu70642F+USPK7RatDkk3uLKEM6IsWUFgMKi w41zoKDu7/BlNZ4CTu65PG9qfyAiZVXUQqxdK/Hl9XoPKtMPCD8beb8shcyQcFXTfczm gm6g== MIME-Version: 1.0 X-Received: by 10.49.58.70 with SMTP id o6mr28989489qeq.1.1373538276295; Thu, 11 Jul 2013 03:24:36 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.49.26.193 with HTTP; Thu, 11 Jul 2013 03:24:36 -0700 (PDT) In-Reply-To: <20130710131150.GA74301@jh> References: <20130710131150.GA74301@jh> Date: Thu, 11 Jul 2013 12:24:36 +0200 X-Google-Sender-Auth: HMU1-rsvBXSWUOs3Q_CYxnMwg0I Message-ID: Subject: Re: Compatibility options for mount(8) From: Robert Millan To: Jaakko Heinonen Content-Type: multipart/mixed; boundary=047d7b2e52900b496a04e139cd82 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 10:24:37 -0000 --047d7b2e52900b496a04e139cd82 Content-Type: text/plain; charset=UTF-8 2013/7/10 Jaakko Heinonen : > I am not sure if mount(8) is the right place for the translation. This > seems to be the first string option translated by mount(8). The "rdonly" > compatibility option is translated to "ro" in kernel. Looks inconsistent > to me. Makes sense... I can look this part up later. For now, how about only adding -n? Is everyone fine with that? See attachment. --047d7b2e52900b496a04e139cd82 Content-Type: application/octet-stream; name="mount_n.diff" Content-Disposition: attachment; filename="mount_n.diff" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hizt7muo0 SW5kZXg6IHNiaW4vbW91bnQvbW91bnQuOAo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09Ci0tLSBzYmluL21vdW50L21vdW50 LjgJKHJldmlzaW9uIDI1MjgyNCkKKysrIHNiaW4vbW91bnQvbW91bnQuOAkod29ya2luZyBjb3B5 KQpAQCAtMTE4LDYgKzExOCw5IEBACiAuRmwgYQogb3B0aW9uLCBhbHNvIG1vdW50IHRob3NlIGZp bGUgc3lzdGVtcyB3aGljaCBhcmUgbWFya2VkIGFzCiAuRHEgTGkgbGF0ZSAuCisuSXQgRmwgbgor Rm9yIGNvbXBhdGliaWxpdHkgd2l0aCBzb21lIG90aGVyIGltcGxlbWVudGF0aW9uczsgdGhpcyBm bGFnIGlzCitjdXJyZW50bHkgYSBuby1vcC4KIC5JdCBGbCBvCiBPcHRpb25zIGFyZSBzcGVjaWZp ZWQgd2l0aCBhCiAuRmwgbwpJbmRleDogc2Jpbi9tb3VudC9tb3VudC5jCj09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0t IHNiaW4vbW91bnQvbW91bnQuYwkocmV2aXNpb24gMjUyODI0KQorKysgc2Jpbi9tb3VudC9tb3Vu dC5jCSh3b3JraW5nIGNvcHkpCkBAIC0yNTMsNyArMjUzLDcgQEAKIAlvcHRpb25zID0gTlVMTDsK IAl2ZnNsaXN0ID0gTlVMTDsKIAl2ZnN0eXBlID0gInVmcyI7Ci0Jd2hpbGUgKChjaCA9IGdldG9w dChhcmdjLCBhcmd2LCAiYWRGOmZMbG86cHJ0OnV2dyIpKSAhPSAtMSkKKwl3aGlsZSAoKGNoID0g Z2V0b3B0KGFyZ2MsIGFyZ3YsICJhZEY6Zkxsbm86cHJ0OnV2dyIpKSAhPSAtMSkKIAkJc3dpdGNo IChjaCkgewogCQljYXNlICdhJzoKIAkJCWFsbCA9IDE7CkBAIC0yNzQsNiArMjc0LDkgQEAKIAkJ Y2FzZSAnbCc6CiAJCQlsYXRlID0gMTsKIAkJCWJyZWFrOworCQljYXNlICduJzoKKwkJCS8qIEZv ciBjb21wYXRpYmlsaXR5IHdpdGggdGhlIExpbnV4IHZlcnNpb24gb2YgbW91bnQuICovCisJCQli cmVhazsKIAkJY2FzZSAnbyc6CiAJCQlpZiAoKm9wdGFyZykgewogCQkJCW9wdGlvbnMgPSBjYXRv cHQob3B0aW9ucywgb3B0YXJnKTsK --047d7b2e52900b496a04e139cd82-- From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 11:36:13 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E6D2C6FB for ; Thu, 11 Jul 2013 11:36:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id A4A43164D for ; Thu, 11 Jul 2013 11:36:12 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=ME3lrcP4jFDzpPiCSQywCMKJiHtpRWeRXBDIYmR1BZg= c=1 sm=2 a=2CN1efILQXEA:10 a=FKkrIqjQGGEA:10 a=kQLhFrLAn7oA:10 a=IkcTkHD0fZMA:10 a=6I5d2MoRAAAA:8 a=wkfKJPNXhD8R7PrmbvYA:9 a=QEXdDO2ut3YA:10 a=SV7veod9ZcQA:10 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqMEAMOX3lGDaFve/2dsb2JhbABagztNgwa+SYEcdIIjAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASHaAYMpXiRPoEmjQl+NAeCVoEfA5UTg3CQIYMtIDKBAzc X-IronPort-AV: E=Sophos;i="4.87,1043,1363147200"; d="scan'208";a="39282997" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 11 Jul 2013 07:36:11 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id CC56A79204; Thu, 11 Jul 2013 07:36:11 -0400 (EDT) Date: Thu, 11 Jul 2013 07:36:11 -0400 (EDT) From: Rick Macklem To: G V Message-ID: <1247906018.4407.1373542571821.JavaMail.root@uoguelph.ca> In-Reply-To: <1373477058.14313.YahooMailNeo@web120501.mail.ne1.yahoo.com> Subject: Re: Nfs readdir/readdirplus buffer size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 11:36:14 -0000 Girish wrote: > =C2=A0 I would like to know what is the buffer size passed by the client > =C2=A0 as an argument in the readdir/readdirplus > request. Is it based on dtpref setting on the server? OR is it=C2=A0 a > fixed value. >=20 > =C2=A0I am changing dtpref value on the Solaris and I would like to know > =C2=A0how freebsd client would read that. >=20 I'm afraid I don't have time to look at the code right now, so this might not be 100% accurate (and might be different for the old client): - the size readdir uses is - capped at MAXBSIZE (64K) - capped at the min(rsize,readdirsize) if specified as mount arguments - capped at dtmax as specified by the server --> I can't remember how dtpref is used. You'll need to look at the sources (or just try it and look at a packet trace in wireshark) to figure that one out. rick > thanks, > Girish > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 17:22:54 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A86B1343; Thu, 11 Jul 2013 17:22:54 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) by mx1.freebsd.org (Postfix) with ESMTP id 8AA591B46; Thu, 11 Jul 2013 17:22:54 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r6BHMohd099772; Thu, 11 Jul 2013 10:22:50 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201307111722.r6BHMohd099772@chez.mckusick.com> To: Robert Millan Subject: Re: Compatibility options for mount(8) In-reply-to: Date: Thu, 11 Jul 2013 10:22:50 -0700 From: Kirk McKusick Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 17:22:54 -0000 > Date: Thu, 11 Jul 2013 12:24:36 +0200 > Subject: Re: Compatibility options for mount(8) > From: Robert Millan > To: Jaakko Heinonen > Cc: freebsd-fs@freebsd.org > > 2013/7/10 Jaakko Heinonen : >> I am not sure if mount(8) is the right place for the translation. This >> seems to be the first string option translated by mount(8). The "rdonly" >> compatibility option is translated to "ro" in kernel. Looks inconsistent >> to me. > > Makes sense... > > I can look this part up later. For now, how about only adding -n? Is > everyone fine with that? > > See attachment. I am fine with your proposed addition. I would favor changing the manual page from +For compatibility with some other implementations; this flag is to +For compatibility with some Linux implementations; this flag is as it is (primarily) Linux compatibility and also reflects the comment that you have added in the code. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 21:08:01 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39831893; Thu, 11 Jul 2013 21:08:01 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-we0-x22b.google.com (mail-we0-x22b.google.com [IPv6:2a00:1450:400c:c03::22b]) by mx1.freebsd.org (Postfix) with ESMTP id 7A409162D; Thu, 11 Jul 2013 21:08:00 +0000 (UTC) Received: by mail-we0-f171.google.com with SMTP id m46so7403294wev.2 for ; Thu, 11 Jul 2013 14:07:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=erX7BSvS/+JGXgu0ZR797YljNpENpjJTIq6rAuTDR6M=; b=U6sx+MLMZOPGLY4t5scTLwORo9g/QhB2xsyUD1SYktg9fhFMQjXjNnyFAhvIx2rv2o WXtlKFLD6Vs6XM20zV2eeQUB+P4K0rfufGxp/j4qcY924gQbvd+xdRUk1ceZm6M0KZuP M/fuZ3RGvh2CnVu9IFeckC7OmKwxjPswRaNAV6E5x1Hg6Lxs8fWjLoonLYgSiTpMmZze 951+GtzB0mU37nlAT22bESFbzydkSSw6yCThthJOHgDdmK9fwLNBgG8boib8ZX4PTfW/ C8d+X7Lwt+474pBHQ/AN8xGwrvePLZT9OAW2VsmvmEAChsKlmPSR1quax++0qHCMiLLa eJQQ== X-Received: by 10.181.11.227 with SMTP id el3mr16669415wid.31.1373576879571; Thu, 11 Jul 2013 14:07:59 -0700 (PDT) MIME-Version: 1.0 Sender: artem.naluzhnyy@gmail.com Received: by 10.216.203.68 with HTTP; Thu, 11 Jul 2013 14:07:19 -0700 (PDT) In-Reply-To: References: From: Artem Naluzhnyy Date: Fri, 12 Jul 2013 00:07:19 +0300 X-Google-Sender-Auth: Wv2vzzctBS3XMN8UzLrHMHLKl7Y Message-ID: Subject: Re: RAID10 stripe size and PostgreSQL performance To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, freebsd-database@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 21:08:01 -0000 On Mon, Jul 8, 2013 at 6:16 PM, Ivan Voras wrote: > On 08/07/2013 14:40, Artem Naluzhnyy wrote: >> Is this expected behavior with more than twice higher pgbench tps on >> 1MB stripe size? > > No, it is not. > > For start, can you please repeat your benchmarks but with restarting the > PostgreSQL server between each pgbench run? Fresh OS installation without DB warning, reboot after pgbench DB initialization (DB size: 26 GB) before benchmarking: * 32 KB (half of the UFS bsize) - tps=198 * 64 KB - tps=226 * 128 KB (default for the RAID controller) - tps=298 * 1 MB (max for the RAID controller) - tps=347 > Also, you should make sure that the database is located on the same > location on the disk platters by e.g. creating a small partition which > is about 150% larger than your pgbench database (and your pgbench > database should be at least 2x larger than your RAM, if you are going to > benchmark IO and not memory caches), which is located at the same > position (byte offset) in your RAID10 volume. Unfortunately it's not that easy to make a custom partitioning. However, all tests were done just after the server reinstallation using exactly the same order of commands. The server has 24 GB RAM, so with 88 GB DB we have: * 32 KB stripe - tps=161 * 1 MB stripe - tps=258 The server is used for VoIP billing, there are also lots of plain-text log files dumping. Had it still better use 1 MB stripe size, or it might have some side effects on performance. -- Artem Naluzhnyy From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 22:17:02 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2A646BE1 for ; Thu, 11 Jul 2013 22:17:02 +0000 (UTC) (envelope-from david.i.noel@gmail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) by mx1.freebsd.org (Postfix) with ESMTP id BCB801926 for ; Thu, 11 Jul 2013 22:17:01 +0000 (UTC) Received: by mail-wi0-f169.google.com with SMTP id c10so61249wiw.0 for ; Thu, 11 Jul 2013 15:17:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=FRi6Cp/85PUmIpzJ+scbZo02guhPVgU7IatWFUEAKH4=; b=rMKCtkqXvTV5j80giqz1oKa/2zMUqL2Ct5NVJVmYFwe57YnnrRXU9e00zn7fQhVEFr zuOZkfQ1uiv7dBsRLOcWXXCm5PJ7FE9ElpX+KFMsRnec0rsxyVSkgL2nmbl6DoFRifZ8 99xm/5uAz/Dk21NDtoCwYOXs6a1GoY4q68gsdlx8eISf4SRhomMvKsPZjK6w2sL1lqYF R2o5rFyuBa/SDyoMv/0EtgTAjpv9ZdJZI/GwnSyoVAF9TrtB/CXgXzWOj8sIdLyxj8BV WGLrz2AvrD0ej7K9xIVQ/mvENrkN3qgJMlc3GyWWJgNISs2esVdLjFGqG7JUXeqMqKQL kAiA== MIME-Version: 1.0 X-Received: by 10.180.160.203 with SMTP id xm11mr20832836wib.58.1373581020901; Thu, 11 Jul 2013 15:17:00 -0700 (PDT) Received: by 10.216.180.138 with HTTP; Thu, 11 Jul 2013 15:17:00 -0700 (PDT) In-Reply-To: References: Date: Thu, 11 Jul 2013 17:17:00 -0500 Message-ID: Subject: FreeBSD upgrade woes (8.3 -> 8.4) From: David Noel To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: David.I.Noel@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 22:17:02 -0000 I've been directed to the freebsd-fs list, so hopefully I'm in the right place for this question. I have 4 servers I'm upgrading from 8.3 to 8.4. Two of them went without a hitch, two of them blew up in my face. The only difference between the two is the ones that worked have a 2-disk ZFS mirror and the ones that didn't have a 4-disk ZFS striped mirror configuration (RAID10). They both use the GPT. After installworld && installkernel they made it through boot, but right before the login prompt I'm getting a panic and stack dump. The backtrace looks something like this (roughly): 0 kdb_backtrace 1 panic 2 trap_fatal 3 trap_pfault 4 trap 5 calltrap 6 vdev_mirror_child_select 7 vdev_mirror_io_start 8 zio_vdev_io_start 9 zio_execute 10 arc_read 11 dbuf_read 12 dbuf_findbp 13 dbuf_hold_impl 14 dbuf_hold 15 dnode_hold_impl 16 dmu_buf_hold 17 zap_lockdir Does anyone have any idea what went wrong? Does anyone have any suggestions on how to get past this? Is there any more information I could provide to help debug this? Thanks, David From owner-freebsd-fs@FreeBSD.ORG Thu Jul 11 23:30:53 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1A0CAB7D; Thu, 11 Jul 2013 23:30:53 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id BF79F1D86; Thu, 11 Jul 2013 23:30:52 +0000 (UTC) X-Cloudmark-SP-Filtered: true X-Cloudmark-SP-Result: v=1.1 cv=u+Bwc9JL7tMNtl/i9xObSTPSFclN5AOtXcIZY5dPsHA= c=1 sm=2 a=2CN1efILQXEA:10 a=FKkrIqjQGGEA:10 a=l0nrKk16v60A:10 a=IkcTkHD0fZMA:10 a=6I5d2MoRAAAA:8 a=tG8P3wK5P364QUJtx80A:9 a=QEXdDO2ut3YA:10 a=SV7veod9ZcQA:10 a=HCYscHdTzkEnlTMq:21 a=Hg57A2LVB9d5kEav:21 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqQEAIs/31GDaFve/2dsb2JhbABaFoMkT4MGvlCBHXSCIwEBAQMBAQEBICsgCxsYAgINGQIpAQkmBggHBAEcAQOHaAYMpiaRO4Emi2qBDxB+NAeCVoEfA5UVg3GIeYcrgViBVSAygQM3 X-IronPort-AV: E=Sophos;i="4.89,648,1367985600"; d="scan'208";a="40006670" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 11 Jul 2013 19:30:45 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 51E5F79204; Thu, 11 Jul 2013 19:30:45 -0400 (EDT) Date: Thu, 11 Jul 2013 19:30:45 -0400 (EDT) From: Rick Macklem To: Bryan Drewery Message-ID: <672055679.467398.1373585445319.JavaMail.root@uoguelph.ca> In-Reply-To: <51DD3C1F.1000609@shatow.net> Subject: Re: NFS panic: newnfs_copycred: negative nfsc_ngroups (client HEAD r253033, server 9.1-R) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs@FreeBSD.org, FreeBSD Current X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Jul 2013 23:30:53 -0000 Bryan Drewery wrote: > I received this panic on the client while doing heavy parallel > reads/writes over NFS. I only recently moved these files to NFS, so I > don't know whether or not it's a recent regression. > > Client: HEAD r253033 > Server: 9.1-R > > core.txt: http://people.freebsd.org/~bdrewery/nfs.txt > > fstab of related paths: > > > tank:/tank/distfiles/freebsd /mnt/distfiles > > nfs > > rw,bg,noatime,intr,rsize=65536,wsize=65536,readahead=8,nfsv4 > > 0 0 > > tank:/usr/packages/ > > /mnt/all-packages nfs > > rw,bg,noatime,soft,retrycnt=3,rsize=65536,wsize=65536,readahead=8,nfsv4 > > 0 0 The mount options "soft" and "intr" should never be used for NFSv4. If an RPC fails with ETIMEDOUT or EINTR it can leave the open state in an undefined state. If you still get one of these crashes with all hard mounts, email again, since that would imply a client bug. (This is documented in the BUGS sections of mount_nfs(1), but not very well.;-) I'm not sure if this undefined open state could cause the crash, but it seems plausible, since the crash indicates garbage for the credentials in the open state structure. rick > > Server: params on these paths: -maproot=root -network 10.10.0.0/16 > > tcpdump at the time: > > > 21:43:05.396585 IP 10.10.0.7.4180315003 > 10.10.0.5.2049: 168 > > getattr fh 0,4/2 > > 21:43:05.396589 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48265029:48266477, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396603 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48266477:48267925, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396605 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack > > 48266477, win 3916, options [nop,nop,TS val 596674 ecr > > 1950216660], length 0 > > 21:43:05.396608 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48267925:48269373, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396621 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48269373:48270821, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396624 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack > > 48269373, win 3870, options [nop,nop,TS val 596674 ecr > > 1950216660], length 0 > > 21:43:05.396641 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48270821:48272269, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396653 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48272269:48273717, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396656 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack > > 48272269, win 3825, options [nop,nop,TS val 596674 ecr > > 1950216660], length 0 > > 21:43:05.396659 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48273717:48275165, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396671 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48275165:48276613, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396674 IP 10.10.0.7.946 > 10.10.0.5.2049: Flags [.], ack > > 48275165, win 3780, options [nop,nop,TS val 596674 ecr > > 1950216660], length 0 > > 21:43:05.396676 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48276613:48278061, ack 4394885, win 29124, options [nop,nop,TS val > > 1950216660 ecr 596674], length 1448 > > 21:43:05.396689 IP 10.10.0.5.2049 > 10.10.0.7.946: Flags [.], seq > > 48278061:48279509, ack 4394885, win 29124, options [nop,nop,TS val > > Write failed: Broken pipe > > I have nfsuserd running on both client/server. nfscbd is running. > nfs_client_enable=yes in rc.conf. > > User lookups seem to work fine: > > > -rw-r--r-- 1 root bryan 1554804 Jul 6 10:50 > > /mnt/distfiles/pkg-1.1.4.tar.xz > > I ran a find -ls on these paths and all files return a user/group. I > am > guessing there is a race condition with files being written and > looking > up the associated groups. > > -- > Regards, > Bryan Drewery > bdrewery@freenode/EFNet > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Fri Jul 12 09:33:39 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CD010AFF; Fri, 12 Jul 2013 09:33:39 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 683E11925; Fri, 12 Jul 2013 09:33:38 +0000 (UTC) Received: from server.rulingia.com (c220-239-237-213.belrs5.nsw.optusnet.com.au [220.239.237.213]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id r6C9XTti060623 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 12 Jul 2013 19:33:30 +1000 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.7/8.14.7) with ESMTP id r6C9XOpa048734 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 12 Jul 2013 19:33:24 +1000 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.7/8.14.7/Submit) id r6C9XMNk048732; Fri, 12 Jul 2013 19:33:22 +1000 (EST) (envelope-from peter) Date: Fri, 12 Jul 2013 19:33:22 +1000 From: Peter Jeremy To: Volodymyr Kostyrko Subject: Re: ZFS default compression algo for contemporary FreeBSD versions Message-ID: <20130712093322.GC23426@server.rulingia.com> References: <51D576E1.6030803@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="YiEDa0DAkWCtVeE4" Content-Disposition: inline In-Reply-To: <51D576E1.6030803@gmail.com> X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, Dmitry Morozovsky , avg@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 09:33:39 -0000 --YiEDa0DAkWCtVeE4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2013-Jul-04 16:21:37 +0300, Volodymyr Kostyrko wrote: >04.07.2013 15:22, Dmitry Morozovsky wrote: >> Collegues, >> >> is it sane to just set 'zfs compression=3Don dataset' to achieve best al= go on >> fresh FreeBSD systems (-current and/or stable/9)? You need to define what you mean by "best". Using gzip9 generally gives the smallest on-disk size but uses significant amounts of CPU time at quite high priority. > Default compression is still lzjb and=20 True >bootloader can't boot oof datasets compressed with lzjb. What gave you that idea? I've been booting FreeBSD off lzjb-compressed pools for something like 5 years. AFAIK, you can't use gzip (or you couldn't when I last tried). --=20 Peter Jeremy --YiEDa0DAkWCtVeE4 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iEYEARECAAYFAlHfzWIACgkQ/opHv/APuIf5tQCfdOHeXPVkMHs3OhZg+5sjo6aB yqIAoIm805ogYLoNY3w+MT0auU4aqUkR =5Lj4 -----END PGP SIGNATURE----- --YiEDa0DAkWCtVeE4-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 12 16:45:48 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1ABF52B9 for ; Fri, 12 Jul 2013 16:45:48 +0000 (UTC) (envelope-from healer@rpi.edu) Received: from smtp9.server.rpi.edu (gateway.canit.rpi.edu [128.113.2.229]) by mx1.freebsd.org (Postfix) with ESMTP id CC7681FDE for ; Fri, 12 Jul 2013 16:45:47 +0000 (UTC) Received: from smtp-auth1.server.rpi.edu (smtp-auth1.server.rpi.edu [128.113.2.231]) by smtp9.server.rpi.edu (8.14.3/8.14.3/Debian-9.4) with ESMTP id r6CGjf68019382 for ; Fri, 12 Jul 2013 12:45:41 -0400 Received: from smtp-auth1.server.rpi.edu (localhost [127.0.0.1]) by smtp-auth1.server.rpi.edu (Postfix) with ESMTP id F136758033 for ; Fri, 12 Jul 2013 12:45:40 -0400 (EDT) Received: from [128.113.210.26] (vpn-210-26.net.rpi.edu [128.113.210.26]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: healer) by smtp-auth1.server.rpi.edu (Postfix) with ESMTPSA id D478358020 for ; Fri, 12 Jul 2013 12:45:40 -0400 (EDT) Message-ID: <51E032B5.9080705@rpi.edu> Date: Fri, 12 Jul 2013 12:45:41 -0400 From: Bob Healey User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Massive Problems with 10G, NFS, ZFS, and iSCSI Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP X-Bayes-Prob: 0.0001 (Score 0, tokens from: outgoing, @@RPTN) X-Spam-Score: 0.00 () [Hold at 10.10] T_RP_MATCHES_RCVD:-0.01,SPF(none:0) X-CanIt-Incident-Id: 02JXsJFff X-CanIt-Geo: ip=128.113.210.26; country=US; region=NY; city=Troy; postalcode=12180; latitude=42.7495; longitude=-73.5951; metrocode=532; areacode=518; http://maps.google.com/maps?q=42.7495,-73.5951&z=6 X-CanItPRO-Stream: outgoing X-Canit-Stats-ID: Bayes signature not available X-Scanned-By: CanIt (www . roaringpenguin . com) on 128.113.2.229 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 16:45:48 -0000 I've been beating my head against a brick wall for a week with this and 5 similar systems. My current major headache: Dell Poweredge R610, dual quad core Xeon E5530 @ 2.4GHz, 24GB RAM 4 onboard bce NICs, 1 mxge NIC, pair of 10K SAS drives on mpt (Dell MB SAS controller), pair of 15 drive 1TB RAID 6 arrays on mfi (PERC 6). The machine was originally installed with FreeBSD 7.2 and has been upgraded through the years to 9.1. None of the issues I'm currently seeing manifested themselves under 9.0. When under heavy NFS load, the server currently becomes non-responsive on the network, unless the packet payload is very small (ICMP ping packets with > 124 bytes payload get dropped). Current network config: bce0: management network, connected to the 37 IPMI controllers in the rack, has conserver running SOL connections to each bce1: link to outside world, everything in rack trying to reach outside is NATed through here bce2: used for a direct host to host ISCSI link to another host in the rack to provide a hard drive for a virtual machine. This machine is the iscsi target, and an 80GB zvol is the backing store. mxge0/vlan1: connected to first 25 machines in rack mxge0/vlan2: connected to remaining 12 machines in rack, plus a vm on host #25 on vlan 1 This is an HPC cluster, with all nodes running RHEL 5. The landing pads (1 real, 1 virtual) are multihomed to both the internal and external networks, so the only traffic that crosses the NAT is software updates and job accounting information. PF is used for firewalling and NAT. skip is enabled on all internal interfaces. Stuff I've tried: setting vfs.zfs.arc_max="20480M", disabling flow control on the 10G NIC, moving the ZIL to some unused space on the boot drive (RAID 1, mostly UFS). I'm getting lots of Limiting open port RST response from 32325 to 200 packets/sec in the logs, ISCSI timeouts on the client, and NFS server not responding errors. netstat -i is showing lots of input errors on mxge, but i'm not seeing any errors on the switch (Dell Powerconnect 6248). Myricom (nic vendor) is at a loss too. Any ideas on what I should try next? I'm at the point of throwing darts blindfolded. I've got 5 more similar misbehaving machines, 4 of which behave just fine when using igb instead of mxge. -- Bob Healey Systems Administrator Biocomputation and Bioinformatics Constellation and Molecularium healer@rpi.edu (518) 276-4407 From owner-freebsd-fs@FreeBSD.ORG Fri Jul 12 19:15:45 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B5DC82C5; Fri, 12 Jul 2013 19:15:45 +0000 (UTC) (envelope-from artem.naluzhnyy@gmail.com) Received: from mail-wi0-x22d.google.com (mail-wi0-x22d.google.com [IPv6:2a00:1450:400c:c05::22d]) by mx1.freebsd.org (Postfix) with ESMTP id 040611A15; Fri, 12 Jul 2013 19:15:44 +0000 (UTC) Received: by mail-wi0-f173.google.com with SMTP id hq4so1038469wib.6 for ; Fri, 12 Jul 2013 12:15:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=+88swS6fEFYbjY/PJZ6w5Z5I9iLpjMWv67ummD29GBg=; b=FRCfri8cFg4xxuHSYBw9aqzFmv7Okow/FR2ck8d1gkCdowAqejOKSs78Vb2fz2RrJ0 /qB6C/tYKA4GtT6mXlgpkxGlJORHd35vyOVKFZdK3q9HaxEPh09cX45+Ja4okfRNoSTn bbIA0QOO+zB7zil3ZzyawzKfwVm7mVeewG0wMN2WXGxJmAVw8GW2763OMVHsyHr2QFZ8 rzQbnMjX9EftVhLvQ8XjrsMUKDvDe47NApDY4jFEfzyNmdtz54znkwKD7+suElQXhnoW FClU2B9FUbZeZHc7K0+uz9wYump5LJrd13zFDz5BDhxvroPAS1yJ82mvKobL8QwaqeuI QDhw== X-Received: by 10.180.36.107 with SMTP id p11mr2507885wij.31.1373656544076; Fri, 12 Jul 2013 12:15:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.203.68 with HTTP; Fri, 12 Jul 2013 12:15:03 -0700 (PDT) In-Reply-To: References: From: Artem Naluzhnyy Date: Fri, 12 Jul 2013 22:15:03 +0300 Message-ID: Subject: Re: RAID10 stripe size and PostgreSQL performance To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Cc: freebsd-database@freebsd.org, freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 19:15:45 -0000 On Fri, Jul 12, 2013 at 4:55 PM, Ivan Voras wrote: > I just looked at your RAID configuration at http://pastebin.com/F8uZEZdm > and you have a mirror of stripes (RAID-01) nor a stripe of mirrors > (RAID-10). And apparently, is I parse your configuration correctly, you > have a 1M stripe in the MIRROR part of the RAID, and an unknown stripe > size in the STRIPE part. This is probably a bug in mfiutil output. There is no "RAID 01" option in the controller configuration, and its documentation says (http://goo.gl/6X5pe): "RAID 10, a combination of RAID 0 and RAID 1, consists of striped data across mirrored spans. A RAID 10 drive group is a spanned drive group that creates a striped set from a series of mirrored drives. RAID 10 allows a maximum of eight spans. You must use an even number of configuration Scenarios 1-7 drives in each RAID virtual drive in the span. The RAID 1 virtual drives must have the same stripe size." There is also no options to configure a different stripe size for the mirrors, I can only set it globally for the whole RAID 10 volume. > Anyway, could you please do one more test: > > 1) create a large file with "dd if=/dev/zero of=file bs=1m count=48000" > 2) install /usr/ports/benchmarks/randomio > 3) run "randomio file 8 0.5 1 8192 10 10" > > ... and report the results. See results at the end of http://pastebin.com/F8uZEZdm There is yet another issue that makes (I guess it should) all previous benchmarks kinda inaccurate and irrelevant - looks like the the UFS partitions are not aligned properly: $ gpart show => 63 1167966145 mfid0 MBR (557G) 63 1167957567 1 freebsd [active] (556G) 1167957630 8578 - free - (4.2M) => 0 1167957567 mfid0s1 BSD (556G) 0 4194304 1 freebsd-ufs (2.0G) 4194304 16777216 2 freebsd-swap (8.0G) 20971520 1130217472 5 freebsd-ufs (539G) 1151188992 16768575 4 freebsd-ufs (8G) Will also try to fix the alignment and make some tests. -- Artem Naluzhnyy From owner-freebsd-fs@FreeBSD.ORG Fri Jul 12 19:38:57 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 75EE46BD for ; Fri, 12 Jul 2013 19:38:57 +0000 (UTC) (envelope-from bofh@terranova.net) Received: from tog.net (tog.net [IPv6:2605:5a00::5]) by mx1.freebsd.org (Postfix) with ESMTP id 3E64C1B61 for ; Fri, 12 Jul 2013 19:38:57 +0000 (UTC) Received: from [IPv6:2605:5a00:ffff::face] (unknown [IPv6:2605:5a00:ffff::face]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by tog.net (Postfix) with ESMTPSA id 3bsPZl1xg6z5bH; Fri, 12 Jul 2013 15:38:55 -0400 (EDT) Message-ID: <51E05B48.60607@terranova.net> Date: Fri, 12 Jul 2013 15:38:48 -0400 From: Travis Mikalson Organization: TerraNovaNet Internet Services User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Bob Healey , freebsd-fs@freebsd.org Subject: Re: Massive Problems with 10G, NFS, ZFS, and iSCSI References: <51E032B5.9080705@rpi.edu> In-Reply-To: <51E032B5.9080705@rpi.edu> X-Enigmail-Version: 0.96.0 OpenPGP: url=http://www.terranova.net/pgp/bofh Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Jul 2013 19:38:57 -0000 Bob Healey wrote: > I've been beating my head against a brick wall for a week with this and > 5 similar systems. > > My current major headache: > Dell Poweredge R610, dual quad core Xeon E5530 @ 2.4GHz, 24GB RAM 4 > onboard bce NICs, 1 mxge NIC, pair of 10K SAS drives on mpt (Dell MB SAS > controller), pair of 15 drive 1TB RAID 6 arrays on mfi (PERC 6). > > The machine was originally installed with FreeBSD 7.2 and has been > upgraded through the years to 9.1. None of the issues I'm currently > seeing manifested themselves under 9.0. When under heavy NFS load, the > server currently becomes non-responsive on the network, unless the > packet payload is very small (ICMP ping packets with > 124 bytes payload > get dropped). > > Current network config: > bce0: management network, connected to the 37 IPMI controllers in the > rack, has conserver running SOL connections to each > bce1: link to outside world, everything in rack trying to reach outside > is NATed through here > bce2: used for a direct host to host ISCSI link to another host in the > rack to provide a hard drive for a virtual machine. This machine is the > iscsi target, and an 80GB zvol is the backing store. > mxge0/vlan1: connected to first 25 machines in rack > mxge0/vlan2: connected to remaining 12 machines in rack, plus a vm on > host #25 on vlan 1 > > This is an HPC cluster, with all nodes running RHEL 5. The landing pads > (1 real, 1 virtual) are multihomed to both the internal and external > networks, so the only traffic that crosses the NAT is software updates > and job accounting information. > > PF is used for firewalling and NAT. skip is enabled on all internal > interfaces. I have zero experience with mxge NICs, and I expect others will have a lot more to say, but the first thing I'd try in your shoes is complete removal of pf from your kernel. Try replacing it with ipfw and see if it helps any. Pf is generally not recommended above 1Gbit due to it still working under a single mutex. I'm linking this for purposes of describing pf's current performance limitations, not for the rest of the content of the post: http://forum.pfsense.org/index.php?topic=50812.0;wap2 > Stuff I've tried: setting vfs.zfs.arc_max="20480M", disabling flow > control on the 10G NIC, moving the ZIL to some unused space on the boot > drive (RAID 1, mostly UFS). > > I'm getting lots of Limiting open port RST response from 32325 to 200 > packets/sec in the logs, ISCSI timeouts on the client, and NFS server > not responding errors. netstat -i is showing lots of input errors on > mxge, but i'm not seeing any errors on the switch (Dell Powerconnect > 6248). Myricom (nic vendor) is at a loss too. > > Any ideas on what I should try next? I'm at the point of throwing darts > blindfolded. > > I've got 5 more similar misbehaving machines, 4 of which behave just > fine when using igb instead of mxge. Again, I have no experience with mxge good or bad and I wouldn't rule out the possibility of mxge driver performance either not being up to snuff or requiring tuning. Another thing that comes to mind that you haven't mentioned, have you tuned your mbuf clusters upwards from default? My /boot/loader.conf just for a loaded box with only gigabit NICs adjusts things upwards like so: kern.ipc.nmbclusters="262144" kern.ipc.nmbjumbop="262144" kern.ipc.nmbjumbo16="32000" kern.ipc.nmbjumbo9="64000" netstat -m can give you some insight on your mbuf cluster usage, and would be especially interesting to see during one of these fits you've described. -- TerraNovaNet Internet Services - Key Largo, FL Voice: (305)453-4011 x101 Fax: (305)451-5991 http://www.terranova.net/ PGP: 50091B3D ---------------------------------------------- Life's not fair, but the root password helps. From owner-freebsd-fs@FreeBSD.ORG Sat Jul 13 05:48:10 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7BBE9399 for ; Sat, 13 Jul 2013 05:48:10 +0000 (UTC) (envelope-from berend@pobox.com) Received: from smtp.pobox.com (b-pb-sasl-quonix.pobox.com [208.72.237.35]) by mx1.freebsd.org (Postfix) with ESMTP id 397C711FD for ; Sat, 13 Jul 2013 05:48:09 +0000 (UTC) Received: from smtp.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id E302929059; Sat, 13 Jul 2013 05:48:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=date :message-id:from:to:cc:subject:in-reply-to:references :mime-version:content-type:content-transfer-encoding; s=sasl; bh=hO+HMgauLC5l/3vjFK4WU64SqRI=; b=QDgD6ol/U0r6A7SCWEtRmflXKQej Z0JvxHlTqr10YUwng9aaMK+I3J6DIM0fGXCYePI+pPQZkJHpe7l5Bcz0Kw0Dne7I nvPC08EiasiiEwbBz/6oAuBJE1YGaboOoRkokYcZTQy8QeJTjQPDMyh6FJBSdprw C8DGA8kZiRioito= DomainKey-Signature: a=rsa-sha1; c=nofws; d=pobox.com; h=date:message-id :from:to:cc:subject:in-reply-to:references:mime-version :content-type:content-transfer-encoding; q=dns; s=sasl; b=gBv09m UJ+dacvAMWomC4/xVsbFLp1A8SMrqxrZj8vdA7jJ3L9dk4DOTc696sh+vb/KKUct SnA8Cb8kIjCgN2ZN1qjp55EHukEIeriL9pRSUFiwu7WgH4H6CrANolkdJrCcpLnh 0QqyeNjNwbQW9i9WbU3bvsOVAmM1iJn0YgHUM= Received: from b-pb-sasl-quonix.pobox.com (unknown [127.0.0.1]) by b-sasl-quonix.pobox.com (Postfix) with ESMTP id DA53729052; Sat, 13 Jul 2013 05:48:07 +0000 (UTC) Received: from bmach.nederware.nl (unknown [27.252.169.66]) by b-sasl-quonix.pobox.com (Postfix) with ESMTPA id 5400D29050; Sat, 13 Jul 2013 05:48:07 +0000 (UTC) Received: from quadrio.nederware.nl (quadrio.nederware.nl [192.168.33.13]) by bmach.nederware.nl (Postfix) with ESMTP id 228725C57; Sat, 13 Jul 2013 17:47:59 +1200 (NZST) Received: from quadrio.nederware.nl (quadrio.nederware.nl [127.0.0.1]) by quadrio.nederware.nl (Postfix) with ESMTP id 57B4A4355BBC; Sat, 13 Jul 2013 17:48:02 +1200 (NZST) Date: Sat, 13 Jul 2013 17:48:02 +1200 Message-ID: <8761wfvwml.wl%berend@pobox.com> From: Berend de Boer To: Rick Macklem Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + UFS/ZFS + AWS EC2 In-Reply-To: <818900293.3878290.1373413081112.JavaMail.root@uoguelph.ca> References: <877gh0yvhm.wl%berend@pobox.com> <818900293.3878290.1373413081112.JavaMail.root@uoguelph.ca> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/24.3 (i686-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) Organization: Xplain Technology Ltd MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: multipart/signed; boundary="pgp-sign-Multipart_Sat_Jul_13_17:48:02_2013-1"; micalg=pgp-sha256; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit X-Pobox-Relay-ID: CAFDDF08-EB7F-11E2-A8CD-E84251E3A03C-48001098!b-pb-sasl-quonix.pobox.com Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jul 2013 05:48:10 -0000 --pgp-sign-Multipart_Sat_Jul_13_17:48:02_2013-1 Content-Type: text/plain; charset=US-ASCII >>>>> "Rick" == Rick Macklem writes: Rick> If you could do some testing where you export a UFS volume, Rick> the results might help to isolate the issue to ZFS vs nfsd. Indeed! Have changed subject, as indeed ZFS is a red herring. Issue shows up with UFS as well. Very high cpu for nfds, about the same time to do the operation. Tried it with these settings: vfs.nfsd.tcphighwater=5000 vfs.nfsd.tcpcachetimeo=300 nfs_server_flags="-u -t -n 256" On nfs3 + ufs everything back to normal. I.e. nfs4 was about 15 minutes, same operation was 241s minutes with nfs3 and nfsd using no cpu at all basically. Per other reply, tried this too: vfs.nfsd.issue_delegations=1 Locks up the client at first write access. Ctrl+C doesn't work, need to explicitly send a KILL signal from other terminal. I think it locks up the server in some way as well. Doing an ls on an exported path locks up. Ctrl+C won't work anymore. Process doesn't react to any signal. In the end I rebooted the server to get rid of this. -- All the best, Berend de Boer ------------------------------------------------------ Awesome Drupal hosting: https://www.xplainhosting.com/ --pgp-sign-Multipart_Sat_Jul_13_17:48:02_2013-1 Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit Content-Description: OpenPGP Digital Signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (GNU/Linux) iQIcBAABCAAGBQJR4OoSAAoJEKOfeD48G3g5dvkP/jXVqkXYVWlbpdrw5LLOkLua QQ55z7jvPzQ4oRGCZXAhgX2Xy3h9T2rQcbO6jYWQ+XM93aziBEi6n0nOGeRgQIm/ n/E6s9Vrdy1kaowL5DsdmovYweqnjEEPI0hH4rp5XfJVUCUwkFzZhNeXQ6uvlmmV jtAUsrcrS5k/xJ1vJ8dANcacFkKKGOQ51/ctNiGEK6j29yk6/fmXHEtnDZPm/Bs5 YINoGpP0QoUegb0SzsQI/7TY1e1ezAvaMZenmQHhKCdNpwyz4AP9XxnIFPUPITFf dKamuwxHT5RXcHCJYD5G/wYvuBDHY25pGp6bss5vvCaULNnyTZ4M5XFhYnT8OG47 q0vq44LjU7tzFuXFWovh/nEHkvZjBP82AHDD2d7nn974ju6RSrE4t5HnVeUKLa0Y m9nyPBOZL6CewiBKN8y7Tyn1SgWv8Vw8xeenoVleNcjaWb89dXTmA1m1GpR+pbrH M9VYSVoMYh9EalSSrPPlkiqtEB61tqNTrvKlykjewkQ0N2IWIM976URyvtvy621p 5wpGXU0pcKudhJXI5laXv3LbGPwye7+ipk64CIQRfjnCXKkjZCWLuQNvbw6RWj95 k8HPJc2EZa+m/XCkA06qrS3UMDAXG7RVwajW2jeV+LbgAHuPcvglRdLeh7C1zHBQ qa+wHBP6pcZ4TWBjKqN9 =AehI -----END PGP SIGNATURE----- --pgp-sign-Multipart_Sat_Jul_13_17:48:02_2013-1-- From owner-freebsd-fs@FreeBSD.ORG Sat Jul 13 21:42:30 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 22E61737; Sat, 13 Jul 2013 21:42:30 +0000 (UTC) (envelope-from rmh.aybabtu@gmail.com) Received: from mail-qc0-x22e.google.com (mail-qc0-x22e.google.com [IPv6:2607:f8b0:400d:c01::22e]) by mx1.freebsd.org (Postfix) with ESMTP id CD0FC10EE; Sat, 13 Jul 2013 21:42:29 +0000 (UTC) Received: by mail-qc0-f174.google.com with SMTP id m15so5568529qcq.5 for ; Sat, 13 Jul 2013 14:42:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=mz5/7v1jVc/0UyYcHAAZQqsuUq7tvZY2Jz6uclNsnfo=; b=t81S5ML/YZf6HDsgJEQcC1xEY72cvssZtbx/7MaBvUBLXHUxsB76j+XKair52LSEFl 1k+5583f9+VFZeeAHG7kyDd8IZE4qx2GRrFOJZ+Q2738ZFGq1M3bQvvYrMT8I0ZRalsv 4QDDYqDEVmFT6SjO+G8gB9oVsxkeApkuc3kc9rMQtRa1g1HqpcLE8C2a6Rm3hNSmCwB8 4ksVo+D/WUy3XvtKlD/pk02fRP2JuhaNj26fpk2zx7BHJCp8IgmigHfbj+5ImUlImEA0 mPTqKf/g0WiBeyh2pkf0S5+1SbPrRWmN2tzk1BZiRk27sdEvxlrMU+lHpch/F/UXpgTB P8kA== MIME-Version: 1.0 X-Received: by 10.224.98.140 with SMTP id q12mr44028117qan.99.1373751749331; Sat, 13 Jul 2013 14:42:29 -0700 (PDT) Sender: rmh.aybabtu@gmail.com Received: by 10.49.26.193 with HTTP; Sat, 13 Jul 2013 14:42:29 -0700 (PDT) In-Reply-To: <201307111722.r6BHMohd099772@chez.mckusick.com> References: <201307111722.r6BHMohd099772@chez.mckusick.com> Date: Sat, 13 Jul 2013 23:42:29 +0200 X-Google-Sender-Auth: 2Jldk7cC4mURX94xzZJpXoreah8 Message-ID: Subject: Re: Compatibility options for mount(8) From: Robert Millan To: Kirk McKusick Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jul 2013 21:42:30 -0000 2013/7/11 Kirk McKusick : > I am fine with your proposed addition. I would favor changing the > manual page from > > +For compatibility with some other implementations; this flag is > > to > > +For compatibility with some Linux implementations; this flag is > > as it is (primarily) Linux compatibility and also reflects the comment > that you have added in the code. Well I'm not sure. There's only one implementation I'm aware of that accepts -n, the one in the Linux version of mount (i.e. the mount program from util-linux package). If we want to be more precise we could investigate, but I think it's overkill. It doesn't hurt to be vague on this IMHO. If you don't mind, I'd leave this with "some other" as initially proposed by Jeremy. -- Robert Millan From owner-freebsd-fs@FreeBSD.ORG Sat Jul 13 22:25:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 753E7DE5; Sat, 13 Jul 2013 22:25:29 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [IPv6:2001:5a8:4:7e72:4a5b:39ff:fe12:452]) by mx1.freebsd.org (Postfix) with ESMTP id 5706311F0; Sat, 13 Jul 2013 22:25:29 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id r6DMPP7p002100; Sat, 13 Jul 2013 15:25:25 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201307132225.r6DMPP7p002100@chez.mckusick.com> To: Robert Millan Subject: Re: Compatibility options for mount(8) In-reply-to: Date: Sat, 13 Jul 2013 15:25:25 -0700 From: Kirk McKusick Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jul 2013 22:25:29 -0000 > Date: Sat, 13 Jul 2013 23:42:29 +0200 > From: Robert Millan > To: Kirk McKusick > Subject: Re: Compatibility options for mount(8) > Cc: freebsd-fs@freebsd.org > > 2013/7/11 Kirk McKusick : > > > I am fine with your proposed addition. I would favor changing the > > manual page from > > > > +For compatibility with some other implementations; this flag is > > > > to > > > > +For compatibility with some Linux implementations; this flag is > > > > as it is (primarily) Linux compatibility and also reflects the comment > > that you have added in the code. > > Well I'm not sure. There's only one implementation I'm aware of that > accepts -n, the one in the Linux version of mount (i.e. the mount > program from util-linux package). > > If we want to be more precise we could investigate, but I think it's > overkill. It doesn't hurt to be vague on this IMHO. > > If you don't mind, I'd leave this with "some other" as initially > proposed by Jeremy. > > -- > Robert Millan > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" OK to leave it. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Sat Jul 13 22:45:47 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E1556226 for ; Sat, 13 Jul 2013 22:45:47 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id ADB861277 for ; Sat, 13 Jul 2013 22:45:47 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.89,660,1367985600"; d="scan'208";a="40228784" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 13 Jul 2013 18:45:40 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A76E2B3F17; Sat, 13 Jul 2013 18:45:40 -0400 (EDT) Date: Sat, 13 Jul 2013 18:45:40 -0400 (EDT) From: Rick Macklem To: Berend de Boer Message-ID: <153512858.1034456.1373755540674.JavaMail.root@uoguelph.ca> In-Reply-To: <8761wfvwml.wl%berend@pobox.com> Subject: Re: Terrible NFS4 performance: FreeBSD 9.1 + UFS/ZFS + AWS EC2 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: freebsd-fs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 13 Jul 2013 22:45:47 -0000 Berend de Boer wrote: > >>>>> "Rick" == Rick Macklem writes: > > Rick> If you could do some testing where you export a UFS volume, > Rick> the results might help to isolate the issue to ZFS vs nfsd. > > Indeed! Have changed subject, as indeed ZFS is a red herring. Issue > shows up with UFS as well. Very high cpu for nfds, about the same > time > to do the operation. > > Tried it with these settings: > > vfs.nfsd.tcphighwater=5000 > vfs.nfsd.tcpcachetimeo=300 > nfs_server_flags="-u -t -n 256" > > > On nfs3 + ufs everything back to normal. I.e. nfs4 was about 15 > minutes, same operation was 241s minutes with nfs3 and nfsd using no > cpu at all basically. > All I can suggest is capturing packets and then emailing be the captured packet trace. You can use tcpdump to do the capture, since wireshark will understand it: # tcpdump -s 0 -w .pcap host and then emailing me .pcap. I can take a look at the packet capture and maybe see what is going on. I think you mentioned that you were using a Linux client, but not what version. I'd suggest a recent kernel from kernel.org. (Fedora tracks updates/fixes for NFSv4 pretty closely, so the newest Fedora release should be pretty current.) > Per other reply, tried this too: > > vfs.nfsd.issue_delegations=1 > > Locks up the client at first write access. Ctrl+C doesn't work, need > to explicitly send a KILL signal from other terminal. > > I think it locks up the server in some way as well. Doing an ls on an > exported path locks up. Ctrl+C won't work anymore. Process doesn't > react to any signal. In the end I rebooted the server to get rid of > this. > Obviously broken, but without a lot more information, I can's say anything. (Assuming you can still run commands on the server, you could start with something like "ps axl" and "procstat -kk".) rick > -- > All the best, > > Berend de Boer > > > ------------------------------------------------------ > Awesome Drupal hosting: https://www.xplainhosting.com/ >