From owner-freebsd-fs@FreeBSD.ORG Sun Feb 22 00:15:44 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6B78EBDD for ; Sun, 22 Feb 2015 00:15:44 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 1E176C79 for ; Sun, 22 Feb 2015 00:15:43 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2BNBQAxH+lU/95baINag1haBIMEv3wKgjiCb0oCgVsBAQEBAQF8hA8BAQEDAQEBASAEJyALBRYYAgINGQIpAQkmBggHBAETBwIEiAYIDbpQl2kBAQEBBgEBAQEBAQEBGoEhiXKEAhsBARsBMweCaIFDBYpKiGaDRoM6OIUsjDQiggIcgW4gMQeBBDl/AQEB X-IronPort-AV: E=Sophos;i="5.09,622,1418101200"; d="scan'208";a="192199354" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 21 Feb 2015 19:15:36 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id E102EB3F75; Sat, 21 Feb 2015 19:15:36 -0500 (EST) Date: Sat, 21 Feb 2015 19:15:36 -0500 (EST) From: Rick Macklem To: Rainer Duffner Message-ID: <1105076308.8017441.1424564136910.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: The magic of ZFS and NFS (2nd try) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, Jordan Hubbard , Christian Baer X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Feb 2015 00:15:44 -0000 Rainer Duffner wrote: >=20 > > Am 21.02.2015 um 19:23 schrieb Jordan Hubbard : > >=20 > >=20 > >> On Feb 21, 2015, at 9:36 AM, Christian Baer > >> wrote: > >>=20 > >> But why shouldn't I use /etc/exports? I have read people writing > >> this (don't > >> use /etc/exports) in forums when searching for answers, however > >> the current > >> manpage for zfs says this: > >=20 > > FreeNAS has more experience with sharing things from ZFS than > > anyone else in the BSD community (that=E2=80=99s not hyperbole, it=E2= =80=99s > > simply fact). We don=E2=80=99t use any of the zfs sharing flags. Thos= e > > were intended more for Solaris (sharesmb, for example - FreeBSD > > lets you do that, but what does it *mean* when you don=E2=80=99t have a > > native CIFS service?). FreeBSD has never integrated ZFS=E2=80=99s not= ion > > of sharing or, for that matter, a number of other things like > > drive hot sparing and automatic replacement, and you=E2=80=99re seeing = the > > results of ZFS=E2=80=99s solaris roots still not lining up 100% with th= eir > > new FreeBSD home. That=E2=80=99s all. > >=20 > > I would simplify things, just as FreeNAS has (for good reasons), > > and simply have ZFS be =E2=80=9Ca filesystem=E2=80=9D from FreeBSD=E2= =80=99s perspective > > and share it just as you would UFS. >=20 >=20 >=20 > Interesting. >=20 > I admit I don=E2=80=99t use NFS v4. > Is it much faster than NFS v3 these days? >=20 Nope. If you are lucky, you'll be about performance neutral when switching from v3 -> v4. If you access lots of files, you probably won't be performance neutral, due to the extra overhead of Opens, etc. NFSv4 isn't really a replacement for NFSv3 imho. It fills a different, although somewhat overlapping solution space. It provides better byte range locking, ACLs and, when pNFS becomes commonly available, better scalability for I/O performance on relatively large servers (especially if the clients are accessing a fairly small number of large files). If you don't need any of the above, you don't need/want NFSv4, again imho. Sorry to wander off topic, but Rainer did ask;-) rick > But I=E2=80=99ve always added the line from exports(5) into the sharenfs > property like >=20 > zfs get sharenfs datapool/nfs/ds3-documents > NAME PROPERTY VALUE > SOURCE > datapool/nfs/ds3-documents sharenfs -maproot=3D1003 -network > 10.10.10.0 -mask 255.255.255.0 inherited from datapool/nfs >=20 > These lines get written into /etc/zfs/exports >=20 > I like it that way because if a filesystem is destroyed, I don=E2=80=99t = have > to remember removing it from /etc/exports. >=20 > I also admit I=E2=80=99m heavily influenced by Solaris on this particular > setting=E2=80=A6 >=20 >=20 >=20 >=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sun Feb 22 00:18:47 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3CC74CEF for ; Sun, 22 Feb 2015 00:18:47 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id DEDF0C9D for ; Sun, 22 Feb 2015 00:18:46 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2BNBQCcH+lU/95baINag1haBIMEv3wKgjiCb0oCgVsBAQEBAQF8hA8BAQEDAQEBASAEJyALBRYYAgINGQIpAQkmBggHBAETBwIEiAYIDbpPl2kBAQEBBgEBAQEBAQEBGoEhiXKEAhsBARsBMweCaIFDBYpKiGaDRoM6OIUsjDQiggIcgW4gMQeBBDl/AQEB X-IronPort-AV: E=Sophos;i="5.09,622,1418101200"; d="scan'208";a="193923662" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 21 Feb 2015 19:18:40 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DE347B3F86; Sat, 21 Feb 2015 19:18:39 -0500 (EST) Date: Sat, 21 Feb 2015 19:18:39 -0500 (EST) From: Rick Macklem To: Rainer Duffner Message-ID: <778130683.8017836.1424564319905.JavaMail.root@uoguelph.ca> In-Reply-To: <1105076308.8017441.1424564136910.JavaMail.root@uoguelph.ca> Subject: Re: The magic of ZFS and NFS (2nd try) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, Christian Baer , Jordan Hubbard X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Feb 2015 00:18:47 -0000 I wrote: > Rainer Duffner wrote: > >=20 > > > Am 21.02.2015 um 19:23 schrieb Jordan Hubbard > > > : > > >=20 > > >=20 > > >> On Feb 21, 2015, at 9:36 AM, Christian Baer > > >> wrote: > > >>=20 > > >> But why shouldn't I use /etc/exports? I have read people writing > > >> this (don't > > >> use /etc/exports) in forums when searching for answers, however > > >> the current > > >> manpage for zfs says this: > > >=20 > > > FreeNAS has more experience with sharing things from ZFS than > > > anyone else in the BSD community (that=E2=80=99s not hyperbole, it=E2= =80=99s > > > simply fact). We don=E2=80=99t use any of the zfs sharing flags. Th= ose > > > were intended more for Solaris (sharesmb, for example - FreeBSD > > > lets you do that, but what does it *mean* when you don=E2=80=99t have= a > > > native CIFS service?). FreeBSD has never integrated ZFS=E2=80=99s > > > notion > > > of sharing or, for that matter, a number of other things like > > > drive hot sparing and automatic replacement, and you=E2=80=99re seein= g > > > the > > > results of ZFS=E2=80=99s solaris roots still not lining up 100% with > > > their > > > new FreeBSD home. That=E2=80=99s all. > > >=20 > > > I would simplify things, just as FreeNAS has (for good reasons), > > > and simply have ZFS be =E2=80=9Ca filesystem=E2=80=9D from FreeBSD=E2= =80=99s perspective > > > and share it just as you would UFS. > >=20 > >=20 > >=20 > > Interesting. > >=20 > > I admit I don=E2=80=99t use NFS v4. > > Is it much faster than NFS v3 these days? > >=20 > Nope. If you are lucky, you'll be about performance neutral when > switching from v3 -> v4. If you access lots of files, you probably > won't be performance neutral, due to the extra overhead of Opens, > etc. >=20 > NFSv4 isn't really a replacement for NFSv3 imho. It fills a > different, > although somewhat overlapping solution space. It provides better byte > range locking, ACLs and, when pNFS becomes commonly available, better > scalability for I/O performance on relatively large servers > (especially > if the clients are accessing a fairly small number of large files). > If you don't need any of the above, you don't need/want NFSv4, again > imho. >=20 Oh, and NFSv4 allows clients to cross server mount point boundaries. Some will find this a useful feature, others a hassle. > Sorry to wander off topic, but Rainer did ask;-) rick >=20 > > But I=E2=80=99ve always added the line from exports(5) into the sharenf= s > > property like > >=20 > > zfs get sharenfs datapool/nfs/ds3-documents > > NAME PROPERTY VALUE > > SOURCE > > datapool/nfs/ds3-documents sharenfs -maproot=3D1003 -network > > 10.10.10.0 -mask 255.255.255.0 inherited from datapool/nfs > >=20 > > These lines get written into /etc/zfs/exports > >=20 > > I like it that way because if a filesystem is destroyed, I don=E2=80=99= t > > have > > to remember removing it from /etc/exports. > >=20 > > I also admit I=E2=80=99m heavily influenced by Solaris on this particul= ar > > setting=E2=80=A6 > >=20 > >=20 > >=20 > >=20 > >=20 > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to > > "freebsd-fs-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sun Feb 22 21:00:11 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E54B8F8E for ; Sun, 22 Feb 2015 21:00:11 +0000 (UTC) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2001:1900:2254:206a::16:76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B210330A for ; Sun, 22 Feb 2015 21:00:11 +0000 (UTC) Received: from bugs.freebsd.org ([127.0.1.118]) by kenobi.freebsd.org (8.14.9/8.14.9) with ESMTP id t1ML0Bp2022206 for ; Sun, 22 Feb 2015 21:00:11 GMT (envelope-from bugzilla-noreply@FreeBSD.org) Message-Id: <201502222100.t1ML0Bp2022206@kenobi.freebsd.org> From: bugzilla-noreply@FreeBSD.org To: freebsd-fs@FreeBSD.org Subject: Problem reports for freebsd-fs@FreeBSD.org that need special attention X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 Date: Sun, 22 Feb 2015 21:00:11 +0000 Content-Type: text/plain; charset="UTF-8" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Feb 2015 21:00:12 -0000 To view an individual PR, use: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id). The following is a listing of current problems submitted by FreeBSD users, which need special attention. These represent problem reports covering all versions including experimental development code and obsolete releases. Status | Bug Id | Description ------------+-----------+--------------------------------------------------- Open | 136470 | [nfs] Cannot mount / in read-only, over NFS Open | 139651 | [nfs] mount(8): read-only remount of NFS volume d Open | 144447 | [zfs] sharenfs fsunshare() & fsshare_main() non f 3 problems total for which you should take action. From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 10:06:21 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 77E63739 for ; Mon, 23 Feb 2015 10:06:21 +0000 (UTC) Received: from mail1.postbank.bg (mx.postbank.bg [195.242.126.253]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mail.postbank.bg", Issuer "GeoTrust DV SSL CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 9BAD8BC2 for ; Mon, 23 Feb 2015 10:06:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; d=postbank.bg; s=mail; c=relaxed/relaxed; q=dns/txt; i=@postbank.bg; t=1424685967; x=1427277967; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:CC:MIME-Version:Content-Type: content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=rJhzjD/KYLTNCVcPze2u/Z7qIgcUO3tK9YV+DGlQgDI=; b=A9BK3DHFI4vvFHzx7ahpF4j2eaNxwembCvP3kUqQklviqeRjquW5c6ExMnpAui3t Tk4FMNtbwUAqw2aIPWao0rYdJXrq91gt/8WU1bBcOe/hRtwwWI4H7JnBX8rYo8yW n5rIHiQ9HQH2OYtZl0AmduChRSAZpL53MD7fx/HBhaw=; X-AuditID: ac100165-f79f86d000002fb1-07-54eafb8f5785 Received: from sofdc01excv15.postbank.bg ( [10.1.129.38]) (using TLS with cipher AES128-SHA (128/128 bits)) (Client did not present a certificate) by mail1.postbank.bg (Eurobank AD BG Outbound mail system) with SMTP id 9F.0A.12209.F8BFAE45; Mon, 23 Feb 2015 12:06:07 +0200 (EET) From: "Ivailo A. Tanusheff" To: Martin Simmons Subject: RE: ZVOL and snapshots length problem Thread-Topic: ZVOL and snapshots length problem Thread-Index: AdBND6RMju0eZwL0REa1ABKJVnnOtwAGG40ZAIoHAwA= Date: Mon, 23 Feb 2015 10:06:06 +0000 Message-ID: <1422065A4E115F409E22C1EC9EDAFBA422145ACC@sofdc01exc02.postbank.bg> References: <1422065A4E115F409E22C1EC9EDAFBA42214513E@sofdc01exc02.postbank.bg> <201502201612.t1KGCacu014655@higson.cam.lispworks.com> In-Reply-To: <201502201612.t1KGCacu014655@higson.cam.lispworks.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.1.2.31] Content-Type: text/plain; charset="us-ascii" content-transfer-encoding: quoted-printable MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrAKsWRmVeSWpSXmKPExsXCxdioptv/+1WIwbftshbHHv9ks7g39wOL A5PHjE/zWTwmv5jLHsAU1cBok5iXl1+SWJKqkJJanGyrFB2QX1ySlJiXHavgklmcnJOYmZta pKSQmWKrZKykUJCTmJyam5pXYquUWFCQmpeiZMelgAFsgMoy8xRS85LzUzLz0m2VPIP9dS0s TC11DZXsEKZaqSkbGid8Zc+Yfkm9YKpYxfH5x1gbGHcJdTFyckgImEg8X3yBGcIWk7hwbz1b FyMXh5DAHCaJaVd+sIEk2ICKts3dwwRiiwhoSvQfWMIIYjMLmEoc3dQBZgsL6EksmXGABaJG X6Kr5yaUbSXRtXoZ2AIWAVWJuTPugc3kFfCXOHdzNSvEsg5GidkPmllBEpwCzhITp/SCDWUE uuj7qTVMEMvEJW49mc8EcamAxJI956GuFpV4+fgfK4QtK/HtwD+o43QkFuz+xAZha0ssW/ia GWKxoMTJmU9YJjCKzkIydhaSlllIWmYhaVnAyLKKUbI4Py0l2cAw2Ne9zMBQrwAajXpJ6ZsY gelhjQBj6g7GF1ecDjEKcDAq8fAaFL0KEWJNLCuuzD3EKMHBrCTCK/0TKMSbklhZlVqUH19U mpNafIgxGRg8E5mlRJPzgakrryTe0MTA0NTEyNDCwMzClDRhJXHe2EVPQ4QE0oHpLTs1tSC1 CGYLEwenVAOju51+p8xWL7dlwRtYAv2+3LX1UIiYoc7aEPXCuzq14mVlyxlJ1o3rVhhdmafc aTct71G2mtJxy76UQ38cmfvD/VNb9+pkzWrTU7DgunOzcNlMu6eKd2qXz/gSonz7+K6+DYsq D5f1hh7zWie+aeq7mATFamd5vpsfXzDvMlr7sPzyccaPy24qsRRnJBpqMRcVJwIAYLjZJlMD AAA= Cc: "freebsd-fs@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 10:06:21 -0000 Hi there, But how may I fix this? As far as I know ZFS both allows names with spaces inside and with more than= 63 symbols. So how may I make this work for me? Regards, Ivailo Tanusheff -----Original Message----- From: Martin Simmons [mailto:martin@lispworks.com] Sent: Friday, February 20, 2015 6:13 PM To: Ivailo A. Tanusheff Cc: freebsd-fs@freebsd.org Subject: Re: ZVOL and snapshots length problem >>>>> On Fri, 20 Feb 2015 13:19:28 +0000, Ivailo A Tanusheff said: > > Dear all, > > I have some trouble creating and manipulating ZVOL on my server. I am usin= g FreeBSD 10 and I am creating a little bit sophisticated structure, where I= create several file systems and volumes inside them for easy manipulation a= nd snapshot management. > An example structure is: > / / > / / > > Whenever my ZVOL path exceeds 63 characters, both when creating volume or= snapshot, I receive this error in my messages log: > Feb 20 13:05:04 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/DB@Daily_operations_2015-02-20-13:05, error=3D22) Feb 20 > 13:05:04 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/FS@Daily_operations_2015-02-20-13:05, error=3D22) Feb 20 > 13:05:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/Report@Daily_operations_2015-02-20-13:05, error=3D63) > > Feb 20 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/DB@Daily_operations_2015-02-20-13:10, error=3D22) Feb 20 > 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/FS@Daily_operations_2015-02-20-13:10, error=3D22) Feb 20 > 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > (gp->name=3Dzvol/TANK/Bank > system/Core/Report@Daily_operations_2015-02-20-13:10, error=3D63) > > As far as I digged into it this is due to impossibility of creation of /d= ev/zvol/... pointing to that volume or snapshot, while the volume/snapshot i= s still visible in the zfs list tree, although I am not quite sure I can use= it. > > Is there any way to fix this behavior or this is an implementation bug, no= t described in the manual? > If I create shorter names the problem disappears, but this is contra versi= on of what I needed, so it is a not acceptable solution. I think this is a limitation in FreeBSD device naming. error=3D22 is EINVAL, because you have a space in the name. error=3D63 is ENAMETOOLONG, because the name is longer than SPECNAMELEN (als= o 63 by coincidence). __Martin Disclaimer: This communication is confidential. If you are not the intended recipient, y= ou are hereby notified that any disclosure, copying, distribution or taking= any action in reliance on the contents of this information is strictly proh= ibited and may be unlawful. If you have received this communication by mista= ke, please notify us immediately by responding to this email and then delete= it from your system. Eurobank Bulgaria AD is not responsible for, nor endorses, any opinion, reco= mmendation, conclusion, solicitation, offer or agreement or any information= contained in this communication. Eurobank Bulgaria AD cannot accept any responsibility for the accuracy or co= mpleteness of this message as it has been transmitted over a public network.= If you suspect that the message may have been intercepted or amended, pleas= e call the sender. From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 12:31:21 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BB4EF8E8 for ; Mon, 23 Feb 2015 12:31:21 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7393FD0C for ; Mon, 23 Feb 2015 12:31:20 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YPsAH-0004Sk-5x for freebsd-fs@freebsd.org; Mon, 23 Feb 2015 13:31:09 +0100 Received: from p4fdddda4.dip0.t-ipconnect.de ([79.221.221.164]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 23 Feb 2015 13:31:09 +0100 Received: from christian.baer by p4fdddda4.dip0.t-ipconnect.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 23 Feb 2015 13:31:09 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Christian Baer Subject: Re: The magic of ZFS and NFS (2nd try) Date: Mon, 23 Feb 2015 13:30:56 +0100 Lines: 31 Message-ID: <13471232.xpn8XerdpW@falbala.rz1.convenimus.net> References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <12103095.viZFqgegqA@falbala.rz1.convenimus.net> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: p4fdddda4.dip0.t-ipconnect.de User-Agent: KNode/4.14.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 12:31:21 -0000 Rainer Duffner wrote: > These lines get written into /etc/zfs/exports > > I like it that way because if a filesystem is destroyed, I don’t have to > remember removing it from /etc/exports. > > I also admit I’m heavily influenced by Solaris on this particular setting… I didn't come from Solaris and I wasn't a big fan of it during my time at university. It wasn't the really a problem with the OS itself but with the userland which really sucked rocks at the time. We are talking SunOS 5.8 here. I am guessing that in the future, ZFS will be far more important and UFS will become more and more exotic. Then it would be fine to config everything the ZFS-way. But currently, it seems pretty dumb to have to go through a case list like: case fs == ZFS then /etc/zfs/exports fs == $EXOTIC_OTHER_FS then goto whereever else goto /etc/exports Couldn't help myself with the gotos there. :-D On the other hand, if you can configure the same thing in a number of files, chaos is predestined. That is one machine I would not want to take care of. Regards, Christian From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 12:35:08 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4CAB4988 for ; Mon, 23 Feb 2015 12:35:08 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 0527CD23 for ; Mon, 23 Feb 2015 12:35:07 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YPsE2-0006Tu-MW for freebsd-fs@freebsd.org; Mon, 23 Feb 2015 13:35:02 +0100 Received: from p4fdddda4.dip0.t-ipconnect.de ([79.221.221.164]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 23 Feb 2015 13:35:02 +0100 Received: from christian.baer by p4fdddda4.dip0.t-ipconnect.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 23 Feb 2015 13:35:02 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Christian Baer Subject: Re: The magic of ZFS and NFS (2nd try) Date: Mon, 23 Feb 2015 13:32:31 +0100 Lines: 55 Message-ID: <3359447.v8JUK05WsX@falbala.rz1.convenimus.net> References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <12103095.viZFqgegqA@falbala.rz1.convenimus.net> <54E8D38F.9090608@pinyon.org> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: p4fdddda4.dip0.t-ipconnect.de User-Agent: KNode/4.14.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 12:35:08 -0000 Russell L. Carter wrote: > When I was working out my own mounts, it seemed that sharenfs=on was > required to make them work, but I just checked and indeed I can > mount a zfs file system over NFS4.1 without it. So I would definitely > agree about not complicating things. The complicated thing about the options here is not so much where they have to be set (although I am doing a pretty good job ob messing that up), but rather that it's a pain if a setting can be found in one of 10 configuration files. That is a real pain. > Having both sharenfs and sharesmb *seem* to work does complicate > figuring out how to make NFS work if you don't already know this, > though. Well, I always got NFS to work fine in the past and we use it quite heavily at work. This is my first try at NFS with ZFS. And as I stated, the directory /usr/archive (which is a subfolder of / and formated with UFSv2) can be read fine. So if move or copy something from within /usr/archive/Shared to /usr/archive, I can access that file or folder tree. But let's say that is sub-optimal. > Back to Christian's problem, I don't see nfsv4_server_enable="YES" > in your rc.conf lines. I have it in mine, and NFSv4 works. > See man(4) nfsv4. I was actually expecting NFSv4 to be that default in FreeBSD 10.1. But ok, I added the line and restarted the nfsd with root@obelix:/usr/archive/Shared # /etc/rc.d/nfsd restart Stopping nfsd. Waiting for PIDS: 11247 11248, 11247. Starting nfsd. > You might have a look at /var/log/messages after restarting nfsd. There was nothing in /var/log/messages about the nfsd restarting. But there was something when I tried to mount the export: Feb 23 13:05:19 obelix mountd[50070]: mount request denied from 192.168.100.8 for /usr/archive/Shared Is there a way to get a message about what motivated this reaction? The mount command wasn't helpful either: root@falbala:/mnt # mount obelix:/usr/archive/Shared /mnt/Shared/ [tcp] obelix:/usr/archive/Shared: Permission denied Note that this keeps going and I have to exit it via ctl-c. Kind regards, Christian From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 14:13:29 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8D03CB38 for ; Mon, 23 Feb 2015 14:13:29 +0000 (UTC) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 2B1E5975 for ; Mon, 23 Feb 2015 14:13:28 +0000 (UTC) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.5/8.14.5) with ESMTP id t1NEDIsn036834; Mon, 23 Feb 2015 14:13:18 GMT (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id t1NEDIHU000691; Mon, 23 Feb 2015 14:13:18 GMT Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id t1NEDITT000687; Mon, 23 Feb 2015 14:13:18 GMT Date: Mon, 23 Feb 2015 14:13:18 GMT Message-Id: <201502231413.t1NEDITT000687@higson.cam.lispworks.com> From: Martin Simmons To: freebsd-fs@freebsd.org In-reply-to: <2437038.yvsE2IGTDZ@falbala.rz1.convenimus.net> (message from Christian Baer on Sat, 21 Feb 2015 18:34:12 +0100) Subject: Re: The magic of ZFS and NFS (2nd try) References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <54E7A2CF.60804@pinyon.org> <2437038.yvsE2IGTDZ@falbala.rz1.convenimus.net> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 14:13:29 -0000 >>>>> On Sat, 21 Feb 2015 18:34:12 +0100, Christian Baer said: > > Russell L. Carter wrote: > > > Post your /etc/exports, and the nfs*_enable bits of /etc/rc.conf. And > > as Rainer noted you definitely need to check that uid/gid match on > > both server and client. > > Pretty boring stuff... > > root@obelix:~ # cat /etc/exports > V4: /usr/archive/Shared -alldirs -network 192.168.100/24 > > I reduced the shares to one for the time being. According to exports(5), that reduces it to zero: The third form has the string ``V4:'' followed by a single absolute path name, to specify the NFSv4 tree root. This line does not export any file system, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ but simply marks where the root of the server's directory tree is for NFSv4 clients. The exported file systems for NFSv4 are specified via the other lines in the exports file in the same way as for NFSv2 and NFSv3. __Martin From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 14:45:31 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7EC28902 for ; Mon, 23 Feb 2015 14:45:31 +0000 (UTC) Received: from mail-yh0-x22b.google.com (mail-yh0-x22b.google.com [IPv6:2607:f8b0:4002:c01::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 35E42D2B for ; Mon, 23 Feb 2015 14:45:31 +0000 (UTC) Received: by yhl29 with SMTP id 29so10467828yhl.0 for ; Mon, 23 Feb 2015 06:45:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LxvFK3gcp/7lmw5HUK7SnZIjnW+wjBzwR32mblII38E=; b=EYV7nLiLt5NeNpjhSAETSk8iQK65ErwRYGz6rRCEVLXqytLCNmuQX51qEjaO2pWRtc SvV0HXc0KLOHc+p7noiBZYsuCgPA7HMLsf2LBSqsKTU7t7HNrZzzZ8ty+qpRJaGaF4Mh bs0PbXtneHpHTDSdxOw1YnPG8+S0IjniL7EBPMK3EtkCOR470EgeS8HnhIS2ED436nXa hlzFJUenjwbKCDxtO+S2ux4r18paCq7CdQz1IvJqtqaSQ0ar2aeIRTpSXd2zpdqYB09+ UzDH6wEr1zoN1zd++i1qsKQT6Dh1rz2cRb7QPoY3ii9YpCifqJ31ea8tu8kPlaL406Me r/gA== MIME-Version: 1.0 X-Received: by 10.170.138.130 with SMTP id f124mr11025503ykc.62.1424702730035; Mon, 23 Feb 2015 06:45:30 -0800 (PST) Received: by 10.170.188.1 with HTTP; Mon, 23 Feb 2015 06:45:29 -0800 (PST) In-Reply-To: <20150220214952.GA13839@coyotlan.Tlalpan> References: <20150218191345.GA31812@coyotlan.Tlalpan> <20150220214952.GA13839@coyotlan.Tlalpan> Date: Mon, 23 Feb 2015 14:45:29 +0000 Message-ID: Subject: Re: ZFS root set to readonly=on temporary at boot From: krad To: =?UTF-8?B?WMSrY8Oy?= Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD FS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 14:45:31 -0000 either drop the vfs.root.mountfrom=3D"zfs:zroot" or add vfs.root.mountfrom.options=3Drw to the loader.conf you might be 1/2 overriding the defaul vaules picked up from the bootfs attribute also make sure you have zfs_enable=3Dyes set in you rc system best to use the following as there are a multitude of places it can end up getting unset these days # sysrc zfs_enable zfs_enable: yes On 20 February 2015 at 21:49, X=C4=ABc=C3=B2 wrote: > On Thu, Feb 19, 2015 at 12:12:54PM +0000, krad wrote: > > Check your bootfs and the / file system actually match up. Its quite ea= sy > > to get odd things happening if you have the bootfs set to a filesystem > that > > then has an fstab which then tells / is on another fs. Also check the > > loader.conf on the bootfs. In my experiance its safer to have clean > fstabs, > > and nothing in the loader.conf (in relation to this) and just relay on > the > > bootfs pool property > > Thanks for the tips. I still could not figure it out though. My > partition scheme is as follows: > > # gpart show > =3D> 34 3907029101 ada0 GPT (1.8T) > 34 128 1 freebsd-boot (64K) > 162 8388608 2 freebsd-swap (4.0G) > 8388770 3898640365 3 freebsd-zfs (1.8T) > > with a bootcode installed by: > > # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0 > > The bootfs property of the zpool seems to be set to the correct root: > > # zpool get bootfs zroot > NAME PROPERTY VALUE SOURCE > zroot bootfs zroot local > # zfs list zroot > NAME USED AVAIL REFER MOUNTPOINT > zroot 2.16G 1.75T 509M / > > The /etc/fstab only lists the swap partition (and there are no other > fstab files, nor any unmounted snapshot/partitions): > > # cat /etc/fstab > /dev/label/swap none swap sw 0 0 > > My loader.conf appears to be required for the server to boot: > > # cat /boot/loader.conf > zfs_load=3D"YES" > vfs.root.mountfrom=3D"zfs:zroot" > > Also, the root is mounted in read/write when manually imported from > another system: > > # zpool import -R /mnt zroot > # zfs get readonly zroot > NAME PROPERTY VALUE SOURCE > zroot readonly off temporary > > Anyway, it is not that huge a problem, but it is quite inconvenient not > to understand what could be the reason behind that. As far as I can > tell, it may be related to the fact that my root was not created in its > own dataset: > > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > zroot 2.16G 1.75T 509M / > zroot/ezjail 984M 1.75T 27K /usr/jails > zroot/var 284M 1.75T 10.6M /var > [=E2=80=A6] > > I might check that later on. > > Best, > > X=C4=ABc=C3=B2 > From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 15:29:17 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 47A735DF for ; Mon, 23 Feb 2015 15:29:17 +0000 (UTC) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id D73F01FC for ; Mon, 23 Feb 2015 15:29:16 +0000 (UTC) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.5/8.14.5) with ESMTP id t1NFTCGS039240; Mon, 23 Feb 2015 15:29:12 GMT (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id t1NFTC5k002461; Mon, 23 Feb 2015 15:29:12 GMT Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id t1NFTCNq002457; Mon, 23 Feb 2015 15:29:12 GMT Date: Mon, 23 Feb 2015 15:29:12 GMT Message-Id: <201502231529.t1NFTCNq002457@higson.cam.lispworks.com> From: Martin Simmons To: freebsd-fs@freebsd.org In-reply-to: <3359447.v8JUK05WsX@falbala.rz1.convenimus.net> (message from Christian Baer on Mon, 23 Feb 2015 13:32:31 +0100) Subject: Re: The magic of ZFS and NFS (2nd try) References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <12103095.viZFqgegqA@falbala.rz1.convenimus.net> <54E8D38F.9090608@pinyon.org> <3359447.v8JUK05WsX@falbala.rz1.convenimus.net> X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 15:29:17 -0000 >>>>> On Mon, 23 Feb 2015 13:32:31 +0100, Christian Baer said: > > There was nothing in /var/log/messages about the nfsd restarting. But there > was something when I tried to mount the export: > > Feb 23 13:05:19 obelix mountd[50070]: mount request denied from > 192.168.100.8 for /usr/archive/Shared This is a message from mountd, so make sure that it is restarted after changing /etc/exports (I'm not sure if /etc/rc.d/nfsd restart does that). __Martin From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 16:27:28 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 59A8ECDB for ; Mon, 23 Feb 2015 16:27:28 +0000 (UTC) Received: from mail-yh0-x230.google.com (mail-yh0-x230.google.com [IPv6:2607:f8b0:4002:c01::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 0EF55B7C for ; Mon, 23 Feb 2015 16:27:28 +0000 (UTC) Received: by yhoc41 with SMTP id c41so11112899yho.2 for ; Mon, 23 Feb 2015 08:27:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=w19ZN40RSR+FkMrA6USsLMEy5ITvrO8ZDD1/z4g7Xzs=; b=CFOrMwJVShJ2cMJ10FPEB+RLO9gueQNlYa50jj0TQok4eGieiUr2pCXJhP8qcs3ibq X2rz7cpc+dk08v9fRQH5O4q+x5+HPd0E17cu2CISO0jciJ39oHUJG0+t7Occr9TMqeRk etmpbVlYxB4b0uaOKnKhQao37Pzs6Q0ggR8yRVi+a6ky7+z0c7k3DbNFoit87ZS8aZKn epPXmmdGsBzt7yTb2cpovCg/aHvTcOodjjjGggXZLPa9CDVdfsHI3kSX88JPCerpp8af MnPY5CU7hnUJLTvsm8FGn1J/XHRvmXS42ycNNatF4UXBlBoqrUpOpUVahg95YHB308Nq lCtg== MIME-Version: 1.0 X-Received: by 10.170.142.85 with SMTP id j82mr7634593ykc.123.1424708847155; Mon, 23 Feb 2015 08:27:27 -0800 (PST) Received: by 10.170.79.87 with HTTP; Mon, 23 Feb 2015 08:27:27 -0800 (PST) In-Reply-To: <13471232.xpn8XerdpW@falbala.rz1.convenimus.net> References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <12103095.viZFqgegqA@falbala.rz1.convenimus.net> <13471232.xpn8XerdpW@falbala.rz1.convenimus.net> Date: Mon, 23 Feb 2015 08:27:27 -0800 Message-ID: Subject: Re: The magic of ZFS and NFS (2nd try) From: Freddie Cash To: Christian Baer Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: FreeBSD Filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 16:27:28 -0000 On Mon, Feb 23, 2015 at 4:30 AM, Christian Baer < christian.baer@uni-dortmund.de> wrote: > Rainer Duffner wrote: > > > These lines get written into /etc/zfs/exports > > > > I like it that way because if a filesystem is destroyed, I don=E2=80=99= t have to > > remember removing it from /etc/exports. > > > > I also admit I=E2=80=99m heavily influenced by Solaris on this particul= ar > setting=E2=80=A6 > > I didn't come from Solaris and I wasn't a big fan of it during my time at > university. It wasn't the really a problem with the OS itself but with th= e > userland which really sucked rocks at the time. We are talking SunOS 5.8 > here. > > I am guessing that in the future, ZFS will be far more important and UFS > will become more and more exotic. Then it would be fine to config > everything > the ZFS-way. But currently, it seems pretty dumb to have to go through a > case list like: > > =E2=80=8BYou don't. All NFS-related configuration stuff goes into /etc/exp= orts by default. Including ZFS stuff. Just treat ZFS like any other filesystem when it comes to NFS, and configure it just like any other filesystem. If, and only if, you want to play with the ZFS property "sharenfs", then you can. You aren't required to (and probably shouldn't as the syntax will be different for every OS and may cause issues with replication to other pools). If you do, then anything you put into that property will be automatically copied into /etc/zfs/exports. You should never be manually editing /etc/zfs/exports, as any manual changes you make will be lost the next time you edit any "sharenfs" property on any ZFS filesystem. --=20 Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 18:01:57 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CB087F89 for ; Mon, 23 Feb 2015 18:01:57 +0000 (UTC) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [46.17.166.21]) by mx1.freebsd.org (Postfix) with ESMTP id 79F19837 for ; Mon, 23 Feb 2015 18:01:56 +0000 (UTC) Received: from higson.cam.lispworks.com (higson.cam.lispworks.com [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.5/8.14.5) with ESMTP id t1NI1pnr045331; Mon, 23 Feb 2015 18:01:51 GMT (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id t1NI1phY005990; Mon, 23 Feb 2015 18:01:51 GMT Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id t1NI1pu6005987; Mon, 23 Feb 2015 18:01:51 GMT Date: Mon, 23 Feb 2015 18:01:51 GMT Message-Id: <201502231801.t1NI1pu6005987@higson.cam.lispworks.com> From: Martin Simmons To: "Ivailo A. Tanusheff" In-reply-to: <1422065A4E115F409E22C1EC9EDAFBA422145ACC@sofdc01exc02.postbank.bg> (ITanusheff@postbank.bg) Subject: Re: ZVOL and snapshots length problem References: <1422065A4E115F409E22C1EC9EDAFBA42214513E@sofdc01exc02.postbank.bg> <201502201612.t1KGCacu014655@higson.cam.lispworks.com> <1422065A4E115F409E22C1EC9EDAFBA422145ACC@sofdc01exc02.postbank.bg> Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 18:01:58 -0000 Sorry, I don't know any solution other than renaming the datasets. __Martin >>>>> On Mon, 23 Feb 2015 10:06:06 +0000, Ivailo A Tanusheff said: > > Hi there, > > But how may I fix this? > As far as I know ZFS both allows names with spaces inside and with more than 63 symbols. > So how may I make this work for me? > > Regards, > > Ivailo Tanusheff > > -----Original Message----- > From: Martin Simmons [mailto:martin@lispworks.com] > Sent: Friday, February 20, 2015 6:13 PM > To: Ivailo A. Tanusheff > Cc: freebsd-fs@freebsd.org > Subject: Re: ZVOL and snapshots length problem > >>>>> On Fri, 20 Feb 2015 13:19:28 +0000, Ivailo A Tanusheff said: > > > > Dear all, > > > > I have some trouble creating and manipulating ZVOL on my server. I am using FreeBSD 10 and I am creating a little bit sophisticated structure, where I create several file systems and volumes inside them for easy manipulation and snapshot management. > > An example structure is: > > / / > > / / > > > > Whenever my ZVOL path exceeds 63 characters, both when creating volume or snapshot, I receive this error in my messages log: > > Feb 20 13:05:04 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/DB@Daily_operations_2015-02-20-13:05, error=22) Feb 20 > > 13:05:04 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/FS@Daily_operations_2015-02-20-13:05, error=22) Feb 20 > > 13:05:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/Report@Daily_operations_2015-02-20-13:05, error=63) > > > > Feb 20 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/DB@Daily_operations_2015-02-20-13:10, error=22) Feb 20 > > 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/FS@Daily_operations_2015-02-20-13:10, error=22) Feb 20 > > 13:10:05 FreeBSD kernel: g_dev_taste: make_dev_p() failed > > (gp->name=zvol/TANK/Bank > > system/Core/Report@Daily_operations_2015-02-20-13:10, error=63) > > > > As far as I digged into it this is due to impossibility of creation of /dev/zvol/... pointing to that volume or snapshot, while the volume/snapshot is still visible in the zfs list tree, although I am not quite sure I can use it. > > > > Is there any way to fix this behavior or this is an implementation bug, not described in the manual? > > If I create shorter names the problem disappears, but this is contra version of what I needed, so it is a not acceptable solution. > > I think this is a limitation in FreeBSD device naming. > > error=22 is EINVAL, because you have a space in the name. > > error=63 is ENAMETOOLONG, because the name is longer than SPECNAMELEN (also 63 by coincidence). > > __Martin > > Disclaimer: > > This communication is confidential. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication by mistake, please notify us immediately by responding to this email and then delete it from your system. > Eurobank Bulgaria AD is not responsible for, nor endorses, any opinion, recommendation, conclusion, solicitation, offer or agreement or any information contained in this communication. > Eurobank Bulgaria AD cannot accept any responsibility for the accuracy or completeness of this message as it has been transmitted over a public network. If you suspect that the message may have been intercepted or amended, please call the sender. > From owner-freebsd-fs@FreeBSD.ORG Mon Feb 23 23:01:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 23DCC9A9 for ; Mon, 23 Feb 2015 23:01:59 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id B53CDE00 for ; Mon, 23 Feb 2015 23:01:58 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DJBAAHsetU/95baINbg1haBIMEwAcKhGNESQKBaAEBAQEBAXyEDwEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASIBggNukyYSQEBAQEBAQQBAQEBAQEBARqBIYlyhB0BARs0B4JogUMFikqIZoNGgzo4jiKDPiKCAhyBbiAxB4EEOX8BAQE X-IronPort-AV: E=Sophos;i="5.09,634,1418101200"; d="scan'208";a="192476315" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 23 Feb 2015 18:01:29 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B9A66B3F34; Mon, 23 Feb 2015 18:01:29 -0500 (EST) Date: Mon, 23 Feb 2015 18:01:29 -0500 (EST) From: Rick Macklem To: Martin Simmons Message-ID: <1256589852.9218646.1424732489750.JavaMail.root@uoguelph.ca> In-Reply-To: <201502231413.t1NEDITT000687@higson.cam.lispworks.com> Subject: Re: The magic of ZFS and NFS (2nd try) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Feb 2015 23:01:59 -0000 Martin wrote: > >>>>> On Sat, 21 Feb 2015 18:34:12 +0100, Christian Baer said: > > > > Russell L. Carter wrote: > > > > > Post your /etc/exports, and the nfs*_enable bits of /etc/rc.conf. > > > And > > > as Rainer noted you definitely need to check that uid/gid match > > > on > > > both server and client. > > > > Pretty boring stuff... > > > > root@obelix:~ # cat /etc/exports > > V4: /usr/archive/Shared -alldirs -network 192.168.100/24 > > > > I reduced the shares to one for the time being. > As Martin notes below, the above does not export any file system. Add a line for /usr/archive/Shared to /etc/exports just like you would have for UFS and then send a HUP signal (kill -HUP ) to make the changes to /etc/exports take effect. - After this, check for syslog messages, in case the exports syntax isn`t correct. I know nothing about ZFS, but my understanding is that you can do everything in /etc/exports just like you would for UFS. (I'm not even sure what ZFS calls a file system, but every line you see when you type "mount" with no arguments is a file system as far as FreeBSD is concerned.) Every file system you want to access from client(s) must have an line in exports (or in the ZFS stuff if you choose to do it that way). The only difference for NFSv4 is that the client(s) don`t have to do separate mounts for each file system, but they all must be exported to the client(s). Hope this helps, rick ps: You only have one V4: line because there is only one place in the server`s directory tree that can be a root for NFSv4. > According to exports(5), that reduces it to zero: > > The third form has the string ``V4:'' followed by a single absolute > path name, > to specify the NFSv4 tree root. This line does not export any file > system, > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > but simply marks where the root of the server's directory tree is for > NFSv4 > clients. The exported file systems for NFSv4 are specified via the > other > lines in the exports file in the same way as for NFSv2 and NFSv3. > > __Martin > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 02:36:39 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F1468DB6 for ; Tue, 24 Feb 2015 02:36:39 +0000 (UTC) Received: from mail-pd0-x22f.google.com (mail-pd0-x22f.google.com [IPv6:2607:f8b0:400e:c02::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BDCD38B9 for ; Tue, 24 Feb 2015 02:36:39 +0000 (UTC) Received: by pdbfp1 with SMTP id fp1so29793581pdb.5 for ; Mon, 23 Feb 2015 18:36:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=mL097iy2ackWPMneVxMib3b8bF//XB5hwIcS4jchzvM=; b=Dk2hESG5LdNLdCt5mymq99+nEbbOwYal/pQ+w3lZITJKYoO+FRs3vV7qwMP8O2heII MQftGF2FDwK9M6WIALYU1FoqcuVSxQ00N+35efSUbOjKJQxS+v5xz9W0FfL/dTMjZCGo eoUNOvX6AGKRRMv1t0+rn6JhemxKg7TnWUm8W/lv2PzusVz/qC1kxlte32IBTsMX1qsB AkHCmQKQuqCYhcX7wgA2knAVk9Oq24F8iFpc7oBcAsrtKoMklMQ0WXdwGxsZCxnO5wz/ xGRVYJig/bNHmHz4oj3d/TiuYpA9NIWaPANsPnCEp7QFrkN49p6eVu9iqTzXefBhYn6K Pe1g== X-Received: by 10.66.66.196 with SMTP id h4mr24296243pat.127.1424745398911; Mon, 23 Feb 2015 18:36:38 -0800 (PST) Received: from [192.168.1.20] (14-201-255-204.static.tpgi.com.au. [14.201.255.204]) by mx.google.com with ESMTPSA id bl2sm33251759pad.15.2015.02.23.18.36.37 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Feb 2015 18:36:38 -0800 (PST) Message-ID: <54EBE3B2.6090508@gmail.com> Date: Tue, 24 Feb 2015 13:36:34 +1100 From: Brett A Wiggins User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: NFS, pefs panic: vputx: neg ref cnt Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 02:36:40 -0000 Hi, I posted a message on the Storage section of the FreeBSD forums and have been directed to this list. A link to the thread is below; https://forums.freebsd.org/threads/nfs-server-crashes-panic-vputx-neg-ref-cnt.50511/ I am setting up a FreeBSD 10.1-RELEASE NFS server to use as a NAS on my LAN. I have set everything up as per the handbook. The directory I am sharing is using pefs encryption. I am able to access the NFS share from OSX but when I try and access it via a Linux machine I get a kernel panic. The core dump is posted below; http://pastebin.com/Da5bciWX I'm not sure how to read a core dump (I'm not a developer) but there is a line; Panic: vputx: neg ref cnt Any help would be appreciated and am able to post additional information if requested. Brett. -- "If you are new to UNIX, you may be used to clicking something and seeing either an "OK" message, an error, nothing, or (all too often) a pretty blue screen with nifty high-tech letters' explaining exactly where the system crashed" - Michael Lucas From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 15:37:44 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 28DBE46F for ; Tue, 24 Feb 2015 15:37:44 +0000 (UTC) Received: from hydra.pix.net (hydra.pix.net [IPv6:2001:470:e254::4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.pix.net", Issuer "Pix.Com Technologies, LLC CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id EDB35ADF for ; Tue, 24 Feb 2015 15:37:43 +0000 (UTC) Received: from torb.pix.net (verizon.pix.net [71.178.232.3]) (authenticated bits=0) by hydra.pix.net (8.15.1/8.15.1) with ESMTPA id t1OFbg3w084139; Tue, 24 Feb 2015 10:37:42 -0500 (EST) (envelope-from lidl@pix.net) X-Authentication-Warning: hydra.pix.net: Host verizon.pix.net [71.178.232.3] claimed to be torb.pix.net Message-ID: <54EC9AC6.5000109@pix.net> Date: Tue, 24 Feb 2015 10:37:42 -0500 From: Kurt Lidl User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: The magic of ZFS and NFS (2nd try) References: <12103095.viZFqgegqA@falbala.rz1.convenimus.net> In-Reply-To: <12103095.viZFqgegqA@falbala.rz1.convenimus.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 15:37:44 -0000 > Rainer Duffner wrote: > >> You must use the syntax of exports(5) also with zfs set sharenfs= >> AFAIK, you shouldn’t use /etc/exports to do zfs exports but the above >> command > > But why shouldn't I use /etc/exports? I have read people writing this (don't > use /etc/exports) in forums when searching for answers, however the current > manpage for zfs says this: > > sharenfs=on | off | opts > Controls whether the file system is shared via NFS, and what options > are used. A file system with a sharenfs property of off is managed > the traditional way via exports(5). Otherwise, the file system is > automatically shared and unshared with the "zfs share" and "zfs > unshare" commands. If the property is set to on no NFS export options > are used. Otherwise, NFS export options are equivalent to the con- > tents of this property. The export options may be comma-separated. > See exports(5) for a list of valid options. > > To me this looks like I can choose whether I want to use the exports file > or if I wish to set the sharenfs flag. I don't really *that* much, although > I don't really think this is something that a file system should decide, but > should be set from the NFS server. Well, I would argue (putting aside Jordan's semi-rant), that the correct thing to do is use the sharenfs flag for ZFS filesystems, as already recommended. The mechanics of using it are straightforward: zfs set sharenfs="things you would put in /etc/exports" path/to/zfs In a more concrete example, on one of my machines, I have: root@hostname-134: zfs get all |grep sharenfs|grep zroot/fbsd zroot/fbsd sharenfs off default zroot/fbsd/10.0-amd64 sharenfs -maproot=root local zroot/fbsd/9.1-amd64 sharenfs -maproot=root -alldirs local zroot/fbsd/9.1-sparc64 sharenfs -maproot=root -alldirs local Those were created via: zfs set sharenfs="-maproot=root" zroot/fbsd/10.0-amd64 After doing that, the file /etc/zfs/exports was automatically created for me (look, no need to manually edit /etc/exports): root@hostname-135: cat /etc/zfs/exports # !!! DO NOT EDIT THIS FILE MANUALLY !!! /fbsd/10.0-amd64 -maproot=root /fbsd/9.1-amd64 -maproot=root -alldirs /fbsd/9.1-sparc64 -maproot=root -alldirs All I had to do after this was restart my mountd daemon: root@hostname-136: service mountd restart Stopping mountd. Starting mountd. root@hostname-137: ps -auxww | grep mountd root 58779 0.0 0.1 10232 2312 ?? Ss 10:30AM 0:00.01 /usr/sbin/mountd -r /etc/exports /etc/zfs/exports Make sure that you have the appropriate NFS daemons started in your server's /etc/rc.conf: # NFS server things # (inetd is needed for tftpd) inetd_enable="YES" mountd_enable="YES" mountd_flags="-r" nfs_server_enable="YES" nfs_server_flags="-u -t -n 4" rarpd_enable="YES" rpcbind_enable="YES" rpc_lockd_enable="YES" rpc_statd_enable="YES" I enable inetd because I need to have tftpd running for diskless booting of my sparcs when I play with them, along with some other, older machines. The examples above work fine for me on FreeBSD 9.1, and I've done the same thing on FreeBSD 10.1 too - this has worked fine for pretty long time. -Kurt From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 15:40:54 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 02EB95FD; Tue, 24 Feb 2015 15:40:54 +0000 (UTC) Received: from dmz-mailsec-scanner-4.mit.edu (dmz-mailsec-scanner-4.mit.edu [18.9.25.15]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 83BCEB57; Tue, 24 Feb 2015 15:40:53 +0000 (UTC) X-AuditID: 1209190f-f79546d000007593-4e-54ec9a4fa498 Received: from mailhub-auth-2.mit.edu ( [18.7.62.36]) (using TLS with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 0C.4C.30099.05A9CE45; Tue, 24 Feb 2015 10:35:44 -0500 (EST) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id t1OFZhV1004641; Tue, 24 Feb 2015 10:35:43 -0500 Received: from multics.mit.edu (system-low-sipb.mit.edu [18.187.2.37]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id t1OFZe9i022377 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 24 Feb 2015 10:35:42 -0500 Received: (from kaduk@localhost) by multics.mit.edu (8.12.9.20060308) id t1OFZeWK029631; Tue, 24 Feb 2015 10:35:40 -0500 (EST) Date: Tue, 24 Feb 2015 10:35:40 -0500 (EST) From: Benjamin Kaduk To: Brett A Wiggins Subject: Re: NFS, pefs panic: vputx: neg ref cnt In-Reply-To: <54EBE3B2.6090508@gmail.com> Message-ID: References: <54EBE3B2.6090508@gmail.com> User-Agent: Alpine 1.10 (GSO 962 2008-03-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrEIsWRmVeSWpSXmKPExsUixG6nohsw602IwaIpbBb79hxhtDj2+Ceb xdy/+xkdmD1mfJrP4rFz1l32AKYoLpuU1JzMstQifbsErozTM5ayFiziqTi+rI25gXEKVxcj J4eEgInEz/u/GCFsMYkL99azdTFycQgJLGaSmHVyHguEs5FR4vKqlawgVUICh5gkdi3IgEg0 MEr8bZzLDJJgEdCW+PVpPROIzSagIjHzzUY2EFtEQE3iz5NnYDXMAgYSx5f8A7OFgey9386A 1XMKaEr87V4OtoBXwEFiwVKIGiEBDYkVr86wgNiiAjoSq/dPYYGoEZQ4OfMJC8RMLYnl07ex TGAUnIUkNQtJagEj0ypG2ZTcKt3cxMyc4tRk3eLkxLy81CJdE73czBK91JTSTYygoOWU5N/B +O2g0iFGAQ5GJR7eAzJvQoRYE8uKK3MPMUpyMCmJ8kpOBQrxJeWnVGYkFmfEF5XmpBYfYpTg YFYS4ZUDyfGmJFZWpRblw6SkOViUxHk3/eALERJITyxJzU5NLUgtgsnKcHAoSfD6zgRqFCxK TU+tSMvMKUFIM3FwggznARpeDVLDW1yQmFucmQ6RP8WoKCXOGweSEABJZJTmwfXCksorRnGg V4R5D4FU8QATElz3K6DBTECD9zx+BTK4JBEhJdXA6KxZlde88/7kBy9PGDflVvhdWjCt1olj anhg5OV1FienG63YMNHO0uKHj17MYacfa0R9fy/R5wqsmrJmCqNuU7TwM3Z/S6bUdYt2Tnqm UxDs1/7rX2lzZtidz7aWhb5VNv1VVZ9mBVRkGDuv37zUkfH1msUXPupXKUXeqKsPid8dd1xZ 8fV+JZbijERDLeai4kQAhf2BZAUDAAA= Cc: freebsd-fs@freebsd.org, rmacklem@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 15:40:54 -0000 On Mon, 23 Feb 2015, Brett A Wiggins wrote: > I am able to access the NFS share from OSX but when I try and access it > via a Linux machine I get a kernel panic. The core dump is posted below; > > http://pastebin.com/Da5bciWX > > I'm not sure how to read a core dump (I'm not a developer) but there is > a line; > > Panic: vputx: neg ref cnt This particular panic ~always means "programmer error". Looking at the backtrace: KDB: stack backtrace: #0 0xffffffff80963000 at kdb_backtrace+0x60 #1 0xffffffff80928125 at panic+0x155 #2 0xffffffff809c8b75 at vputx+0x2d5 #3 0xffffffff8195b3fb at pefs_reclaim+0xdb #4 0xffffffff80e439a7 at VOP_RECLAIM_APV+0xa7 #5 0xffffffff809c9951 at vgonel+0x1c1 #6 0xffffffff809c9de9 at vrecycle+0x59 #7 0xffffffff8195b317 at pefs_inactive+0x87 #8 0xffffffff80e43897 at VOP_INACTIVE_APV+0xa7 #9 0xffffffff809c8722 at vinactive+0x102 #10 0xffffffff809c8b12 at vputx+0x272 #11 0xffffffff808811ee at nfsrvd_readdirplus+0x117e #12 0xffffffff80863d9e at nfsrvd_dorpc+0x6de #13 0xffffffff80872d94 at nfssvc_program+0x554 #14 0xffffffff80b27957 at svc_run_internal+0xc77 #15 0xffffffff80b26bee at svc_run+0x1de #16 0xffffffff8087321a at nfsrvd_nfsd+0x1ca #17 0xffffffff80883417 at nfssvc_nfsd+0x107 It seems that both NFS and PEFS are calling vputx on the same vnode. Given the popularity of NFS, I would be more inclined to suspect PEFS than NFS, but it could still be a bug that only appears when both are used together. Adding rmacklem to the cc list since he's the resident NFS expert. -Ben From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 16:43:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0A6FC957 for ; Tue, 24 Feb 2015 16:43:59 +0000 (UTC) Received: from cu01176a.smtpx.saremail.com (cu01176a.smtpx.saremail.com [195.16.150.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BACC6372 for ; Tue, 24 Feb 2015 16:43:58 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id 041B69DE0C7 for ; Tue, 24 Feb 2015 17:38:33 +0100 (CET) From: Borja Marcos Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Proposal: enhancing zfs hold, atomic holds on create, snap and receive Date: Tue, 24 Feb 2015 17:38:29 +0100 Message-Id: To: "freebsd-fs@freebsd.org Filesystems" Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 16:43:59 -0000 Hi :) I''ve been doing some incremental replication work with ZFS, and I am = using holds to prevent user errors. When someone destroys the wrong snapshot, a dataset must be sent wholly = beacuse it's no longer possible to perform an incremental send. A hold can prevent it, marking = the snapshot as "critical for incremental replication". Of course holds are even better as you can assign several = labelled holds to a single snapshot, so that each hold can represent a different reason = to keep it.=20 But there's a missing feature which would make them as perfect as they = can get: holds are somewhat of an afterthough, a second class citizen compared to = properties and, unlike properties, you can't (for example) place a hold atomically on a snapshot when = creating it. ZFS has a nice feature that allows you to create an object (snapshot or = dataset) and, *atomically* assign a property to it. The same feature applies to create and clone, of course, although it = doesn=B4t to receive, which might be useful. So, the proposal is to add a "-h hold1,hold2,..holdN" option to "zfs = snap" and ideally zfs receive, so that a hold would be placed atomically with the snapshot creation.=20 This feature would prevent some possible race conditions in snapshot = management, which would make them much more useful.=20 I imagine that the -o option as added with the same purpose. What do you think? Thanks! Borja.= From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 17:52:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4AB144A2 for ; Tue, 24 Feb 2015 17:52:59 +0000 (UTC) Received: from mail-wi0-f175.google.com (mail-wi0-f175.google.com [209.85.212.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D5A7BD9E for ; Tue, 24 Feb 2015 17:52:58 +0000 (UTC) Received: by mail-wi0-f175.google.com with SMTP id r20so27494035wiv.2 for ; Tue, 24 Feb 2015 09:52:50 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=hjeCOwg5/vpCanpwx52u0ED/wvt3aj8C4F9q5DWYUPA=; b=LnPBC9xk7BWPzTRRLf3NbcOEPDXNLHGBNWqkyxRpyn8MemEaW79PGSN+5mbWlED+Ph jAmhnpe2FWOOOgyxZ0lM4PsZLXaVCErRGv+Le4aH313gNA1vdw9XwxggfhGYgVa9r6Ig en3ATfRD5qPV59zhVJ7Si2QCodd6g49vvZkVjcXM2y1xXRCaB1IZgyn5O5LjMndh/J/s UmxP4XHOKoTxeds09Lc5DpJEJkXqoh+LllRdtEC82E/QLGYDxAwPAvh52X8YP72RcYZ7 3SN/HwkNjTfHwsHalYnZ3vQJPk1OUJpK8pEyhPSXjBbgC3j4WAqx9R3RuiXqHk/sFQOR Ggww== X-Gm-Message-State: ALoCoQmFLJmhJYuW0LJNb76Xp6+5ft+aFOSoMYyiW52UBuquokJbLEJGQ0Nc+mG7CNxW+YSwsqng X-Received: by 10.194.178.161 with SMTP id cz1mr35508636wjc.102.1424799889404; Tue, 24 Feb 2015 09:44:49 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id bd3sm13928499wib.17.2015.02.24.09.44.48 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Feb 2015 09:44:48 -0800 (PST) Message-ID: <54ECB88C.5060305@multiplay.co.uk> Date: Tue, 24 Feb 2015 17:44:44 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Proposal: enhancing zfs hold, atomic holds on create, snap and receive References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 17:52:59 -0000 Bookmarks? On 24/02/2015 16:38, Borja Marcos wrote: > Hi :) > > I''ve been doing some incremental replication work with ZFS, and I am using holds to prevent user errors. > When someone destroys the wrong snapshot, a dataset must be sent wholly beacuse it's no longer > possible to perform an incremental send. A hold can prevent it, marking the snapshot as "critical for incremental > replication". Of course holds are even better as you can assign several labelled holds > to a single snapshot, so that each hold can represent a different reason to keep it. > > But there's a missing feature which would make them as perfect as they can get: holds > are somewhat of an afterthough, a second class citizen compared to properties and, unlike properties, > you can't (for example) place a hold atomically on a snapshot when creating it. > > ZFS has a nice feature that allows you to create an object (snapshot or dataset) and, *atomically* assign a property to it. > The same feature applies to create and clone, of course, although it doesn´t to receive, which might be useful. > > So, the proposal is to add a "-h hold1,hold2,..holdN" option to "zfs snap" and ideally zfs receive, so that a hold would be > placed atomically with the snapshot creation. > > This feature would prevent some possible race conditions in snapshot management, which would make them much more useful. > I imagine that the -o option as added with the same purpose. > > What do you think? > > > Thanks! > > > > > Borja. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 19:58:00 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 187B5D52 for ; Tue, 24 Feb 2015 19:58:00 +0000 (UTC) Received: from mail.ijs.si (mail.ijs.si [IPv6:2001:1470:ff80::25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id BBB25E02 for ; Tue, 24 Feb 2015 19:57:59 +0000 (UTC) Received: from amavis-proxy-ori.ijs.si (localhost [IPv6:::1]) by mail.ijs.si (Postfix) with ESMTP id 3ks9zR4vRFz6j for ; Tue, 24 Feb 2015 20:57:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ijs.si; h= user-agent:message-id:references:in-reply-to:organization :subject:subject:from:from:date:date:content-transfer-encoding :content-type:content-type:mime-version:received:received :received:received; s=jakla4; t=1424807872; x=1427399873; bh=StT zqZccJGkd9bZsx2gCO9llUyABU7VIjzGkAt/jZag=; b=Cyqar7BZu2vW5En5jb0 ELERXHT6xIp+94kgOWHKDNFIzKx/c34sBOThVAFYE8cn31tzBJdY17esyoc4xJfS E8nFrEzefGgXsClDmC/NUUxqFpdpu1m9ozupbdUfgsEL5M8BiF4cxS3G7WY1YlqG vF4Ql91lLUKmygoPqJAAo+T8= X-Virus-Scanned: amavisd-new at ijs.si Received: from mail.ijs.si ([IPv6:::1]) by amavis-proxy-ori.ijs.si (mail.ijs.si [IPv6:::1]) (amavisd-new, port 10012) with ESMTP id SzPkvbjbNnk1 for ; Tue, 24 Feb 2015 20:57:52 +0100 (CET) Received: from mildred.ijs.si (mailbox.ijs.si [IPv6:2001:1470:ff80::143:1]) by mail.ijs.si (Postfix) with ESMTP for ; Tue, 24 Feb 2015 20:57:52 +0100 (CET) Received: from neli.ijs.si (neli.ijs.si [IPv6:2001:1470:ff80:88:21c:c0ff:feb1:8c91]) by mildred.ijs.si (Postfix) with ESMTP id 3ks9zN1dTnzsW for ; Tue, 24 Feb 2015 20:57:52 +0100 (CET) Received: from sleepy.ijs.si ([2001:1470:ff80:e001::1:1]) by neli.ijs.si with HTTP (HTTP/1.1 POST); Tue, 24 Feb 2015 20:57:52 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Date: Tue, 24 Feb 2015 20:57:52 +0100 From: Mark Martinec To: freebsd-fs@freebsd.org Subject: Re: Proposal: enhancing zfs hold, atomic holds on create, snap and receive Organization: J. Stefan Institute In-Reply-To: <54ECB88C.5060305@multiplay.co.uk> References: <54ECB88C.5060305@multiplay.co.uk> Message-ID: <355cc6d42f85b8aaff3ab6950b93c990@mailbox.ijs.si> X-Sender: Mark.Martinec+freebsd@ijs.si User-Agent: Roundcube Webmail/1.1.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 19:58:00 -0000 Steven Hartland wrote: > Bookmarks? If only the sysutils/zxfer would put bookmarks to good use, life would be easier (in presence of automatic snapshot management like with zfs-snapshot-mgmt) ... Mark > On 24/02/2015 16:38, Borja Marcos wrote: >> Hi :) >>=20 >> I''ve been doing some incremental replication work with ZFS, and I am=20 >> using holds to prevent user errors. >> When someone destroys the wrong snapshot, a dataset must be sent=20 >> wholly beacuse it's no longer >> possible to perform an incremental send. A hold can prevent it,=20 >> marking the snapshot as "critical for incremental >> replication". Of course holds are even better as you can assign=20 >> several labelled holds >> to a single snapshot, so that each hold can represent a different=20 >> reason to keep it. >>=20 >> But there's a missing feature which would make them as perfect as they= =20 >> can get: holds >> are somewhat of an afterthough, a second class citizen compared to=20 >> properties and, unlike properties, >> you can't (for example) place a hold atomically on a snapshot when=20 >> creating it. >>=20 >> ZFS has a nice feature that allows you to create an object (snapshot=20 >> or dataset) and, *atomically* assign a property to it. >> The same feature applies to create and clone, of course, although it=20 >> doesn=C2=B4t to receive, which might be useful. >>=20 >> So, the proposal is to add a "-h hold1,hold2,..holdN" option to "zfs=20 >> snap" and ideally zfs receive, so that a hold would be >> placed atomically with the snapshot creation. >>=20 >> This feature would prevent some possible race conditions in snapshot=20 >> management, which would make them much more useful. >> I imagine that the -o option as added with the same purpose. >>=20 >> What do you think? >>=20 >>=20 >> Thanks! >> Borja. From owner-freebsd-fs@FreeBSD.ORG Tue Feb 24 22:56:48 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4D22F9C6; Tue, 24 Feb 2015 22:56:48 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 00485873; Tue, 24 Feb 2015 22:56:47 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2D7BABjAO1U/95baINbDoNKVQUEgwTAEgqFJ0kCgXIBAQEBAQF8hBABAQQBAQEgKx0DCxsYAgINGQIpAQkmBggHBAEcBIgOCAW7cJkVAQEBAQEFAQEBAQEBAQEagSGJcoJOgU8BARs0B4JogUMFik6IbYNGgzo5hTCGM4YEIoMxWyAxB4EEOX8BAQE X-IronPort-AV: E=Sophos;i="5.09,641,1418101200"; d="scan'208";a="194510132" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 24 Feb 2015 17:56:39 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 8011CB3F40; Tue, 24 Feb 2015 17:56:38 -0500 (EST) Date: Tue, 24 Feb 2015 17:56:38 -0500 (EST) From: Rick Macklem To: Benjamin Kaduk Message-ID: <883898483.10139396.1424818598510.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: NFS, pefs panic: vputx: neg ref cnt MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, rmacklem@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Feb 2015 22:56:48 -0000 Benjamin Kaduk wrote: > On Mon, 23 Feb 2015, Brett A Wiggins wrote: > > > I am able to access the NFS share from OSX but when I try and > > access it > > via a Linux machine I get a kernel panic. The core dump is posted > > below; > > > > http://pastebin.com/Da5bciWX > > > > I'm not sure how to read a core dump (I'm not a developer) but > > there is > > a line; > > > > Panic: vputx: neg ref cnt > > This particular panic ~always means "programmer error". > > Looking at the backtrace: > > KDB: stack backtrace: > #0 0xffffffff80963000 at kdb_backtrace+0x60 > #1 0xffffffff80928125 at panic+0x155 It doesn't make sense to call vputx() from VOP_RECLAIM(), since the ref cnt is already 0. vputx() derefs the vnode, making the ref cnt. -1, I think? > #2 0xffffffff809c8b75 at vputx+0x2d5 > #3 0xffffffff8195b3fb at pefs_reclaim+0xdb > #4 0xffffffff80e439a7 at VOP_RECLAIM_APV+0xa7 > #5 0xffffffff809c9951 at vgonel+0x1c1 > #6 0xffffffff809c9de9 at vrecycle+0x59 This call to pefs_inactive() indicates that the ref cnt. would be 0 at this point. > #7 0xffffffff8195b317 at pefs_inactive+0x87 > #8 0xffffffff80e43897 at VOP_INACTIVE_APV+0xa7 > #9 0xffffffff809c8722 at vinactive+0x102 > #10 0xffffffff809c8b12 at vputx+0x272 > #11 0xffffffff808811ee at nfsrvd_readdirplus+0x117e > #12 0xffffffff80863d9e at nfsrvd_dorpc+0x6de > #13 0xffffffff80872d94 at nfssvc_program+0x554 > #14 0xffffffff80b27957 at svc_run_internal+0xc77 > #15 0xffffffff80b26bee at svc_run+0x1de > #16 0xffffffff8087321a at nfsrvd_nfsd+0x1ca > #17 0xffffffff80883417 at nfssvc_nfsd+0x107 > > It seems that both NFS and PEFS are calling vputx on the same vnode. > > Given the popularity of NFS, I would be more inclined to suspect PEFS > than > NFS, but it could still be a bug that only appears when both are used > together. > > Adding rmacklem to the cc list since he's the resident NFS expert. > Now for the dumb question...where is the pefs stuff? (I've never heard of it and a quick find/grep didn't locate it in the kernel source tree.) I suspect that there is a bogus vput() call in pefs_reclaim(). Neither pefs_inactive() nor pefs_reclaim() should have a vput() call in them, from what I know of the VFS. rick > -Ben > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 02:45:04 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 94E9B2FE; Wed, 25 Feb 2015 02:45:04 +0000 (UTC) Received: from hergotha.csail.mit.edu (wollman-1-pt.tunnel.tserv4.nyc4.ipv6.he.net [IPv6:2001:470:1f06:ccb::2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 3777826E; Wed, 25 Feb 2015 02:45:03 +0000 (UTC) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.14.9/8.14.9) with ESMTP id t1P2iukq094349; Tue, 24 Feb 2015 21:44:59 -0500 (EST) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.14.9/8.14.4/Submit) id t1P2iu6N094346; Tue, 24 Feb 2015 21:44:56 -0500 (EST) (envelope-from wollman) Date: Tue, 24 Feb 2015 21:44:56 -0500 (EST) From: Garrett Wollman Message-Id: <201502250244.t1P2iu6N094346@hergotha.csail.mit.edu> To: rmacklem@uoguelph.ca Subject: Re: NFS: kernel modules (loading/unloading) and scheduling References: <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca> Organization: none X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (hergotha.csail.mit.edu [127.0.0.1]); Tue, 24 Feb 2015 21:44:59 -0500 (EST) X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED autolearn=disabled version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on hergotha.csail.mit.edu Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 02:45:04 -0000 In article <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca>, rmacklem@uoguelph.ca writes: >I tend to think that a bias towards doing Getattr/Lookup over Read/Write >may help performance (the old "shortest job first" principal), I'm not >sure you'll have a big enough queue of outstanding RPCs under normal load >for this to make a real difference. I don't think this is a particularly relevant condition here. There are lots of ways RPCs can pile up where you really need to do better work-sharing than the current implementation does. One example is a client that issues lots of concurrent reads (e.g., a compute node running dozens of parallel jobs). Two such systems on gigabit NICs can easily issue large reads fast enough to cause 64 nfsd service threads to blocked while waiting for the socket send buffer to drain. Meanwhile, the file server is completely idle, but unable to respond to incoming requests, and the other users get angry. Rather than assigning new threads to requests from the slow clients, it would be better to let the requests sit until the send buffer drains, and process other clients' requests instead of letting the resources get monopolized by a single user. Lest you think this is purely hypothetical: we actually experienced this problem today, and I verified with "procstat -kk" that all of the nfsd threads were in fact blocked waiting for send buffer space to open up. I was able to restore service immediately by increasing the number of nfsd threads, but I'm unsure to what extent I can do this without breaking other things or hitting other bottlenecks.[1] So I have a user asking me why I haven't enable fair-share scheduling for NFS, and I'm going to have to tell him the answer is "no such thing". -GAWollman [1] What would the right number actually be? We could potentially have many thousands of threads in a compute cluster all operating simultaneously on the same filesystem, well within the I/O capacity of the server, and we'd really like to degrade gracefully rather than falling over when a single slow client soaks up all of the nfsd worker threads. From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 09:35:16 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2DEE54C9 for ; Wed, 25 Feb 2015 09:35:16 +0000 (UTC) Received: from cu01176a.smtpx.saremail.com (cu01176a.smtpx.saremail.com [195.16.150.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DCCD18BA for ; Wed, 25 Feb 2015 09:35:15 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id 5CF3D9DC718; Wed, 25 Feb 2015 10:35:12 +0100 (CET) Subject: Re: Proposal: enhancing zfs hold, atomic holds on snap and receive Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=windows-1252 From: Borja Marcos In-Reply-To: <54ECB88C.5060305@multiplay.co.uk> Date: Wed, 25 Feb 2015 10:35:10 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <54ECB88C.5060305@multiplay.co.uk> To: Steven Hartland X-Mailer: Apple Mail (2.1283) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 09:35:16 -0000 On Feb 24, 2015, at 6:44 PM, Steven Hartland wrote: > Bookmarks? If you mean using bookmarks rather than snapshots, it's a lean solution = but with some shortcomings. Bookmarks are fine if you just want to keep a copy of the data. And it = can save a lot of room, of course it's still compatible with replication to multiple targets, etc. But replication with snapshots has an advantage. You have the option to = copy not just data changes between a bookmark and a snapshot, but also all the intermediate snapshots in = between. If you keep some automated snapshot policy with the purpose of keeping recovery = points (a Time Machine of sorts), bookmark replication won't keep all that information on the target, or at the = very least it will require the replication tool to work snapshot by snapshot. With incremental snapshot replication you can use -I to = send an incremental stream including intermediate snapshots. Snapshots, compared to bookmarks, make replication code simpler, it = doesn=B4t need to be explicitly "aware" of other snapshot management being used concurrently. Apart from this not so particular case, anyway in my opinion holds in = their present state are an incomplete feature. Their usefulness is = severely limited by the potential to suffer a race condition between a snapshot = creation and placing a hold on it. Why can user properties, several,=20 even, be set atomically with a dataset or snapshot creation? Exactly the = same reasons apply to holds. Actually, holds in their present state are, in my opinion, the poisoned candy. Sweet to lure the unsuspecting = programmer to use them and suffer a race condition. Of course you can resort to recipes such as creating a snapshot with a = temporal name, place a hold and rename it afterwards in order to protect it as long as you keep some strict naming discipline, but it won't be as = foolproof. I was wrong with the "-h hold,hold2..." and I see that it should be "-h = hold1 .. -h holdN" instead, in the same way as user properties. = Actually, altough this will prove much harder as the send/receive format is = supposed to be frozen, holds should be sent along with the replication stream. Or at the very least it would be nice if "zfs receive" could = place a hold on the received snapshot. And I was wrong on mentioning "create", sorry. Holds don't apply to = datasets. So, to summarize the changes proposed: zfs snap [-h hold] would create a hold atomically with snapshot = creation, much like [-o property]. Multiple holds would be specified = with multiple -h options. zfs send [-h] would make the stream contain holds information, working = in the same way as "-p". Extending -R to encompass holds may look like a good thing, but it can be more problematic, as it would change its = behavior, I would rather require the user to specify -h in order to = explicitly send the holds. zfs receive [-h hold] would be a ugly kludge in case it wasn't possible = to send holds as part of a replication stream. But it still would offer the possibility of atomic protection of a just received snapshot.=20 I've been looking at the code and it's not that hard, although it will = require some sugery down to the ioctl level! Anyway, should I take this proposal to the ZFS mailing lists, or are = there enough of the relevant members lurking here? A change such as this one should be done for all the ZFS implementations. Cheers, Borja. =20 that a feature such as the holds is per se incomplete when you can = suffer a race condition when creating a snapshot and placing a hold on it. Why was it decided = that properties, several even, could be added atomically with=20 can properties be applied atomically with a snapshot creation? Exactly = for the same reason. as it can coexist with your manual snapshot management (for example, = doing a snap right before an upgrade in order to have a recovery opton) and incremental replication of snapshots = offers the possibility of replicating all the snapshots in between.=20 : First, imagine that I have two servers: the active and the reserve at a = remote location and I am=20 >=20 > On 24/02/2015 16:38, Borja Marcos wrote: >> Hi :) >>=20 >> I''ve been doing some incremental replication work with ZFS, and I am = using holds to prevent user errors. >> When someone destroys the wrong snapshot, a dataset must be sent = wholly beacuse it's no longer >> possible to perform an incremental send. A hold can prevent it, = marking the snapshot as "critical for incremental >> replication". Of course holds are even better as you can assign = several labelled holds >> to a single snapshot, so that each hold can represent a different = reason to keep it. >>=20 >> But there's a missing feature which would make them as perfect as = they can get: holds >> are somewhat of an afterthough, a second class citizen compared to = properties and, unlike properties, >> you can't (for example) place a hold atomically on a snapshot when = creating it. >>=20 >> ZFS has a nice feature that allows you to create an object (snapshot = or dataset) and, *atomically* assign a property to it. >> The same feature applies to create and clone, of course, although it = doesn=B4t to receive, which might be useful. >>=20 >> So, the proposal is to add a "-h hold1,hold2,..holdN" option to "zfs = snap" and ideally zfs receive, so that a hold would be >> placed atomically with the snapshot creation. >>=20 >> This feature would prevent some possible race conditions in snapshot = management, which would make them much more useful. >> I imagine that the -o option as added with the same purpose. >>=20 >> What do you think? >>=20 >>=20 >> Thanks! >>=20 >>=20 >>=20 >>=20 >> Borja. >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 10:01:41 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2E197D28 for ; Wed, 25 Feb 2015 10:01:41 +0000 (UTC) Received: from cu01176a.smtpx.saremail.com (cu01176a.smtpx.saremail.com [195.16.150.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id DC5F3C09 for ; Wed, 25 Feb 2015 10:01:40 +0000 (UTC) Received: from [172.16.2.2] (izaro.sarenet.es [192.148.167.11]) by proxypop03.sare.net (Postfix) with ESMTPSA id 8B6E09DD50A; Wed, 25 Feb 2015 11:01:38 +0100 (CET) Subject: Re: Proposal: enhancing zfs hold, atomic holds on snap and receive Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=windows-1252 From: Borja Marcos In-Reply-To: Date: Wed, 25 Feb 2015 11:01:37 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <3788338C-B63A-4ACD-8EDC-95EFD4A85A54@sarenet.es> References: <54ECB88C.5060305@multiplay.co.uk> To: Borja Marcos X-Mailer: Apple Mail (2.1283) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 10:01:41 -0000 On Feb 25, 2015, at 10:35 AM, Borja Marcos wrote: > First, imagine that I have two servers: the active and the reserve at = a remote location and I am=20 My apologies for the salad at the end of the message. That's what = happens when the telephone interrupts you plenty of times while you try = to keep focus. Borja. From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 10:21:47 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 93DD1907 for ; Wed, 25 Feb 2015 10:21:47 +0000 (UTC) Received: from mail-lb0-x22f.google.com (mail-lb0-x22f.google.com [IPv6:2a00:1450:4010:c04::22f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 13857D9D for ; Wed, 25 Feb 2015 10:21:47 +0000 (UTC) Received: by lbiz12 with SMTP id z12so2752068lbi.11 for ; Wed, 25 Feb 2015 02:21:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=yId9yNCPTrGhE6jSZArv0wQSww5s5x1DqWn0WhBueFY=; b=Y5hDfG8JG1Y6Txe+saCfVv9apsmUQK2buA/xrA87DkKY/K9E3ulAnYk3UhMHIlxHZK nJ26i4Z655PFjde3ZML21B906MZVAfI5GK5Ca8FIGWDM3PmKd3o6jHtvbSKj8rPwkJxH yhtdm1gL1jEnd/gttazJ8AxVLnN3Q+9UYsqa9TkkSYmBjd1sfUYDZlb14bjFN15sNs1q zsV5pPFOKcQWfhskvAd96zIPojnAoeQuKVrenMAAYoX0kLM2RvFknhjoeqQrlMzjqPFA qTnxEkSjuapIsccHAJ6cKj6/GjwaradqwR7SfIdl29MOgEa0A1xHUcuhi9gP+E4J42BD YZMw== MIME-Version: 1.0 X-Received: by 10.152.36.232 with SMTP id t8mr2127895laj.90.1424859705159; Wed, 25 Feb 2015 02:21:45 -0800 (PST) Received: by 10.25.24.224 with HTTP; Wed, 25 Feb 2015 02:21:45 -0800 (PST) In-Reply-To: References: Date: Wed, 25 Feb 2015 15:51:45 +0530 Message-ID: Subject: Re: Zoned Commands ZBS/ZAC, Shingled SMR/SFS, ZFS From: Shehbaz Jaffer To: grarpamp Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 10:21:47 -0000 Hi, I am interested in collaborating with you for bringing in ZFS in freeBSD. Couple of questions: 1. Can we emulate ZFS on Qemu or some other hypervisor or do we need SMR drives for development / integration? 2. How do you propose we use the open source ZFS libraries and include them in freeBSD? On Sat, Feb 21, 2015 at 3:04 AM, grarpamp wrote: > I typo ZBS in subject instead of correct ZBC, this MUA > couldn't fix it on reply without breaking thread refs, sorry. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- Shehbaz Jaffer MTS | Advanced Technology Group | NetApp From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 10:27:40 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 731BFB14 for ; Wed, 25 Feb 2015 10:27:40 +0000 (UTC) Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 05ADDE7A for ; Wed, 25 Feb 2015 10:27:39 +0000 (UTC) Received: by wggz12 with SMTP id z12so2702727wgg.2 for ; Wed, 25 Feb 2015 02:27:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=+1fz61tIa11uC9edCwk1IaobTPOenKUDduJ/hXAwYXs=; b=Pyo0T5YgzH64I/IoT7LM6xNCFxrM6xxvUVVQbA4snHlYcyAiH46E/zp2lRBqEiOdXR 6C4JP34RqPbJO58WpZfcTcNtPekSfHUmWLI31SkLQM2dSi1jUjx/LPftAtgQBFZL+hAa E29YApKDJd/Hky7Q4EJ/aE8F2Ruzvh1wBBM6DvGStuaEydKnX5HjY1LPs9fNbU7aWM8q 6c/ngk+ciFHK47nB5zLVXUjRfuoGP+fx5AGlTfUItDaQHSFYKEra9UVEJE2Hi62PfLZE pzDttNyEl9e8VRtM7O+d9IMDwcxn6smWRbwB1dMctrT7DHm1I3JHd3KBnsZsUqvFw5vU 0OfA== X-Gm-Message-State: ALoCoQm4WyDKcZ4ljLdMrtW3MkYLgS5168nhv2ycvsSAQNLHuU1c8zXlwL1hXlfzYQR8IwuWIN+G X-Received: by 10.194.60.104 with SMTP id g8mr4840292wjr.96.1424860052157; Wed, 25 Feb 2015 02:27:32 -0800 (PST) Received: from [10.10.1.68] (82-69-141-170.dsl.in-addr.zen.co.uk. [82.69.141.170]) by mx.google.com with ESMTPSA id k6sm24480024wia.6.2015.02.25.02.27.30 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Feb 2015 02:27:31 -0800 (PST) Message-ID: <54EDA38F.7000107@multiplay.co.uk> Date: Wed, 25 Feb 2015 10:27:27 +0000 From: Steven Hartland User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: Zoned Commands ZBS/ZAC, Shingled SMR/SFS, ZFS References: In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 10:27:40 -0000 On 25/02/2015 10:21, Shehbaz Jaffer wrote: > Hi, > > I am interested in collaborating with you for bringing in ZFS in > freeBSD. Couple of questions: > > 1. Can we emulate ZFS on Qemu or some other hypervisor or do we need > SMR drives for development / integration? > 2. How do you propose we use the open source ZFS libraries and include > them in freeBSD? Confused FreeBSD already has ZFS From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 10:49:26 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 730D5FCB for ; Wed, 25 Feb 2015 10:49:26 +0000 (UTC) Received: from mail-la0-x230.google.com (mail-la0-x230.google.com [IPv6:2a00:1450:4010:c03::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id E4A9C12C for ; Wed, 25 Feb 2015 10:49:25 +0000 (UTC) Received: by labgm9 with SMTP id gm9so3003896lab.2 for ; Wed, 25 Feb 2015 02:49:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=ATzHx4rHV5Jib75aiBU+1j4F9rj3EwtSa+Vof+WnjHI=; b=EXwWi4i4U9/0HoHhF1ZJtgKp79FUQgdXciWJYVoFAFMC3m+YSFqFfJ0Od661ccC/OT ++ls/wlINHY3a4UVfQ/zhIzgXGIwJyboj55u4EU5CIBji5EXMOfEAy/pk5Czgh4/d6EK ji8h/aVASeTvy7QSeD3qkFzGos7kbWHci/5t114c4YkRHlfbViawHvH3/89I4hPHHcVL pInTxGcUo6+Gn9WKmKTAmL1xwyI+QNOGENWvEmlNwTkyVNjW73d/90YxwvzybgMKuLwt zFlR8bMR45RysVfJbvPu4ehk8wf9h3ubFeQJ15TRTYgBYRPNy/wbF11FD5QUBGFkN0Ya vvEA== MIME-Version: 1.0 X-Received: by 10.152.179.172 with SMTP id dh12mr2182314lac.76.1424861363832; Wed, 25 Feb 2015 02:49:23 -0800 (PST) Received: by 10.25.24.224 with HTTP; Wed, 25 Feb 2015 02:49:23 -0800 (PST) In-Reply-To: <54EDA38F.7000107@multiplay.co.uk> References: <54EDA38F.7000107@multiplay.co.uk> Date: Wed, 25 Feb 2015 16:19:23 +0530 Message-ID: Subject: Re: Zoned Commands ZBS/ZAC, Shingled SMR/SFS, ZFS From: Shehbaz Jaffer To: Steven Hartland Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 10:49:26 -0000 Sorry for the confusion. I thought proposal was to implement FreeBSD- ZFS on SMR Drives. So rephrasing the question: 1. How will we emulate SMR Drives on QEMU? 2. How should we go about using open source SMR Drives and use these in freeBSD? Thanks, On Wed, Feb 25, 2015 at 3:57 PM, Steven Hartland wrote: > > On 25/02/2015 10:21, Shehbaz Jaffer wrote: >> >> Hi, >> >> I am interested in collaborating with you for bringing in ZFS in >> freeBSD. Couple of questions: >> >> 1. Can we emulate ZFS on Qemu or some other hypervisor or do we need >> SMR drives for development / integration? >> 2. How do you propose we use the open source ZFS libraries and include >> them in freeBSD? > > Confused FreeBSD already has ZFS > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 13:13:27 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id BE9E87E5 for ; Wed, 25 Feb 2015 13:13:27 +0000 (UTC) Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 7392565C for ; Wed, 25 Feb 2015 13:13:27 +0000 (UTC) Received: by mail-wi0-f176.google.com with SMTP id h11so33196927wiw.3 for ; Wed, 25 Feb 2015 05:13:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=1Xo1tN1ljy6yh1CaW6EUtSDVe8uk1trCfOJYU3DFUtw=; b=IePOWhr5PSkaJYZiJ0E7nQY7fyqKSpEuvi4CjpjsaJCk563ZabZ6mW6h9dYFxBR3Sw Dk+RL+L9c5V6AdVNG7qsZY5VimRM3vSPmmVCu33G6MTK9cwTYJmeI3dx1PqHlOzSHvUN Sg3hwVnEWnmzfZLcXDZBRtT81iRzfB9Pw1fHof0fmFe8TZVzttwn86WR2EMegcYz9m0d 27lRyWM9FJjpWfEwWvaQsAGqZZAm41AL5wN/ha5gbUMZ1r968zVu59n0wy5I2yEQyWZO 3UmQ/MXptkhP2KPx6a7Rf/+YMwYwRQHTf4gwFO0u/Dac/SH8AVf5Xn7V60KQa4nApUd9 c2Lg== X-Gm-Message-State: ALoCoQkeg5txPRBofVFsMR1IXqa1Ftck1jId+muT24WfSeJOG4Ebuvj0Df7yZvyRTcQg2ybC86g3 X-Received: by 10.180.72.98 with SMTP id c2mr40341637wiv.87.1424869999061; Wed, 25 Feb 2015 05:13:19 -0800 (PST) MIME-Version: 1.0 Received: by 10.27.143.19 with HTTP; Wed, 25 Feb 2015 05:12:58 -0800 (PST) In-Reply-To: <201502250244.t1P2iu6N094346@hergotha.csail.mit.edu> References: <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca> <201502250244.t1P2iu6N094346@hergotha.csail.mit.edu> From: Tim Borgeaud Date: Wed, 25 Feb 2015 13:12:58 +0000 Message-ID: Subject: Re: NFS: kernel modules (loading/unloading) and scheduling To: Garrett Wollman Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 13:13:27 -0000 Hi Rick, Garret & others, Thanks for the replies and useful info. I may take a look at enabling some kernel module reloading, but, if the usual approach is rebooting, I expect that this won't be an issue. Regarding the NFS functionality itself, I can give a bit more of an overview of our (Framestore) NFS use and what NFS server functionality we are considering. We have NFS storage systems accessed by both an HPC cluster and also multiple desktop users. Like many organizations, we want to have, and get the most from, high performance NFS storage, in terms of both total IO and low latency. One of the demands that we face is to allow users, as nodes of the cluster or more interactive desktops, to be provided with a reasonable level of service from heavily loaded systems. We do want to allow "users" to apply high load, but would like to avoid such requests from starving others. We are considering scheduling of NFS RPCs such that both client transports (xprt) and users themselves are given a fair share, including where a single user is using multiple transports. At the same time, we don't want hurt performance by loss of file handle affinity or similar (though it may be quite nice to remove responsibility from the RPC/NFS layer). We've already prototyped a 'fair' schedule when servicing the RPC requests, which appears to be an improvement. But, as Garrett pointed out, we may have moved onto bottlenecks in sending replies and, possibly, reading requests. It may be that, with such cases as slow clients, overall performance could also be improved. -- Tim Borgeaud Systems Developer On 25 February 2015 at 02:44, Garrett Wollman < wollman@hergotha.csail.mit.edu> wrote: > In article > <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca>, > rmacklem@uoguelph.ca writes: > > >I tend to think that a bias towards doing Getattr/Lookup over Read/Write > >may help performance (the old "shortest job first" principal), I'm not > >sure you'll have a big enough queue of outstanding RPCs under normal load > >for this to make a real difference. > > I don't think this is a particularly relevant condition here. There > are lots of ways RPCs can pile up where you really need to do better > work-sharing than the current implementation does. One example is a > client that issues lots of concurrent reads (e.g., a compute node > running dozens of parallel jobs). Two such systems on gigabit NICs > can easily issue large reads fast enough to cause 64 nfsd service > threads to blocked while waiting for the socket send buffer to drain. > Meanwhile, the file server is completely idle, but unable to respond > to incoming requests, and the other users get angry. Rather than > assigning new threads to requests from the slow clients, it would be > better to let the requests sit until the send buffer drains, and > process other clients' requests instead of letting the resources get > monopolized by a single user. > > Lest you think this is purely hypothetical: we actually experienced > this problem today, and I verified with "procstat -kk" that all of the > nfsd threads were in fact blocked waiting for send buffer space to > open up. I was able to restore service immediately by increasing the > number of nfsd threads, but I'm unsure to what extent I can do this > without breaking other things or hitting other bottlenecks.[1] So I > have a user asking me why I haven't enable fair-share scheduling for > NFS, and I'm going to have to tell him the answer is "no such thing". > > -GAWollman > > [1] What would the right number actually be? We could potentially > have many thousands of threads in a compute cluster all operating > simultaneously on the same filesystem, well within the I/O capacity of > the server, and we'd really like to degrade gracefully rather than > falling over when a single slow client soaks up all of the nfsd worker > threads. > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 14:57:12 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 151BF832 for ; Wed, 25 Feb 2015 14:57:12 +0000 (UTC) Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D9918213 for ; Wed, 25 Feb 2015 14:57:11 +0000 (UTC) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.nyi.internal (Postfix) with ESMTP id 2C0B120DCD for ; Wed, 25 Feb 2015 09:57:03 -0500 (EST) Received: from web3 ([10.202.2.213]) by compute1.internal (MEProxy); Wed, 25 Feb 2015 09:57:03 -0500 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:x-sasl-enc:from:to :mime-version:content-transfer-encoding:content-type:in-reply-to :references:subject:date; s=smtpout; bh=zMD9SqVGanasr837ZDraY2z6 Dq0=; b=plgQ+YIixGUyADvFj0BquuPglAC1TFX2i2eiVCcNarpLQ/N6/Nv6a4qm MWr/7Lc5wOOxwSAbpjzBGykgI+UVe9GXZX29BsZmkpXGozqmRBTguwzoRYj1Du6r P9h5I80tvaW3gXCvwzrZTFFth4tEJfqKBUObKepVNinMDB/cFJE= Received: by web3.nyi.internal (Postfix, from userid 99) id D2EAE11B4A1; Wed, 25 Feb 2015 09:57:03 -0500 (EST) Message-Id: <1424876223.3248960.232271733.32363554@webmail.messagingengine.com> X-Sasl-Enc: hxf33gbwIe2pdLN54KpRdXIuow96vmxvc/JmEasIimQR 1424876223 From: Mark Felder To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-4ba7306c In-Reply-To: References: Subject: Re: Zoned Commands ZBS/ZAC, Shingled SMR/SFS, ZFS Date: Wed, 25 Feb 2015 08:57:03 -0600 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 14:57:12 -0000 On Fri, Feb 20, 2015, at 14:15, grarpamp wrote: > These may be of interest for possible integration... > Do I understand the scope of this correctly: - SMR drives are the future for increasing drive capacity - SMR drives are terrible at random IO - There is work in Linux to detect these SMR drives and alter ZFS' behavior Is that correct? If so, it's interesting... but sort of annoying that we are basically creating quirks and a wildly different codepath because the drive manufacturers haven't figured out another way to increase density and keep expected behavior. From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 22:08:25 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 779F468E; Wed, 25 Feb 2015 22:08:25 +0000 (UTC) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2001:470:8b2d:1e1c:21b:21ff:feb8:d7b0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "khavrinen.csail.mit.edu", Issuer "Client CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 45106B1; Wed, 25 Feb 2015 22:08:25 +0000 (UTC) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.14.9/8.14.9) with ESMTP id t1PM8NSC060838 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA); Wed, 25 Feb 2015 17:08:23 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.14.9/8.14.9/Submit) id t1PM8MGf060835; Wed, 25 Feb 2015 17:08:22 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21742.18390.976511.707403@khavrinen.csail.mit.edu> Date: Wed, 25 Feb 2015 17:08:22 -0500 From: Garrett Wollman To: freebsd-fs@freebsd.org Subject: Implementing backpressure in the NFS server X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (khavrinen.csail.mit.edu [127.0.0.1]); Wed, 25 Feb 2015 17:08:23 -0500 (EST) Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 22:08:25 -0000 Here's the scenario: 1) A small number of (Linux) clients run a large number of processes (compute jobs) that read large files sequentially out of an NFS filesystem. Each process is reading from a different file. 2) The clients are behind a network bottleneck. 3) The Linux NFS client will issue NFS3PROC_READ RPCs (potentially including read-ahead) independently for each process. 4) The network bottleneck does not serve to limit the rate at which read RPCs can be issued, because the requests are small (it's only the responses that are large). 5) Even if the responses are delayed, causing one process to block, there are sufficient other processes that are still runnable to allow more reads to be issued. 6) On the server side, because these are requests for different file handles, they will get steered to different NFS service threads by the generic RPC queueing code. 7) Each service thread will process the read to completion, and then block when the reply is transmitted because the socket buffer is full. 8) As more reads continue to be issued by the clients, more and more service threads are stuck waiting for the socket buffer until all of the nfsd threads are blocked. 9) The server is now almost completely idle. Incoming requests can only be serviced when one of the nfsd threads finally manages to put its pending reply on the socket send queue, at which point it can return to the RPC code and pick up one request -- which, because the incoming queues are full of pending reads from the problem clients, is likely to get stuck in the same place. Lather, rinse, repeat. What should happen here? As an administrator, I can certainly increase the number of NFS service threads until there are sufficient threads available to handle all of the offered load -- but the load varies widely over time, and it's likely that I would run into other resource constraints if I did this without limit. (Is 1000 threads practical? What happens when a different mix of RPCs comes in -- will it livelock the server?) I'm of the opinion that we need at least one of the following things to mitigate this issue, but I don't have a good knowledge of the RPC code to have an idea how feasible this is: a) Admission control. RPCs should not be removed from the receive queue if the transmit queue is over some high-water mark. This will ensure that a problem client behind a network bottleneck like this one will eventually feel backpressure via TCP window contraction if nothing else. This will also make it more likely that other clients will still get their RPCs processed even if most service threads are taken up by the problem clients. b) Fairness scheduling. There should be some parameter, configurable by the administrator, that restricts the number of nfsd threads any one client can occupy, independent of how many requests it has pending. A really advanced scheduler would allow bursting over the limit for some small number of requests. Does anyone else have thoughts, or even implementation ideas, on this? -GAWollman From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 23:26:14 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6A9906CF; Wed, 25 Feb 2015 23:26:14 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [IPv6:2001:470:1f05:b76::196]) by mx1.freebsd.org (Postfix) with ESMTP id 55B52C02; Wed, 25 Feb 2015 23:26:14 +0000 (UTC) Received: from AlfredMacbookAir.local (3.sub-70-208-71.myvzw.com [70.208.71.3]) by elvis.mu.org (Postfix) with ESMTPSA id DBE70341F912; Wed, 25 Feb 2015 15:26:12 -0800 (PST) Message-ID: <54EE5AE9.1000908@freebsd.org> Date: Wed, 25 Feb 2015 18:29:45 -0500 From: Alfred Perlstein Organization: FreeBSD User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Garrett Wollman , freebsd-fs@freebsd.org Subject: Re: Implementing backpressure in the NFS server References: <21742.18390.976511.707403@khavrinen.csail.mit.edu> In-Reply-To: <21742.18390.976511.707403@khavrinen.csail.mit.edu> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 23:26:14 -0000 On 2/25/15 5:08 PM, Garrett Wollman wrote: > Here's the scenario: > > 1) A small number of (Linux) clients run a large number of processes > (compute jobs) that read large files sequentially out of an NFS > filesystem. Each process is reading from a different file. > > 2) The clients are behind a network bottleneck. > > 3) The Linux NFS client will issue NFS3PROC_READ RPCs (potentially > including read-ahead) independently for each process. > > 4) The network bottleneck does not serve to limit the rate at which > read RPCs can be issued, because the requests are small (it's only the > responses that are large). > > 5) Even if the responses are delayed, causing one process to block, > there are sufficient other processes that are still runnable to allow > more reads to be issued. > > 6) On the server side, because these are requests for different file > handles, they will get steered to different NFS service threads by the > generic RPC queueing code. > > 7) Each service thread will process the read to completion, and then > block when the reply is transmitted because the socket buffer is full. > > 8) As more reads continue to be issued by the clients, more and more > service threads are stuck waiting for the socket buffer until all of > the nfsd threads are blocked. > > 9) The server is now almost completely idle. Incoming requests can > only be serviced when one of the nfsd threads finally manages to put > its pending reply on the socket send queue, at which point it can > return to the RPC code and pick up one request -- which, because the > incoming queues are full of pending reads from the problem clients, is > likely to get stuck in the same place. Lather, rinse, repeat. > > What should happen here? As an administrator, I can certainly > increase the number of NFS service threads until there are sufficient > threads available to handle all of the offered load -- but the load > varies widely over time, and it's likely that I would run into other > resource constraints if I did this without limit. (Is 1000 threads > practical? What happens when a different mix of RPCs comes in -- will > it livelock the server?) > > I'm of the opinion that we need at least one of the following things > to mitigate this issue, but I don't have a good knowledge of the RPC > code to have an idea how feasible this is: > > a) Admission control. RPCs should not be removed from the receive > queue if the transmit queue is over some high-water mark. This will > ensure that a problem client behind a network bottleneck like this one > will eventually feel backpressure via TCP window contraction if > nothing else. This will also make it more likely that other clients > will still get their RPCs processed even if most service threads are > taken up by the problem clients. > > b) Fairness scheduling. There should be some parameter, configurable > by the administrator, that restricts the number of nfsd threads any > one client can occupy, independent of how many requests it has > pending. A really advanced scheduler would allow bursting over the > limit for some small number of requests. > > Does anyone else have thoughts, or even implementation ideas, on this? The default number of threads is insanely low, the only reason I didn't bump them to FreeNAS levels (or higher) was because of the inevitable bikeshed/cryfest about Alfred touching defaults so I didn't bother. I kept them really small, because y'know people whine, and they are capped at ncpu * 8, it really should be higher imo. Just increase the nfs servers to something higher, I think we were at 256 threads in FreeNAS and it did us just fine. Higher seemed ok, except we lost a bit of performance. The only problem you might see is on SMALL machines where people will complain. So probably want an arch specific override or perhaps a memory based sliding scale. If that could become a FreeBSD default (with overrides for small memory machines and arches) that would be even better. I think your other suggestions are fine, however the problem is that: 1) they seem complex for an edge case 2) turning them on may tank performance for no good reason if the heuristic is met but we're not in the bad situation That said if you want to pursue those options, by all means please do. -Alfred From owner-freebsd-fs@FreeBSD.ORG Wed Feb 25 23:36:28 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9F3118F1; Wed, 25 Feb 2015 23:36:28 +0000 (UTC) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [IPv6:2001:470:8b2d:1e1c:21b:21ff:feb8:d7b0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "khavrinen.csail.mit.edu", Issuer "Client CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 25F54CEC; Wed, 25 Feb 2015 23:36:28 +0000 (UTC) Received: from khavrinen.csail.mit.edu (localhost [127.0.0.1]) by khavrinen.csail.mit.edu (8.14.9/8.14.9) with ESMTP id t1PNaQ0b062094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL CN=khavrinen.csail.mit.edu issuer=Client+20CA); Wed, 25 Feb 2015 18:36:26 -0500 (EST) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.14.9/8.14.9/Submit) id t1PNaQbl062091; Wed, 25 Feb 2015 18:36:26 -0500 (EST) (envelope-from wollman) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <21742.23674.220013.63261@khavrinen.csail.mit.edu> Date: Wed, 25 Feb 2015 18:36:26 -0500 From: Garrett Wollman To: Alfred Perlstein Subject: Re: Implementing backpressure in the NFS server In-Reply-To: <54EE5AE9.1000908@freebsd.org> References: <21742.18390.976511.707403@khavrinen.csail.mit.edu> <54EE5AE9.1000908@freebsd.org> X-Mailer: VM 7.17 under 21.4 (patch 22) "Instant Classic" XEmacs Lucid X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (khavrinen.csail.mit.edu [127.0.0.1]); Wed, 25 Feb 2015 18:36:26 -0500 (EST) Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Feb 2015 23:36:28 -0000 < said: > I think your other suggestions are fine, however the problem is that: > 1) they seem complex for an edge case > 2) turning them on may tank performance for no good reason if the > heuristic is met but we're not in the bad situation I'm OK with trading off performance for one user against availability for the other 800. I'm pleased to hear that FreeNAS was fine with 256 threads as the default; I'm going to try running with that to see if we encounter any other scaling issues as a result. (Our servers are all 12-core, 24-thread systems with buckets of memory, and I remember increasing the thread count as high as 128 previously, but I also remember having to back it down to 64 for some reason.) -GAWollman From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 00:27:38 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CDF5342C; Thu, 26 Feb 2015 00:27:38 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 77B9D214; Thu, 26 Feb 2015 00:27:37 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CSBQBbZe5U/95baINbg1RaBIMFwBUKhSdJAoFoAQEBAQEBfIQPAQEBAwEBAQEgBCcgCwUWGAICDRkCKQEJJgYIBwQBHASIBggNvEWZIQEBAQEBBQEBAQEBAQEbgSGIdH6EDAsFAgEbNAeCaIFDBYpQgw2FZoNGgzo5gmGCUUuIMoM+IoF9AgMcgW4gMQd7AQQbIn8BAQE X-IronPort-AV: E=Sophos;i="5.09,648,1418101200"; d="scan'208";a="194869451" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 25 Feb 2015 19:27:30 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 9642CB3F01; Wed, 25 Feb 2015 19:27:30 -0500 (EST) Date: Wed, 25 Feb 2015 19:27:30 -0500 (EST) From: Rick Macklem To: Alfred Perlstein Message-ID: <2109245074.408271.1424910450599.JavaMail.root@uoguelph.ca> In-Reply-To: <54EE5AE9.1000908@freebsd.org> Subject: Re: Implementing backpressure in the NFS server MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 00:27:38 -0000 Alfred Perlstein wrote: > > On 2/25/15 5:08 PM, Garrett Wollman wrote: > > Here's the scenario: > > > > 1) A small number of (Linux) clients run a large number of > > processes > > (compute jobs) that read large files sequentially out of an NFS > > filesystem. Each process is reading from a different file. > > > > 2) The clients are behind a network bottleneck. > > > > 3) The Linux NFS client will issue NFS3PROC_READ RPCs (potentially > > including read-ahead) independently for each process. > > > > 4) The network bottleneck does not serve to limit the rate at which > > read RPCs can be issued, because the requests are small (it's only > > the > > responses that are large). > > > > 5) Even if the responses are delayed, causing one process to block, > > there are sufficient other processes that are still runnable to > > allow > > more reads to be issued. > > > > 6) On the server side, because these are requests for different > > file > > handles, they will get steered to different NFS service threads by > > the > > generic RPC queueing code. > > > > 7) Each service thread will process the read to completion, and > > then > > block when the reply is transmitted because the socket buffer is > > full. > > > > 8) As more reads continue to be issued by the clients, more and > > more > > service threads are stuck waiting for the socket buffer until all > > of > > the nfsd threads are blocked. > > > > 9) The server is now almost completely idle. Incoming requests can > > only be serviced when one of the nfsd threads finally manages to > > put > > its pending reply on the socket send queue, at which point it can > > return to the RPC code and pick up one request -- which, because > > the > > incoming queues are full of pending reads from the problem clients, > > is > > likely to get stuck in the same place. Lather, rinse, repeat. > > > > What should happen here? As an administrator, I can certainly > > increase the number of NFS service threads until there are > > sufficient > > threads available to handle all of the offered load -- but the load > > varies widely over time, and it's likely that I would run into > > other > > resource constraints if I did this without limit. (Is 1000 threads > > practical? What happens when a different mix of RPCs comes in -- > > will > > it livelock the server?) > > As far as I know, even 256 is an arbitrary limit left from when a server was typically a single core i386. Since they are just kernel threads, the idle ones are very little overhead. There is a 256 limit wired into the sources, but you can increase this and recompile. (MAXNFSDCNT in nfsd.c) I can't think of why 1000 threads isn't practical for server hardware of the size you run. > > I'm of the opinion that we need at least one of the following > > things > > to mitigate this issue, but I don't have a good knowledge of the > > RPC > > code to have an idea how feasible this is: > > > > a) Admission control. RPCs should not be removed from the receive > > queue if the transmit queue is over some high-water mark. This > > will > > ensure that a problem client behind a network bottleneck like this > > one > > will eventually feel backpressure via TCP window contraction if > > nothing else. This will also make it more likely that other > > clients > > will still get their RPCs processed even if most service threads > > are > > taken up by the problem clients. > > This sounds like a good idea in theory. However, I'm not sure how you can implement it. As you mention, Read requests are small. However, Write requests are fairly large. --> one 64K write request will result in as many bytes in the socket's receive queue as something like 700 Read requests. As such, using the socket receive queue's sb_cc isn't going to work. Since TCP doesn't have record marks, there is no way to know how many RPC requests are in the queue until they are processed. But, by the time the krpc has parsed out an RPC request from the socket's receive queue, it is pretty must "too late". For NFSv3, doing something like file handle affinity does, which pre-parses the RPC request, then putting it in some other queue instead of handing it to an nfsd right away, might be feasible. Also, lots of Getattr and/or Lookup isn't the same as lots of Reads. Then you'd have to decide how long to delay the RPC. If you use TCP send queue length, then what about cases where there are a lot of TCP connections for the clients on the other side of the network bottleneck? (However, since all NFSv4 RPCs are the same, ie Compound, this doesn't work for NFSv4, since it is very hard to parse and determine what an NFSv4 compound does without parsing it completely. The current code parses it as the Ops in it are done.) I think just having lots of nfsd threads and letting a bunch of them block on the socket send queue is much simpler to me. One of the nice things about using TCP transport is that it will apply backpressure at this level. > > b) Fairness scheduling. There should be some parameter, > > configurable > > by the administrator, that restricts the number of nfsd threads any > > one client can occupy, independent of how many requests it has > > pending. A really advanced scheduler would allow bursting over the > > limit for some small number of requests. > > I've thought about this and so long as you go with "per TCP connection" instead of per-client (which I think is close to the same thing in practice), it may be a good idea. I suspect the main problem with this is that it will negatively impact clients when most other clients are idle. (The worst case is when one client wants to do lots of reading when no other client mount is active.) I thought of something like a running estimate of "active clients" and then divide that into the total # of nfsd threads, but then estimating "active clients" will be a heuristic at best. Also, what about the case of many clients each doing some reads behind the network bottleneck instead of fewer clients doing lots of reads behind the network bottleneck? > > Does anyone else have thoughts, or even implementation ideas, on > > this? > The default number of threads is insanely low, the only reason I > didn't > bump them to FreeNAS levels (or higher) was because of the inevitable > bikeshed/cryfest about Alfred touching defaults so I didn't bother. > I > kept them really small, because y'know people whine, and they are > capped > at ncpu * 8, it really should be higher imo. > Heh, heh. That's why I never change defaults. Also, just fyi, I got email complaining about this and asking it be reverted back to "4" threads total by default. I just suggested they contact you;-) > Just increase the nfs servers to something higher, I think we were at > 256 threads in FreeNAS and it did us just fine. Higher seemed ok, > except we lost a bit of performance. > Yep, that's what I'd suggest too. If you can't get this to work well, then looking more closely at implementing one of your other suggestions. (I'd also recompile nfsd, so that you can go past 256 if you need to.) Good luck with it, rick > The only problem you might see is on SMALL machines where people will > complain. So probably want an arch specific override or perhaps a > memory based sliding scale. > > If that could become a FreeBSD default (with overrides for small > memory > machines and arches) that would be even better. > > I think your other suggestions are fine, however the problem is that: > 1) they seem complex for an edge case > 2) turning them on may tank performance for no good reason if the > heuristic is met but we're not in the bad situation > > That said if you want to pursue those options, by all means please > do. > > -Alfred > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 03:22:24 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C1468946 for ; Thu, 26 Feb 2015 03:22:24 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1.freebsd.org (Postfix) with ESMTP id A4ECDDA5 for ; Thu, 26 Feb 2015 03:22:24 +0000 (UTC) Received: from msam.nabble.com (unknown [162.253.133.85]) by mwork.nabble.com (Postfix) with ESMTP id 045AC1512959 for ; Wed, 25 Feb 2015 19:22:25 -0800 (PST) Date: Wed, 25 Feb 2015 20:22:24 -0700 (MST) From: andy zhang To: freebsd-fs@freebsd.org Message-ID: <1424920944349-5992079.post@n5.nabble.com> In-Reply-To: <20150216095410.GH34251@kib.kiev.ua> References: <54E1B90E.8050101@freebsd.org> <20150216095410.GH34251@kib.kiev.ua> Subject: Re: About Filesystem freeze/thaw in freebsd MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 03:22:24 -0000 Thanks, I have already tried "UFSSUSPEND/UFSRESUME ioctls on the /dev/ufssuspend", and I found it actually not work. In Linux, if I send freeze ioctl, all write operations will be blocked unless I send thaw ioctl. While for "UFSSUSPEND/UFSRESUME ioctls", it does not work in that way. that is: If I send UFSSUSPEND ioctl, I still can do write operations, like create files, etc. I am still looking that the code of ufssuspend, and if that really not works, i may implement that in my driver level. thanks for any advice. -- View this message in context: http://freebsd.1045724.n5.nabble.com/About-Filesystem-freeze-thaw-in-freebsd-tp5989205p5992079.html Sent from the freebsd-fs mailing list archive at Nabble.com. From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 03:38:33 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1A8D5BFF for ; Thu, 26 Feb 2015 03:38:33 +0000 (UTC) Received: from mail-we0-x233.google.com (mail-we0-x233.google.com [IPv6:2a00:1450:400c:c03::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id AE660E91 for ; Thu, 26 Feb 2015 03:38:32 +0000 (UTC) Received: by wesu56 with SMTP id u56so7456527wes.10 for ; Wed, 25 Feb 2015 19:38:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=eoDIm89A6nsAbmdXw+p9Tabe/N2d857EOMENg+qgzxM=; b=KdUySYVBsofkmGKUtQVNPmULPcfC7y7IFlpiu3hQZEkTn5OJSs371797EwiKjrA3D4 79jbNtSrGTVMewn4x8DgxalCkIyVAkkRcj07ym+nQ7/FbeVOzWGnDRII2A8SnFTSukXx MpIE2csFOoca0pIFdA/pKOQp+tfetOit0YAaMH21tFLWGKD/XeSn9T4oDJJvjUvHF0BC uKjgDj5zngeLHkMwm2h07fOoHsVvfUVW2POBZRqFts7M1sUMeK8qFbtVHYOTqXFrM/Pe BtuFPPt8a82EdTK8bjxmiKE5/AFF/JSxeiJAQu42YExRngUSvzbKu0FF0iN77RsojgpX uZqQ== X-Received: by 10.194.23.39 with SMTP id j7mr12523286wjf.9.1424921911133; Wed, 25 Feb 2015 19:38:31 -0800 (PST) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by mx.google.com with ESMTPSA id kj8sm67530009wjc.29.2015.02.25.19.38.29 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 25 Feb 2015 19:38:30 -0800 (PST) Date: Thu, 26 Feb 2015 04:38:27 +0100 From: Mateusz Guzik To: andy zhang Subject: Re: About Filesystem freeze/thaw in freebsd Message-ID: <20150226033827.GA3799@dft-labs.eu> References: <54E1B90E.8050101@freebsd.org> <20150216095410.GH34251@kib.kiev.ua> <1424920944349-5992079.post@n5.nabble.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1424920944349-5992079.post@n5.nabble.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 03:38:33 -0000 On Wed, Feb 25, 2015 at 08:22:24PM -0700, andy zhang wrote: > Thanks, I have already tried "UFSSUSPEND/UFSRESUME ioctls on the > /dev/ufssuspend", and I found it actually not work. In Linux, if I send > freeze ioctl, all write operations will be blocked unless I send thaw ioctl. > While for "UFSSUSPEND/UFSRESUME ioctls", it does not work in that way. > that is: > If I send UFSSUSPEND ioctl, I still can do write operations, like create > files, etc. > > I am still looking that the code of ufssuspend, and if that really not > works, i may implement that in my driver level. thanks for any advice. > Can you show your code? If you inspect ffs_susp_ioctl, you will see it expects fsid as an argument. Unless stuff got horribly broken, you can also see that proper usage results in setting MNTK_SUSPEND flag. Then if you inspect code creating files, writing etc. you will see it checks for that flag. -- Mateusz Guzik From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 03:55:38 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E53F9E21; Thu, 26 Feb 2015 03:55:37 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 81B78A3; Thu, 26 Feb 2015 03:55:36 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CSBQD3l+5U/95baINbDoNGWgSDBcAVCoUnSQKBbQEBAQEBAXyEEAEBBAEBASArIAsFFhgCAg0ZAikBCSYGCAcEARwEiA4NvBmZCAEBAQEBAQEDAQEBAQEBAQEagSGJcoQdAQEbNAeCaIFDBYpQiHODRoM6OYUyiH2DPiKDMVsgMQEBAQSBBDl/AQEB X-IronPort-AV: E=Sophos;i="5.09,650,1418101200"; d="scan'208";a="194891817" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 25 Feb 2015 22:55:35 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 8624BB3F85; Wed, 25 Feb 2015 22:55:35 -0500 (EST) Date: Wed, 25 Feb 2015 22:55:35 -0500 (EST) From: Rick Macklem To: Garrett Wollman Message-ID: <1930269117.485441.1424922935533.JavaMail.root@uoguelph.ca> In-Reply-To: <201502250244.t1P2iu6N094346@hergotha.csail.mit.edu> Subject: Re: NFS: kernel modules (loading/unloading) and scheduling MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 03:55:38 -0000 Garrett Wollman wrote: > In article > <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca>, > rmacklem@uoguelph.ca writes: > > >I tend to think that a bias towards doing Getattr/Lookup over > >Read/Write > >may help performance (the old "shortest job first" principal), I'm > >not > >sure you'll have a big enough queue of outstanding RPCs under normal > >load > >for this to make a real difference. > > I don't think this is a particularly relevant condition here. There > are lots of ways RPCs can pile up where you really need to do better > work-sharing than the current implementation does. One example is a > client that issues lots of concurrent reads (e.g., a compute node > running dozens of parallel jobs). Two such systems on gigabit NICs > can easily issue large reads fast enough to cause 64 nfsd service > threads to blocked while waiting for the socket send buffer to drain. > Meanwhile, the file server is completely idle, but unable to respond > to incoming requests, and the other users get angry. Rather than > assigning new threads to requests from the slow clients, it would be > better to let the requests sit until the send buffer drains, and > process other clients' requests instead of letting the resources get > monopolized by a single user. > > Lest you think this is purely hypothetical: we actually experienced > this problem today, and I verified with "procstat -kk" that all of > the > nfsd threads were in fact blocked waiting for send buffer space to > open up. I was able to restore service immediately by increasing the > number of nfsd threads, but I'm unsure to what extent I can do this > without breaking other things or hitting other bottlenecks.[1] So I > have a user asking me why I haven't enable fair-share scheduling for > NFS, and I'm going to have to tell him the answer is "no such thing". > > -GAWollman > > [1] What would the right number actually be? We could potentially > have many thousands of threads in a compute cluster all operating > simultaneously on the same filesystem, well within the I/O capacity > of > the server, and we'd really like to degrade gracefully rather than > falling over when a single slow client soaks up all of the nfsd > worker > threads. Well, each of these threads have two structures allocated to it. 1 - The kthread info (sched_sizeof_thread() <-- struct thread + the scheduler info one) 2 - A structure used by the krpc for each thread. Since allocating two moderate sized structures isn't a lot of kernel memory, I would think a server like yours would be fine with several thousand nfsd threads. What would be interesting would be the receive queue lengths for the sockets for NFS client TCP connections when the server is running normally. (This would be an indication of how many outstanding RPC requests any scheduling effort would select between.) I'll admit (given basic queuing theory) I would have expected these receive queues to be small unless the server is overloaded. Oh, and I now realize my response related to your first idea "Admission" was way off and didn't make much sense. Somehow, I thought receive queue when you were talking about send queue. (Basically, just ignore my response.) However, given the different sizes of RPC replies, it might be hard to come up with a reasonable high water mark for the send queue. Also, the networking code would have to do some sort of upcall to the krpc when the send queue shrinks. (So, still not trivial to implement, I think?) I do agree with Alfred, in that I think you are experiencing nfsd thread starvation and increasing the number of nfsd threads a lot is the simple way to resolve this. rick > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 08:09:58 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0E9F2D06 for ; Thu, 26 Feb 2015 08:09:58 +0000 (UTC) Received: from mwork.nabble.com (mwork.nabble.com [162.253.133.43]) by mx1.freebsd.org (Postfix) with ESMTP id EA241C32 for ; Thu, 26 Feb 2015 08:09:57 +0000 (UTC) Received: from msam.nabble.com (unknown [162.253.133.85]) by mwork.nabble.com (Postfix) with ESMTP id C1984151892A for ; Thu, 26 Feb 2015 00:09:58 -0800 (PST) Date: Thu, 26 Feb 2015 01:09:56 -0700 (MST) From: andy zhang To: freebsd-fs@freebsd.org Message-ID: <1424938196983-5992105.post@n5.nabble.com> In-Reply-To: <20150226033827.GA3799@dft-labs.eu> References: <54E1B90E.8050101@freebsd.org> <20150216095410.GH34251@kib.kiev.ua> <1424920944349-5992079.post@n5.nabble.com> <20150226033827.GA3799@dft-labs.eu> Subject: Re: About Filesystem freeze/thaw in freebsd MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 08:09:58 -0000 Thanks for the reminder, I just got a wrong fsid in my ffs_susp_ioctl usage. Now it works well, that's awesome! Thanks again! -- View this message in context: http://freebsd.1045724.n5.nabble.com/About-Filesystem-freeze-thaw-in-freebsd-tp5989205p5992105.html Sent from the freebsd-fs mailing list archive at Nabble.com. From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 09:21:55 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 97C62D35; Thu, 26 Feb 2015 09:21:55 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2078B63C; Thu, 26 Feb 2015 09:21:54 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t1Q9LmSe022578 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 26 Feb 2015 11:21:48 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t1Q9LmSe022578 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t1Q9LlRD022577; Thu, 26 Feb 2015 11:21:47 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 26 Feb 2015 11:21:47 +0200 From: Konstantin Belousov To: Rick Macklem Subject: Re: NFS: kernel modules (loading/unloading) and scheduling Message-ID: <20150226092147.GC2379@kib.kiev.ua> References: <201502250244.t1P2iu6N094346@hergotha.csail.mit.edu> <1930269117.485441.1424922935533.JavaMail.root@uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1930269117.485441.1424922935533.JavaMail.root@uoguelph.ca> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 09:21:55 -0000 On Wed, Feb 25, 2015 at 10:55:35PM -0500, Rick Macklem wrote: > Garrett Wollman wrote: > > In article > > <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca>, > > rmacklem@uoguelph.ca writes: > > > > >I tend to think that a bias towards doing Getattr/Lookup over > > >Read/Write > > >may help performance (the old "shortest job first" principal), I'm > > >not > > >sure you'll have a big enough queue of outstanding RPCs under normal > > >load > > >for this to make a real difference. > > > > I don't think this is a particularly relevant condition here. There > > are lots of ways RPCs can pile up where you really need to do better > > work-sharing than the current implementation does. One example is a > > client that issues lots of concurrent reads (e.g., a compute node > > running dozens of parallel jobs). Two such systems on gigabit NICs > > can easily issue large reads fast enough to cause 64 nfsd service > > threads to blocked while waiting for the socket send buffer to drain. > > Meanwhile, the file server is completely idle, but unable to respond > > to incoming requests, and the other users get angry. Rather than > > assigning new threads to requests from the slow clients, it would be > > better to let the requests sit until the send buffer drains, and > > process other clients' requests instead of letting the resources get > > monopolized by a single user. > > > > Lest you think this is purely hypothetical: we actually experienced > > this problem today, and I verified with "procstat -kk" that all of > > the > > nfsd threads were in fact blocked waiting for send buffer space to > > open up. I was able to restore service immediately by increasing the > > number of nfsd threads, but I'm unsure to what extent I can do this > > without breaking other things or hitting other bottlenecks.[1] So I > > have a user asking me why I haven't enable fair-share scheduling for > > NFS, and I'm going to have to tell him the answer is "no such thing". > > > > -GAWollman > > > > [1] What would the right number actually be? We could potentially > > have many thousands of threads in a compute cluster all operating > > simultaneously on the same filesystem, well within the I/O capacity > > of > > the server, and we'd really like to degrade gracefully rather than > > falling over when a single slow client soaks up all of the nfsd > > worker > > threads. > Well, each of these threads have two structures allocated to it. > 1 - The kthread info (sched_sizeof_thread() <-- struct thread + the scheduler info one) > 2 - A structure used by the krpc for each thread. > Since allocating two moderate sized structures isn't a lot of kernel > memory, I would think a server like yours would be fine with several > thousand nfsd threads. The biggest memory consumer for any thread, kernel or not, is the kernel thread stack. It consumes both physical memory and KVA, the later is not too sparce for amd64. > > What would be interesting would be the receive queue lengths for the > sockets for NFS client TCP connections when the server is running > normally. (This would be an indication of how many outstanding RPC > requests any scheduling effort would select between.) > I'll admit (given basic queuing theory) I would have expected these > receive queues to be small unless the server is overloaded. > > Oh, and I now realize my response related to your first idea > "Admission" was way off and didn't make much sense. Somehow, I > thought receive queue when you were talking about send queue. > (Basically, just ignore my response.) > However, given the different sizes of RPC replies, it might > be hard to come up with a reasonable high water mark for the > send queue. Also, the networking code would have to do some > sort of upcall to the krpc when the send queue shrinks. > (So, still not trivial to implement, I think?) > > I do agree with Alfred, in that I think you are experiencing > nfsd thread starvation and increasing the number of nfsd threads > a lot is the simple way to resolve this. This also increases indirect scheduler costs. Direct management of the runqueues costs proportionally to the number of runnable threads, but some rare operations have to account all threads. From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 11:30:42 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A75BCF45; Thu, 26 Feb 2015 11:30:42 +0000 (UTC) Received: from mail-ie0-x230.google.com (mail-ie0-x230.google.com [IPv6:2607:f8b0:4001:c03::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 702F65F2; Thu, 26 Feb 2015 11:30:42 +0000 (UTC) Received: by iecrl12 with SMTP id rl12so13601181iec.2; Thu, 26 Feb 2015 03:30:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:cc:content-type; bh=ZhyJen6rMXNB3RtkmDfOykU6rXb0rH5KROASxcdWsvE=; b=Tg8yiwTPFwhvKA6k+UUUUJDVo/fY7TpDRagdKoRA7zPp/mCO/6fwdLnN0BYvZyONQQ aXNAFOiE9PSpZbS2qo7sZw35TPU626dfubNO/1T3EHM07+8ltr7b6egfEXIpiJsNdpyY nhKG69GGcgo8KNWPbMQZx3t/rmf8pKheUji6uVkVvpNjELDNu1uuXk0fmRMR6Gm8c+PE S8Ms10sumyktq1hu3nquFVT12pJR+uXhbsImw0tA/p+KQ/xSv1vQdU9h+miChaGrisv8 /ASMBHwO+xi4DmNm4bXoGgus0iuPDAYLkeXUXYXtkka8Qk3wZzHK5WXCllU/FPDfOTlw se+A== MIME-Version: 1.0 X-Received: by 10.50.62.110 with SMTP id x14mr33885594igr.2.1424950241690; Thu, 26 Feb 2015 03:30:41 -0800 (PST) Received: by 10.36.111.202 with HTTP; Thu, 26 Feb 2015 03:30:41 -0800 (PST) Date: Thu, 26 Feb 2015 06:30:41 -0500 Message-ID: Subject: Zoned Commands ZBC/ZAC, Shingled SMR drives, ZFS From: grarpamp To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: developer@open-zfs.org, zfs-devel@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 11:30:42 -0000 > On Wed, 25 Feb 2015 16:19 Shehbaz Jaffer : > I thought proposal was to implement FreeBSD - ZFS on SMR Drives. That's right. Same as Linux and EXT4 were both adapted for SMR. > 1. How will we emulate SMR Drives on QEMU? Given that you can request drive samples for development from feldman@seagate , I'd think the QEMU question is left to QEMU. > 2. How should we go about using open source SMR Drives and use > these in FreeBSD? Thanks for your interest, there are good people here to collaborate development with. Also, the overall OS agnostic home for ZFS is at http://www.open-zfs.org/ Have a look there as well. They have a pretty active ZFS list. There are enough links below to get people started. Are you still affiliated with NetApp as suggested by the sig in your first email? > On Wed, 25 Feb 2015 Mark Felder : > Do I understand the scope of this correctly: I'll try based on limited reading, refer to the links and video for authoritative answers. > - SMR drives are the future for increasing drive capacity So far, given spindle vs SSD applications, likely yes. Seagate just released 8TB top of their "Archive" line for $280. Western Digital Hitachi has a 10TB model in the queue. SMR seems to be something FreeBSD and ZFS would want to incorporate in order to further compete in the storage space. > - SMR drives are terrible at random IO Haven't found any benchmarks yet. There may be some in these presentations or over in Linux land. I think only random writes are currently performance caveated if you are not SMR host aware. These drives present three types of management and have huge caches. The host aware and filesystem mods are meant to address speed. Drives will do 150MB/s r/w avg. > - There is work in Linux to detect these SMR drives Yes, in their storage stacks. They work as a dumb non-optimized LBA device without it. > and alter ZFS' behavior Linux is enhancing EXT4. The presentations in the links hope to enhance ZFS as well. ZFS's copy on write seems well suited to SMR. I don't have anything to do with or know anything, just didn't initially see any FreeBSD + SMR + ZFS references out there. Have fun. ========= Newly added links ZFS Host-Aware_SMR https://www.youtube.com/watch?v=b1yqjV8qemU SMR Modifications to EXT4 (and other generic file systems) http://www.spinics.net/lists/linux-ext4/msg46868.html http://www.spinics.net/lists/linux-scsi/msg81950.html Linux has been adding support since their 3.18 kernel onwards... https://git.kernel.org/cgit/linux/kernel/git/hare/scsi-devel.git/log/?h=zac.v2 https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=7a14c1c3319608154da8712e4174d56ffb2f7b8d https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=9162c6579bf90b3f5ddb7e3a6c6fa946c1b4cbeb https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/commit/?id=f9ca5ab832e7ac5bc2b6fe0e82ad46d536f436f9 Drive specs http://www.seagate.com/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf http://www.seagate.com/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/100757960c.pdf http://www.hgst.com/science-of-storage/next-generation-data-centers/10tb-smr-helioseal-hdd ZBC/ZAC specs http://www.t10.org/members/w_zbc-.htm http://www.t13.org/Documents/MinutesDefault.aspx?keyword=di537 ========= Previous links for reference ZFS Host-Aware_SMR http://open-zfs.org/w/images/2/2a/Host-Aware_SMR-Tim_Feldman.pdf http://www.youtube.com/watch?v=b1yqjV8qemU ZFS on SMR Drives: Enabling Shingled Magnetic Recording (SMR) for Enterprises http://storageconference.us/2014/Presentations/Novak.pdf ZBC device manipulation library https://github.com/hgst/libzbc Seagate SMR Friendly File System - EXT4 https://github.com/Seagate/SMR_FS-EXT4 Initial ZAC support http://www.spinics.net/lists/linux-scsi/msg81545.html ZAC/ZBC Update http://www.spinics.net/lists/linux-scsi/msg80161.html libzbc - The Linux Foundation http://events.linuxfoundation.org/sites/events/files/slides/SMR-LinuxConUSA-2014.pdf Panel: Shingled Disk Drives - File System Vs. Autonomous Block Device / SFS http://storageconference.us/2014/Presentations/Panel4.Bandic.pdf http://storageconference.us/2014/Presentations/Panel4.Amer.pdf http://storageconference.us/2014/Presentations/Panel4.Novak.pdf http://storageconference.us/2014/index.html http://storageconference.us/history.html http://www.opencompute.org/wiki/Storage/Dev http://www.snia.org/sites/default/files/Dunn-Feldman_SNIA_Tutorial_Shingled_Magnetic_Recording-r7_Final.pdf ========= From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 13:48:23 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0E48E4F7; Thu, 26 Feb 2015 13:48:23 +0000 (UTC) Received: from mail-pa0-x22b.google.com (mail-pa0-x22b.google.com [IPv6:2607:f8b0:400e:c03::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id CC7297AA; Thu, 26 Feb 2015 13:48:22 +0000 (UTC) Received: by padet14 with SMTP id et14so14042636pad.11; Thu, 26 Feb 2015 05:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=OcM4tpu5qPywbOcO1ntLaQj5BYeOtf7dUJTIQJ/Z5uo=; b=v8XbH5OFZmpaRhjd4ar6eom6XcXU6FuQBdsliHo/1ZvtPYKz7wXgGhcLvAq9tmf/Gc oplVW6znCwBMdGr3sDDYxprk3l2CKnoe3ymeM/fBxjgAaasVenMDnu+Bgljsd3V0eKhl ZX3N0+CUX5IrvzNx8aPBdD32g/5XuHwEapwmrUzQ3nj559v+gxmtCvQxa/zk8DXDwyo9 H3UURg681bnL2Y9/p57O7FVdtk7W/EPakMmx3r/PbV6PiR29mi3Mi9arnR8cirpxaUpW C36Nv36T3cLyHevgniZlttKWZUAV/Pv7jSbH6/DAmX0cM5vHrWw7HynpSSTAicBkBbfg bmqQ== X-Received: by 10.68.232.200 with SMTP id tq8mr14808050pbc.133.1424958502111; Thu, 26 Feb 2015 05:48:22 -0800 (PST) Received: from [10.128.85.22] (pa49-199-1-84.pa.vic.optusnet.com.au. [49.199.1.84]) by mx.google.com with ESMTPSA id f12sm1097956pat.43.2015.02.26.05.48.20 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 26 Feb 2015 05:48:21 -0800 (PST) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (1.0) Subject: Re: NFS, pefs panic: vputx: neg ref cnt From: Brett Wiggins X-Mailer: iPad Mail (12B466) In-Reply-To: <883898483.10139396.1424818598510.JavaMail.root@uoguelph.ca> Date: Fri, 27 Feb 2015 00:48:19 +1100 Content-Transfer-Encoding: 7bit Message-Id: <2B730CB5-CA5E-4B6C-82CE-168484B8202C@gmail.com> References: <883898483.10139396.1424818598510.JavaMail.root@uoguelph.ca> To: Rick Macklem Cc: "freebsd-fs@freebsd.org" , "rmacklem@freebsd.org" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 13:48:23 -0000 > On 25 Feb 2015, at 9:56 am, Rick Macklem wrote: > > > Now for the dumb question...where is the pefs stuff? > (I've never heard of it and a quick find/grep didn't locate it in the > kernel source tree.) > Hi guys, pefs is a kernel module I got from the ports sysutils/pefs-kmod If it's in the ports does it make it an outside issue? Brett From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 16:04:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CE3E3D08; Thu, 26 Feb 2015 16:04:59 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 83A4BBBF; Thu, 26 Feb 2015 16:04:59 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2A+BQCeQ+9U/95baINbhC4EgwXEbAKBcQEBAQEBAXyEEAEFI1YbDgoCAg0ZAlkGE4gvvH6ZbQEBAQEBAQQBAQEBAQEBARqBIYlyhDo0B4JogUMFiliiJSOEDCAxgUR/AQEB X-IronPort-AV: E=Sophos;i="5.09,653,1418101200"; d="scan'208";a="194983246" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 26 Feb 2015 11:04:42 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 4EE03B4110; Thu, 26 Feb 2015 11:04:42 -0500 (EST) Date: Thu, 26 Feb 2015 11:04:42 -0500 (EST) From: Rick Macklem To: Brett Wiggins Message-ID: <1210997275.821031.1424966682310.JavaMail.root@uoguelph.ca> In-Reply-To: <2B730CB5-CA5E-4B6C-82CE-168484B8202C@gmail.com> Subject: Re: NFS, pefs panic: vputx: neg ref cnt MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.12] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, rmacklem@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 16:04:59 -0000 Brett Wiggins wrote: > > > > On 25 Feb 2015, at 9:56 am, Rick Macklem > > wrote: > > > > > > Now for the dumb question...where is the pefs stuff? > > (I've never heard of it and a quick find/grep didn't locate it in > > the > > kernel source tree.) > > > Hi guys, > > pefs is a kernel module I got from the ports > > sysutils/pefs-kmod > > If it's in the ports does it make it an outside issue? > > Brett > Well, I think this means it needs to be looked at by whoever maintains the code. If you are just looking for a workaround, you could try doing the Linux mount with whatever option disables use of ReaddirPlus instead of Readdir. (I recall you mentioned OSX mounts worked and they probably don't use ReaddirPlus by default, like FreeBSD.) Unfortunately I don't have a way to download the sources for this until April, so I can't take a look. (I'd guess something like it gets an extra ref cnt on the vnode for most things, but not pefs_vget() or something like that?) rick From owner-freebsd-fs@FreeBSD.ORG Thu Feb 26 22:46:59 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9A50E609; Thu, 26 Feb 2015 22:46:59 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 2097F384; Thu, 26 Feb 2015 22:46:58 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2CKBQAGoe9U/95baINbhC4EgwXEbwKBdAEBAQEBAXyEEAEFIwRSGw4KAgINBBUCWQYTiC+9L5l6AQEBAQEBAQMBAQEBAQEBG4EhiXKEFyM0BwqCXoFDBYpYj3eFb4kBgz4jhAwgMQEBAYEAQX8BAQE X-IronPort-AV: E=Sophos;i="5.09,655,1418101200"; d="scan'208";a="193280988" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 26 Feb 2015 17:46:52 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 33EFBB3F01; Thu, 26 Feb 2015 17:46:51 -0500 (EST) Date: Thu, 26 Feb 2015 17:46:51 -0500 (EST) From: Rick Macklem To: Konstantin Belousov Message-ID: <422345651.1296741.1424990811199.JavaMail.root@uoguelph.ca> In-Reply-To: <20150226092147.GC2379@kib.kiev.ua> Subject: Re: NFS: kernel modules (loading/unloading) and scheduling MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.11] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org, freebsd-net@freebsd.org, Garrett Wollman X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Feb 2015 22:46:59 -0000 Kostik wrote: > On Wed, Feb 25, 2015 at 10:55:35PM -0500, Rick Macklem wrote: > > Garrett Wollman wrote: > > > In article > > > <388835013.10159778.1424820357923.JavaMail.root@uoguelph.ca>, > > > rmacklem@uoguelph.ca writes: > > > > > > >I tend to think that a bias towards doing Getattr/Lookup over > > > >Read/Write > > > >may help performance (the old "shortest job first" principal), > > > >I'm > > > >not > > > >sure you'll have a big enough queue of outstanding RPCs under > > > >normal > > > >load > > > >for this to make a real difference. > > > > > > I don't think this is a particularly relevant condition here. > > > There > > > are lots of ways RPCs can pile up where you really need to do > > > better > > > work-sharing than the current implementation does. One example > > > is a > > > client that issues lots of concurrent reads (e.g., a compute node > > > running dozens of parallel jobs). Two such systems on gigabit > > > NICs > > > can easily issue large reads fast enough to cause 64 nfsd service > > > threads to blocked while waiting for the socket send buffer to > > > drain. > > > Meanwhile, the file server is completely idle, but unable to > > > respond > > > to incoming requests, and the other users get angry. Rather than > > > assigning new threads to requests from the slow clients, it would > > > be > > > better to let the requests sit until the send buffer drains, and > > > process other clients' requests instead of letting the resources > > > get > > > monopolized by a single user. > > > > > > Lest you think this is purely hypothetical: we actually > > > experienced > > > this problem today, and I verified with "procstat -kk" that all > > > of > > > the > > > nfsd threads were in fact blocked waiting for send buffer space > > > to > > > open up. I was able to restore service immediately by increasing > > > the > > > number of nfsd threads, but I'm unsure to what extent I can do > > > this > > > without breaking other things or hitting other bottlenecks.[1] > > > So I > > > have a user asking me why I haven't enable fair-share scheduling > > > for > > > NFS, and I'm going to have to tell him the answer is "no such > > > thing". > > > > > > -GAWollman > > > > > > [1] What would the right number actually be? We could > > > potentially > > > have many thousands of threads in a compute cluster all operating > > > simultaneously on the same filesystem, well within the I/O > > > capacity > > > of > > > the server, and we'd really like to degrade gracefully rather > > > than > > > falling over when a single slow client soaks up all of the nfsd > > > worker > > > threads. > > Well, each of these threads have two structures allocated to it. > > 1 - The kthread info (sched_sizeof_thread() <-- struct thread + the > > scheduler info one) > > 2 - A structure used by the krpc for each thread. > > Since allocating two moderate sized structures isn't a lot of > > kernel > > memory, I would think a server like yours would be fine with > > several > > thousand nfsd threads. > The biggest memory consumer for any thread, kernel or not, is the > kernel thread stack. It consumes both physical memory and KVA, the > later is not too sparce for amd64. > Yes, thanks, I should have thought of that. - For amd64, it appears to be a 16K stack (plus a KVA page to catch stack overflows?) --> Figure around 16Mbytes for 1000 kernel threads. I don't think this would be an issue for a 64bit arch with quite a few Gbytes of RAM? - For i386, it appears to be a 8K stack (plus a KVA page to catch stack overflows?) --> Figure 8Mbytes for 1000 kernel threads. Still sounds ok to me, although I think the KVA limit is about 430Mbytes by default. I'd guess that KVA exhaustion due to mbuf and other malloc() allocations will happen before having some extra nfsd threads, will occur. I have succeeded in exhausting KVA with the server running on a 256Mbyte i386, but it took several days of heavy load to get it to happen and I never found a reliable way to do it. > > > > What would be interesting would be the receive queue lengths for > > the > > sockets for NFS client TCP connections when the server is running > > normally. (This would be an indication of how many outstanding RPC > > requests any scheduling effort would select between.) > > I'll admit (given basic queuing theory) I would have expected these > > receive queues to be small unless the server is overloaded. > > > > Oh, and I now realize my response related to your first idea > > "Admission" was way off and didn't make much sense. Somehow, I > > thought receive queue when you were talking about send queue. > > (Basically, just ignore my response.) > > However, given the different sizes of RPC replies, it might > > be hard to come up with a reasonable high water mark for the > > send queue. Also, the networking code would have to do some > > sort of upcall to the krpc when the send queue shrinks. > > (So, still not trivial to implement, I think?) > > > > I do agree with Alfred, in that I think you are experiencing > > nfsd thread starvation and increasing the number of nfsd threads > > a lot is the simple way to resolve this. > > This also increases indirect scheduler costs. Direct management of > the runqueues costs proportionally to the number of runnable threads, > but some rare operations have to account all threads. > Since the extra nfsd threads won't be runable, I'd guess that the rare operations will be the only ones affected. Thanks for pointing this out, rick ps: Does increasing the MAXNFSDCNT to something like 2048 sound reasonable for FreeBSD11? (I'm not talking default, but the maximum a server can be set to.) From owner-freebsd-fs@FreeBSD.ORG Fri Feb 27 21:56:07 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 7B0CFEFC for ; Fri, 27 Feb 2015 21:56:07 +0000 (UTC) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 33DF8F48 for ; Fri, 27 Feb 2015 21:56:06 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1YRSsz-0004l2-Ug for freebsd-fs@freebsd.org; Fri, 27 Feb 2015 22:55:54 +0100 Received: from p4fddd7aa.dip0.t-ipconnect.de ([79.221.215.170]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 27 Feb 2015 22:55:53 +0100 Received: from christian.baer by p4fddd7aa.dip0.t-ipconnect.de with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 27 Feb 2015 22:55:53 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Christian Baer Subject: Re: The magic of ZFS and NFS (2nd try) Date: Fri, 27 Feb 2015 22:55:41 +0100 Lines: 30 Message-ID: <2401301.b3eZRBi7it@falbala.rz1.convenimus.net> References: <4257601.p3oiXZFr4n@falbala.rz1.convenimus.net> <54E7A2CF.60804@pinyon.org> <2437038.yvsE2IGTDZ@falbala.rz1.convenimus.net> <201502231413.t1NEDITT000687@higson.cam.lispworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7Bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: p4fddd7aa.dip0.t-ipconnect.de User-Agent: KNode/4.14.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Feb 2015 21:56:07 -0000 Martin Simmons wrote: > According to exports(5), that reduces it to zero: > The third form has the string ``V4:'' followed by a single absolute path > name, > to specify the NFSv4 tree root. This line does not export any file > system, > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > but simply marks where the root of the server's directory tree is for > NFSv4 > clients. The exported file systems for NFSv4 are specified via the other > lines in the exports file in the same way as for NFSv2 and NFSv3. I see the part in the manpage you are referring to. The way nfs reacts doesn't seem to be that way though. I have changed the contents of /etc/exports to /usr/archive/Shared -alldirs -network 192.168.100/24 I still cannot mount that share. The V4: at the beginning of the line did not change anything I could notice. If I let the path point to a ZFS file system, I get permission denied, when it points to a path on UFS, it works fine. Die directories in question have the correct owner and group. Is there some way that ZFS may have a different setting for this? Kind regards, Christian From owner-freebsd-fs@FreeBSD.ORG Fri Feb 27 23:02:31 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 37DEC549 for ; Fri, 27 Feb 2015 23:02:31 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id F2982956 for ; Fri, 27 Feb 2015 23:02:30 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2A4BgBL9vBU/95baINbg1RaBIMGvw8KhGNESQKBcwEBAQEBAXyEDwEBAQMBAQEBICsgCwUWGAICDRkCKQEJJgYIBwQBHASIBggNvTCZcwEBAQEBAQQBAQEBAQEBARqBIYlxhB0BARs0B4JogUMFilqIe4NGJoMlKJIBI4ICHIFuIDEHgQQ5fwEBAQ X-IronPort-AV: E=Sophos;i="5.09,663,1418101200"; d="scan'208";a="195386783" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 27 Feb 2015 18:02:25 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id EE78FB3F86; Fri, 27 Feb 2015 18:02:23 -0500 (EST) Date: Fri, 27 Feb 2015 18:02:23 -0500 (EST) From: Rick Macklem To: Christian Baer Message-ID: <69023850.2194024.1425078143967.JavaMail.root@uoguelph.ca> In-Reply-To: <2401301.b3eZRBi7it@falbala.rz1.convenimus.net> Subject: Re: The magic of ZFS and NFS (2nd try) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Feb 2015 23:02:31 -0000 Christian Baer wrote: > Martin Simmons wrote: > > > According to exports(5), that reduces it to zero: > > The third form has the string ``V4:'' followed by a single absolute > > path > > name, > > to specify the NFSv4 tree root. This line does not export any file > > system, > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > but simply marks where the root of the server's directory tree is > > for > > NFSv4 > > clients. The exported file systems for NFSv4 are specified via the > > other > > lines in the exports file in the same way as for NFSv2 and NFSv3. > > I see the part in the manpage you are referring to. The way nfs > reacts > doesn't seem to be that way though. I have changed the contents of > /etc/exports to > > /usr/archive/Shared -alldirs -network 192.168.100/24 > You need both lines for an NFSv4 mount to work. For example: V4: /usr/archive/Shared -network 192.168.100/24 /usr/archive/Shared -alldirs -network 192.168.100/24 Then the client mount command would look like: mount -t nfs -o nfsv4 :/ /mnt - Note that if the V4: line specifies /usr/archive/Shared as its root, then the client mounts that as "/". If you want to mount the same dir as NFSv3, the mount would look like: mount -t nfs -o nfsv3 :/usr/archive/Shared /mnt > I still cannot mount that share. > > The V4: at the beginning of the line did not change anything I could > notice. It enables NFSv4 and tells the server where the client mount's "/" is. > If I let the path point to a ZFS file system, I get permission > denied, when > it points to a path on UFS, it works fine. > > Die directories in question have the correct owner and group. Is > there some > way that ZFS may have a different setting for this? > Not that I am aware, but I am not a ZFS guy, rick > Kind regards, > Christian > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 04:31:50 2015 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 614DF4AE; Sat, 28 Feb 2015 04:31:50 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id E25A3F5E; Sat, 28 Feb 2015 04:31:49 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t1S4ViX8082098 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 28 Feb 2015 06:31:44 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t1S4ViX8082098 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t1S4ViRX082097; Sat, 28 Feb 2015 06:31:44 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 28 Feb 2015 06:31:44 +0200 From: Konstantin Belousov To: fs@freebsd.org Subject: ZFS port and thread_exit() Message-ID: <20150228043144.GQ2379@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: threads@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 04:31:50 -0000 While looking for some change to thread_exit(), I noted that ZFS on FreeBSD directly calls thread_exit(). It simply cannot work, thread_exit() is the internal function which requires the thread and process state prepared for it call. Among most obvious things, process spin lock must be held, but also several cleanups and accounting have to be done before the call. I believe the function just happens to have the same name as the Solaris counterpart, and for some reasons it is never called. If this is true, kthread_exit() should be used instead. Also, I noted that the userspace port defines thread_exit() as thr_exit(NULL). Again, the direct invocation of the syscall does not look right. The libthr library must do some cleanups on the thread exit, which are not done if syscall is invoked by an application code. Also, the thread itself gets no destructor calls. Could somebody interested in ZFS look into the issues ? From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 06:23:53 2015 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3A6B5C59; Sat, 28 Feb 2015 06:23:53 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 2885EAC4; Sat, 28 Feb 2015 06:23:52 +0000 (UTC) Received: from AlfredMacbookAir.local (hudsonhotel209.h.subnet.rcn.com [207.237.151.136]) by elvis.mu.org (Postfix) with ESMTPSA id 6EEE4341F8AC; Fri, 27 Feb 2015 22:23:45 -0800 (PST) Message-ID: <54F15FC9.2090209@mu.org> Date: Sat, 28 Feb 2015 01:27:21 -0500 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Konstantin Belousov , fs@freebsd.org Subject: Re: ZFS port and thread_exit() References: <20150228043144.GQ2379@kib.kiev.ua> In-Reply-To: <20150228043144.GQ2379@kib.kiev.ua> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: threads@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 06:23:53 -0000 On 2/27/15 11:31 PM, Konstantin Belousov wrote: > While looking for some change to thread_exit(), I noted that ZFS > on FreeBSD directly calls thread_exit(). It simply cannot work, > thread_exit() is the internal function which requires the thread and > process state prepared for it call. Among most obvious things, process > spin lock must be held, but also several cleanups and accounting have to > be done before the call. > > I believe the function just happens to have the same name as the Solaris > counterpart, and for some reasons it is never called. If this is true, > kthread_exit() should be used instead. > > Also, I noted that the userspace port defines thread_exit() as > thr_exit(NULL). Again, the direct invocation of the syscall does not > look right. The libthr library must do some cleanups on the thread exit, > which are not done if syscall is invoked by an application code. Also, > the thread itself gets no destructor calls. > > Could somebody interested in ZFS look into the issues ? this sounds v important, needs a bugzilla in case no one steps up now, should be marked high priority. From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 07:36:32 2015 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EA35145C; Sat, 28 Feb 2015 07:36:32 +0000 (UTC) Received: from mail-pa0-x234.google.com (mail-pa0-x234.google.com [IPv6:2607:f8b0:400e:c03::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B6530C5; Sat, 28 Feb 2015 07:36:32 +0000 (UTC) Received: by padbj1 with SMTP id bj1so2908113pad.11; Fri, 27 Feb 2015 23:36:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=8lQMRl9788W2sRuLfOjF9I3SXtGuBMhdsPrhwY4rnkQ=; b=iR0Dwh3DRuvk+GPJ0CoNSD/gnM2yezis7olLaxIEkjDo9BhQst/Rxh4Sf3qRQJ8QKl orMRbL9m/Wxvdi49sEQ8wW9XmNGtj9zwGHSvaiq/7AUNd1E3+X9Ri7q57uHf3Fe86ERW 8vpb5GHwbCIMmmjhJKHxmALER7Xtjmn+U8+cuJ7eccv5D5dQ1+E/yoUGJQemtvc8m1Tc 6ARPjTOUSsm6AeEE7U0F1dMxij+st2v9BdEfspa/Esoyb+23LkjVWkwqdgMhe/iw0aLI yLO9z/bzW7iN6bfioQCCSPKQhrzQ9xSYgYIlf9Azup4Oa3ANtlLujJXLee+ODwFhOKj8 t25w== X-Received: by 10.67.7.10 with SMTP id cy10mr29448764pad.151.1425108992214; Fri, 27 Feb 2015 23:36:32 -0800 (PST) Received: from raichu (216-243-33-91.users.condointernet.net. [216.243.33.91]) by mx.google.com with ESMTPSA id qn14sm5961247pab.33.2015.02.27.23.36.30 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Feb 2015 23:36:31 -0800 (PST) Sender: Mark Johnston Date: Fri, 27 Feb 2015 23:36:26 -0800 From: Mark Johnston To: Konstantin Belousov Subject: Re: ZFS port and thread_exit() Message-ID: <20150228073625.GA2627@raichu> References: <20150228043144.GQ2379@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150228043144.GQ2379@kib.kiev.ua> User-Agent: Mutt/1.5.23 (2014-03-12) Cc: threads@freebsd.org, fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 07:36:33 -0000 On Sat, Feb 28, 2015 at 06:31:44AM +0200, Konstantin Belousov wrote: > While looking for some change to thread_exit(), I noted that ZFS > on FreeBSD directly calls thread_exit(). It simply cannot work, > thread_exit() is the internal function which requires the thread and > process state prepared for it call. Among most obvious things, process > spin lock must be held, but also several cleanups and accounting have to > be done before the call. > > I believe the function just happens to have the same name as the Solaris > counterpart, and for some reasons it is never called. If this is true, > kthread_exit() should be used instead. I'm not very familiar with the ZFS code, but: The opensolaris compat proc.h #defines thread_exit to kthread_exit after FreeBSD's proc.h is included, so calls to thread_exit() in ZFS code should be replaced. Also, zfs.ko contains no references to thread_exit, but does reference kthread_exit. > > Also, I noted that the userspace port defines thread_exit() as > thr_exit(NULL). Again, the direct invocation of the syscall does not > look right. The libthr library must do some cleanups on the thread exit, > which are not done if syscall is invoked by an application code. Also, > the thread itself gets no destructor calls. This is done in zfs_context.h, which includes cddl/contrib/opensolaris/head/thread.h, which #defines thr_exit(r) to pthread_exit(r). There is no other thread.h in the src tree, and libzpool.so references only pthread_exit. So it seems to me that the compat layer is doing the right thing. > > Could somebody interested in ZFS look into the issues ? > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 11:19:33 2015 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0C3AEAA3; Sat, 28 Feb 2015 11:19:33 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 739FB846; Sat, 28 Feb 2015 11:19:32 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t1SBJQ0U071852 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 28 Feb 2015 13:19:26 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t1SBJQ0U071852 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t1SBJQQK071851; Sat, 28 Feb 2015 13:19:26 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 28 Feb 2015 13:19:26 +0200 From: Konstantin Belousov To: Mark Johnston Subject: Re: ZFS port and thread_exit() Message-ID: <20150228111926.GR2379@kib.kiev.ua> References: <20150228043144.GQ2379@kib.kiev.ua> <20150228073625.GA2627@raichu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150228073625.GA2627@raichu> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: threads@freebsd.org, fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 11:19:33 -0000 On Fri, Feb 27, 2015 at 11:36:26PM -0800, Mark Johnston wrote: > On Sat, Feb 28, 2015 at 06:31:44AM +0200, Konstantin Belousov wrote: > > While looking for some change to thread_exit(), I noted that ZFS > > on FreeBSD directly calls thread_exit(). It simply cannot work, > > thread_exit() is the internal function which requires the thread and > > process state prepared for it call. Among most obvious things, process > > spin lock must be held, but also several cleanups and accounting have to > > be done before the call. > > > > I believe the function just happens to have the same name as the Solaris > > counterpart, and for some reasons it is never called. If this is true, > > kthread_exit() should be used instead. > > I'm not very familiar with the ZFS code, but: > > The opensolaris compat proc.h #defines thread_exit to kthread_exit after > FreeBSD's proc.h is included, so calls to thread_exit() in ZFS code should > be replaced. Also, zfs.ko contains no references to thread_exit, but does > reference kthread_exit. > > > > > Also, I noted that the userspace port defines thread_exit() as > > thr_exit(NULL). Again, the direct invocation of the syscall does not > > look right. The libthr library must do some cleanups on the thread exit, > > which are not done if syscall is invoked by an application code. Also, > > the thread itself gets no destructor calls. > > This is done in zfs_context.h, which includes > cddl/contrib/opensolaris/head/thread.h, which #defines thr_exit(r) to > pthread_exit(r). There is no other thread.h in the src tree, and > libzpool.so references only pthread_exit. So it seems to me that the > compat layer is doing the right thing. Thank you for the explanation. I realized why kthread_exit() did not popped up in my reading. Still, it is very confusing, probably at the same scale as our/Linux PAGE_MASK. From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 17:20:21 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4BFF8CC1 for ; Sat, 28 Feb 2015 17:20:21 +0000 (UTC) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id A85ECF63 for ; Sat, 28 Feb 2015 17:20:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id t1SHJvlc045047 for ; Sat, 28 Feb 2015 20:19:57 +0300 (MSK) (envelope-from marck@rinet.ru) Date: Sat, 28 Feb 2015 20:19:57 +0300 (MSK) From: Dmitry Morozovsky To: freebsd-fs@FreeBSD.org Subject: mounting sun UFS file system under FreeBSD stable/10 Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Sat, 28 Feb 2015 20:19:57 +0300 (MSK) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 17:20:21 -0000 Dear colleagues, I have (from one of my friends) block copy of dead disk apparently used at old Sun Solaris Server: root@moose:/ar/backup/komarov# file -s sun72g.img sun72g.img: Unix Fast File system [v1] (big-endian), last mounted on /, last written at Sun Jan 30 15:51:07 2011, clean flag 253, number of blocks 10242144, number of data blocks 10086988, number of cylinder groups 202, block size 8192, fragment size 1024, minimum percentage of free blocks 1, rotational delay 0ms, disk rotational speed 167rps, TIME optimization root@moose:/ar/backup/komarov# dd if=sun72g.img count=1 | hd 1+0 records in 1+0 records out 512 bytes transferred in 0.000021 secs (24403223 bytes/sec) 00000000 53 55 4e 37 32 47 20 63 79 6c 20 31 34 30 38 37 |SUN72G cyl 14087| 00000010 20 61 6c 74 20 32 20 68 64 20 32 34 20 73 65 63 | alt 2 hd 24 sec| 00000020 20 34 32 34 00 00 00 00 00 00 00 00 00 00 00 00 | 424............| 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 00 00 00 01 00 00 00 00 00 00 00 00 00 08 00 02 |................| 00000090 00 00 00 03 00 01 00 05 00 00 00 00 00 00 00 04 |................| 000000a0 00 00 00 07 00 00 00 08 00 00 00 00 00 00 00 00 |................| 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 60 0d de ee |............`...| 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001a0 00 00 00 00 27 29 37 09 00 00 00 00 00 00 00 01 |....')7.........| 000001b0 37 07 00 02 00 18 01 a8 00 00 00 00 00 00 00 00 |7...............| 000001c0 01 38 90 c0 00 00 07 dd 00 40 20 c0 00 00 00 00 |.8.......@ .....| 000001d0 08 8b 56 40 00 00 09 7a 01 38 90 c0 00 00 11 57 |..V@...z.8.....W| 000001e0 01 38 90 c0 00 00 19 34 01 38 90 c0 00 00 21 11 |.8.....4.8....!.| 000001f0 01 f4 22 c0 00 00 2d a6 01 74 cf c0 da be cc 21 |.."...-..t.....!| 00000200 however, trying to mount or fsck the image fail: root@moose:/ar/backup/komarov# mdconfig -l -v md0 swap 128M md1 vnode 68G /ar/backup/komarov/sun72g.img root@moose:/ar/backup/komarov# fsck_ufs -n /dev/md1 ** /dev/md1 (NO WRITE) Cannot find file system superblock ioctl (GCINFO): Inappropriate ioctl for device fsck_ufs: /dev/md1: can't read disk label root@moose:/ar/backup/komarov# mount -r /dev/md1 /mnt mount: /dev/md1: Invalid argument Any hints? Thanks! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 17:49:29 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 69869524 for ; Sat, 28 Feb 2015 17:49:29 +0000 (UTC) Received: from forward5l.mail.yandex.net (forward5l.mail.yandex.net [IPv6:2a02:6b8:0:1819::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "forwards.mail.yandex.net", Issuer "Certum Level IV CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EE701278 for ; Sat, 28 Feb 2015 17:49:28 +0000 (UTC) Received: from smtp17.mail.yandex.net (smtp17.mail.yandex.net [95.108.252.17]) by forward5l.mail.yandex.net (Yandex) with ESMTP id 327E2C40FC1; Sat, 28 Feb 2015 20:49:25 +0300 (MSK) Received: from smtp17.mail.yandex.net (localhost [127.0.0.1]) by smtp17.mail.yandex.net (Yandex) with ESMTP id B00C7190022B; Sat, 28 Feb 2015 20:49:24 +0300 (MSK) Received: from unknown (unknown [2a02:6b8:0:5::b3]) by smtp17.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id OC547UsQtU-nOSWNQWT; Sat, 28 Feb 2015 20:49:24 +0300 (using TLSv1.2 with cipher AES128-SHA (128/128 bits)) (Client certificate not present) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1425145764; bh=AemoWGrulRZuVuL2Ehsor84dYUE/jjzp0/f8DK6swLk=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:Subject: References:In-Reply-To:Content-Type; b=RxyuE4Sy3pmBJbXShku3fdhDQ8GrMTawZPNr1Ec1C8NAz08i98TSwFdK/YdGXoet6 kW6Mg3h+fY3ZyAENAv945leEEzkGbn+Y9teMiUYTVNCb6CYgqrk+PRMaOOECFQT8Bc TWM3Ct8xyT+DOrVmCeeNRy3AjmQSW8mN0nPZg0Uo= Authentication-Results: smtp17.mail.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: <54F1FF4A.6040905@yandex.ru> Date: Sat, 28 Feb 2015 20:47:54 +0300 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Dmitry Morozovsky , freebsd-fs@FreeBSD.org Subject: Re: mounting sun UFS file system under FreeBSD stable/10 References: In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="nu0CGIUf8U0p5TqfpGuGUjsrnRFmGoOxo" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 17:49:29 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --nu0CGIUf8U0p5TqfpGuGUjsrnRFmGoOxo Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable On 28.02.2015 20:19, Dmitry Morozovsky wrote: > Dear colleagues, >=20 > I have (from one of my friends) block copy of dead disk apparently used= at old=20 > Sun Solaris Server: >=20 > root@moose:/ar/backup/komarov# file -s sun72g.img > sun72g.img: Unix Fast File system [v1] (big-endian), last mounted on /,= last=20 > written at Sun Jan 30 15:51:07 2011, clean flag 253, number of blocks 1= 0242144,=20 > number of data blocks 10086988, number of cylinder groups 202, block si= ze 8192,=20 > fragment size 1024, minimum percentage of free blocks 1, rotational del= ay 0ms,=20 > disk rotational speed 167rps, TIME optimization > root@moose:/ar/backup/komarov# dd if=3Dsun72g.img count=3D1 | hd > 1+0 records in > 1+0 records out > 512 bytes transferred in 0.000021 secs (24403223 bytes/sec) > 00000000 53 55 4e 37 32 47 20 63 79 6c 20 31 34 30 38 37 |SUN72G cyl= 14087| > 00000010 20 61 6c 74 20 32 20 68 64 20 32 34 20 73 65 63 | alt 2 hd = 24 sec| > 00000020 20 34 32 34 00 00 00 00 00 00 00 00 00 00 00 00 | 424......= =2E.....| > 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |..........= =2E.....| > * > 00000080 00 00 00 01 00 00 00 00 00 00 00 00 00 08 00 02 |..........= =2E.....| > 00000090 00 00 00 03 00 01 00 05 00 00 00 00 00 00 00 04 |..........= =2E.....| > 000000a0 00 00 00 07 00 00 00 08 00 00 00 00 00 00 00 00 |..........= =2E.....| > 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 60 0d de ee |..........= =2E.`...| > 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |..........= =2E.....| It looks like VTOC disklabel, try to kldload geom_part_vtoc8. --=20 WBR, Andrey V. Elsukov --nu0CGIUf8U0p5TqfpGuGUjsrnRFmGoOxo Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJU8f9KAAoJEAHF6gQQyKF6QXEIAILau1maWpm1I9qIOHXJpsFI R26AsNLG4Nfnm+RyHsWCyJiWau/rQZ6NlI10mN30JCCge7l7axsv4dxQ4uTKJ0Gh 0sOVGXLD4hIg4TNp2zSpkp+b9/GUBpZnSDy/TRsW4t6qvxhsuRUwL0Jl293MTuBl uSLlpgdsz3S0/zpln7gcPWzo72C9DPLa63cge9SdewWma3h7+n8/gWTyk3N8BdMH Jo0akCOqA+Bg+pYpzZC05JmNIklV8NLPDdFpFziEDpOzrw2E05/A6qnPQ7mDvyrK efCOaIP2E1UtVpJvdYwX0nPHGhfZuRs/DHWWzTtspAFc/ZZbw60YP68bfySSh70= =QueT -----END PGP SIGNATURE----- --nu0CGIUf8U0p5TqfpGuGUjsrnRFmGoOxo-- From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 18:14:00 2015 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ACAA273B for ; Sat, 28 Feb 2015 18:14:00 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1DB586E9 for ; Sat, 28 Feb 2015 18:13:59 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.9/8.14.9) with ESMTP id t1SIDp39071624 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 28 Feb 2015 20:13:51 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.9.2 kib.kiev.ua t1SIDp39071624 Received: (from kostik@localhost) by tom.home (8.14.9/8.14.9/Submit) id t1SIDobl071623; Sat, 28 Feb 2015 20:13:50 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 28 Feb 2015 20:13:50 +0200 From: Konstantin Belousov To: "Andrey V. Elsukov" Subject: Re: mounting sun UFS file system under FreeBSD stable/10 Message-ID: <20150228181350.GT2379@kib.kiev.ua> References: <54F1FF4A.6040905@yandex.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54F1FF4A.6040905@yandex.ru> User-Agent: Mutt/1.5.23 (2014-03-12) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on tom.home Cc: freebsd-fs@FreeBSD.org, Dmitry Morozovsky X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 18:14:00 -0000 On Sat, Feb 28, 2015 at 08:47:54PM +0300, Andrey V. Elsukov wrote: > On 28.02.2015 20:19, Dmitry Morozovsky wrote: > > Dear colleagues, > > > > I have (from one of my friends) block copy of dead disk apparently used at old > > Sun Solaris Server: > > > > root@moose:/ar/backup/komarov# file -s sun72g.img > > sun72g.img: Unix Fast File system [v1] (big-endian), last mounted on /, last > > written at Sun Jan 30 15:51:07 2011, clean flag 253, number of blocks 10242144, > > number of data blocks 10086988, number of cylinder groups 202, block size 8192, > > fragment size 1024, minimum percentage of free blocks 1, rotational delay 0ms, > > disk rotational speed 167rps, TIME optimization > > root@moose:/ar/backup/komarov# dd if=sun72g.img count=1 | hd > > 1+0 records in > > 1+0 records out > > 512 bytes transferred in 0.000021 secs (24403223 bytes/sec) > > 00000000 53 55 4e 37 32 47 20 63 79 6c 20 31 34 30 38 37 |SUN72G cyl 14087| > > 00000010 20 61 6c 74 20 32 20 68 64 20 32 34 20 73 65 63 | alt 2 hd 24 sec| > > 00000020 20 34 32 34 00 00 00 00 00 00 00 00 00 00 00 00 | 424............| > > 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > > * > > 00000080 00 00 00 01 00 00 00 00 00 00 00 00 00 08 00 02 |................| > > 00000090 00 00 00 03 00 01 00 05 00 00 00 00 00 00 00 04 |................| > > 000000a0 00 00 00 07 00 00 00 08 00 00 00 00 00 00 00 00 |................| > > 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 60 0d de ee |............`...| > > 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > > It looks like VTOC disklabel, try to kldload geom_part_vtoc8. > It still cannot help, note that UFS image is big-endian, while the host is most likely x86, which means little-endian. Our UFS does not perform data normalization. Also, I believe that Sun did some changes to the filesystem layout, so it is not quite likely that it would work even on right endianess machine. Best action is to use Solaris live CD to tar the volume. From owner-freebsd-fs@FreeBSD.ORG Sat Feb 28 20:59:32 2015 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E2A44C89 for ; Sat, 28 Feb 2015 20:59:32 +0000 (UTC) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 5E4BB853 for ; Sat, 28 Feb 2015 20:59:31 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id t1SKqwnt047081; Sat, 28 Feb 2015 23:52:58 +0300 (MSK) (envelope-from marck@rinet.ru) Date: Sat, 28 Feb 2015 23:52:58 +0300 (MSK) From: Dmitry Morozovsky To: Konstantin Belousov Subject: Re: mounting sun UFS file system under FreeBSD stable/10 In-Reply-To: <20150228181350.GT2379@kib.kiev.ua> Message-ID: References: <54F1FF4A.6040905@yandex.ru> <20150228181350.GT2379@kib.kiev.ua> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Sat, 28 Feb 2015 23:54:05 +0300 (MSK) Cc: freebsd-fs@freebsd.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Feb 2015 20:59:33 -0000 On Sat, 28 Feb 2015, Konstantin Belousov wrote: > > It looks like VTOC disklabel, try to kldload geom_part_vtoc8. > > > > It still cannot help, note that UFS image is big-endian, while > the host is most likely x86, which means little-endian. Our UFS > does not perform data normalization. I'm afraid of the same :( > Also, I believe that Sun did some changes to the filesystem layout, > so it is not quite likely that it would work even on right endianess > machine. Best action is to use Solaris live CD to tar the volume. Side question: can Solaris LiveCD use iSCSI-exported volume? I then could boot Solaris in virtual machine and try to mount the .img from there... Thanks! -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------