From owner-freebsd-fs@FreeBSD.ORG Sun Apr 12 03:52:07 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 17D94106566C for ; Sun, 12 Apr 2009 03:52:07 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (rrcs-24-73-246-106.sw.biz.rr.com [24.73.246.106]) by mx1.freebsd.org (Postfix) with ESMTP id BA1458FC12 for ; Sun, 12 Apr 2009 03:52:06 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n3C3TbbM001217 for ; Sat, 11 Apr 2009 22:29:37 -0500 (CDT) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: content-type:content-transfer-encoding; b=RZT1d3Jtiaaqs+lcx7ZRlEJiCDm6J47s1UYDP7P3b56PELUayAhKcfGQXF+Xt+adH 1BVOUW9/cIvI3QLxRYp6iVsWEfJyszcS2ySiTKMz47pTtAxOj3TCi1OdZsnggLtx5cB XBL6pGxVnDpozfisXkrwXj81lujylvKcbxRUu9Y= Message-ID: <49E16021.6040900@jrv.org> Date: Sat, 11 Apr 2009 22:29:37 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: turning off ZFS mountpoint property behavior? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Apr 2009 03:52:07 -0000 Is there a knob to turn off ZFS's mounting of filesystems based on the mountpoint property? It is most unhelpful when receiving replicas of filesystems to have a received snapshot suddenly mounted over /usr. I have two systems "prime" and "subprime", both of which have a large ZFS pool and a small UFS partition for maintenance. They are essentially the same except that /boot/loader.conf boots one into ZFS and the other into UFS. "prime" is the operational server using ZFS. "subprime" is essentially a hot spare booting UFS whose ZFS pool is to be kept in sync with the pool on "prime" sync zfs send/recv replication. Should the pool on "prime" fail, /boot/load.conf on "subprime" can be changed to boot its ZFS pool and the server is quickly available again, at the last snapshot replicated. Unfortunately when zfs recv runs and it receive a filesystem with property mountpoint=/usr it mounts that filesystem there. That's not desirable in my situation nor I suspect many others. Is there a sysctl or some other way to disable the automatic mount behavior? From owner-freebsd-fs@FreeBSD.ORG Sun Apr 12 20:06:18 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBC32106566C for ; Sun, 12 Apr 2009 20:06:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from acadia.cs.uoguelph.ca (acadia.cs.uoguelph.ca [131.104.94.221]) by mx1.freebsd.org (Postfix) with ESMTP id 7DB618FC0A for ; Sun, 12 Apr 2009 20:06:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by acadia.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id n3CK6Hd3002721 for ; Sun, 12 Apr 2009 16:06:17 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id n3CKCed06307 for ; Sun, 12 Apr 2009 16:12:40 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Sun, 12 Apr 2009 16:12:40 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: freebsd-fs@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.94.221 Subject: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Apr 2009 20:06:19 -0000 In summary, the nfsv4 server needs 3 changes to the FreeBSD kernel: 1 - Sharing of nfssvc(). (This was just checked into FreeBSD-CURRENT.) 2 - Some calls that recall delegations must be done before local VOP_RENAME() and VOP_ADVLOCK(). I am waiting for comments to a vague post on this before I mail my first stab at coding this. 3 - Support for the Change attribute, which is what this post is about. Once the above 3 things are resolved, the code should drop in without further changes outside of its subtree. As background, I believe va_filerev/i_modrev was added for nqnfs long long ago. Since it is not exposed to userland via the stat structure, I don't believe anything outside of the kernel uses it. Inside the kernel, the only thing that currently uses it is the nfs server, which uses it as the cookie verifier. (It really doesn't use it, since a client regurgitates it back to the server as opaque bits in the next readdir rpc and the server then ignores those bits. This is correct, since va_filerev is a bogus cookie verifier.) As such, I don't believe changing the semantics of va_filerev will break anything in FreeBSD. I'd like to change the semantics of va_filerev so that it can be used by the nfsv4 server as the Change attribute. To do this, it needs to change in 2 ways: - must change upon metadata changes as well as data changes - must persist across server reboots (ie. be moved to spare space in the on-disk i-node instead of in memory i-node) Here is the patch to ufs for the above, that I have been using for some time. Please review and comment. Thanks, rick --- ufs patch to change va_filerev semantics --- --- ufs/ufs/inode.h.sav 2009-04-12 02:29:05.000000000 -0400 +++ ufs/ufs/inode.h 2009-03-20 12:18:20.000000000 -0400 @@ -74,7 +74,6 @@ struct fs *i_fs; /* Associated filesystem superblock. */ struct dquot *i_dquot[MAXQUOTAS]; /* Dquot structures. */ - u_quad_t i_modrev; /* Revision level for NFS lease. */ /* * Side effects; used during directory lookup. */ --- ufs/ufs/dinode.h.sav 2009-04-12 02:29:40.000000000 -0400 +++ ufs/ufs/dinode.h 2008-08-25 17:31:55.000000000 -0400 @@ -145,7 +145,8 @@ ufs2_daddr_t di_extb[NXADDR];/* 96: External attributes block. */ ufs2_daddr_t di_db[NDADDR]; /* 112: Direct disk blocks. */ ufs2_daddr_t di_ib[NIADDR]; /* 208: Indirect disk blocks. */ - int64_t di_spare[3]; /* 232: Reserved; currently unused */ + u_int64_t di_modrev; /* 232: i_modrev for NFSv4 */ + int64_t di_spare[2]; /* 240: Reserved; currently unused */ }; /* @@ -183,7 +184,7 @@ int32_t di_gen; /* 108: Generation number. */ u_int32_t di_uid; /* 112: File owner. */ u_int32_t di_gid; /* 116: File group. */ - int32_t di_spare[2]; /* 120: Reserved; currently unused */ + u_int64_t di_modrev; /* 120: i_modrev for NFSv4 */ }; #define di_ogid di_u.oldids[1] #define di_ouid di_u.oldids[0] --- ufs/ufs/ufs_vnops.c.sav 2009-04-12 02:28:41.000000000 -0400 +++ ufs/ufs/ufs_vnops.c 2009-03-10 16:47:11.000000000 -0400 @@ -157,11 +157,12 @@ if (ip->i_flag & IN_UPDATE) { DIP_SET(ip, i_mtime, ts.tv_sec); DIP_SET(ip, i_mtimensec, ts.tv_nsec); - ip->i_modrev++; + DIP_SET(ip, i_modrev, DIP(ip, i_modrev) + 1); } if (ip->i_flag & IN_CHANGE) { DIP_SET(ip, i_ctime, ts.tv_sec); DIP_SET(ip, i_ctimensec, ts.tv_nsec); + DIP_SET(ip, i_modrev, DIP(ip, i_modrev) + 1); } out: @@ -446,6 +447,7 @@ vap->va_ctime.tv_sec = ip->i_din1->di_ctime; vap->va_ctime.tv_nsec = ip->i_din1->di_ctimensec; vap->va_bytes = dbtob((u_quad_t)ip->i_din1->di_blocks); + vap->va_filerev = ip->i_din1->di_modrev; } else { vap->va_rdev = ip->i_din2->di_rdev; vap->va_size = ip->i_din2->di_size; @@ -456,12 +458,12 @@ vap->va_birthtime.tv_sec = ip->i_din2->di_birthtime; vap->va_birthtime.tv_nsec = ip->i_din2->di_birthnsec; vap->va_bytes = dbtob((u_quad_t)ip->i_din2->di_blocks); + vap->va_filerev = ip->i_din2->di_modrev; } vap->va_flags = ip->i_flags; vap->va_gen = ip->i_gen; vap->va_blocksize = vp->v_mount->mnt_stat.f_iosize; vap->va_type = IFTOVT(ip->i_mode); - vap->va_filerev = ip->i_modrev; return (0); } @@ -2223,7 +2225,6 @@ ASSERT_VOP_LOCKED(vp, "ufs_vinit"); if (ip->i_number == ROOTINO) vp->v_vflag |= VV_ROOT; - ip->i_modrev = init_va_filerev(); *vpp = vp; return (0); } From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 01:28:09 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9090C106572A for ; Mon, 13 Apr 2009 01:28:09 +0000 (UTC) (envelope-from tcberner@gmail.com) Received: from mail-fx0-f167.google.com (mail-fx0-f167.google.com [209.85.220.167]) by mx1.freebsd.org (Postfix) with ESMTP id 15C0A8FC17 for ; Mon, 13 Apr 2009 01:28:08 +0000 (UTC) (envelope-from tcberner@gmail.com) Received: by fxm11 with SMTP id 11so1852547fxm.43 for ; Sun, 12 Apr 2009 18:28:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:to:subject:from :organization:content-type:mime-version:content-transfer-encoding :message-id:user-agent; bh=11U0ZIBBiEtJvEyKwNts20VuXKZyt72X6I147uzTzkQ=; b=X/nlgiAaVZMqawTRwRsiq2Y2g83p3BTS9ViFzCz/k5igyX/sxr0TO84J6Ll2VOwsWN WFTeakswE9D7T6CNjVWZ7I56HuNRbuleERagRu/Mfm9GKHEhORDr57zz5ed6aiKSHTYv 9M0ZP4MNYG0BaUxCJUwhC8yfLejwFrU8VHyPM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:to:subject:from:organization:content-type:mime-version :content-transfer-encoding:message-id:user-agent; b=DHRBVv1U8015QFrv3SDzGlDoPqBKdnaiHgdM3I0I9EYlLtS/a5OuV15yFC/KgHfGRY qyVnkvmAmH0LIzQTZkzifjEkDg/JXwPbC7uu0MPLLzKUBiq+hxGfd8f3qEjnxDTmKsa8 LbHShNYt0LtNAbvsnaGfhO9/S0k9lPg+4D/4o= Received: by 10.86.61.13 with SMTP id j13mr4321661fga.65.1239584625706; Sun, 12 Apr 2009 18:03:45 -0700 (PDT) Received: from sam.firefly (132-37.105-92.cust.bluewin.ch [92.105.37.132]) by mx.google.com with ESMTPS id 12sm6360561fgg.22.2009.04.12.18.03.45 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 12 Apr 2009 18:03:45 -0700 (PDT) Date: Mon, 13 Apr 2009 03:03:44 +0200 To: freebsd-fs@freebsd.org From: "Tobias C. Berner" Organization: - Content-Type: text/plain; charset=utf-8 MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Message-ID: User-Agent: Opera Mail/10.00 (FreeBSD) Subject: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 01:28:12 -0000 Hi I have a zfs pool NAME STATE READ WRITE CKSUM multimedia ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad10 ONLINE 0 0 0 ad12 ONLINE 0 0 0 ad14 ONLINE 0 0 0 Now, I need more sata-connecters. If I activate an other onboard-controller, the device names move: ad8 -> ad14 ad10 -> ad16 ad12 -> ad18 ad14 -> ad20 What is the proper way to handle this in zfs? thanks, Tobias From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 04:23:30 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD7B31065673 for ; Mon, 13 Apr 2009 04:23:30 +0000 (UTC) (envelope-from dimitar.vassilev@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 4865E8FC16 for ; Mon, 13 Apr 2009 04:23:30 +0000 (UTC) (envelope-from dimitar.vassilev@gmail.com) Received: by ewy19 with SMTP id 19so1820821ewy.43 for ; Sun, 12 Apr 2009 21:23:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=f0Vx2jMSOf0JY9YL1UdU4EfcnWHbDrMmI1XgUwR/Uus=; b=FTQoVkT5SJ8eoou89l+ZNNx0PaNx1Xem0WQmgBkbgq7oxCGJXWKa/7bmpzMhB2pHvp 293p91OCBOghYo1Ng7jg6Bmp8wXsXejvr2WZNnuzZE7a7r9NBqBsgeIdZI3p9veWtd+/ P1JIUWb41mOTn3pYwdZYNy4A7bY/o9p6JCAQQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=PN/PenCmDa1DWlvmk9uSzUbRFEZN2GmQcw6UeImfy64FtaT34EEhFxwRzkg5rrx+L0 3TVeQu3rSJK2ePXtd8zr0qgjNfSgM0vISjGEBx7tDLTNxFjXuFbZxjw6h1k0o/IcgE4B KWyPMjw1FengEv1YZt/Gshg142plxbTrr1+uU= MIME-Version: 1.0 Received: by 10.216.72.209 with SMTP id t59mr1429829wed.27.1239594852050; Sun, 12 Apr 2009 20:54:12 -0700 (PDT) In-Reply-To: References: Date: Mon, 13 Apr 2009 06:54:12 +0300 Message-ID: <59adc1a0904122054j52cf9c60h6b3909379e04463@mail.gmail.com> From: Dimitar Vasilev To: "Tobias C. Berner" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 04:23:31 -0000 2009/4/13 Tobias C. Berner : > Hi > > I have a zfs pool > > =C2=A0 =C2=A0 =C2=A0 =C2=A0NAME =C2=A0 =C2=A0 =C2=A0 =C2=A0STATE =C2=A0 = =C2=A0 READ WRITE CKSUM > =C2=A0 =C2=A0 =C2=A0 =C2=A0multimedia =C2=A0ONLINE =C2=A0 =C2=A0 =C2=A0 0= =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ad8 =C2=A0 =C2=A0 =C2=A0 ONLINE =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ad10 =C2=A0 =C2=A0 =C2=A0ONLINE =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ad12 =C2=A0 =C2=A0 =C2=A0ONLINE =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0ad14 =C2=A0 =C2=A0 =C2=A0ONLINE =C2=A0 = =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 =C2=A0 =C2=A0 0 > > Now, I need more sata-connecters. If I activate > an other onboard-controller, the device names > move: > > =C2=A0 ad8 =C2=A0-> ad14 > =C2=A0 ad10 -> ad16 > =C2=A0 ad12 -> ad18 > =C2=A0 ad14 -> ad20 > > > What is the proper way to handle this in zfs? > > > thanks, Tobias > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > There was an option for ata_static_id's in $KERNCONF - you need to enable this to keep the sata from shifting.Don't remember the exact magic instance - should be somewhere in LINT/hint/GENERIC. Should resemble something like ATA_STATIC_ID. Cheers, Dimitar From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 05:05:51 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F18E1065674 for ; Mon, 13 Apr 2009 05:05:51 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1]) by mx1.freebsd.org (Postfix) with ESMTP id 0678A8FC0A for ; Mon, 13 Apr 2009 05:05:51 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by noop.in-addr.com with local (Exim 4.54 (FreeBSD)) id 1LtEMg-0009pT-C4 for freebsd-fs@freebsd.org; Mon, 13 Apr 2009 01:05:50 -0400 Date: Mon, 13 Apr 2009 01:05:50 -0400 From: Gary Palmer To: freebsd-fs@freebsd.org Message-ID: <20090413050550.GA44022@in-addr.com> References: <59adc1a0904122054j52cf9c60h6b3909379e04463@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59adc1a0904122054j52cf9c60h6b3909379e04463@mail.gmail.com> Subject: Re: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 05:05:51 -0000 On Mon, Apr 13, 2009 at 06:54:12AM +0300, Dimitar Vasilev wrote: > 2009/4/13 Tobias C. Berner : > > Hi > > > > I have a zfs pool > > > > ?? ?? ?? ??NAME ?? ?? ?? ??STATE ?? ?? READ WRITE CKSUM > > ?? ?? ?? ??multimedia ??ONLINE ?? ?? ?? 0 ?? ?? 0 ?? ?? 0 > > ?? ?? ?? ?? ??ad8 ?? ?? ?? ONLINE ?? ?? ?? 0 ?? ?? 0 ?? ?? 0 > > ?? ?? ?? ?? ??ad10 ?? ?? ??ONLINE ?? ?? ?? 0 ?? ?? 0 ?? ?? 0 > > ?? ?? ?? ?? ??ad12 ?? ?? ??ONLINE ?? ?? ?? 0 ?? ?? 0 ?? ?? 0 > > ?? ?? ?? ?? ??ad14 ?? ?? ??ONLINE ?? ?? ?? 0 ?? ?? 0 ?? ?? 0 > > > > Now, I need more sata-connecters. If I activate > > an other onboard-controller, the device names > > move: > > > > ?? ad8 ??-> ad14 > > ?? ad10 -> ad16 > > ?? ad12 -> ad18 > > ?? ad14 -> ad20 > > > > > > What is the proper way to handle this in zfs? > > > > > > thanks, Tobias > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > There was an option for ata_static_id's in $KERNCONF - you need to > enable this to keep the sata from shifting.Don't remember the exact > magic instance - should be somewhere in LINT/hint/GENERIC. > Should resemble something like ATA_STATIC_ID. % grep STATIC /sys/i386/conf/GENERIC options ATA_STATIC_ID # Static device numbering Regards, Gary From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 05:22:01 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 279511065672 for ; Mon, 13 Apr 2009 05:22:01 +0000 (UTC) (envelope-from chris@young-alumni.com) Received: from mail.oldschoolpunx.net (cpe-66-68-98-234.austin.res.rr.com [66.68.98.234]) by mx1.freebsd.org (Postfix) with ESMTP id F381E8FC15 for ; Mon, 13 Apr 2009 05:22:00 +0000 (UTC) (envelope-from chris@young-alumni.com) Received: by mail.oldschoolpunx.net (Postfix, from userid 58) id 1A70E8D0E5; Mon, 13 Apr 2009 00:22:00 -0500 (CDT) Received: from [192.168.8.100] (unknown [192.168.8.100]) by mail.oldschoolpunx.net (Postfix) with ESMTPSA id 6C0A98D0CB for ; Mon, 13 Apr 2009 00:18:06 -0500 (CDT) Resent-To: freebsd-fs@freebsd.org From: Chris Ruiz To: Tobias C. Berner In-Reply-To: Resent-From: Chris Ruiz References: Message-Id: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Resent-Date: Mon, 13 Apr 2009 00:18:06 -0500 Mime-Version: 1.0 (Apple Message framework v930.3) Date: Mon, 13 Apr 2009 00:17:43 -0500 X-Mailer: Apple Mail (2.930.3) Resent-Message-Id: <20090413051806.6C0A98D0CB@mail.oldschoolpunx.net> Cc: Subject: Re: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 05:22:01 -0000 On Apr 12, 2009, at 8:03 PM, Tobias C. Berner wrote: > Hi > > I have a zfs pool > > NAME STATE READ WRITE CKSUM > multimedia ONLINE 0 0 0 > ad8 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > ad14 ONLINE 0 0 0 > > Now, I need more sata-connecters. If I activate > an other onboard-controller, the device names > move: > > ad8 -> ad14 > ad10 -> ad16 > ad12 -> ad18 > ad14 -> ad20 > > > What is the proper way to handle this in zfs? ZFS should just find the pool even though the device names have changed. Chris From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 07:34:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F0341065670 for ; Mon, 13 Apr 2009 07:34:48 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id CFBA48FC0C for ; Mon, 13 Apr 2009 07:34:47 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from volatile.chemikals.org (adsl-67-127-25.shv.bellsouth.net [98.67.127.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id C95C58057196; Mon, 13 Apr 2009 02:34:44 -0500 (CDT) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id n3D7YL6T078269; Mon, 13 Apr 2009 02:34:37 -0500 (CDT) (envelope-from morganw@chemikals.org) Date: Mon, 13 Apr 2009 02:34:21 -0500 (CDT) From: Wes Morgan To: "Tobias C. Berner" In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 07:34:48 -0000 On Mon, 13 Apr 2009, Tobias C. Berner wrote: > I have a zfs pool > > NAME STATE READ WRITE CKSUM > multimedia ONLINE 0 0 0 > ad8 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > ad14 ONLINE 0 0 0 > > Now, I need more sata-connecters. If I activate > an other onboard-controller, the device names > move: > > ad8 -> ad14 > ad10 -> ad16 > ad12 -> ad18 > ad14 -> ad20 > > > What is the proper way to handle this in zfs? Export the pool before you make the change and it should work no problem. You may want to enable ATA_STATIC_ID as well so you won't have to worry about it either. On another note, that's a 4 device pool with no redundancy. Make sure you have frequent backups! I lost my "multimedia" pool once during a migration and was very sad. Now I use raidz2. From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 10:18:10 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1396B10656C0 for ; Mon, 13 Apr 2009 10:18:10 +0000 (UTC) (envelope-from tcberner@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.159]) by mx1.freebsd.org (Postfix) with ESMTP id 8F5BE8FC1A for ; Mon, 13 Apr 2009 10:18:09 +0000 (UTC) (envelope-from tcberner@gmail.com) Received: by fg-out-1718.google.com with SMTP id 13so538918fge.12 for ; Mon, 13 Apr 2009 03:18:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:to:subject:from :organization:content-type:mime-version:references :content-transfer-encoding:message-id:in-reply-to:user-agent; bh=q5fTn9NP5SuVpPp9E+UZqwlOZ+Pd7L5bZxSiMay1viU=; b=aHLZEOxbrEJtmC/brYYfym6rwdqimPcI8bRtPCgYliNFnvVhU8PzScn9KoyWNSgdMS EzKGMQig+tLXhlvr7hKIou0cM+mGcO8oRG0fFEyTveVyeX6X2I2n9KG9zWik9asXkCHL gJ9aal/iT46Zcqlbs09RUFGV/EMh2P3SNOt4k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:to:subject:from:organization:content-type:mime-version :references:content-transfer-encoding:message-id:in-reply-to :user-agent; b=wzITWiZ+vNjrU29CzWe9UvhEnR9SDMAGWHE7VdlPu6sjDAqP8gnlGWV7Jvmx3J3fR/ DYUGk5Oz7D4TQblMGuduDbgv0JHxJqV2s8kSLIFXDyd7beuhIk4eFU5YMuX8VY0jA0Z5 l/dZp8pSyrY7cJhSEdrLCAuzzndwX0jvT6m6Q= Received: by 10.86.1.1 with SMTP id 1mr4671555fga.0.1239617888548; Mon, 13 Apr 2009 03:18:08 -0700 (PDT) Received: from sam.firefly (132-37.105-92.cust.bluewin.ch [92.105.37.132]) by mx.google.com with ESMTPS id d4sm6456283fga.28.2009.04.13.03.18.07 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 13 Apr 2009 03:18:08 -0700 (PDT) Date: Mon, 13 Apr 2009 12:18:07 +0200 To: freebsd-fs@freebsd.org From: "Tobias C. Berner" Organization: - Content-Type: text/plain; charset=utf-8 MIME-Version: 1.0 References: Content-Transfer-Encoding: 8bit Message-ID: In-Reply-To: User-Agent: Opera Mail/10.00 (FreeBSD) Subject: Re: zfs and moving devices X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 10:18:11 -0000 Am 13.04.2009, 09:34 Uhr, schrieb Wes Morgan : > On Mon, 13 Apr 2009, Tobias C. Berner wrote: > >> I have a zfs pool >> >> NAME STATE READ WRITE CKSUM >> multimedia ONLINE 0 0 0 >> ad8 ONLINE 0 0 0 >> ad10 ONLINE 0 0 0 >> ad12 ONLINE 0 0 0 >> ad14 ONLINE 0 0 0 >> >> Now, I need more sata-connecters. If I activate >> an other onboard-controller, the device names >> move: >> >> ad8 -> ad14 >> ad10 -> ad16 >> ad12 -> ad18 >> ad14 -> ad20 >> >> >> What is the proper way to handle this in zfs? > > Export the pool before you make the change and it should work no problem. Ok, I will try that, > You may want to enable ATA_STATIC_ID as well so you won't have to worry > about it either. ATA_STATIC_ID is enabled: options ATA_STATIC_ID # Static device numbering thanks, Tobias > > On another note, that's a 4 device pool with no redundancy. Make sure you > have frequent backups! I lost my "multimedia" pool once during a migration > and was very sad. Now I use raidz2. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Erstellt mit Operas revolutionärem E-Mail-Modul: http://www.opera.com/mail/ From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 10:52:40 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AFECD106566B for ; Mon, 13 Apr 2009 10:52:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 4E4B38FC13 for ; Mon, 13 Apr 2009 10:52:40 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-107-120-227.carlnfd1.nsw.optusnet.com.au (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3DAqbsI015286 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 13 Apr 2009 20:52:38 +1000 Date: Mon, 13 Apr 2009 20:52:37 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Rick Macklem In-Reply-To: Message-ID: <20090413193936.A52183@delplex.bde.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 10:52:41 -0000 On Sun, 12 Apr 2009, Rick Macklem wrote: > In summary, the nfsv4 server needs 3 changes to the FreeBSD kernel: > ... > 3 - Support for the Change attribute, which is what this post is about. > ... > As background, I believe va_filerev/i_modrev was added for nqnfs > long long ago. Since it is not exposed to userland via the stat structure, va_gen/i_gen/di_gen/st_gen seems to be even more suitable for this purpose, but it isn't actually a file generation number like its comments say (it is normally set to a random value on file creation then never changed) and it is exposed to userland (st_gen). > I don't believe anything outside of the kernel uses it. Inside the kernel, > the only thing that currently uses it is the nfs server, which uses it as > the cookie verifier. (It really doesn't use it, since a client regurgitates > it back to the server as opaque bits in the next readdir rpc > and the server then ignores those bits. This is correct, since va_filerev is > a bogus cookie verifier.) As such, I don't believe changing the semantics of > va_filerev will break anything in FreeBSD. va_gen isn't used much either. In ext2fs, i_gen is a copy of the on-disk field i_generation which is documented to be /* for NFS */ but nfs doesn't use va_gen at all. nfs3 (getattr, loadattrcache) doesn't even initialize va_gen. nfsv2 initializes it to a non-random value based on a timestamp. I'm not sure if it does this only on creation or on every cache miss or on every call. I think the uninitialized va_gen gives stack garbage in st_gen, but in tests I get 0 for both nfsv3 and nfsv2 (as root -- st_gen is always 0 for non-root). I don't understand the security issues for *_gen, but remember its being changed for security. cvs history shows that it used to actually be a generation number in at least ffs, but for ffs files and not for individual file changes (or for individual ffs file systems or all file systems). > I'd like to change the semantics of va_filerev so that it can be used > by the nfsv4 server as the Change attribute. To do this, it needs to > change in 2 ways: > - must change upon metadata changes as well as data changes > - must persist across server reboots (ie. be moved to spare space in > the on-disk i-node instead of in memory i-node) Many nonstandard file systems, e.g., msdosfs, have no space to spare. Read-only file systems like cd9660 and udf probably don't need a a variable generation count (since they never change), but their current handling of va_filerev and va_gen is wrong if these fields have any other semantics. These file systems just initialize va_gen to 1 for all files and va_gen to 0 (with an XXX in udf only) for all files. va_ctime should give what you want for all file systems, since it should be increased whenever anything changes. However, most file systems always set the nsec part to 0, so va_ctime doesn't track all file changes. This is a problem for things like make(1) too, so if nsec timestamps aren't available or are take too long or are not fine-grained enough, the nsec part should be abused as a generation counter so that any change gives a strictly larger timestamp. The case where someone sets the clock backwards is broken but won't happen often. Many nonstandard file systems, e.g., msdosfs, have no space for an on-disk ctime, so they fake va_ctime using an on-disk mtime. Since such file systems don't have many attributes, only a few more cases are broken. > Here is the patch to ufs for the above, that I have been using for some > time. Please review and comment. > ... > --- ufs/ufs/ufs_vnops.c.sav 2009-04-12 02:28:41.000000000 -0400 > +++ ufs/ufs/ufs_vnops.c 2009-03-10 16:47:11.000000000 -0400 > @@ -157,11 +157,12 @@ > if (ip->i_flag & IN_UPDATE) { > DIP_SET(ip, i_mtime, ts.tv_sec); > DIP_SET(ip, i_mtimensec, ts.tv_nsec); > - ip->i_modrev++; > + DIP_SET(ip, i_modrev, DIP(ip, i_modrev) + 1); > } > if (ip->i_flag & IN_CHANGE) { > DIP_SET(ip, i_ctime, ts.tv_sec); > DIP_SET(ip, i_ctimensec, ts.tv_nsec); > + DIP_SET(ip, i_modrev, DIP(ip, i_modrev) + 1); > } IN_UPDATE implies IN_CHANGE (unless there is a bug). Thus the above gives an extra increment. Strictly, if you want to track _all_ metadata changes, then you need an increment for IN_ACCESS, and va_ctime will no longer be nearly usable since is not changed by read accesses. I hope you don't want this. Bruce From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 11:06:53 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91B901065674 for ; Mon, 13 Apr 2009 11:06:53 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 7D81F8FC22 for ; Mon, 13 Apr 2009 11:06:53 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3DB6rYJ084936 for ; Mon, 13 Apr 2009 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3DB6qDT084932 for freebsd-fs@FreeBSD.org; Mon, 13 Apr 2009 11:06:52 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 13 Apr 2009 11:06:52 GMT Message-Id: <200904131106.n3DB6qDT084932@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 11:06:54 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int o kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132337 fs [zfs] [panic] kernel panic in zfs_fuid_create_cred o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/131084 fs [xfs] xfs destroys itself after copying data o kern/131081 fs [zfs] User cannot delete a file when a ZFS dataset is o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o bin/130105 fs [zfs] zfs send -R dumps core o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o bin/118249 fs mv(1): moving a directory changes its mtime o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc 57 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 15:07:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F067F106566B for ; Mon, 13 Apr 2009 15:07:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from skerryvore.cs.uoguelph.ca (skerryvore.cs.uoguelph.ca [131.104.94.204]) by mx1.freebsd.org (Postfix) with ESMTP id AB9858FC19 for ; Mon, 13 Apr 2009 15:07:08 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by skerryvore.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id n3DF774G019406; Mon, 13 Apr 2009 11:07:07 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id n3DFDXL03365; Mon, 13 Apr 2009 11:13:33 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Mon, 13 Apr 2009 11:13:33 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Bruce Evans In-Reply-To: <20090413193936.A52183@delplex.bde.org> Message-ID: References: <20090413193936.A52183@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.94.204 Cc: freebsd-fs@freebsd.org Subject: Re: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 15:07:09 -0000 On Mon, 13 Apr 2009, Bruce Evans wrote: > > va_gen/i_gen/di_gen/st_gen seems to be even more suitable for this > purpose, but it isn't actually a file generation number like its > comments say (it is normally set to a random value on file creation > then never changed) and it is exposed to userland (st_gen). > i_gen is used by NFS to create T-stable (valid for a long time, including a long time after the file is removed) file handles. It is used by ffs_vptofh() to create the file handles for NFS that are recognized as representing removed files, even after an i-node gets reused such that the i-node number now represents another file. > > va_gen isn't used much either. In ext2fs, i_gen is a copy of the > on-disk field i_generation which is documented to be /* for NFS */ but > nfs doesn't use va_gen at all. nfs3 (getattr, loadattrcache) doesn't > even initialize va_gen. nfsv2 initializes it to a non-random value > based on a timestamp. I'm not sure if it does this only on creation > or on every cache miss or on every call. I think the uninitialized > va_gen gives stack garbage in st_gen, but in tests I get 0 for both > nfsv3 and nfsv2 (as root -- st_gen is always 0 for non-root). I don't > understand the security issues for *_gen, but remember its being changed > for security. cvs history shows that it used to actually be a generation > number in at least ffs, but for ffs files and not for individual file > changes (or for individual ffs file systems or all file systems). > Its initial value doesn't matter for it to work correctly. It should get incremented each time an i-node gets reused for a different file. (That's what the ESTALE magic is, the server reporting to the client that the file handle is for a file that has been removed.) The "security" business is a bit bogus to me. It's one of those security by obscurity tricks, imho. The problem was that a file handle was easy to fake when i_gen is initially 0. Initializing it to a non-zero value makes faking one a little harder, but... Part of the reason for doing this was that IP#s were only checked against exports at mount time on some systems (BSD has never been this way) and faking the one file handle for the root of the FS (root i-node#, i_gen == 0) bypassed exports and tah dah. >> I'd like to change the semantics of va_filerev so that it can be used >> by the nfsv4 server as the Change attribute. To do this, it needs to >> change in 2 ways: >> - must change upon metadata changes as well as data changes >> - must persist across server reboots (ie. be moved to spare space in >> the on-disk i-node instead of in memory i-node) > > Many nonstandard file systems, e.g., msdosfs, have no space to spare. > If a file system can't support it correctly, faking it with something like modify time is about all you can do. Since Change is supposed to change on every file modification, this fails when multiple changes occur within the same tod clock time or the clock gets reset backwards, as you noted below. (Linux uses a modify time with a 1sec clock resolution for Change, which isn't correct and the Linux nfs server folks know that. Since this breaks the AIX nfsv4 client, the AIX folks tend to remind them:-) > Read-only file systems like cd9660 and udf probably don't need a a > variable generation count (since they never change), but their current > handling of va_filerev and va_gen is wrong if these fields have any > other semantics. These file systems just initialize va_gen to 1 for > all files and va_gen to 0 (with an XXX in udf only) for all files. > Since they only need to change for modifications and their initial values don't really matter, the above sounds fine to me. > va_ctime should give what you want for all file systems, since it > should be increased whenever anything changes. However, most file There are some places where IN_UPDATE gets set, but IN_CHANGE doesn't. Since the Change attribute must change for every file modification, I feel safer incrementing it for both IN_UPDATE and IN_CHANGE. (It's 64bits, so it won't wrap around for a little while.) > systems always set the nsec part to 0, so va_ctime doesn't track > all file changes. This is a problem for things like make(1) too, > so if nsec timestamps aren't available or are take too long or are > not fine-grained enough, the nsec part should be abused as a generation > counter so that any change gives a strictly larger timestamp. The > case where someone sets the clock backwards is broken but won't > happen often. > > Many nonstandard file systems, e.g., msdosfs, have > no space for an on-disk ctime, so they fake va_ctime using an on-disk > mtime. Since such file systems don't have many attributes, only a > few more cases are broken. > Yep, that's why ctime/mtime aren't sufficient. If a read/write file system doesn't have support for it, all you can do is fake it and hope the client works ok. I suspect the Linux folks will eventually start to add support for it to ext3fs etc, because of the above, but who knows. It seems that FreeBSD mostly uses FFS and ZFS (which should have support for it, since the Solaris folks are into NFSv4?), so at least we should be able to make those work correctly. Have a good week, rick From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 17:33:43 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E73F106564A; Mon, 13 Apr 2009 17:33:43 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 0393F8FC12; Mon, 13 Apr 2009 17:33:43 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3DHXgM2020276; Mon, 13 Apr 2009 17:33:42 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3DHXglX020272; Mon, 13 Apr 2009 17:33:42 GMT (envelope-from linimon) Date: Mon, 13 Apr 2009 17:33:42 GMT Message-Id: <200904131733.n3DHXglX020272@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-amd64@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/133676: [smbfs] [panic] umount -f'ing a vnode-based memory disk from off a SMB share caused a reboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 17:33:43 -0000 Old Synopsis: umount -f'ing a vnode-based memory disk from off a SMB share caused a reboot New Synopsis: [smbfs] [panic] umount -f'ing a vnode-based memory disk from off a SMB share caused a reboot Responsible-Changed-From-To: freebsd-amd64->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Apr 13 17:31:51 UTC 2009 Responsible-Changed-Why: Reclassify and reassign. http://www.freebsd.org/cgi/query-pr.cgi?pr=133676 From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 17:55:37 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16E80106566C for ; Mon, 13 Apr 2009 17:55:37 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 454AC8FC1A for ; Mon, 13 Apr 2009 17:55:36 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 76A7946B5C; Mon, 13 Apr 2009 13:55:34 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id A903F8A04D; Mon, 13 Apr 2009 13:54:45 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Date: Mon, 13 Apr 2009 11:46:21 -0400 User-Agent: KMail/1.9.7 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200904131146.21640.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 13 Apr 2009 13:54:45 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=0.1 required=4.2 tests=RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Subject: Re: integrating nfsv4 locking with nlm and local locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 17:55:37 -0000 On Thursday 09 April 2009 3:04:37 pm Rick Macklem wrote: > My nfsv4 server currently does VOP_ADVLOCK() with the non-blocking F_SETLK > type and I had thought that was sufficient, but I now realize (thanks to > a recent post by Zachary Loafman) that this breaks when a delegation for > the file is issued to a client. (When a delegation for a file is issued > to a client, it can do byte range locking locally, and the server doesn't > know about these to do VOP_ADVLOCK() on the server machine.) > > I believe that Zachary would like to discuss a more general solution, > including how to handle Open/Share locks, but in the meantime I'd like to > solve this specific case in as simple a way as possible. > > Basically, I need a way to make sure delegations for a file don't exist > when local byte range locking or locking via the NLM is being done on > the file. > > The simplest thing I can think of is the following: > When VOP_ADVLOCK() is called for a file (outside of the nfsv4 server), > do two things: > 1 - Make sure any outstanding delegations are recalled. > I already have a function that does this, so it is a matter > of figuring out where to put the call(s). > 2 - Set a flag on the vnode, so that my nfsv4 server knows not to > issue another delegation for that file. > (I could test for locks via VOP_ADVLOCK() before issuing a > delegation, but that has two problems.) > 1 - Since the vnode is unlocked for VOP_ADVLOCK(), there could > be a race where the nfsv4 server issues a delegation > between the time outstanding delegations are recalled at > #1 above and the VOP_ADVLOCK() sets the lock that I would > see during the test. > 2 - It would have to keep checking for a lock and might issue > a delegation at a point where no lock is held, but one > will be acquired soon, forcing the delegation recall. > (It's much easier to not issue a delegation than recall > one.) > Once this flag is set, I think it would be ok if the flag > remains set until the vnode is recycled, since it seems > fairly likely that, once byte range locking is done on a > file, more will happen. > (If people were agreeable to the vnode flag, it looks like > a VV_xxx flag would make more sense than a VI_xxx one. I > think an atomic_set_int() would be sufficient to set it, > even though the vnode lock isn't held?) You have to hold the vnode lock to set a VV flag always. Even if you do an atomic operation to set your flag, another thread might be setting a flag at the same time using non-atomic ops and could clobber your change (if it does a read-modify-write and reads a value that pre-dates your atomic_set_int() but its write posts after your write). -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 18:33:58 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC683106566B for ; Mon, 13 Apr 2009 18:33:58 +0000 (UTC) (envelope-from BATV+13135baec0e70ef2caf4+2059+infradead.org+hch@bombadil.srs.infradead.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2001:4830:2446:ff00:214:51ff:fe65:c65c]) by mx1.freebsd.org (Postfix) with ESMTP id 653D98FC29 for ; Mon, 13 Apr 2009 18:33:58 +0000 (UTC) (envelope-from BATV+13135baec0e70ef2caf4+2059+infradead.org+hch@bombadil.srs.infradead.org) Received: from hch by bombadil.infradead.org with local (Exim 4.69 #1 (Red Hat Linux)) id 1LtQyd-0008Tf-IC; Mon, 13 Apr 2009 18:33:51 +0000 Date: Mon, 13 Apr 2009 14:33:51 -0400 From: Christoph Hellwig To: Rick Macklem Message-ID: <20090413183351.GA27610@infradead.org> References: <20090413193936.A52183@delplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-SRS-Rewrite: SMTP reverse-path rewritten from by bombadil.infradead.org See http://www.infradead.org/rpr.html Cc: freebsd-fs@freebsd.org Subject: Re: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 18:33:59 -0000 On Mon, Apr 13, 2009 at 11:13:33AM -0400, Rick Macklem wrote: > If a file system can't support it correctly, faking it with something > like modify time is about all you can do. Since Change is supposed to > change on every file modification, this fails when multiple changes > occur within the same tod clock time or the clock gets reset backwards, > as you noted below. (Linux uses a modify time with a 1sec clock > resolution for Change, which isn't correct and the Linux nfs server > folks know that. Since this breaks the AIX nfsv4 client, the AIX folks > tend to remind them:-) Linux uses whatever granularity the underlying filesystems support. For a lot of all designs that may be 1 second, for most recent filesystems it's better. > Yep, that's why ctime/mtime aren't sufficient. > If a read/write file system doesn't have support for it, all you > can do is fake it and hope the client works ok. I suspect the Linux > folks will eventually start to add support for it to ext3fs etc, because > of the above, but who knows. It seems that FreeBSD mostly uses FFS and > ZFS (which should have support for it, since the Solaris folks are into > NFSv4?), so at least we should be able to make those work correctly. Linux already has the changecount in ext4 but the NFS server doesn't yet use it. Also it's beeing implemented for XFS and others. From owner-freebsd-fs@FreeBSD.ORG Mon Apr 13 18:37:36 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C8CF5106566C; Mon, 13 Apr 2009 18:37:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from acadia.cs.uoguelph.ca (acadia.cs.uoguelph.ca [131.104.94.221]) by mx1.freebsd.org (Postfix) with ESMTP id 6DFD48FC08; Mon, 13 Apr 2009 18:37:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by acadia.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id n3DIbZxQ031333; Mon, 13 Apr 2009 14:37:35 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id n3DIi1n06133; Mon, 13 Apr 2009 14:44:02 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Mon, 13 Apr 2009 14:44:01 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: John Baldwin In-Reply-To: <200904131146.21640.jhb@freebsd.org> Message-ID: References: <200904131146.21640.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.94.221 Cc: freebsd-fs@freebsd.org Subject: Re: integrating nfsv4 locking with nlm and local locking X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Apr 2009 18:37:37 -0000 On Mon, 13 Apr 2009, John Baldwin wrote: > > You have to hold the vnode lock to set a VV flag always. Even if you do an > atomic operation to set your flag, another thread might be setting a flag at > the same time using non-atomic ops and could clobber your change (if it does > a read-modify-write and reads a value that pre-dates your atomic_set_int() > but its write posts after your write). > Righto, thanks. (I should have realized that.) I guess I'll have to use a VI_xxx flag or add a field to the vnode to make the scheme work. I am just trying to come up with a stopgap solution until something more comprehensive can be done w.r.t. handling delegations. VI_xxx are currently used for handling the vnode and it doesn't seem appropriate to add one of these to indicate "don't issue delegations". How do others feel w.r.t. adding a VI_xxx flag vs adding v_disabledelegate to the structure? There is always the fallback position of shipping an nfsv4 server with delegations disabled, until handling them when local VOPs are done, gets resolved. rick From owner-freebsd-fs@FreeBSD.ORG Tue Apr 14 10:08:56 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B0C110656F0 for ; Tue, 14 Apr 2009 10:08:56 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail10.syd.optusnet.com.au (mail10.syd.optusnet.com.au [211.29.132.191]) by mx1.freebsd.org (Postfix) with ESMTP id 23D528FC08 for ; Tue, 14 Apr 2009 10:08:55 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-107-120-227.carlnfd1.nsw.optusnet.com.au (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail10.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3EA8pmG014702 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 14 Apr 2009 20:08:53 +1000 Date: Tue, 14 Apr 2009 20:08:51 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Rick Macklem In-Reply-To: Message-ID: <20090414180826.J53102@delplex.bde.org> References: <20090413193936.A52183@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Apr 2009 10:08:57 -0000 On Mon, 13 Apr 2009, Rick Macklem wrote: > On Mon, 13 Apr 2009, Bruce Evans wrote: >> va_gen/i_gen/di_gen/st_gen seems to be even more suitable for this >> purpose, but it isn't actually a file generation number like its >> comments say (it is normally set to a random value on file creation >> then never changed) and it is exposed to userland (st_gen). >> > i_gen is used by NFS to create T-stable (valid for a long time, including > a long time after the file is removed) file handles. It is used by > ffs_vptofh() to create the file handles for NFS that are recognized as > representing removed files, even after an i-node gets reused such that > the i-node number now represents another file. Oops, I missed that since nfs's use of i_gen is indirect. What does nfs do for file systems that don't detect removed files, e.g., msdosfs. vptofh and fhtovp routines seem to have too many differences. E.g., file systems based on ffs return ESTALE for removed files, but zfs_fhtovp() returns EINVAL. I just noticed than the increment of i_gen was slightly broken for ffs by a type mismatch in ffs2 (affects ffs1 too). Originally, i_gen had the same type as di_gen (int32_t). Now i_gen has type int64_t but in ffs1, di_gen of course still has type int32_t, and in ffs2, di_gen still has type int32_t (apparently there was insufficient space to expand it). This makes the overflow check in ffs_alloc.c (++ip->i_gen == 0) more broken than before. Previously it only gave undefined behaviour followed by a bogus check when overflow occurs for incrementing from INT32_T_MAX. Now it has no effect, since it takes 293 years of incrementing at a rate of 1GHz to reach overflow at INT64_T_MAX. Overflow now occurs on assignment to di_gen. The result of this bug is almost the the same as removing the silly part of the security code -- the re-randomization on overflow. i_gen may grow larger than UINT32_T_MAX, but usually refresh from the dinode will keep it smaller. When it starts near UINT32_T_MAX and grows larger, the overflow on assignment and a subsequent refresh will make it nearly 0. Except, in 1 in every 2**32 cases, when the overflow makes di_gen exactly 0, the subsequent refresh will randomize i_gen. >> va_ctime should give what you want for all file systems, since it >> should be increased whenever anything changes. However, most file > There are some places where IN_UPDATE gets set, but IN_CHANGE doesn't. Are there? This would be a bug. I checked that ffs doesn't have this bug. > Since the Change attribute must change for every file modification, I > feel safer incrementing it for both IN_UPDATE and IN_CHANGE. (It's 64bits, > so it won't wrap around for a little while.) It would be a large and obvious bug to modify the file data (IN_UPDATE) without setting IN_CHANGE. >> systems always set the nsec part to 0, so va_ctime doesn't track >> all file changes. This is a problem for things like make(1) too, >> so if nsec timestamps aren't available or are take too long or are >> not fine-grained enough, the nsec part should be abused as a generation >> counter so that any change gives a strictly larger timestamp. The >> case where someone sets the clock backwards is broken but won't >> happen often. >> >> Many nonstandard file systems, e.g., msdosfs, have >> no space for an on-disk ctime, so they fake va_ctime using an on-disk >> mtime. Since such file systems don't have many attributes, only a >> few more cases are broken. >> > Yep, that's why ctime/mtime aren't sufficient. > If a read/write file system doesn't have support for it, all you > can do is fake it and hope the client works ok. I suspect the Linux They need to be fixed or faked well enough for make(1) too. When the dinode has no space to spare, something can be done by keeping state in the inode or vnode. This won't work across reboots of course (except by hashing a reboot counter into the generation counts or timestamps) but might be enough for all short-term uses. I'm not sure how much is safe here. Bruce From owner-freebsd-fs@FreeBSD.ORG Tue Apr 14 15:52:35 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 056D41065670 for ; Tue, 14 Apr 2009 15:52:35 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from acadia.cs.uoguelph.ca (acadia.cs.uoguelph.ca [131.104.94.221]) by mx1.freebsd.org (Postfix) with ESMTP id B63678FC12 for ; Tue, 14 Apr 2009 15:52:34 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.91.102]) by acadia.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id n3EFqXxR008878; Tue, 14 Apr 2009 11:52:33 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id n3EFx2S20621; Tue, 14 Apr 2009 11:59:02 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 14 Apr 2009 11:59:02 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher.cs.uoguelph.ca To: Bruce Evans In-Reply-To: <20090414180826.J53102@delplex.bde.org> Message-ID: References: <20090413193936.A52183@delplex.bde.org> <20090414180826.J53102@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.63 on 131.104.94.221 Cc: freebsd-fs@freebsd.org Subject: Re: changing semantics of the va_filerev (code review) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Apr 2009 15:52:35 -0000 On Tue, 14 Apr 2009, Bruce Evans wrote: [stuff snipped] > > Oops, I missed that since nfs's use of i_gen is indirect. What does > nfs do for file systems that don't detect removed files, e.g., msdosfs. > vptofh and fhtovp routines seem to have too many differences. E.g., An nfs client can always think that a file exists for a short period of time (until client side caches time out) after it has been removed locally or by another client, on the server. The more serious failure occurs when the i-node/directory entry gets reallocated. At that point, the client might access the attributes/data of the new file, thinking it was the old file. (In the worst case, this could persist until the client does a umount() of the file system.) However, typically, unless it has the file open when the file is removed locally on the server or by another client, nothing nasty will happen. (And I think if the client has name caching disabled, nothing nasty can happen.) At least, that's my best guess at an answer. > file systems based on ffs return ESTALE for removed files, but > zfs_fhtovp() returns EINVAL. > I don't know why zfs would choose a different errno, but I don't think that a different errno will have much effect. It's a terminal error in either case. (I can't think of anything clever that a client can do for ESTALE. I wouldn't be surprised if some clients end up translating ESTALE to EINVAL, since POSIX apps don't expect ESTALE.) I suppose someone could argue it violates the RFC, but only if they know that the server should generate NFS3ERR_STALE for that case. > I just noticed than the increment of i_gen was slightly broken for ffs > by a type mismatch in ffs2 (affects ffs1 too). Originally, i_gen had > the same type as di_gen (int32_t). Now i_gen has type int64_t but in > ffs1, di_gen of course still has type int32_t, and in ffs2, di_gen > still has type int32_t (apparently there was insufficient space to > expand it). This makes the overflow check in ffs_alloc.c (++ip->i_gen > == 0) more broken than before. Previously it only gave undefined > behaviour followed by a bogus check when overflow occurs for incrementing > from INT32_T_MAX. Now it has no effect, since it takes 293 years of > incrementing at a rate of 1GHz to reach overflow at INT64_T_MAX. > Overflow now occurs on assignment to di_gen. > > The result of this bug is almost the the same as removing the silly > part of the security code -- the re-randomization on overflow. i_gen > may grow larger than UINT32_T_MAX, but usually refresh from the dinode > will keep it smaller. When it starts near UINT32_T_MAX and grows > larger, the overflow on assignment and a subsequent refresh will make > it nearly 0. Except, in 1 in every 2**32 cases, when the overflow makes > di_gen exactly 0, the subsequent refresh will randomize i_gen. > Sounds like you have a better understanding of this than I. Since all nfs really cares about is that the value of i_gen has changed after the i-node is re-allocated, I doubt this causes grief in practice. Personally, I'd just leave it as a 32bit number and initialize it to some pseudo-random value in a range that is a small fraction of UINT32_T_MAX (maybe 1<->1000000) if it is 0, otherwise just increment it by a small value. (I've already noted that I'm not a big fan of security by obscurity anyhow:-) >>> va_ctime should give what you want for all file systems, since it >>> should be increased whenever anything changes. However, most file >> There are some places where IN_UPDATE gets set, but IN_CHANGE doesn't. > > Are there? This would be a bug. I checked that ffs doesn't have this > bug. > Oops, my mistake. I grep'd again and see it is IN_CHANGE that gets set without IN_UPDATE and not the other way around, which makes sense, since I can't think of how you can modify the data without modifying some attribute. So, the Change attribute only needs to change for IN_CHANGE (with all those uses of "change", it must be good:-). Thanks for pointing this out. > > They need to be fixed or faked well enough for make(1) too. > > When the dinode has no space to spare, something can be done by keeping > state in the inode or vnode. This won't work across reboots of course > (except by hashing a reboot counter into the generation counts or > timestamps) but might be enough for all short-term uses. I'm not sure > how much is safe here. > Yes, definitely. I think doing something like having an in-memory field for va_filerev/i_modrev where the high order bits are initialized by ctime (using whatever bits are valid, given tod clock resolution) when read in and then incrementing by 1 for each change, would be a good compromise. rick From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 11:47:54 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2ECDD106566B; Thu, 16 Apr 2009 11:47:54 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 049A68FC18; Thu, 16 Apr 2009 11:47:54 +0000 (UTC) (envelope-from gavin@FreeBSD.org) Received: from freefall.freebsd.org (gavin@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3GBlrWH082164; Thu, 16 Apr 2009 11:47:53 GMT (envelope-from gavin@freefall.freebsd.org) Received: (from gavin@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3GBlr9C082160; Thu, 16 Apr 2009 11:47:53 GMT (envelope-from gavin) Date: Thu, 16 Apr 2009 11:47:53 GMT Message-Id: <200904161147.n3GBlr9C082160@freefall.freebsd.org> To: gavin@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: gavin@FreeBSD.org Cc: Subject: Re: kern/65920: [nwfs] Mounted Netware filesystem behaves strange X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 11:47:54 -0000 Synopsis: [nwfs] Mounted Netware filesystem behaves strange Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: gavin Responsible-Changed-When: Thu Apr 16 11:47:06 UTC 2009 Responsible-Changed-Why: Over to maintainer(s) http://www.freebsd.org/cgi/query-pr.cgi?pr=65920 From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 12:34:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FF92106586C for ; Thu, 16 Apr 2009 12:34:49 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id D73C78FC1C for ; Thu, 16 Apr 2009 12:34:48 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id A6A243BD92 for ; Thu, 16 Apr 2009 14:34:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 61TbHhkGpwDL for ; Thu, 16 Apr 2009 14:34:47 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id 4A5213BC9C; Thu, 16 Apr 2009 14:34:47 +0200 (CEST) Date: Thu, 16 Apr 2009 14:34:47 +0200 From: Ollivier Robert To: freebsd-fs@freebsd.org Message-ID: <20090416123447.GB96263@keltia.freenix.fr> References: <49E16021.6040900@jrv.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49E16021.6040900@jrv.org> X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Subject: Re: turning off ZFS mountpoint property behavior? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 12:34:54 -0000 According to James R. Van Artsdalen: > Unfortunately when zfs recv runs and it receive a filesystem with > property mountpoint=/usr it mounts that filesystem there. That's not > desirable in my situation nor I suspect many others. > > Is there a sysctl or some other way to disable the automatic mount behavior? Have you tried to use legacy? zfs set mountpoint=legacy tank/usr -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 16:01:30 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 65C30106566C for ; Thu, 16 Apr 2009 16:01:30 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id 03DAF8FC08 for ; Thu, 16 Apr 2009 16:01:30 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id C6A3F3BDCA for ; Thu, 16 Apr 2009 18:01:28 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oanFZnedmVjU for ; Thu, 16 Apr 2009 18:01:28 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id 6FF7D3BDC7; Thu, 16 Apr 2009 18:01:28 +0200 (CEST) Date: Thu, 16 Apr 2009 18:01:28 +0200 From: Ollivier Robert To: freebsd-fs@freebsd.org Message-ID: <20090416160128.GA831@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 16:01:30 -0000 According to Stefan Bethke: > Created a GPT label and one partition on each of the three drives: > > gpart create -s gpt $1 > gpart add -b 34 -s 128 -t freebsd-boot $1 > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 $1 > gpart add -b 512 -s 41900000 -t freebsd-zfs $1 > gpart list $1 Coming back to this thread, I'm playing with this setup (and the script mentioned in another thread). When I try to zpool set bootfs=tank with tank containing a raidz array, zpool refuses to set the property, saying it is not available. Using the same commandline on a mirror works. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 16:40:56 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F2481065849 for ; Thu, 16 Apr 2009 16:40:56 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id BF3AC8FC1E for ; Thu, 16 Apr 2009 16:40:55 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id 57ADF3BDCA for ; Thu, 16 Apr 2009 18:40:54 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E5BPciY3aMAm for ; Thu, 16 Apr 2009 18:40:53 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id E013C3BDC7; Thu, 16 Apr 2009 18:40:53 +0200 (CEST) Date: Thu, 16 Apr 2009 18:40:53 +0200 From: Ollivier Robert To: freebsd-fs@freebsd.org Message-ID: <20090416164053.GA80978@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090416160128.GA831@keltia.freenix.fr> X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 16:40:57 -0000 According to Ollivier Robert: > with tank containing a raidz array, zpool refuses to set the property, > saying it is not available. Using the same commandline on a mirror works. BTW all messages I've found on this subject assume (like the script does) that one can do installworld/installkernel. I can setup the whole gpt thing from livefs, even extracting all dists on the newly zfs pool manually by playing with livefs/dvd1 but it can not boot afterwards because / can not be found. I must have missed something... I long for pcbsd setup with zfs support in fact I think :( -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 17:09:44 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 60C2F106566C for ; Thu, 16 Apr 2009 17:09:44 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (merlin.alerce.com [64.62.142.94]) by mx1.freebsd.org (Postfix) with ESMTP id 4BC2E8FC08 for ; Thu, 16 Apr 2009 17:09:44 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id B189033C62; Thu, 16 Apr 2009 09:52:16 -0700 (PDT) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id 675C233C5B; Thu, 16 Apr 2009 09:52:16 -0700 (PDT) From: George Hartzell MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18919.25164.567669.809759@already.local> Date: Thu, 16 Apr 2009 09:52:28 -0700 To: Ollivier Robert In-Reply-To: <20090416160128.GA831@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> X-Mailer: VM 8.0.12 under 22.3.1 (i386-apple-darwin9.6.0) X-Virus-Scanned: ClamAV using ClamSMTP Cc: freebsd-fs@freebsd.org Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hartzell@alerce.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 17:09:44 -0000 Ollivier Robert writes: > According to Stefan Bethke: > > Created a GPT label and one partition on each of the three drives: > > > > gpart create -s gpt $1 > > gpart add -b 34 -s 128 -t freebsd-boot $1 > > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 $1 > > gpart add -b 512 -s 41900000 -t freebsd-zfs $1 > > gpart list $1 > > Coming back to this thread, I'm playing with this setup (and the script > mentioned in another thread). When I try to > > zpool set bootfs=tank > > with tank containing a raidz array, zpool refuses to set the property, > saying it is not available. Using the same commandline on a mirror works. In Doug's original email announcing raidz boot support, http://kerneltrap.org/mailarchive/freebsd-fs/2008/12/17/4441084 he says: Currently the ZFS kernel code refuses to allow you to set the bootfs pool property on raidz pools (because Solaris can't boot from them). This means that you are limited to booting from the root filesystem of the pool for now (it shouldn't be hard to relax this restriction). The root filesystem of the pool should contain a directory /boot with the usual contents which must include a /boot/loader which was built with the 'LOADER_ZFS_SUPPORT' make option. Which jsut means that you need a populated boot directory at the top of the tank (e.g. /data/boot). If you're using the create-zfsboot-gpt.sh file that was posted here recently, you'll need to rework it a bit, since it puts the root dir at /data/ROOT/data. g. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 18:25:43 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 99889106566B for ; Thu, 16 Apr 2009 18:25:43 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from mout.perfora.net (mout.perfora.net [74.208.4.195]) by mx1.freebsd.org (Postfix) with ESMTP id 654A58FC16 for ; Thu, 16 Apr 2009 18:25:43 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from [127.0.0.1] (pool-96-241-114-53.washdc.fios.verizon.net [96.241.114.53]) by mrelay.perfora.net (node=mrus0) with ESMTP (Nemesis) id 0MKp8S-1LuW4B1HnZ-000g1t; Thu, 16 Apr 2009 14:12:07 -0400 Message-ID: <49E774F0.1020706@hoyletech.com> Date: Thu, 16 Apr 2009 14:12:00 -0400 From: Nathanael Hoyle User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: Ollivier Robert , freebsd-fs@freebsd.org References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <20090416164053.GA80978@keltia.freenix.fr> In-Reply-To: <20090416164053.GA80978@keltia.freenix.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX19Fq8IE/aaHBE0f98eeO6zvpqFCnNfb2MAeXps gQhuvd1jN621wnaltoaokro4LwKEF6o/SkpRgjoSwocFfIRDqf 5rlaSZ83Y+cvoP3AUgqolAr5JcpdjEy Cc: Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 18:25:43 -0000 Ollivier Robert wrote: > According to Ollivier Robert: > >> with tank containing a raidz array, zpool refuses to set the property, >> saying it is not available. Using the same commandline on a mirror works. >> > > BTW all messages I've found on this subject assume (like the script does) > that one can do installworld/installkernel. > > I can setup the whole gpt thing from livefs, even extracting all dists on > the newly zfs pool manually by playing with livefs/dvd1 but it can not boot > afterwards because / can not be found. > > I must have missed something... I long for pcbsd setup with zfs support in > fact I think :( > To my knowledge, RAID-Z root (boot) pools are not supported. I know that this is true for upstream (Solaris) ZFS, and unless the FreeBSD folks implemented it when I wasn't looking, you can't do it on FreeBSD either. I believe the current implementation essentially reads "through" the mirror structure on a mirrored device and can find all of the data by "dumb" sequential reads on the first disk, just as it would with unpooled disks. In the case of RAID-Z the boot loader would have to be far more intelligent in locating where to read the next block from. It is my understanding that this is a planned future improvement (at least for upstream) but haven't heard any update on it in a while. -Nathanael From owner-freebsd-fs@FreeBSD.ORG Thu Apr 16 18:28:10 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DCA991065700 for ; Thu, 16 Apr 2009 18:28:10 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from mout.perfora.net (mout.perfora.net [74.208.4.195]) by mx1.freebsd.org (Postfix) with ESMTP id A8C028FC1D for ; Thu, 16 Apr 2009 18:28:10 +0000 (UTC) (envelope-from nhoyle@hoyletech.com) Received: from [127.0.0.1] (pool-96-241-114-53.washdc.fios.verizon.net [96.241.114.53]) by mrelay.perfora.net (node=mrus1) with ESMTP (Nemesis) id 0MKpCa-1LuW6f3orG-000coF; Thu, 16 Apr 2009 14:14:41 -0400 Message-ID: <49E7758A.30400@hoyletech.com> Date: Thu, 16 Apr 2009 14:14:34 -0400 From: Nathanael Hoyle User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) MIME-Version: 1.0 To: Ollivier Robert References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <20090416164053.GA80978@keltia.freenix.fr> In-Reply-To: <20090416164053.GA80978@keltia.freenix.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX193hwmvpiZp0Tmu+oFrAxi/ik7K4s0uE6yqepX m6HskAiAaWoZCH4jL2bwiMVd3pZpL8Pn/Gsq09cJqn92Ssbw3E dBkqicQQg7d6I1YA5qRT9O1JeATJS4P Cc: freebsd-fs@freebsd.org Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Apr 2009 18:28:11 -0000 Ollivier Robert wrote: > According to Ollivier Robert: > >> with tank containing a raidz array, zpool refuses to set the property, >> saying it is not available. Using the same commandline on a mirror works. >> > > BTW all messages I've found on this subject assume (like the script does) > that one can do installworld/installkernel. > > I can setup the whole gpt thing from livefs, even extracting all dists on > the newly zfs pool manually by playing with livefs/dvd1 but it can not boot > afterwards because / can not be found. > > I must have missed something... I long for pcbsd setup with zfs support in > fact I think :( > Ok, I screwed up. Not on my usual workstation and my email client mis-threaded discussions. I now realize you were referring to the experimental capabilities that Doug has been working on; my apologies for jumping the gun with the "can't do that" response. -Nathanael From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 13:06:11 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66B3F1065674; Fri, 17 Apr 2009 13:06:11 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from redbull.bpaserver.net (redbullneu.bpaserver.net [213.198.78.217]) by mx1.freebsd.org (Postfix) with ESMTP id 18AD98FC08; Fri, 17 Apr 2009 13:06:11 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from outgoing.leidinger.net (pD9E2CE6F.dip.t-dialin.net [217.226.206.111]) by redbull.bpaserver.net (Postfix) with ESMTP id 946162E1FE; Fri, 17 Apr 2009 14:50:33 +0200 (CEST) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id 37FE4C45FA; Fri, 17 Apr 2009 14:50:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1239972628; bh=QtyPvoJKtsu/Ftw9hNgW3aVEbmjdcncPb azcBUlFnXs=; h=Message-ID:Date:From:To:Cc:Subject:MIME-Version: Content-Type:Content-Transfer-Encoding; b=3gXL7MGtITMJxpMCu49Q7mIb o+PY0HIGf7Ev4BQo51P1OvLWeRAQDheDFbAb8sgdGOke/ewwKMyvBb8gg9ptaK1Z5tG 2+DQQgcDtH4LXI+yjtw2DyfNc+F4mCDhPbZNHB3zAzI3j4iyaD5mVDwdQStu2dcCrV5 Oc6XR3gH3URHK+hpMR5bvD1E3Y/mDGJVEGThcBxuffoqVEC5zzCDpbpI4oVydBf4sbV k8r4bZfSn7HPVvBdeSMsfD5PMwMSRgOByhHwppCkLLiN2pfO9womm5qXihe2H+05Vyc xK6fDdM9HS3FduB8TsYiYCyUf9cDt892pJuUu4RuL0VRAbw5bg== Received: (from www@localhost) by webmail.leidinger.net (8.14.3/8.13.8/Submit) id n3HCoP1o013531; Fri, 17 Apr 2009 14:50:25 +0200 (CEST) (envelope-from Alexander@Leidinger.net) Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Fri, 17 Apr 2009 14:50:24 +0200 Message-ID: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> X-Priority: 3 (Normal) Date: Fri, 17 Apr 2009 14:50:24 +0200 From: Alexander Leidinger To: current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) H3 (4.3) / FreeBSD-8.0 X-BPAnet-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: 946162E1FE.1BFC8 X-BPAnet-MailScanner: Found to be clean X-BPAnet-MailScanner-SpamCheck: not spam, ORDB-RBL, SpamAssassin (not cached, score=-14.9, required 6, BAYES_00 -15.00, DKIM_SIGNED 0.00, DKIM_VERIFIED -0.00, RDNS_DYNAMIC 0.10) X-BPAnet-MailScanner-From: alexander@leidinger.net X-Spam-Status: No Cc: fs@freebsd.org Subject: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 13:06:11 -0000 Hi, to fs@, please CC me, as I'm not subscribed. I monitored (by hand) a while the sysctls kstat.zfs.misc.arcstats.size =20 and kstat.zfs.misc.arcstats.hdr_size. Both grow way higher (at some =20 point I've seen more than 500M) than what I have configured in =20 vfs.zfs.arc_max (40M). After a while FS operations (e.g. pkgdb -F with about 900 packages... =20 my specific workload is the fixup of gnome packages after the removal =20 of the obsolete libusb port) get very slow (in my specific example I =20 let the pkgdb run several times over night and it still is not =20 finished). The big problem with this is, that at some point in time the machine =20 reboots (panic, page fault, page not present, during a fork1). I have =20 the impression (beware, I have a watchdog configured, as I don't know =20 if a triggered WD would cause the same panic, the following is just a =20 guess) that I run out of memory of some kind (I have 1G RAM, i386, max =20 kmem size 700M). I restarted pkgdb several times after a reboot, and =20 it continues to process the libusb removal, but hey, this is anoying. Does someone see something similar to what I describe (mainly the =20 growth of the arc cache way beyond what is configured)? Anyone with =20 some ideas what to try? Bye, Alexander. --=20 When you go out to buy, don't show your silver. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID =3D B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID =3D 72077137 From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 13:14:46 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADE561065670 for ; Fri, 17 Apr 2009 13:14:46 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id 5AFEE8FC25 for ; Fri, 17 Apr 2009 13:14:46 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id 49D223BDC6; Fri, 17 Apr 2009 15:14:44 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pZc4x4V3PJ7j; Fri, 17 Apr 2009 15:14:43 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id C2A483BD9D; Fri, 17 Apr 2009 15:14:43 +0200 (CEST) Date: Fri, 17 Apr 2009 15:14:43 +0200 From: Ollivier Robert To: George Hartzell Message-ID: <20090417131443.GD96263@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <18919.25164.567669.809759@already.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <18919.25164.567669.809759@already.local> X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Cc: freebsd-fs@freebsd.org Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 13:14:47 -0000 According to George Hartzell: > Which jsut means that you need a populated boot directory at the top > of the tank (e.g. /data/boot). If you're using the > create-zfsboot-gpt.sh file that was posted here recently, you'll need > to rework it a bit, since it puts the root dir at /data/ROOT/data. OK, following this, I managed the boot code to find loader & loader.conf. It stops when it can't find the root I want it to boot from though. The ? prompt shows me all devices (da{0,1,2}, da{0,1,2}p{1,2} and label/swap) but trying to use zfs:whatever does not seem to work. loader.conf is very small: ----- zfs_load="YES" geom_label_load="YES" vfs.root.mountfrom="zfs:tank/ROOT/tank" ----- I did zfs set mountpoint=/tank/ROOT/tank tank/ROOT/tank (aka the real root) the other fs are in their usual place zfs set mountpoint=/usr tank/usr zfs set mountpoint=/var tank/var Any other ideas. I'll try to summarize here and on the wiki when I'm done. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 13:47:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FCF8106564A for ; Fri, 17 Apr 2009 13:47:25 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (merlin.alerce.com [64.62.142.94]) by mx1.freebsd.org (Postfix) with ESMTP id 683628FC13 for ; Fri, 17 Apr 2009 13:47:25 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id E997533C62; Fri, 17 Apr 2009 06:47:24 -0700 (PDT) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id 5869E33C5B; Fri, 17 Apr 2009 06:47:24 -0700 (PDT) From: George Hartzell MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18920.34924.2076.295983@already.local> Date: Fri, 17 Apr 2009 06:47:24 -0700 To: Ollivier Robert In-Reply-To: <20090417131443.GD96263@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <18919.25164.567669.809759@already.local> <20090417131443.GD96263@keltia.freenix.fr> X-Mailer: VM 8.0.12 under 22.3.1 (i386-apple-darwin9.6.0) X-Virus-Scanned: ClamAV using ClamSMTP Cc: freebsd-fs@freebsd.org Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hartzell@alerce.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 13:47:25 -0000 Ollivier Robert writes: > According to George Hartzell: > > Which jsut means that you need a populated boot directory at the top > > of the tank (e.g. /data/boot). If you're using the > > create-zfsboot-gpt.sh file that was posted here recently, you'll need > > to rework it a bit, since it puts the root dir at /data/ROOT/data. > > OK, following this, I managed the boot code to find loader & loader.conf. > It stops when it can't find the root I want it to boot from though. > > The ? prompt shows me all devices (da{0,1,2}, da{0,1,2}p{1,2} and > label/swap) but trying to use zfs:whatever does not seem to work. > > loader.conf is very small: > ----- > zfs_load="YES" > geom_label_load="YES" > vfs.root.mountfrom="zfs:tank/ROOT/tank" > ----- > > I did > zfs set mountpoint=/tank/ROOT/tank tank/ROOT/tank (aka the real root) > > the other fs are in their usual place > zfs set mountpoint=/usr tank/usr > zfs set mountpoint=/var tank/var > > Any other ideas. I'll try to summarize here and on the wiki when I'm done. Did you build the loader with LOADER_ZFS_SUPPORT=YES enabled? I just threw that line in my /etc/make.conf and rebuilt everything. g. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 13:57:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 632361065672 for ; Fri, 17 Apr 2009 13:57:39 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id 0F71F8FC13 for ; Fri, 17 Apr 2009 13:57:39 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id 280B83BDC7; Fri, 17 Apr 2009 15:57:38 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4rIfpui42FOn; Fri, 17 Apr 2009 15:57:37 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id 999F93BDC6; Fri, 17 Apr 2009 15:57:37 +0200 (CEST) Date: Fri, 17 Apr 2009 15:57:37 +0200 From: Ollivier Robert To: George Hartzell Message-ID: <20090417135737.GE96263@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <18919.25164.567669.809759@already.local> <20090417131443.GD96263@keltia.freenix.fr> <18920.34924.2076.295983@already.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <18920.34924.2076.295983@already.local> X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Cc: freebsd-fs@freebsd.org Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 13:57:40 -0000 According to George Hartzell: > Did you build the loader with LOADER_ZFS_SUPPORT=YES enabled? > > I just threw that line in my /etc/make.conf and rebuilt everything. Yes, I even reinstalled the gpart bootcode. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 14:20:14 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CD9D1065674; Fri, 17 Apr 2009 14:20:14 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id A6CCB8FC1C; Fri, 17 Apr 2009 14:20:13 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3HE4FBX003074 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 17 Apr 2009 14:04:15 GMT (envelope-from ben@wanderview.com) From: Ben Kelly To: Alexander Leidinger In-Reply-To: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> X-Priority: 3 (Normal) References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> Message-Id: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Fri, 17 Apr 2009 10:04:15 -0400 X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 14:20:15 -0000 On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote: > to fs@, please CC me, as I'm not subscribed. > > I monitored (by hand) a while the sysctls > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. > Both grow way higher (at some point I've seen more than 500M) than > what I have configured in vfs.zfs.arc_max (40M). > > After a while FS operations (e.g. pkgdb -F with about 900 > packages... my specific workload is the fixup of gnome packages > after the removal of the obsolete libusb port) get very slow (in my > specific example I let the pkgdb run several times over night and it > still is not finished). > > The big problem with this is, that at some point in time the machine > reboots (panic, page fault, page not present, during a fork1). I > have the impression (beware, I have a watchdog configured, as I > don't know if a triggered WD would cause the same panic, the > following is just a guess) that I run out of memory of some kind (I > have 1G RAM, i386, max kmem size 700M). I restarted pkgdb several > times after a reboot, and it continues to process the libusb > removal, but hey, this is anoying. > > Does someone see something similar to what I describe (mainly the > growth of the arc cache way beyond what is configured)? Anyone with > some ideas what to try? Can you provide the rest of the arcstats from sysctl? Also, does your arc_reclaim_thread process get any cycles when this problem occurs? What happens if you kill the pkgdb -F manually before it completes? Does the arc cache size come back down or is it stuck at the abnormally high level? At first glance it looks like the tunable limits the value of the arc_c target value, but that appears to only be a soft limit. There is code in there to shrink an ARC that has exceeded its arc_c value. It looks like that code is supposed to run from the arc_reclaim_thread. - Ben From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 14:36:02 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F089106577B; Fri, 17 Apr 2009 14:36:02 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 120CD8FC13; Fri, 17 Apr 2009 14:36:01 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id n3HEIKtw047958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 17 Apr 2009 16:18:20 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by cicely5.cicely.de (8.14.2/8.14.2) with ESMTP id n3HEIHqp018223 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 17 Apr 2009 16:18:17 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id n3HEIH5U015759; Fri, 17 Apr 2009 16:18:17 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id n3HEIHiI015758; Fri, 17 Apr 2009 16:18:17 +0200 (CEST) (envelope-from ticso) Date: Fri, 17 Apr 2009 16:18:17 +0200 From: Bernd Walter To: Alexander Leidinger Message-ID: <20090417141817.GR11551@cicely7.cicely.de> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, AWL=0.000, BAYES_00=-2.599 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on spamd.cicely.de Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 14:36:05 -0000 On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: > Hi, > > to fs@, please CC me, as I'm not subscribed. > > I monitored (by hand) a while the sysctls kstat.zfs.misc.arcstats.size > and kstat.zfs.misc.arcstats.hdr_size. Both grow way higher (at some > point I've seen more than 500M) than what I have configured in > vfs.zfs.arc_max (40M). My understanding about this is the following: vfs.zfs.arc_min/max are not used as min max values. They are used as high/low watermarks. If arc is more than max the arc a thread is triggered to reduce the arc cache until min, but in the meantime other threads can still grow arc so there is a race between them. > After a while FS operations (e.g. pkgdb -F with about 900 packages... > my specific workload is the fixup of gnome packages after the removal > of the obsolete libusb port) get very slow (in my specific example I > let the pkgdb run several times over night and it still is not > finished). I've seen many workloads were prefetching can saturate disks without ever being used. You might want to try disabling prefetch. Of course prefetching also grows arc. > The big problem with this is, that at some point in time the machine > reboots (panic, page fault, page not present, during a fork1). I have > the impression (beware, I have a watchdog configured, as I don't know > if a triggered WD would cause the same panic, the following is just a > guess) that I run out of memory of some kind (I have 1G RAM, i386, max > kmem size 700M). I restarted pkgdb several times after a reboot, and > it continues to process the libusb removal, but hey, this is anoying. With just 700M kmem you should set arc values extremly small and avoid anything which can quickly grow it. Unfortunately accessing many small files is a know arc filling workload. Activating vfs.zfs.cache_flush_disable can help speeding up arc decreasing, with the obvous risks of course... > Does someone see something similar to what I describe (mainly the > growth of the arc cache way beyond what is configured)? Anyone with > some ideas what to try? In my opinion the watermark mechanism can work as it is, but there should be a forced max - currently there is no garantied limit at all. Nevertheless it is up for the people which know the code to decide. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 14:46:06 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EC4C91065B7F for ; Fri, 17 Apr 2009 14:46:06 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from keltia.freenix.fr (keltia.freenix.org [IPv6:2001:660:330f:f820:213:72ff:fe15:f44]) by mx1.freebsd.org (Postfix) with ESMTP id 9181F8FC20 for ; Fri, 17 Apr 2009 14:46:06 +0000 (UTC) (envelope-from roberto@keltia.freenix.fr) Received: from localhost (localhost [127.0.0.1]) by keltia.freenix.fr (Postfix/TLS) with ESMTP id 92C8B3BDC5 for ; Fri, 17 Apr 2009 16:46:05 +0200 (CEST) X-Virus-Scanned: amavisd-new at keltia.freenix.fr Received: from keltia.freenix.fr ([127.0.0.1]) by localhost (keltia.freenix.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n4cF79JbJVsh for ; Fri, 17 Apr 2009 16:46:05 +0200 (CEST) Received: by keltia.freenix.fr (Postfix/TLS, from userid 101) id 1E92A3BDC4; Fri, 17 Apr 2009 16:46:05 +0200 (CEST) Date: Fri, 17 Apr 2009 16:46:05 +0200 From: Ollivier Robert To: freebsd-fs@freebsd.org Message-ID: <20090417144605.GA2316@keltia.freenix.fr> References: <9461581F-F354-486D-961D-3FD5B1EF007C@rabson.org> <20090416160128.GA831@keltia.freenix.fr> <18919.25164.567669.809759@already.local> <20090417131443.GD96263@keltia.freenix.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090417131443.GD96263@keltia.freenix.fr> X-Operating-System: MacOS X / Macbook Pro - FreeBSD 7 / Dell D820 SMP User-Agent: Mutt/1.5.19 (2009-01-05) Subject: Re: Booting from ZFS raidz X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 14:46:10 -0000 According to Ollivier Robert: > I did > zfs set mountpoint=/tank/ROOT/tank tank/ROOT/tank (aka the real root) > > the other fs are in their usual place > zfs set mountpoint=/usr tank/usr > zfs set mountpoint=/var tank/var With a proper zpool.cache in the right place (it was not generated first time I tried), it gets further. I'm still missing some bitsi (/usr apparently although I did configure it...). As this is all done in a vmware vm, I can redo everything whenever I want. I wish sysinstall was in a higher level language than C, I could hack a bit on it. Right now, like many others, I feel a bit overwhelmed by the 20k LOC... -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 16:58:32 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3C88106564A; Fri, 17 Apr 2009 16:58:32 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: from mail-fx0-f167.google.com (mail-fx0-f167.google.com [209.85.220.167]) by mx1.freebsd.org (Postfix) with ESMTP id DB9B18FC15; Fri, 17 Apr 2009 16:58:31 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: by fxm11 with SMTP id 11so1004188fxm.43 for ; Fri, 17 Apr 2009 09:58:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.116.69 with SMTP id l5mr2456841bkq.102.1239985709072; Fri, 17 Apr 2009 09:28:29 -0700 (PDT) In-Reply-To: <20090417141817.GR11551@cicely7.cicely.de> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090417141817.GR11551@cicely7.cicely.de> Date: Fri, 17 Apr 2009 18:28:29 +0200 Message-ID: From: =?ISO-8859-1?Q?Marius_N=FCnnerich?= To: ticso@cicely.de Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Alexander Leidinger , current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 16:58:33 -0000 On Fri, Apr 17, 2009 at 16:18, Bernd Walter wrote= : > On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: >> Hi, >> >> to fs@, please CC me, as I'm not subscribed. >> >> I monitored (by hand) a while the sysctls kstat.zfs.misc.arcstats.size >> and kstat.zfs.misc.arcstats.hdr_size. Both grow way higher (at some >> point I've seen more than 500M) than what I have configured in >> vfs.zfs.arc_max (40M). > > My understanding about this is the following: > vfs.zfs.arc_min/max are not used as min max values. > They are used as high/low watermarks. > If arc is more than max the arc a thread is triggered to reduce the > arc cache until min, but in the meantime other threads can still grow > arc so there is a race between them. Hmm, if this is true the ARC size should go down to arc_min once it did grow past arc_max and no new data is coming along but I do not observe such a thing here. It simply stays near but below arc_max here all the time. I have only /home on ZFS with moderate load. > >> After a while FS operations (e.g. pkgdb -F with about 900 packages... >> my specific workload is the fixup of gnome packages after the removal >> of the obsolete libusb port) get very slow (in my specific example I >> let the pkgdb run several times over night and it still is not >> finished). > > I've seen many workloads were prefetching can saturate disks without > ever being used. > You might want to try disabling prefetch. > Of course prefetching also grows arc. > >> The big problem with this is, that at some point in time the machine >> reboots (panic, page fault, page not present, during a fork1). I have >> the impression (beware, I have a watchdog configured, as I don't know >> if a triggered WD would cause the same panic, the following is just a >> guess) that I run out of memory of some kind (I have 1G RAM, i386, max >> kmem size 700M). I restarted =A0pkgdb several times after a reboot, and >> it continues to process the libusb removal, but hey, this is anoying. > > With just 700M kmem you should set arc values extremly small and > avoid anything which can quickly grow it. > Unfortunately accessing many small files is a know arc filling workload. > Activating vfs.zfs.cache_flush_disable can help speeding up arc decreasin= g, > with the obvous risks of course... > >> Does someone see something similar to what I describe (mainly the >> growth of the arc cache way beyond what is configured)? Anyone with >> some ideas what to try? > > In my opinion the watermark mechanism can work as it is, but there should > be a forced max - currently there is no garantied limit at all. > Nevertheless it is up for the people which know the code to decide. > > -- > B.Walter http://www.bwct.de > Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org= " > From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 19:05:58 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 31283106566B; Fri, 17 Apr 2009 19:05:58 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 95F608FC16; Fri, 17 Apr 2009 19:05:57 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id n3HJ5tmF063212 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 17 Apr 2009 21:05:56 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by cicely5.cicely.de (8.14.2/8.14.2) with ESMTP id n3HJ5q2p027708 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 17 Apr 2009 21:05:52 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id n3HJ5qJX016460; Fri, 17 Apr 2009 21:05:52 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id n3HJ5qeY016459; Fri, 17 Apr 2009 21:05:52 +0200 (CEST) (envelope-from ticso) Date: Fri, 17 Apr 2009 21:05:52 +0200 From: Bernd Walter To: Marius =?iso-8859-1?Q?N=FCnnerich?= Message-ID: <20090417190551.GT11551@cicely7.cicely.de> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090417141817.GR11551@cicely7.cicely.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, AWL=0.000, BAYES_00=-2.599 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on spamd.cicely.de Cc: Alexander Leidinger , ticso@cicely.de, fs@freebsd.org, current@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 19:05:58 -0000 On Fri, Apr 17, 2009 at 06:28:29PM +0200, Marius Nünnerich wrote: > On Fri, Apr 17, 2009 at 16:18, Bernd Walter wrote: > > On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: > >> Hi, > >> > >> to fs@, please CC me, as I'm not subscribed. > >> > >> I monitored (by hand) a while the sysctls kstat.zfs.misc.arcstats.size > >> and kstat.zfs.misc.arcstats.hdr_size. Both grow way higher (at some > >> point I've seen more than 500M) than what I have configured in > >> vfs.zfs.arc_max (40M). > > > > My understanding about this is the following: > > vfs.zfs.arc_min/max are not used as min max values. > > They are used as high/low watermarks. > > If arc is more than max the arc a thread is triggered to reduce the > > arc cache until min, but in the meantime other threads can still grow > > arc so there is a race between them. > > Hmm, if this is true the ARC size should go down to arc_min once it > did grow past arc_max and no new data is coming along but I do not > observe such a thing here. It simply stays near but below arc_max here > all the time. I have only /home on ZFS with moderate load. I had a few ideas why this could be, but scanning complete sys showed no point at all where arc_min is used. There are formular to set this value, but that's all I find. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-fs@FreeBSD.ORG Fri Apr 17 21:44:05 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 19E33106566B for ; Fri, 17 Apr 2009 21:44:05 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from email1.allantgroup.com (email1.emsphone.com [199.67.51.115]) by mx1.freebsd.org (Postfix) with ESMTP id C0EA88FC19 for ; Fri, 17 Apr 2009 21:44:04 +0000 (UTC) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by email1.allantgroup.com (8.14.0/8.14.0) with ESMTP id n3HLCaLH073548 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 17 Apr 2009 16:12:37 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: from dan.emsphone.com (smmsp@localhost [127.0.0.1]) by dan.emsphone.com (8.14.3/8.14.3) with ESMTP id n3HLCanh027321 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 17 Apr 2009 16:12:36 -0500 (CDT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.14.3/8.14.3/Submit) id n3HKxtlu099641; Fri, 17 Apr 2009 15:59:55 -0500 (CDT) (envelope-from dan) Date: Fri, 17 Apr 2009 15:59:55 -0500 From: Dan Nelson To: ticso@cicely.de Message-ID: <20090417205955.GK90152@dan.emsphone.com> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090417141817.GR11551@cicely7.cicely.de> <20090417190551.GT11551@cicely7.cicely.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20090417190551.GT11551@cicely7.cicely.de> X-OS: FreeBSD 7.1-STABLE User-Agent: Mutt/1.5.19 (2009-01-05) X-Virus-Scanned: ClamAV version 0.94.1, clamav-milter version 0.94.1 on email1.allantgroup.com X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.2 (email1.allantgroup.com [199.67.51.78]); Fri, 17 Apr 2009 16:12:37 -0500 (CDT) X-Scanned-By: MIMEDefang 2.45 Cc: Alexander Leidinger , current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Apr 2009 21:44:05 -0000 In the last episode (Apr 17), Bernd Walter said: > On Fri, Apr 17, 2009 at 06:28:29PM +0200, Marius Nünnerich wrote: > > On Fri, Apr 17, 2009 at 16:18, Bernd Walter wrote: > > > On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: > > >> I monitored (by hand) a while the sysctls > > >> kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. > > >> Both grow way higher (at some point I've seen more than 500M) than > > >> what I have configured in vfs.zfs.arc_max (40M). > > > > > > My understanding about this is the following: vfs.zfs.arc_min/max are > > > not used as min max values. They are used as high/low watermarks. If > > > arc is more than max the arc a thread is triggered to reduce the arc > > > cache until min, but in the meantime other threads can still grow arc > > > so there is a race between them. > > > > Hmm, if this is true the ARC size should go down to arc_min once it did > > grow past arc_max and no new data is coming along but I do not observe > > such a thing here. It simply stays near but below arc_max here all the > > time. I have only /home on ZFS with moderate load. > > I had a few ideas why this could be, but scanning complete sys showed no > point at all where arc_min is used. There are formular to set this value, > but that's all I find. zfs_arc_{min,max} are just tunables. The real variables arc_c_{min,max} get autosized and then capped to {min,max} in uts/common/fs/zfs/arc.c:arc_init() . -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 07:39:13 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B4099106564A; Sat, 18 Apr 2009 07:39:13 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from redbull.bpaserver.net (redbullneu.bpaserver.net [213.198.78.217]) by mx1.freebsd.org (Postfix) with ESMTP id 6FAAB8FC08; Sat, 18 Apr 2009 07:39:13 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from outgoing.leidinger.net (pD9E2DC61.dip.t-dialin.net [217.226.220.97]) by redbull.bpaserver.net (Postfix) with ESMTP id 09C302E068; Sat, 18 Apr 2009 09:39:06 +0200 (CEST) Received: from unknown (IO.Leidinger.net [192.168.2.103]) by outgoing.leidinger.net (Postfix) with ESMTP id 2410FC2B67; Sat, 18 Apr 2009 09:38:59 +0200 (CEST) Date: Sat, 18 Apr 2009 09:38:57 +0200 From: Alexander Leidinger To: ticso@cicely.de Message-ID: <20090418093857.0000199a@unknown> In-Reply-To: <20090417141817.GR11551@cicely7.cicely.de> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090417141817.GR11551@cicely7.cicely.de> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.10.13; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BPAnet-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: 09C302E068.61563 X-BPAnet-MailScanner: Found to be clean X-BPAnet-MailScanner-SpamCheck: not spam, ORDB-RBL, SpamAssassin (not cached, score=-14.4, required 6, BAYES_00 -15.00, L_HELLO_ADDRESS 0.50, RDNS_DYNAMIC 0.10) X-BPAnet-MailScanner-From: alexander@leidinger.net X-Spam-Status: No Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 07:39:14 -0000 On Fri, 17 Apr 2009 16:18:17 +0200 Bernd Walter wrote: > On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: > > Hi, > > > > to fs@, please CC me, as I'm not subscribed. > > > > I monitored (by hand) a while the sysctls > > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. > > Both grow way higher (at some point I've seen more than 500M) than > > what I have configured in vfs.zfs.arc_max (40M). > > My understanding about this is the following: > vfs.zfs.arc_min/max are not used as min max values. > They are used as high/low watermarks. > If arc is more than max the arc a thread is triggered to reduce the > arc cache until min, but in the meantime other threads can still grow > arc so there is a race between them. 500M (more than 10 times my max) after a night seems to be a big race... > > After a while FS operations (e.g. pkgdb -F with about 900 > > packages... my specific workload is the fixup of gnome packages > > after the removal of the obsolete libusb port) get very slow (in my > > specific example I let the pkgdb run several times over night and > > it still is not finished). > > I've seen many workloads were prefetching can saturate disks without > ever being used. > You might want to try disabling prefetch. > Of course prefetching also grows arc. Prefetching is already disabled in this case. > > The big problem with this is, that at some point in time the > > machine reboots (panic, page fault, page not present, during a > > fork1). I have the impression (beware, I have a watchdog > > configured, as I don't know if a triggered WD would cause the same > > panic, the following is just a guess) that I run out of memory of > > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted > > pkgdb several times after a reboot, and it continues to process the > > libusb removal, but hey, this is anoying. > > With just 700M kmem you should set arc values extremly small and > avoid anything which can quickly grow it. > Unfortunately accessing many small files is a know arc filling > workload. Activating vfs.zfs.cache_flush_disable can help speeding up > arc decreasing, with the obvous risks of course... I have this: ---snip--- vfs.zfs.prefetch_disable=1 vm.kmem_size="700M" vm.kmem_size_max="700M" vfs.zfs.arc_max="40M" vfs.zfs.vdev.cache.size="5M" vfs.zfs.vdev.cache.bshift="13" # device read ahead: 8k vfs.zfs.vdev.max_pending="6" # congruent request to the device, + for NCQ ---snip--- Bye, Alexander. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 07:48:31 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 353CD106564A; Sat, 18 Apr 2009 07:48:31 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from redbull.bpaserver.net (redbullneu.bpaserver.net [213.198.78.217]) by mx1.freebsd.org (Postfix) with ESMTP id AC7278FC0C; Sat, 18 Apr 2009 07:48:30 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from outgoing.leidinger.net (pD9E2DC61.dip.t-dialin.net [217.226.220.97]) by redbull.bpaserver.net (Postfix) with ESMTP id A1B1D2E0AD; Sat, 18 Apr 2009 09:48:26 +0200 (CEST) Received: from unknown (IO.Leidinger.net [192.168.2.103]) by outgoing.leidinger.net (Postfix) with ESMTP id F2787C2E1F; Sat, 18 Apr 2009 09:48:22 +0200 (CEST) Date: Sat, 18 Apr 2009 09:48:21 +0200 From: Alexander Leidinger To: Ben Kelly Message-ID: <20090418094821.00002e67@unknown> In-Reply-To: References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.10.13; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-BPAnet-MailScanner-Information: Please contact the ISP for more information X-MailScanner-ID: A1B1D2E0AD.767EF X-BPAnet-MailScanner: Found to be clean X-BPAnet-MailScanner-SpamCheck: not spam, ORDB-RBL, SpamAssassin (not cached, score=-14.823, required 6, BAYES_00 -15.00, RDNS_DYNAMIC 0.10, TW_ZF 0.08) X-BPAnet-MailScanner-From: alexander@leidinger.net X-Spam-Status: No Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 07:48:31 -0000 On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly wrote: > On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote: > > to fs@, please CC me, as I'm not subscribed. > > > > I monitored (by hand) a while the sysctls > > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. > > Both grow way higher (at some point I've seen more than 500M) than > > what I have configured in vfs.zfs.arc_max (40M). > > > > After a while FS operations (e.g. pkgdb -F with about 900 > > packages... my specific workload is the fixup of gnome packages > > after the removal of the obsolete libusb port) get very slow (in > > my specific example I let the pkgdb run several times over night > > and it still is not finished). > > > > The big problem with this is, that at some point in time the > > machine reboots (panic, page fault, page not present, during a > > fork1). I have the impression (beware, I have a watchdog > > configured, as I don't know if a triggered WD would cause the same > > panic, the following is just a guess) that I run out of memory of > > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted > > pkgdb several times after a reboot, and it continues to process the > > libusb removal, but hey, this is anoying. > > > > Does someone see something similar to what I describe (mainly the > > growth of the arc cache way beyond what is configured)? Anyone > > with some ideas what to try? > > Can you provide the rest of the arcstats from sysctl? Also, does > your arc_reclaim_thread process get any cycles when this problem > occurs? What happens if you kill the pkgdb -F manually before it > completes? Does the arc cache size come back down or is it stuck at > the abnormally high level? I haven't tried killing pkgdb and looking at the stats, but on the idle machine (reboot after the panic and 5h of no use by me... the machine fetches my mails, has a webmail + mysql + imap interface and is a fileserver) the size is double of my max value. Again there's no real load at this time, just fetching my mails (most traffic from the FreeBSD lists) and a little bit of SpamAssassin filtering of them. When I logged in this morning the machine was rebooted about 5h ago by a panic and no FS traffic was going on (100% idle). Currently the arc_reclaim_thread has 0:12 of accumulated CPU time, the wcpu is at 0%, but it is in the running state. The machine is about 80% idle. Here are all zfs sysctls as of now (pkgdb started 5min ago): ---snip--- # sysctl -a | grep zfs vfs.zfs.arc_meta_limit: 10485760 vfs.zfs.arc_meta_used: 130211600 vfs.zfs.mdcomp_disable: 0 vfs.zfs.arc_min: 22937600 vfs.zfs.arc_max: 41943040 vfs.zfs.zfetch.array_rd_sz: 1048576 vfs.zfs.zfetch.block_cap: 256 vfs.zfs.zfetch.min_sec_reap: 2 vfs.zfs.zfetch.max_streams: 8 vfs.zfs.prefetch_disable: 1 vfs.zfs.recover: 0 vfs.zfs.txg.synctime: 5 vfs.zfs.txg.timeout: 30 vfs.zfs.scrub_limit: 10 vfs.zfs.vdev.cache.bshift: 13 vfs.zfs.vdev.cache.size: 5242880 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.vdev.aggregation_limit: 131072 vfs.zfs.vdev.ramp_rate: 2 vfs.zfs.vdev.time_shift: 6 vfs.zfs.vdev.min_pending: 4 vfs.zfs.vdev.max_pending: 6 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.version.zpl: 3 vfs.zfs.version.vdev_boot: 1 vfs.zfs.version.spa: 13 vfs.zfs.version.dmu_backup_stream: 1 vfs.zfs.version.dmu_backup_header: 2 vfs.zfs.version.acl: 1 vfs.zfs.debug: 0 vfs.zfs.super_owner: 0 kstat.zfs.misc.arcstats.hits: 2483157 kstat.zfs.misc.arcstats.misses: 604115 kstat.zfs.misc.arcstats.demand_data_hits: 187200 kstat.zfs.misc.arcstats.demand_data_misses: 78685 kstat.zfs.misc.arcstats.demand_metadata_hits: 2295957 kstat.zfs.misc.arcstats.demand_metadata_misses: 525430 kstat.zfs.misc.arcstats.prefetch_data_hits: 0 kstat.zfs.misc.arcstats.prefetch_data_misses: 0 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 0 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0 kstat.zfs.misc.arcstats.mru_hits: 1621026 kstat.zfs.misc.arcstats.mru_ghost_hits: 32102 kstat.zfs.misc.arcstats.mfu_hits: 862131 kstat.zfs.misc.arcstats.mfu_ghost_hits: 18804 kstat.zfs.misc.arcstats.deleted: 550853 kstat.zfs.misc.arcstats.recycle_miss: 287993 kstat.zfs.misc.arcstats.mutex_miss: 2 kstat.zfs.misc.arcstats.evict_skip: 654418 kstat.zfs.misc.arcstats.hash_elements: 5363 kstat.zfs.misc.arcstats.hash_elements_max: 8569 kstat.zfs.misc.arcstats.hash_collisions: 133396 kstat.zfs.misc.arcstats.hash_chains: 739 kstat.zfs.misc.arcstats.hash_chain_max: 5 kstat.zfs.misc.arcstats.p: 41943040 kstat.zfs.misc.arcstats.c: 41943040 kstat.zfs.misc.arcstats.c_min: 22937600 kstat.zfs.misc.arcstats.c_max: 41943040 kstat.zfs.misc.arcstats.size: 130467088 kstat.zfs.misc.arcstats.hdr_size: 730456 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 0 kstat.zfs.misc.vdev_cache_stats.delegations: 2728 kstat.zfs.misc.vdev_cache_stats.hits: 297326 kstat.zfs.misc.vdev_cache_stats.misses: 368918 ---snip--- Bye, Alexander. From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 12:58:38 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE88A1065C57; Sat, 18 Apr 2009 12:58:38 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: from mail-fx0-f167.google.com (mail-fx0-f167.google.com [209.85.220.167]) by mx1.freebsd.org (Postfix) with ESMTP id E20738FC16; Sat, 18 Apr 2009 12:58:37 +0000 (UTC) (envelope-from marius@nuenneri.ch) Received: by fxm11 with SMTP id 11so1268416fxm.43 for ; Sat, 18 Apr 2009 05:58:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.55.142 with SMTP id u14mr3348482bkg.121.1240059516812; Sat, 18 Apr 2009 05:58:36 -0700 (PDT) In-Reply-To: <20090418094821.00002e67@unknown> References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090418094821.00002e67@unknown> Date: Sat, 18 Apr 2009 14:58:36 +0200 Message-ID: From: =?ISO-8859-1?Q?Marius_N=FCnnerich?= To: Alexander Leidinger Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: fs@freebsd.org, current@freebsd.org, Ben Kelly Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 12:58:48 -0000 On Sat, Apr 18, 2009 at 09:48, Alexander Leidinger wrote: > On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly wrote: > > >> On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote: >> > to fs@, please CC me, as I'm not subscribed. >> > >> > I monitored (by hand) a while the sysctls >> > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size. >> > Both grow way higher (at some point I've seen more than 500M) than >> > what I have configured in vfs.zfs.arc_max (40M). >> > >> > After a while FS operations (e.g. pkgdb -F with about 900 >> > packages... my specific workload is the fixup of gnome packages >> > after the removal of the obsolete libusb port) get very slow (in >> > my specific example I let the pkgdb run several times over night >> > and it still is not finished). >> > >> > The big problem with this is, that at some point in time the >> > machine reboots (panic, page fault, page not present, during a >> > fork1). I have the impression (beware, I have a watchdog >> > configured, as I don't know if a triggered WD would cause the same >> > panic, the following is just a guess) that I run out of memory of >> > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted >> > pkgdb several times after a reboot, and it continues to process the >> > libusb removal, but hey, this is anoying. >> > >> > Does someone see something similar to what I describe (mainly the >> > growth of the arc cache way beyond what is configured)? Anyone >> > with some ideas what to try? >> >> Can you provide the rest of the arcstats from sysctl? =A0Also, does >> your arc_reclaim_thread process get any cycles when this problem >> occurs? What happens if you kill the pkgdb -F manually before it >> completes? Does the arc cache size come back down or is it stuck at >> the abnormally high level? > > I haven't tried killing pkgdb and looking at the stats, but on the idle > machine (reboot after the panic and 5h of no use by me... the machine > fetches my mails, has a webmail + mysql + imap interface and is a > fileserver) the size is double of my max value. Again there's no real > load at this time, just fetching my mails (most traffic from the > FreeBSD lists) and a little bit of SpamAssassin filtering of them. When > I logged in this morning the machine was rebooted about 5h ago by a > panic and no FS traffic was going on (100% idle). > > Currently the arc_reclaim_thread has 0:12 of accumulated CPU time, > the wcpu is at 0%, but it is in the running state. The machine is > about 80% idle. > [snip] How about adding a few DTrace probes into arc_reclaim_thread and see what it does? From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 21:17:04 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 77FA0106576E; Sat, 18 Apr 2009 21:17:04 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 001308FC13; Sat, 18 Apr 2009 21:17:03 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3ILH0Dk003279 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 18 Apr 2009 21:17:01 GMT (envelope-from ben@wanderview.com) Message-Id: <6535218D-6292-4F84-A8BA-FFA9B2E47F80@wanderview.com> From: Ben Kelly To: Alexander Leidinger In-Reply-To: <20090418094821.00002e67@unknown> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Sat, 18 Apr 2009 17:17:00 -0400 References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090418094821.00002e67@unknown> X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: current@freebsd.org, fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 21:17:04 -0000 On Apr 18, 2009, at 3:48 AM, Alexander Leidinger wrote: > On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly > wrote: > I haven't tried killing pkgdb and looking at the stats, but on the > idle > machine (reboot after the panic and 5h of no use by me... the machine > fetches my mails, has a webmail + mysql + imap interface and is a > fileserver) the size is double of my max value. Again there's no real > load at this time, just fetching my mails (most traffic from the > FreeBSD lists) and a little bit of SpamAssassin filtering of them. > When > I logged in this morning the machine was rebooted about 5h ago by a > panic and no FS traffic was going on (100% idle). From looking at the code, its not too surprising it settles out at 2x your zfs_arc_max tunable. It looks like under normal conditions the arc_reclaim_thread only tries to evict buffers when the arc_size plus any ghost buffers is twice the value of arc_c: if (needfree || (2 * arc_c < arc_size + arc_mru_ghost->arcs_size + arc_mfu_ghost- >arcs_size)) arc_adjust(); (The needfree flag is only set when the system lowmem event is fired.) The arc_reclaim_thread checks this once a second. Perhaps this limit should be a tunable. Also, it might make sense to have a separate limit check for the ghost buffers. I was able to reproduce similar arc_size growth on my machine by running my rsync backup. After instrumenting the code it appeared that buffers were not being evicted because they were "indirect" and had been in the cache less than a second. The "indirect" flag is set based on the on-disk level field. When you see the arcstats.evict_skip sysctl going up this is probably what is happening. The comments in the code say this check is only for prefetch data, but it also triggers for indirect. I'm hesitant to make it really only affect prefetch buffers. Perhaps we could make the timeout a tunable or dynamic based on how far the cache is over its target. After the rsync completed my machine slowly evicts buffers until its back down to about twice arc_c. There was one case, however, where I saw it stop at about four times arc_c. In that case it was failing to evict buffers due to a missed lock. Its not clear yet if it was a buffer lock or hash lock. When this happens you'll see the arcstats.mutex_missed sysctl go up. I'm going to see if I can track down why this is occuring under idle conditions. That seems suspicious to me. Hope that helps. I'll let you know if I find anything else. - Ben From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 21:25:22 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A30AD106566C; Sat, 18 Apr 2009 21:25:22 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 2C6338FC25; Sat, 18 Apr 2009 21:25:21 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3ILPHve003379 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Sat, 18 Apr 2009 21:25:18 GMT (envelope-from ben@wanderview.com) Message-Id: <6FBF637A-6D96-4117-85C5-F205989DCCC1@wanderview.com> From: Ben Kelly To: =?ISO-8859-1?Q?Marius_N=FCnnerich?= In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v930.3) Date: Sat, 18 Apr 2009 17:25:17 -0400 References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090417141817.GR11551@cicely7.cicely.de> X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: Alexander Leidinger , ticso@cicely.de, fs@freebsd.org, current@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 21:25:22 -0000 On Apr 17, 2009, at 12:28 PM, Marius N=FCnnerich wrote: > On Fri, Apr 17, 2009 at 16:18, Bernd Walter =20 > wrote: >> On Fri, Apr 17, 2009 at 02:50:24PM +0200, Alexander Leidinger wrote: >>> Hi, >>> >>> to fs@, please CC me, as I'm not subscribed. >>> >>> I monitored (by hand) a while the sysctls =20 >>> kstat.zfs.misc.arcstats.size >>> and kstat.zfs.misc.arcstats.hdr_size. Both grow way higher (at some >>> point I've seen more than 500M) than what I have configured in >>> vfs.zfs.arc_max (40M). >> >> My understanding about this is the following: >> vfs.zfs.arc_min/max are not used as min max values. >> They are used as high/low watermarks. >> If arc is more than max the arc a thread is triggered to reduce the >> arc cache until min, but in the meantime other threads can still grow >> arc so there is a race between them. > > Hmm, if this is true the ARC size should go down to arc_min once it > did grow past arc_max and no new data is coming along but I do not > observe such a thing here. It simply stays near but below arc_max here > all the time. I have only /home on ZFS with moderate load. It appears arc_reclaim_thread only shrinks from arc_max when the =20 system vm_lowmem event is fired or more than 75% of max kmem is in use =20= by the system. If you want to make it try to shrink the arc all the time you could =20 try the patch below. This worked to reduce arc_c on my system, but it =20= was unable to reduce arc_size to match due to an apparent mutex miss. =20= I'm still trying to track that down. Hope that helps. - Ben Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c =20 (revision 205) +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c =20 (working copy) @@ -1963,7 +1963,7 @@ if (needfree || (2 * arc_c < arc_size + arc_mru_ghost->arcs_size + arc_mfu_ghost-=20 >arcs_size)) - arc_adjust(); + arc_shrink(); if (arc_eviction_list !=3D NULL) arc_do_user_evicts();= From owner-freebsd-fs@FreeBSD.ORG Sat Apr 18 22:57:01 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 405E4106566B for ; Sat, 18 Apr 2009 22:57:01 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 0537E8FC17 for ; Sat, 18 Apr 2009 22:57:01 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from volatile.chemikals.org (adsl-67-215-2.shv.bellsouth.net [98.67.215.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id A8C388004D04 for ; Sat, 18 Apr 2009 17:56:58 -0500 (CDT) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id n3IMuqli041194 for ; Sat, 18 Apr 2009 17:56:53 -0500 (CDT) (envelope-from morganw@chemikals.org) Date: Sat, 18 Apr 2009 17:56:52 -0500 (CDT) From: Wes Morgan To: freebsd-fs@freebsd.org Message-ID: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Marvell 88SE6480 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Apr 2009 22:57:01 -0000 Saw this on zfs-discuss: http://supermicro.com/products/accessories/addon/AOC-SASLP-MV8.cfm Has a Marvell 88SE6480 chipset on it. Looks like a good controller for zfs arrays. It doesn't appear to be supported by FreeBSD (yet). Anyone know more about it?