From owner-freebsd-fs@FreeBSD.ORG Mon Jul 21 11:06:55 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 19B24106566C for ; Mon, 21 Jul 2008 11:06:55 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EC9BB8FC2A for ; Mon, 21 Jul 2008 11:06:54 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id m6LB6slZ031860 for ; Mon, 21 Jul 2008 11:06:54 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.2/8.14.1/Submit) id m6LB6sea031856 for freebsd-fs@FreeBSD.org; Mon, 21 Jul 2008 11:06:54 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 21 Jul 2008 11:06:54 GMT Message-Id: <200807211106.m6LB6sea031856@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jul 2008 11:06:55 -0000 Current FreeBSD problem reports Critical problems Serious problems S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o kern/116170 fs [panic] Kernel panic when mounting /tmp o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t 7 problems total. Non-critical problems S Tracker Resp. Description -------------------------------------------------------------------------------- o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o bin/118249 fs mv(1): moving a directory changes its mtime o kern/124621 fs [ext3] Cannot mount ext2fs partition o kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li 8 problems total. From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 08:16:56 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E71F106567A for ; Tue, 22 Jul 2008 08:16:56 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from emh04.mail.saunalahti.fi (emh04.mail.saunalahti.fi [62.142.5.110]) by mx1.freebsd.org (Postfix) with ESMTP id 21F0D8FC25 for ; Tue, 22 Jul 2008 08:16:55 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from saunalahti-vams (vs3-11.mail.saunalahti.fi [62.142.5.95]) by emh04-2.mail.saunalahti.fi (Postfix) with SMTP id A85AE13C2C0; Tue, 22 Jul 2008 10:57:22 +0300 (EEST) Received: from emh04.mail.saunalahti.fi ([62.142.5.110]) by vs3-11.mail.saunalahti.fi ([62.142.5.95]) with SMTP (gateway) id A00774A47FF; Tue, 22 Jul 2008 10:57:22 +0300 Received: from a91-153-120-204.elisa-laajakaista.fi (a91-153-120-204.elisa-laajakaista.fi [91.153.120.204]) by emh04.mail.saunalahti.fi (Postfix) with SMTP id 7CCB441C66; Tue, 22 Jul 2008 10:57:19 +0300 (EEST) Date: Tue, 22 Jul 2008 10:57:19 +0300 From: Jaakko Heinonen To: Bruce Evans , freebsd-fs@freebsd.org Message-ID: <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> References: <200806020800.m528038T072838@freefall.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200806020800.m528038T072838@freefall.freebsd.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-Antivirus: VAMS Cc: ighighi@gmail.com Subject: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 08:16:56 -0000 On 2008-06-02, Bruce Evans wrote: [about patch for ext2fs in PR kern/122047] > % + vap->va_birthtime.tv_sec = 0; > % + vap->va_birthtime.tv_nsec = 0; > > This is unrelated and should be handled centrally. Almost all file > systems get this wrong. Most fail to set va_birthtime, so stat() > returns kernel stack garbage for st_birthtime. ffs1 does the same > as the above. msdosfs does the above correctly, by setting tv_sec to > (time_t)-1 in unsupported cases. How about this patch? %%% Index: sys/kern/vfs_vnops.c =================================================================== --- sys/kern/vfs_vnops.c (revision 180588) +++ sys/kern/vfs_vnops.c (working copy) @@ -703,6 +703,9 @@ vn_stat(vp, sb, active_cred, file_cred, #endif vap = &vattr; + /* Not all file systems initialize birthtime. */ + VATTR_NULL(vap); + error = VOP_GETATTR(vp, vap, active_cred, td); if (error) return (error); Index: sys/ufs/ufs/ufs_vnops.c =================================================================== --- sys/ufs/ufs/ufs_vnops.c (revision 180588) +++ sys/ufs/ufs/ufs_vnops.c (working copy) @@ -410,8 +410,8 @@ ufs_getattr(ap) vap->va_mtime.tv_nsec = ip->i_din1->di_mtimensec; vap->va_ctime.tv_sec = ip->i_din1->di_ctime; vap->va_ctime.tv_nsec = ip->i_din1->di_ctimensec; - vap->va_birthtime.tv_sec = 0; - vap->va_birthtime.tv_nsec = 0; + vap->va_birthtime.tv_sec = (time_t)-1; + vap->va_birthtime.tv_nsec = -1; vap->va_bytes = dbtob((u_quad_t)ip->i_din1->di_blocks); } else { vap->va_rdev = ip->i_din2->di_rdev; Index: sys/fs/msdosfs/msdosfs_vnops.c =================================================================== --- sys/fs/msdosfs/msdosfs_vnops.c (revision 180588) +++ sys/fs/msdosfs/msdosfs_vnops.c (working copy) @@ -345,8 +345,8 @@ msdosfs_getattr(ap) 0, &vap->va_birthtime); } else { vap->va_atime = vap->va_mtime; - vap->va_birthtime.tv_sec = -1; - vap->va_birthtime.tv_nsec = 0; + vap->va_birthtime.tv_sec = (time_t)-1; + vap->va_birthtime.tv_nsec = -1; } vap->va_flags = 0; if ((dep->de_Attributes & ATTR_ARCHIVE) == 0) Index: sys/nfsclient/nfs_subs.c =================================================================== --- sys/nfsclient/nfs_subs.c (revision 180588) +++ sys/nfsclient/nfs_subs.c (working copy) @@ -628,6 +628,8 @@ nfs_loadattrcache(struct vnode **vpp, st vap->va_rdev = rdev; mtime_save = vap->va_mtime; vap->va_mtime = mtime; + vap->va_birthtime.tv_sec = (time_t)-1; + vap->va_birthtime.tv_nsec = -1; vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; if (v3) { vap->va_nlink = fxdr_unsigned(u_short, fp->fa_nlink); %%% The patch adds VATTR_NULL() call to vn_stat() to initialize the vattr structure before VOP_GETATTR() call. VATTR_NULL() initializes va_birthtime.tv_sec and va_birthtime.tv_nsec to -1 (VNOVAL). I also changed UFS1 and msdosfs to use consistent values. NFS needs explicit initialization because otherwise values would be set to 0 due to memory obtained with M_ZERO flag. I have tested the patch with UFS2, UFS1, cd9660, nfs, ext2fs and smbfs. (There's also more information about the problem in this message: http://lists.freebsd.org/pipermail/freebsd-bugs/2008-March/029682.html) -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 11:41:17 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0FCEF106568D for ; Tue, 22 Jul 2008 11:41:17 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id E33C08FC0C for ; Tue, 22 Jul 2008 11:41:16 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (unknown [192.168.64.3]) by phk.freebsd.dk (Postfix) with ESMTP id 2AD2B170E5; Tue, 22 Jul 2008 11:16:59 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.2/8.14.2) with ESMTP id m6MBGvwd037899; Tue, 22 Jul 2008 11:16:58 GMT (envelope-from phk@critter.freebsd.dk) To: Jaakko Heinonen From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 22 Jul 2008 10:57:19 +0300." <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> Date: Tue, 22 Jul 2008 11:16:57 +0000 Message-ID: <37898.1216725417@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-fs@freebsd.org, ighighi@gmail.com Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 11:41:17 -0000 In message <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi>, Jaakko Heinonen writes: >On 2008-06-02, Bruce Evans wrote: >[about patch for ext2fs in PR kern/122047] >> % + vap->va_birthtime.tv_sec = 0; >> % + vap->va_birthtime.tv_nsec = 0; >> >> This is unrelated and should be handled centrally. Almost all file >> systems get this wrong. Most fail to set va_birthtime, so stat() >> returns kernel stack garbage for st_birthtime. ffs1 does the same >> as the above. msdosfs does the above correctly, by setting tv_sec to >> (time_t)-1 in unsupported cases. > >How about this patch? Looks like something Kirk forgot to me. We want to macroize the NOVAL for timespec instead of spreading -1 casts all over. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 14:56:57 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E54BF106564A for ; Tue, 22 Jul 2008 14:56:57 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from fallbackmx06.syd.optusnet.com.au (fallbackmx06.syd.optusnet.com.au [211.29.132.8]) by mx1.freebsd.org (Postfix) with ESMTP id 7EF8C8FC25 for ; Tue, 22 Jul 2008 14:56:57 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by fallbackmx06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m6MCfH99015144 for ; Tue, 22 Jul 2008 22:41:17 +1000 Received: from c220-239-252-11.carlnfd3.nsw.optusnet.com.au (c220-239-252-11.carlnfd3.nsw.optusnet.com.au [220.239.252.11]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m6MCfCn3024753 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jul 2008 22:41:13 +1000 Date: Tue, 22 Jul 2008 22:41:12 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Jaakko Heinonen In-Reply-To: <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> Message-ID: <20080722215249.K17453@delplex.bde.org> References: <200806020800.m528038T072838@freefall.freebsd.org> <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, ighighi@gmail.com Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 14:56:58 -0000 On Tue, 22 Jul 2008, Jaakko Heinonen wrote: > On 2008-06-02, Bruce Evans wrote: > [about patch for ext2fs in PR kern/122047] >> % + vap->va_birthtime.tv_sec = 0; >> % + vap->va_birthtime.tv_nsec = 0; >> >> This is unrelated and should be handled centrally. Almost all file >> systems get this wrong. Most fail to set va_birthtime, so stat() >> returns kernel stack garbage for st_birthtime. ffs1 does the same >> as the above. msdosfs does the above correctly, by setting tv_sec to >> (time_t)-1 in unsupported cases. > > How about this patch? > > %%% > Index: sys/kern/vfs_vnops.c > =================================================================== > --- sys/kern/vfs_vnops.c (revision 180588) > +++ sys/kern/vfs_vnops.c (working copy) > @@ -703,6 +703,9 @@ vn_stat(vp, sb, active_cred, file_cred, > #endif > > vap = &vattr; > + /* Not all file systems initialize birthtime. */ > + VATTR_NULL(vap); > + > error = VOP_GETATTR(vp, vap, active_cred, td); > if (error) > return (error); I want to initialize va_birthtime to { -1, 0 } here only. Don't initialize the whole vattr here. VOP_GETTATR() is supposed to initalize everything, but doesn't for va_birthtime. If there any other fields that VOP_GETTATR() doesn't initialize, then these should be searched for and fixed instead of setting them to the garbage value given by vattr_null. Similarly, if there are any fields that aren't supported by most file systems, then they should be searched for and defaulted like va_birthtime instead of requiring indivual file systems to invent a default value for them. > Index: sys/ufs/ufs/ufs_vnops.c > ... > Index: sys/fs/msdosfs/msdosfs_vnops.c > ... > Index: sys/nfsclient/nfs_subs.c There are a probably more file systems that have missing or slightly incorrect (all zero) settings of va_birthtime. > The patch adds VATTR_NULL() call to vn_stat() to initialize the vattr > structure before VOP_GETATTR() call. VATTR_NULL() initializes > va_birthtime.tv_sec and va_birthtime.tv_nsec to -1 (VNOVAL). I also > changed UFS1 and msdosfs to use consistent values. NFS needs explicit > initialization because otherwise values would be set to 0 due to memory > obtained with M_ZERO flag. VNOVAL = -1 only accidentally gives the correct value for va_birthtime.tv_sec. It gives a wrong value for va_birthtime.tv_nsec. It is better to set va_birthtime.tv_sec explicitly to -1. This -1 is only accidentantally equal to VNOVAL. Fortunately, this accident doesn't prevent VOP_GETATTR() from setting va_birthtime, since VNOVAL is only magic for VOP_SETATTR(). phk replied (but didn't quote enough, so I merged this manually): >> Looks like something Kirk forgot to me. >> We want to macroize the NOVAL for timespec instead of spreading >> -1 casts all over. This isn't a problem for the "GET" interface since VNOVAL doesn't apply to it. Also, the casts of -1 aren't really needed. ufs_settattr() doesn't have them for time_t's, and vattr_null() doesn't have them for anything. The correctness of this depends on the type of time_t (and the other va field times). In userland we're supposed to cast -1 to time_t for error detection in mktime() etc. In userland, time_t can be any arithmetic time so it is possible for (time_t)-1 != -1. Even there, I think there is only a problem if time_t is an unsigned intergral type shorter than int. Compilers may warn about other cases. ufs_settatr() has the casts for va_bytes (bogus cast of va_bytes to int, which breaks its value), va_uid, va_gid and va_mode. For va_mode, there is a problem -- the same one as in my example for time_t above -- va_mode is u_short so it cannot equal -1 (after the default promotions) except on exotic systems. For va_uid and va_gid, the casts were needed 15 years ago when uid_t and gid_t were 16 bits. I can't see any problem with omitting the cast for va_bytes -- va_bytes is u_quad_t, which is certainly at least as large as int, so it can equal VNOVAL = -1 after the default promotions though it cannot represent any negative value (now C's conversion rules requires (uquad_t)-1 == -1, and it would be a compiler bug to warn about expressions that depend on these rules). In vattr_null(), the assignments go the other way and VNOVAL = -1 always gets converted to the intended value (which is not always -1). C's conversion rules are depended on even more here to do something reasonable with (foo_t)-1. I wouldn't like VNOVAL being replaced by VNOTIMESPECVAL, VNOUIDVAL, ... etc. Recently I noticed a commit that replaced (struct foo *)0 by NULL together with less contentions replacements of plain 0 by NULL. Old code that tries to be careful uses (struct foo *)0 (or a macro NULLFOO for this) too much. Now that NULL is Standard we can just use plain NULL. Similarly for plain VNOVAL except in a few cases where -1 doesn't get converted right. Bruce From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 15:45:17 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 18B86106566B for ; Tue, 22 Jul 2008 15:45:17 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from ns1.jnielsen.net (ns1.jnielsen.net [69.55.238.237]) by mx1.freebsd.org (Postfix) with ESMTP id DB8FA8FC17 for ; Tue, 22 Jul 2008 15:45:16 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from [172.17.2.20] (rrcs-74-218-226-253.se.biz.rr.com [74.218.226.253]) (authenticated bits=0) by ns1.jnielsen.net (8.12.9p2/8.12.9) with ESMTP id m6MFSVJP025402; Tue, 22 Jul 2008 11:28:32 -0400 (EDT) (envelope-from lists@jnielsen.net) From: John Nielsen To: current@freebsd.org, fs@freebsd.org Date: Tue, 22 Jul 2008 11:28:27 -0400 User-Agent: KMail/1.9.7 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200807221128.27592.lists@jnielsen.net> X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on ns1.jnielsen.net X-Virus-Status: Clean Cc: Subject: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 15:45:17 -0000 I have a FreeBSD server (which I use as a NAS device, among other things)=20 and a FreeBSD deskop. The desktop is running 7-STABLE from a couple days=20 ago and the server is running 8-CURRENT from yesterday. The server has=20 several NFS-exported ZFS'es which I mount from the desktop. Since moving=20 the shares to ZFS I've been having trouble writing to them from the=20 desktop--the mount hangs after the first or second attempt. This is=20 similar if not identical to what's described in the thread=20 (from -current) I partially copied below. Today I discovered that the problem seems to go away if I change the NFS=20 mount options on the desktop. The following is a summary/timeline of what=20 I've tried: 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; works 7-STABLE client, no NFS options (defaults); 7-STABLE server, ZFS; broken 7-STABLE client, no NFS options (defaults); 8-CURRENT server, ZFS; broken 7-STABLE client, tcp,nfsv3,-r32768,-w32768; 8-CURRENT server, ZFS, works My litmus test is to run fetch in the NFS directory a couple times since=20 in my typical usage the failure is most apparent when fetching distfiles=20 to the shared ports tree. I didn't do a thorough search but I don't see any open PR's about this=20 issue (though I remember the thread below and other discussions about the=20 same time). Should I submit one? Other than that I just wanted to report that 1) this is apparently (still)= =20 an issue and 2) the NFS flags above seem like a good workaround so far. Thanks, JN > Newsgroups: muc.lists.freebsd.current > From: d...@des.no (Dag-Erling Sm=F8rgrav) > Date: Sun, 07 Oct 2007 10:48:49 +0200 > Local: Sun, Oct 7 2007 4:48 am > Subject: Re: ZFS & NFS integration... >=20 > Darren Reed writes: > > Dag-Erling Sm=F8rgrav wrote: > > > Darren Reed writes: > > > > Whats the planned status for ZFS+NFS with 7.0? > > > Don't Do It, basically. > > This sounds like a "shoot yourself in the foot" comment. >=20 > > Why? >=20 > I haven't figured out the exact details yet, but apparently when the > client closes a file that was opened read / write, the server stops > responding to that client. >=20 > DES From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 16:15:12 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9E5A1065670 for ; Tue, 22 Jul 2008 16:15:12 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: from web32706.mail.mud.yahoo.com (web32706.mail.mud.yahoo.com [68.142.207.250]) by mx1.freebsd.org (Postfix) with SMTP id 88EB08FC14 for ; Tue, 22 Jul 2008 16:15:12 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: (qmail 39477 invoked by uid 60001); 22 Jul 2008 15:48:32 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=jAeWONZUt1cA81N2KUBUgqMyzV/Q64lUN8PVAx3JvDWtiJxNggXoXGfVe2+3DJaFmECP7GxqgZnq3GCtY4Kb6ZY4wk6SlcaPH3RJ9bOviFh/S/V3Yge5tc5NCbtErmKj01zK6pW+lZFMxZ1GW6o1EFDsx4xPuIZTVQG7+uqHg6U=; Received: from [190.156.49.247] by web32706.mail.mud.yahoo.com via HTTP; Tue, 22 Jul 2008 08:48:31 PDT X-Mailer: YahooMailWebService/0.7.218 Date: Tue, 22 Jul 2008 08:48:31 -0700 (PDT) From: Pedro Giffuni To: freebsd-fs@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID: <984489.39243.qm@web32706.mail.mud.yahoo.com> X-Mailman-Approved-At: Tue, 22 Jul 2008 16:22:12 +0000 Cc: Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pfgshield-freebsd@yahoo.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 16:15:12 -0000 Hi; Tim has some patches I made to add support for birthtime in libarchive (onl= y in extended pax format) as a LIBARCHIVE.creationtime attribute. Since birthtime is set by modifying mtime twice with utimes(2), the only cr= iteria I used to determine if birthtime should be stored is if it was less = than mtime. I hope something can be done to make that behavior consistent w= ith UFS2 in all other filesystems. cheers, Pedro.=0A=0A=0A Posta, news, sport, oroscopo: tutto in una sola pag= ina. =0ACrea l'home page che piace a te!=0Awww.yahoo.it/latuapagina From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 17:31:45 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F20B1065674 for ; Tue, 22 Jul 2008 17:31:45 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 0A9C58FC14 for ; Tue, 22 Jul 2008 17:31:44 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c220-239-252-11.carlnfd3.nsw.optusnet.com.au (c220-239-252-11.carlnfd3.nsw.optusnet.com.au [220.239.252.11]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m6MHVfpJ012910 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Jul 2008 03:31:42 +1000 Date: Wed, 23 Jul 2008 03:31:41 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Pedro Giffuni In-Reply-To: <984489.39243.qm@web32706.mail.mud.yahoo.com> Message-ID: <20080723032929.F18594@delplex.bde.org> References: <984489.39243.qm@web32706.mail.mud.yahoo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 17:31:45 -0000 On Tue, 22 Jul 2008, Pedro Giffuni wrote: > Tim has some patches I made to add support for birthtime in libarchive (only in extended pax format) as a LIBARCHIVE.creationtime attribute. > > Since birthtime is set by modifying mtime twice with utimes(2), the only criteria I used to determine if birthtime should be stored is if it was less than mtime. I hope something can be done to make that behavior consistent with UFS2 in all other filesystems. Can't it check for st_birthtime.tv_sec being != 0 or -1? The erroneous default of 0 might interact badly with file systems written by buggy versions of tar that set times to 0. Bruce From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 18:11:47 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 45324106566C for ; Tue, 22 Jul 2008 18:11:47 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: from web32706.mail.mud.yahoo.com (web32706.mail.mud.yahoo.com [68.142.207.250]) by mx1.freebsd.org (Postfix) with SMTP id ECB488FC16 for ; Tue, 22 Jul 2008 18:11:46 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: (qmail 60325 invoked by uid 60001); 22 Jul 2008 18:11:46 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Mailer:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=YAq6ORVr0cxzre+K1o3MDYxVy+qYacmEz+/F2yOLqgxd3vuAsUg03UFCStDc7XRh9c6UmIv/Nf/RRwjIvakqzklQ8NmuljBEWCXiA4w1Fvv47JyHL2sjNqBQkJ1A7t7RddJRQJjdotEYvHhtfXu9IlIvM6uspKsZfEdcrfBYa2g=; Received: from [190.156.49.247] by web32706.mail.mud.yahoo.com via HTTP; Tue, 22 Jul 2008 11:11:46 PDT X-Mailer: YahooMailWebService/0.7.218 Date: Tue, 22 Jul 2008 11:11:46 -0700 (PDT) From: Pedro Giffuni To: Bruce Evans In-Reply-To: <20080723032929.F18594@delplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID: <232373.60220.qm@web32706.mail.mud.yahoo.com> Cc: freebsd-fs@FreeBSD.org Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pfgshield-freebsd@yahoo.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 18:11:47 -0000 --- Mar 22/7/08, Bruce Evans ha scritto: ... >=20 > > Tim has some patches I made to add support for > birthtime in libarchive (only in extended pax format) as a > LIBARCHIVE.creationtime attribute. > > > > Since birthtime is set by modifying mtime twice with > utimes(2), the only criteria I used to determine if > birthtime should be stored is if it was less than mtime. I > hope something can be done to make that behavior consistent > with UFS2 in all other filesystems. >=20 > Can't it check for st_birthtime.tv_sec being !=3D 0 or > -1? =20 OK, I can do that, in fact I had it like that originally but then strictly = speaking those values are valid and I had to check for birthtime=3D=3Dmtime= anyways. Admittedly no BSD system was available before Jan 1st 1970 so I w= ill modify the check to avoid those times. Pedro.=0A=0A=0A Posta, news, sport, oroscopo: tutto in una sola pagina= . =0ACrea l'home page che piace a te!=0Awww.yahoo.it/latuapagina From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 21:25:45 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 662D91065673 for ; Tue, 22 Jul 2008 21:25:45 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.174]) by mx1.freebsd.org (Postfix) with ESMTP id 505588FC0A for ; Tue, 22 Jul 2008 21:25:45 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: by wf-out-1314.google.com with SMTP id 24so1436328wfg.7 for ; Tue, 22 Jul 2008 14:25:45 -0700 (PDT) Received: by 10.142.253.21 with SMTP id a21mr2019227wfi.254.1216760257618; Tue, 22 Jul 2008 13:57:37 -0700 (PDT) Received: from matts.spry.com ( [207.178.4.6]) by mx.google.com with ESMTPS id 29sm5991582wfg.0.2008.07.22.13.57.33 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 22 Jul 2008 13:57:36 -0700 (PDT) Message-Id: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> From: Matt Simerson To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v928.1) Date: Tue, 22 Jul 2008 13:57:27 -0700 X-Mailer: Apple Mail (2.928.1) Cc: pjd@freebsd.org Subject: ZFS hang issue and prefetch_disable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 21:25:45 -0000 Symptoms Deadlocks under heavy IO load on the ZFS file system with prefetch_disable=0. Setting vfs.zfs.prefetch_disable=1 results in a stable system. Configuration Two machines. Identically built. Both exhibit identical behavior. 8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks. FreeBSD 7.0 amd64 dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt Boot disk is a read only 1GB compact flash # cat /etc/fstab /dev/ad0s1a / ufs ro,noatime 2 2 # df -h / Filesystem 1K-blocks Used Avail Capacity Mounted on /dev/ad0s1a 939M 555M 309M 64% / RAM has been boosted as suggested in ZFS Tuning Guide # cat /boot/loader.conf vm.kmem_size= 1610612736 vm.kmem_size_max= 1610612736 vfs.zfs.prefetch_disable=1 I haven't mucked much with the other memory settings as I'm using amd64 and according to the FreeBSD ZFS wiki, that isn't necessary. I've tried higher settings for kmem but that resulted in a failed boot. I have ample RAM And would love to use as much as possible for network and disk I/O buffers as that's principally all this system does. Disks & ZFS options Sun's "Best Practices" suggests limiting the number of disks in a raidz pool to no more than 6-10, IIRC. ZFS is configured as shown: http://matt.simerson.net/computing/zfs/zpool.txt I'm using all of the ZFS default properties except: atime=off, compression=on. Environment I'm using these machines as backup servers. I wrote an application that generates a list of the thousands of VPS accounts we host. For each host, it generates a rsnapshot configuration file and backs up up their VPS to these systems via rsync. The application manages concurrency and will span additional rsync processes if system i/o load is below a defined thresh-hold. Which is to say, I can crank up or down the amount of network and disk IO the system sees. With vfs.zfs.prefetch_disable=1, a hang will occur within a few hours (no more than a day). If I keep the i/o load (measured via iostat) down to a low level (< 200 iops) then I still get hangs but less frequently (1-6 days). The only way I have found to prevent the hangs is by setting vfs.zfs.prefetch_disable=1. Matt Simerson From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 22:08:22 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 537D11065674 for ; Tue, 22 Jul 2008 22:08:22 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [91.103.162.4]) by mx1.freebsd.org (Postfix) with ESMTP id 1F4E08FC12 for ; Tue, 22 Jul 2008 22:08:21 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from localhost (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 1ACBE19E023; Wed, 23 Jul 2008 00:08:20 +0200 (CEST) Received: from [192.168.1.2] (r5bb235.net.upc.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 2108219E019; Wed, 23 Jul 2008 00:08:18 +0200 (CEST) Message-ID: <48865A68.1010504@quip.cz> Date: Wed, 23 Jul 2008 00:08:40 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 X-Accept-Language: cz, cs, en, en-us MIME-Version: 1.0 To: Matt Simerson References: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> In-Reply-To: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hang issue and prefetch_disable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 22:08:22 -0000 Matt Simerson wrote: > Symptoms > > Deadlocks under heavy IO load on the ZFS file system with > prefetch_disable=0. Setting vfs.zfs.prefetch_disable=1 results in a > stable system. [...] > With vfs.zfs.prefetch_disable=1, a hang will occur within a few hours > (no more than a day). If I keep the i/o load (measured via iostat) down > to a low level (< 200 iops) then I still get hangs but less frequently > (1-6 days). The only way I have found to prevent the hangs is by > setting vfs.zfs.prefetch_disable=1. "With vfs.zfs.prefetch_disable=1, a hang will occur within...", did you realy mean prefetch_disable=1 in this sentence? Your whole e-mail seems that prefetch_disable=1 is good workaround, so I expect you have prefetch_disable=0 previously which causes hangs... Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Tue Jul 22 22:22:24 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0C046106564A for ; Tue, 22 Jul 2008 22:22:24 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: from wf-out-1314.google.com (wf-out-1314.google.com [209.85.200.169]) by mx1.freebsd.org (Postfix) with ESMTP id EA4BC8FC18 for ; Tue, 22 Jul 2008 22:22:23 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: by wf-out-1314.google.com with SMTP id 24so1452883wfg.7 for ; Tue, 22 Jul 2008 15:22:23 -0700 (PDT) Received: by 10.142.49.4 with SMTP id w4mr2052907wfw.201.1216765343102; Tue, 22 Jul 2008 15:22:23 -0700 (PDT) Received: from matts.spry.com ( [207.178.4.6]) by mx.google.com with ESMTPS id 24sm6763863wff.17.2008.07.22.15.22.21 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 22 Jul 2008 15:22:22 -0700 (PDT) Message-Id: From: Matt Simerson To: freebsd-fs@freebsd.org In-Reply-To: <48865A68.1010504@quip.cz> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v928.1) Date: Tue, 22 Jul 2008 15:22:17 -0700 References: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> <48865A68.1010504@quip.cz> X-Mailer: Apple Mail (2.928.1) Subject: Re: ZFS hang issue and prefetch_disable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 22:22:24 -0000 On Jul 22, 2008, at 3:08 PM, Miroslav Lachman wrote: > Matt Simerson wrote: >> Symptoms >> Deadlocks under heavy IO load on the ZFS file system with >> prefetch_disable=0. Setting vfs.zfs.prefetch_disable=1 results in >> a stable system. > > [...] > >> With vfs.zfs.prefetch_disable=1, a hang will occur within a few >> hours (no more than a day). If I keep the i/o load (measured via >> iostat) down to a low level (< 200 iops) then I still get hangs >> but less frequently (1-6 days). The only way I have found to >> prevent the hangs is by setting vfs.zfs.prefetch_disable=1. > > "With vfs.zfs.prefetch_disable=1, a hang will occur within...", did > you realy mean prefetch_disable=1 in this sentence? Your whole e- > mail seems that prefetch_disable=1 is good workaround, so I expect > you have prefetch_disable=0 previously which causes hangs... Aye. That is exactly what I meant. With vfs.zfs.prefetch_disable=1, I get a stable system. With vfs.zfs.prefetch_disable=0 (the default) I have frequent deadlocks. Matt Rant: I really wish that variable wasn't named in the negative, creating a double negative when prefetch_disable=0. IE, it should be named vfs.zfs.prefetch_enable instead. It's much easier to express in English that prefetch_enable=1 means ON and prefetch_enable=0 means OFF. There's also the matter than in some languages, a double or triple negative still means the negative case. %-\. I'd rather not have to guess what prefetch_disable=1 means. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 07:50:39 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B19FB106564A for ; Wed, 23 Jul 2008 07:50:39 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id 336088FC2D for ; Wed, 23 Jul 2008 07:50:38 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 33AD445CA6; Wed, 23 Jul 2008 09:50:37 +0200 (CEST) Received: from localhost (pjd.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 81AC6456AB; Wed, 23 Jul 2008 09:50:24 +0200 (CEST) Date: Wed, 23 Jul 2008 09:50:30 +0200 From: Pawel Jakub Dawidek To: Matt Simerson Message-ID: <20080723075030.GA3603@garage.freebsd.pl> References: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ew6BAiZeqk4r7MaW" Content-Disposition: inline In-Reply-To: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=3.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hang issue and prefetch_disable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 07:50:39 -0000 --ew6BAiZeqk4r7MaW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 22, 2008 at 01:57:27PM -0700, Matt Simerson wrote: > Symptoms >=20 > Deadlocks under heavy IO load on the ZFS file system with =20 > prefetch_disable=3D0. Setting vfs.zfs.prefetch_disable=3D1 results in a = =20 > stable system. >=20 > Configuration >=20 > Two machines. Identically built. Both exhibit identical behavior. > 8 cores (2 x E5420) x 2.5GHz, 16 GB RAM, 24 x 1TB disks. > FreeBSD 7.0 amd64 > dmesg: http://matt.simerson.net/computing/zfs/dmesg.txt Very nice:) > Boot disk is a read only 1GB compact flash > # cat /etc/fstab > /dev/ad0s1a / ufs ro,noatime 2 2 >=20 > # df -h / > Filesystem 1K-blocks Used Avail Capacity Mounted on > /dev/ad0s1a 939M 555M 309M 64% / >=20 > RAM has been boosted as suggested in ZFS Tuning Guide > # cat /boot/loader.conf > vm.kmem_size=3D 1610612736 > vm.kmem_size_max=3D 1610612736 > vfs.zfs.prefetch_disable=3D1 >=20 > I haven't mucked much with the other memory settings as I'm using =20 > amd64 and according to the FreeBSD ZFS wiki, that isn't necessary. =20 > I've tried higher settings for kmem but that resulted in a failed =20 > boot. I have ample RAM And would love to use as much as possible for =20 > network and disk I/O buffers as that's principally all this system does. >=20 > Disks & ZFS options >=20 > Sun's "Best Practices" suggests limiting the number of disks in a =20 > raidz pool to no more than 6-10, IIRC. ZFS is configured as shown:=20 > http://matt.simerson.net/computing/zfs/zpool.txt >=20 > I'm using all of the ZFS default properties except: atime=3Doff, =20 > compression=3Don. >=20 > Environment >=20 > I'm using these machines as backup servers. I wrote an application =20 > that generates a list of the thousands of VPS accounts we host. For =20 > each host, it generates a rsnapshot configuration file and backs up up = =20 > their VPS to these systems via rsync. The application manages =20 > concurrency and will span additional rsync processes if system i/o =20 > load is below a defined thresh-hold. Which is to say, I can crank up =20 > or down the amount of network and disk IO the system sees. >=20 > With vfs.zfs.prefetch_disable=3D1, a hang will occur within a few hours = =20 I guess you wanted '0' here? > (no more than a day). If I keep the i/o load (measured via iostat) =20 > down to a low level (< 200 iops) then I still get hangs but less =20 > frequently (1-6 days). The only way I have found to prevent the hangs = =20 > is by setting vfs.zfs.prefetch_disable=3D1. This is more or less a known problem. It is related to low memory/kva conditions. Alan Cox is working on vm.kmem_size limitation. I saw Kris using ZFS with some very large vm.kmem_size. Not sure if all the code is already committed, but this would be something you should definiatelly try on your hardware. I've also the most recent ZFS version in perforce that is beeing tested by few other guys and I'd like to commit it to HEAD soon (depends on test results of course). There are plenty improvements and some may fix your problem too. BTW. Do you see prefetch helpful for your workloads? I always turn it off on my systems, because it has negative impact on performance, but maybe my hardware is too weak to take advantage out of it. One more thing. There was a small bug in prefetch code, but I've no idea if it is related to hangs you are seeing. If that's not a problem for you, can you try this patch: http://people.freebsd.org/~pjd/patches/dmu_zfetch.c.patch If you want to play with tunning ZFS prefetch, you might find this patches useful (taken from perforce version): http://people.freebsd.org/~pjd/patches/dmu_zfetch.c.2.patch http://people.freebsd.org/~pjd/patches/quad.patch --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --ew6BAiZeqk4r7MaW Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIhuLFForvXbEpPzQRAgpCAJ0cXQnUcpq4Rnp6muBk0HS0iVEGNgCeL69/ TDT9zL1T0cpNKUSWuOqzz2Y= =Zblm -----END PGP SIGNATURE----- --ew6BAiZeqk4r7MaW-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 08:19:33 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C35D4106567D for ; Wed, 23 Jul 2008 08:19:33 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id 360528FC33 for ; Wed, 23 Jul 2008 08:19:33 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 03D5545B36; Wed, 23 Jul 2008 10:19:28 +0200 (CEST) Received: from localhost (pjd.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 22EA2456AB; Wed, 23 Jul 2008 10:19:25 +0200 (CEST) Date: Wed, 23 Jul 2008 10:19:30 +0200 From: Pawel Jakub Dawidek To: Matt Simerson Message-ID: <20080723081930.GB3603@garage.freebsd.pl> References: <5E8D64DE-EC9B-4B11-BCB4-17BA63650BB7@corp.spry.com> <48865A68.1010504@quip.cz> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zx4FCpZtqtKETZ7O" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=3.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS hang issue and prefetch_disable X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 08:19:33 -0000 --zx4FCpZtqtKETZ7O Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 22, 2008 at 03:22:17PM -0700, Matt Simerson wrote: >=20 > On Jul 22, 2008, at 3:08 PM, Miroslav Lachman wrote: >=20 > >Matt Simerson wrote: > >>Symptoms > >>Deadlocks under heavy IO load on the ZFS file system with =20 > >>prefetch_disable=3D0. Setting vfs.zfs.prefetch_disable=3D1 results in = =20 > >>a stable system. > > > >[...] > > > >>With vfs.zfs.prefetch_disable=3D1, a hang will occur within a few =20 > >>hours (no more than a day). If I keep the i/o load (measured via =20 > >>iostat) down to a low level (< 200 iops) then I still get hangs =20 > >>but less frequently (1-6 days). The only way I have found to =20 > >>prevent the hangs is by setting vfs.zfs.prefetch_disable=3D1. > > > >"With vfs.zfs.prefetch_disable=3D1, a hang will occur within...", did = =20 > >you realy mean prefetch_disable=3D1 in this sentence? Your whole e-=20 > >mail seems that prefetch_disable=3D1 is good workaround, so I expect =20 > >you have prefetch_disable=3D0 previously which causes hangs... >=20 > Aye. That is exactly what I meant. With vfs.zfs.prefetch_disable=3D1, = =20 > I get a stable system. With vfs.zfs.prefetch_disable=3D0 (the default) I = =20 > have frequent deadlocks. >=20 > Matt >=20 > Rant: I really wish that variable wasn't named in the negative, =20 > creating a double negative when prefetch_disable=3D0. IE, it should be = =20 > named vfs.zfs.prefetch_enable instead. It's much easier to express in = =20 > English that prefetch_enable=3D1 means ON and prefetch_enable=3D0 means = =20 > OFF. There's also the matter than in some languages, a double or =20 > triple negative still means the negative case. %-\. I'd rather not =20 > have to guess what prefetch_disable=3D1 means.=20 I agree. We even discussed sysctl naming in the past AFAIR to use exactly 'enable', not 'disable' variants. Although I want to track Solaris as close as possible, that's why I'm using what they have. The intent is to make it easier for people to use ZFS on both Solaris and FreeBSD by not introducing small, but anoying differences. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --zx4FCpZtqtKETZ7O Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIhumSForvXbEpPzQRAiQYAKDw3lAwoYgMPGCkihOV+1tNI8+CYwCgyjiK pFuRtO5DF/gOiBGVweoehJ0= =EyyP -----END PGP SIGNATURE----- --zx4FCpZtqtKETZ7O-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 08:43:43 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3BCFF1065681 for ; Wed, 23 Jul 2008 08:43:43 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id 879868FC14 for ; Wed, 23 Jul 2008 08:43:42 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id CB36045CA0; Wed, 23 Jul 2008 10:24:03 +0200 (CEST) Received: from localhost (pjd.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C2A69456B1; Wed, 23 Jul 2008 10:23:55 +0200 (CEST) Date: Wed, 23 Jul 2008 10:24:01 +0200 From: Pawel Jakub Dawidek To: John Nielsen Message-ID: <20080723082401.GC3603@garage.freebsd.pl> References: <200807221128.27592.lists@jnielsen.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="rQ2U398070+RC21q" Content-Disposition: inline In-Reply-To: <200807221128.27592.lists@jnielsen.net> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=3.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: current@freebsd.org, fs@freebsd.org Subject: Re: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 08:43:43 -0000 --rQ2U398070+RC21q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jul 22, 2008 at 11:28:27AM -0400, John Nielsen wrote: > I have a FreeBSD server (which I use as a NAS device, among other things)= =20 > and a FreeBSD deskop. The desktop is running 7-STABLE from a couple days= =20 > ago and the server is running 8-CURRENT from yesterday. The server has=20 > several NFS-exported ZFS'es which I mount from the desktop. Since moving= =20 > the shares to ZFS I've been having trouble writing to them from the=20 > desktop--the mount hangs after the first or second attempt. This is=20 > similar if not identical to what's described in the thread=20 > (from -current) I partially copied below. >=20 > Today I discovered that the problem seems to go away if I change the NFS= =20 > mount options on the desktop. The following is a summary/timeline of what= =20 > I've tried: >=20 > 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; works > 7-STABLE client, no NFS options (defaults); 7-STABLE server, ZFS; broken > 7-STABLE client, no NFS options (defaults); 8-CURRENT server, ZFS; broken > 7-STABLE client, tcp,nfsv3,-r32768,-w32768; 8-CURRENT server, ZFS, works Do you need all the options here? If not, could you try to find the smallest subset of options that are needed to make ZFS work? Maybe 'nfsv3' is all that is needed, or 'tcp' alone fixes it? At work we use many NFS exported ZFS file systems, mostly accessed from MacOS X and we see no problems. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --rQ2U398070+RC21q Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIhuqeForvXbEpPzQRAs4EAKDGY1A+IgVfv39uNEejIE+EsWBmiQCgqTkh WFx1jU696o+AKZJyf1jKD1U= =0PAa -----END PGP SIGNATURE----- --rQ2U398070+RC21q-- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 09:33:26 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AABA106564A for ; Wed, 23 Jul 2008 09:33:26 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id 961358FC14 for ; Wed, 23 Jul 2008 09:33:25 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id m6N94wr8048538 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 23 Jul 2008 11:04:59 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by cicely5.cicely.de (8.14.2/8.14.2) with ESMTP id m6N94q1h087198 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Jul 2008 11:04:52 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id m6N94qNX064591; Wed, 23 Jul 2008 11:04:52 +0200 (CEST) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id m6N94pvX064590; Wed, 23 Jul 2008 11:04:51 +0200 (CEST) (envelope-from ticso) Date: Wed, 23 Jul 2008 11:04:51 +0200 From: Bernd Walter To: Pawel Jakub Dawidek Message-ID: <20080723090450.GV58113@cicely7.cicely.de> References: <200807221128.27592.lists@jnielsen.net> <20080723082401.GC3603@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080723082401.GC3603@garage.freebsd.pl> X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-4.3 required=5.0 tests=ALL_TRUSTED=-1.8, AWL=0.141, BAYES_00=-2.599 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on spamd.cicely.de Cc: John Nielsen , current@freebsd.org, fs@freebsd.org Subject: Re: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 09:33:26 -0000 On Wed, Jul 23, 2008 at 10:24:01AM +0200, Pawel Jakub Dawidek wrote: > On Tue, Jul 22, 2008 at 11:28:27AM -0400, John Nielsen wrote: > > I have a FreeBSD server (which I use as a NAS device, among other things) > > and a FreeBSD deskop. The desktop is running 7-STABLE from a couple days > > ago and the server is running 8-CURRENT from yesterday. The server has > > several NFS-exported ZFS'es which I mount from the desktop. Since moving > > the shares to ZFS I've been having trouble writing to them from the > > desktop--the mount hangs after the first or second attempt. This is > > similar if not identical to what's described in the thread > > (from -current) I partially copied below. > > > > Today I discovered that the problem seems to go away if I change the NFS > > mount options on the desktop. The following is a summary/timeline of what > > I've tried: > > > > 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; works > > 7-STABLE client, no NFS options (defaults); 7-STABLE server, ZFS; broken > > 7-STABLE client, no NFS options (defaults); 8-CURRENT server, ZFS; broken > > 7-STABLE client, tcp,nfsv3,-r32768,-w32768; 8-CURRENT server, ZFS, works > > Do you need all the options here? If not, could you try to find the > smallest subset of options that are needed to make ZFS work? Maybe > 'nfsv3' is all that is needed, or 'tcp' alone fixes it? At work we use > many NFS exported ZFS file systems, mostly accessed from MacOS X and > we see no problems. Whenever changing NFS transport options has an influence on reliability my first task is to verify the network. Especially there were often hardware problems with some NIC lately, of which some have worked around in the drivers and some not. Disabling TSO and checksum offloading typically helps. This kind of problem is typical on both the client and server, but also on routers. Of course network problems can also be on any cable, switch in between as well, but are less typical to produce complete NFS hangs. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 10:27:45 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B50B7106566C for ; Wed, 23 Jul 2008 10:27:45 +0000 (UTC) (envelope-from kometen@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.156]) by mx1.freebsd.org (Postfix) with ESMTP id E44018FC24 for ; Wed, 23 Jul 2008 10:27:44 +0000 (UTC) (envelope-from kometen@gmail.com) Received: by fg-out-1718.google.com with SMTP id l26so1424056fgb.35 for ; Wed, 23 Jul 2008 03:27:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references; bh=G19tZSpOr44gHBnPoBJqpKWnEwp0Tt4km2hGxdDYOvY=; b=hWgOoCENNVarWHmuk63JqGfxhGebYDmeQ36gXYwHH5XxBMwJSZ3bDbQZS5fLY3xbDp iSPyvEaeNBG1YqEQDX45FvmJyQupL46rHneqTd86U75cJh+rty4myuiqAACRI25f5ePk 0qHuybABjM5QNzkTtm6Jq2d/F4xS0DY8Xpel0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references; b=uJRwqKgsxpWEA+fcCXFOWnCXYWNj8SsBfpMtDlCygeIIMeIwyikC3U0fHOfarlbBSK v0TyMMrAdrnhBJf+X8jn1CBcG5zZ8JBEJ5hHx/MrsXBDTJpVKSP17IUSMDdH4upZZGjQ M22kd6XxbZSbfz4AJ+5eiWDOTQ6sVDnXCLD6o= Received: by 10.86.70.8 with SMTP id s8mr1434641fga.50.1216807287377; Wed, 23 Jul 2008 03:01:27 -0700 (PDT) Received: by 10.86.79.5 with HTTP; Wed, 23 Jul 2008 03:01:27 -0700 (PDT) Message-ID: Date: Wed, 23 Jul 2008 12:01:27 +0200 From: "Claus Guttesen" To: "Pawel Jakub Dawidek" In-Reply-To: <20080723082401.GC3603@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200807221128.27592.lists@jnielsen.net> <20080723082401.GC3603@garage.freebsd.pl> Cc: John Nielsen , current@freebsd.org, fs@freebsd.org Subject: Re: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 10:27:45 -0000 >> Today I discovered that the problem seems to go away if I change the NFS >> mount options on the desktop. The following is a summary/timeline of what >> I've tried: >> >> 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; works >> 7-STABLE client, no NFS options (defaults); 7-STABLE server, ZFS; broken >> 7-STABLE client, no NFS options (defaults); 8-CURRENT server, ZFS; broken >> 7-STABLE client, tcp,nfsv3,-r32768,-w32768; 8-CURRENT server, ZFS, works > > Do you need all the options here? If not, could you try to find the > smallest subset of options that are needed to make ZFS work? Maybe > 'nfsv3' is all that is needed, or 'tcp' alone fixes it? At work we use > many NFS exported ZFS file systems, mostly accessed from MacOS X and > we see no problems. Good to hear. I've just started testing a setup with an areca arc-1680 sas card and an external sas-cabinet. It currently has a zpool with ten 1 TB-drives in raidz2. It may grow to a two-digit TB system if the testing goes fine. It will nfs-share the partitions to my FreeBSD webservers. I'm using nfs v.3 and tcp with a read- and write-size at 32768. -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 10:34:31 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 55EE4106567B for ; Wed, 23 Jul 2008 10:34:31 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from emh07.mail.saunalahti.fi (emh07.mail.saunalahti.fi [62.142.5.117]) by mx1.freebsd.org (Postfix) with ESMTP id D2C058FC0A for ; Wed, 23 Jul 2008 10:34:30 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from saunalahti-vams (vs3-12.mail.saunalahti.fi [62.142.5.96]) by emh07-2.mail.saunalahti.fi (Postfix) with SMTP id 9D44961A49; Wed, 23 Jul 2008 13:34:29 +0300 (EEST) Received: from emh04.mail.saunalahti.fi ([62.142.5.110]) by vs3-12.mail.saunalahti.fi ([62.142.5.96]) with SMTP (gateway) id A06BFC935DF; Wed, 23 Jul 2008 13:34:29 +0300 Received: from a91-153-120-204.elisa-laajakaista.fi (a91-153-120-204.elisa-laajakaista.fi [91.153.120.204]) by emh04.mail.saunalahti.fi (Postfix) with SMTP id 3827C41C9B; Wed, 23 Jul 2008 13:34:25 +0300 (EEST) Date: Wed, 23 Jul 2008 13:34:25 +0300 From: Jaakko Heinonen To: Bruce Evans Message-ID: <20080723103424.GA1856@a91-153-120-204.elisa-laajakaista.fi> References: <200806020800.m528038T072838@freefall.freebsd.org> <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> <20080722215249.K17453@delplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080722215249.K17453@delplex.bde.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-Antivirus: VAMS Cc: freebsd-fs@FreeBSD.org, ighighi@gmail.com Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 10:34:31 -0000 On 2008-07-22, Bruce Evans wrote: > > + VATTR_NULL(vap); > > I want to initialize va_birthtime to { -1, 0 } here only. Don't > initialize the whole vattr here. VOP_GETTATR() is supposed to initalize > everything, but doesn't for va_birthtime. If there any other fields > that VOP_GETTATR() doesn't initialize, then these should be searched > for and fixed instead of setting them to the garbage value given by > vattr_null. At least xfs gets it wrong for several fields. /* * Fields with no direct equivalent in XFS * leave initialized by VATTR_NULL */ #if 0 vap->va_filerev = 0; vap->va_birthtime = va.va_ctime; vap->va_vaflags = 0; vap->va_flags = 0; vap->va_spare = 0; #endif > > Index: sys/ufs/ufs/ufs_vnops.c > > ... > > Index: sys/fs/msdosfs/msdosfs_vnops.c > > ... > > Index: sys/nfsclient/nfs_subs.c > > There are a probably more file systems that have missing or slightly > incorrect (all zero) settings of va_birthtime. Many file systems misses settings of va_birthtime. That's the reason why I initialized it in vn_stat(). I have seen four types of initializations: 1) Support and set birthtime. (UFS2, tmpfs, msdosfs (not all variants of msdosfs support birthtime), nfs4?) 2) Set birthtime to zero. (UFS1, nfs (nfs zeroes the vattr structure)) 3) Initialize vattr with VATTR_NULL() but not birthtime explicitly. Thus tv_sec and tv_nsec are set to -1 (VNOVAL). (devfs, xfs, portalfs, pseudofs) 4) Not initialize birthtime at all. Those would be fixed by initializing the birthtime in vn_stat(). (cd9660, hpfs, ntfs, smbfs, udf, ext2fs, reiserfs) I couldn't test but I suspect that also coda belongs to this group. So I see two ways to fix: - initialize birthtime in vn_stat() and add/fix explicit setting for group 2 and 3 file systems or - add explicit initialization to all file systems missing it (groups 3 and 4) and fix group 2 to initialize birthtime to correct value > I wouldn't like VNOVAL being replaced by VNOTIMESPECVAL, VNOUIDVAL, > ... etc. I agree with this. I have updated the patch per your comments and checked more file systems. I have verified that with this patch these file systems return correct birthtime values (real birthtime or {-1, 0} if not supported): UFS2, UFS1, cd9660, nfs, ext2fs, smbfs, reiserfs, xfs, ntfs, devfs, procfs, linprocfs, tmpfs, msdosfs, portalfs, udf. For pseudofs I set birthtime to current time. %%% Index: sys/kern/vfs_vnops.c =================================================================== --- sys/kern/vfs_vnops.c (revision 180729) +++ sys/kern/vfs_vnops.c (working copy) @@ -703,6 +703,13 @@ vn_stat(vp, sb, active_cred, file_cred, #endif vap = &vattr; + + /* + * Not all file systems initialize birthtime. + */ + vap->va_birthtime.tv_sec = -1; + vap->va_birthtime.tv_nsec = 0; + error = VOP_GETATTR(vp, vap, active_cred, td); if (error) return (error); Index: sys/ufs/ufs/ufs_vnops.c =================================================================== --- sys/ufs/ufs/ufs_vnops.c (revision 180729) +++ sys/ufs/ufs/ufs_vnops.c (working copy) @@ -410,7 +410,7 @@ ufs_getattr(ap) vap->va_mtime.tv_nsec = ip->i_din1->di_mtimensec; vap->va_ctime.tv_sec = ip->i_din1->di_ctime; vap->va_ctime.tv_nsec = ip->i_din1->di_ctimensec; - vap->va_birthtime.tv_sec = 0; + vap->va_birthtime.tv_sec = -1; vap->va_birthtime.tv_nsec = 0; vap->va_bytes = dbtob((u_quad_t)ip->i_din1->di_blocks); } else { Index: sys/nfsclient/nfs_subs.c =================================================================== --- sys/nfsclient/nfs_subs.c (revision 180729) +++ sys/nfsclient/nfs_subs.c (working copy) @@ -628,6 +628,8 @@ nfs_loadattrcache(struct vnode **vpp, st vap->va_rdev = rdev; mtime_save = vap->va_mtime; vap->va_mtime = mtime; + vap->va_birthtime.tv_sec = -1; + vap->va_birthtime.tv_nsec = 0; vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; if (v3) { vap->va_nlink = fxdr_unsigned(u_short, fp->fa_nlink); Index: sys/fs/pseudofs/pseudofs_vnops.c =================================================================== --- sys/fs/pseudofs/pseudofs_vnops.c (revision 180729) +++ sys/fs/pseudofs/pseudofs_vnops.c (working copy) @@ -200,7 +200,7 @@ pfs_getattr(struct vop_getattr_args *va) vap->va_fsid = vn->v_mount->mnt_stat.f_fsid.val[0]; vap->va_nlink = 1; nanotime(&vap->va_ctime); - vap->va_atime = vap->va_mtime = vap->va_ctime; + vap->va_atime = vap->va_mtime = vap->va_birthtime = vap->va_ctime; switch (pn->pn_type) { case pfstype_procdir: Index: sys/fs/portalfs/portal_vnops.c =================================================================== --- sys/fs/portalfs/portal_vnops.c (revision 180729) +++ sys/fs/portalfs/portal_vnops.c (working copy) @@ -462,6 +462,8 @@ portal_getattr(ap) nanotime(&vap->va_atime); vap->va_mtime = vap->va_atime; vap->va_ctime = vap->va_mtime; + vap->va_birthtime.tv_sec = -1; + vap->va_birthtime.tv_nsec = 0; vap->va_gen = 0; vap->va_flags = 0; vap->va_rdev = 0; Index: sys/fs/devfs/devfs_vnops.c =================================================================== --- sys/fs/devfs/devfs_vnops.c (revision 180729) +++ sys/fs/devfs/devfs_vnops.c (working copy) @@ -543,6 +543,8 @@ devfs_getattr(struct vop_getattr_args *a vap->va_rdev = cdev2priv(dev)->cdp_inode; } + vap->va_birthtime.tv_sec = -1; + vap->va_birthtime.tv_nsec = 0; vap->va_gen = 0; vap->va_flags = 0; vap->va_nlink = de->de_links; Index: sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c =================================================================== --- sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c (revision 180729) +++ sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c (working copy) @@ -263,6 +263,8 @@ _xfs_getattr( vap->va_atime = va.va_atime; vap->va_mtime = va.va_mtime; vap->va_ctime = va.va_ctime; + vap->va_birthtime.tv_sec = -1; + vap->va_birthtime.tv_nsec = 0; vap->va_gen = va.va_gen; vap->va_rdev = va.va_rdev; vap->va_bytes = (va.va_nblocks << BBSHIFT); %%% -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 15:42:54 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B8E8E1065678 for ; Wed, 23 Jul 2008 15:42:54 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail01.syd.optusnet.com.au (mail01.syd.optusnet.com.au [211.29.132.182]) by mx1.freebsd.org (Postfix) with ESMTP id 50F208FC0C for ; Wed, 23 Jul 2008 15:42:53 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c220-239-252-11.carlnfd3.nsw.optusnet.com.au [220.239.252.11]) by mail01.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m6NFgpMJ023245 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 24 Jul 2008 01:42:51 +1000 Date: Thu, 24 Jul 2008 01:42:51 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Jaakko Heinonen In-Reply-To: <20080723103424.GA1856@a91-153-120-204.elisa-laajakaista.fi> Message-ID: <20080724000618.Q16961@besplex.bde.org> References: <200806020800.m528038T072838@freefall.freebsd.org> <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> <20080722215249.K17453@delplex.bde.org> <20080723103424.GA1856@a91-153-120-204.elisa-laajakaista.fi> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org, ighighi@gmail.com Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 15:42:54 -0000 On Wed, 23 Jul 2008, Jaakko Heinonen wrote: > On 2008-07-22, Bruce Evans wrote: >>> + VATTR_NULL(vap); >> >> I want to initialize va_birthtime to { -1, 0 } here only. Don't >> initialize the whole vattr here. VOP_GETTATR() is supposed to initalize >> everything, but doesn't for va_birthtime. If there any other fields >> that VOP_GETTATR() doesn't initialize, then these should be searched >> for and fixed instead of setting them to the garbage value given by >> vattr_null. > > At least xfs gets it wrong for several fields. > > /* > * Fields with no direct equivalent in XFS > * leave initialized by VATTR_NULL > */ > #if 0 > vap->va_filerev = 0; > vap->va_birthtime = va.va_ctime; > vap->va_vaflags = 0; > vap->va_flags = 0; > vap->va_spare = 0; > #endif That's amazingly bad. First, the fields shouldn't be initialized using VATTR_NULL() in VOP_GETATTR(). Doing so breaks the preinitialization that we want to add (maybe also layering). This bug seems to be present in only the following file systems: fdescfs, mqfs, pseudofs, tmpfs, xfs The uninitialized fields should give stack garbage. Second, VNOVAL is an extremly bogus default value. For va_flags, it gives all flags set, so ls -lo output would be weird (and wrong since the flags aren't actually there). Third, va_vaflags and va_spare aren't part of the VOP_GETATTR() API. va_vaflags is an input parameter for VOP_SETATTR(). va_spare is just spare (unused). VATTR_NULL() initializes va_vaflags to 0, not VNOVAL (as is required for the usual case in VOP_SETTATR()), and it knows better than to initialize unused fields (it also doesn't initialize unnamed padding -- stack garbage in this is OK since vattr is never copied directly to userland). After deleting the bogus initializations, we're left with va_filerev, va_birthtime and va_flags. Most file systems don't support these, so they could usefully all be handled by defaulting them as in the proposed changes for va_birthtime. >>> Index: sys/ufs/ufs/ufs_vnops.c >>> ... >>> Index: sys/fs/msdosfs/msdosfs_vnops.c >>> ... >>> Index: sys/nfsclient/nfs_subs.c >> >> There are a probably more file systems that have missing or slightly >> incorrect (all zero) settings of va_birthtime. > > Many file systems misses settings of va_birthtime. That's the reason why > I initialized it in vn_stat(). I have seen four types of > initializations: > > 1) Support and set birthtime. (UFS2, tmpfs, msdosfs (not all > variants of msdosfs support birthtime), nfs4?) > > 2) Set birthtime to zero. (UFS1, nfs (nfs zeroes the vattr structure)) Zeroing it is almost as bad as VATTR_NULL()ing it. > 3) Initialize vattr with VATTR_NULL() but not birthtime explicitly. Thus > tv_sec and tv_nsec are set to -1 (VNOVAL). (devfs, xfs, portalfs, > pseudofs) > > 4) Not initialize birthtime at all. Those would be fixed by initializing the > birthtime in vn_stat(). (cd9660, hpfs, ntfs, smbfs, udf, ext2fs, > reiserfs) > I couldn't test but I suspect that also coda belongs to this group. > > So I see two ways to fix: > > - initialize birthtime in vn_stat() and add/fix explicit setting for group 2 > and 3 file systems or > - add explicit initialization to all file systems missing it > (groups 3 and 4) and fix group 2 to initialize birthtime to correct value (3) and (4) are only different due to bugs. I want to initialize va_birthtime and maybe a couple of other fields in vn_stat(), and depend on this and not initialize to the same or a worse value in case (3). This requires removing VATTR_NULL() or zeroing in some cases and checking that everything is still initialized. All old fields should be handled by explicit initialization as in ffs1, and all new fields should have defaults. > I have updated the patch per your comments and checked more file > systems. I have verified that with this patch these file systems return > correct birthtime values (real birthtime or {-1, 0} if not supported): > > UFS2, UFS1, cd9660, nfs, ext2fs, smbfs, reiserfs, xfs, ntfs, devfs, > procfs, linprocfs, tmpfs, msdosfs, portalfs, udf. I don't want the case (3). Otherwise good. > > For pseudofs I set birthtime to current time. I don't like this. birthtime should be <= all other file times. If a file system doesn't support birthtime, then it could also set birthtime = mtime, but that isn't useful either. Better set it to -1 as in ffs1 (exept ffs1 set it to 0). > > %%% > Index: sys/kern/vfs_vnops.c > =================================================================== > --- sys/kern/vfs_vnops.c (revision 180729) > +++ sys/kern/vfs_vnops.c (working copy) > @@ -703,6 +703,13 @@ vn_stat(vp, sb, active_cred, file_cred, > #endif > > vap = &vattr; > + > + /* > + * Not all file systems initialize birthtime. > + */ Change to something like: /* * Initialize defaults for new and/or unusual fields, so that file * systems which don't support these fields don't need to know * about them. */ > + vap->va_birthtime.tv_sec = -1; > + vap->va_birthtime.tv_nsec = 0; > + > error = VOP_GETATTR(vp, vap, active_cred, td); > if (error) > return (error); > Index: sys/ufs/ufs/ufs_vnops.c > =================================================================== > --- sys/ufs/ufs/ufs_vnops.c (revision 180729) > +++ sys/ufs/ufs/ufs_vnops.c (working copy) > @@ -410,7 +410,7 @@ ufs_getattr(ap) > vap->va_mtime.tv_nsec = ip->i_din1->di_mtimensec; > vap->va_ctime.tv_sec = ip->i_din1->di_ctime; > vap->va_ctime.tv_nsec = ip->i_din1->di_ctimensec; > - vap->va_birthtime.tv_sec = 0; > + vap->va_birthtime.tv_sec = -1; > vap->va_birthtime.tv_nsec = 0; > vap->va_bytes = dbtob((u_quad_t)ip->i_din1->di_blocks); > } else { Can just delete birthtime references here. Unless I've missed a bzero. > Index: sys/nfsclient/nfs_subs.c > =================================================================== > --- sys/nfsclient/nfs_subs.c (revision 180729) > +++ sys/nfsclient/nfs_subs.c (working copy) > @@ -628,6 +628,8 @@ nfs_loadattrcache(struct vnode **vpp, st > vap->va_rdev = rdev; > mtime_save = vap->va_mtime; > vap->va_mtime = mtime; > + vap->va_birthtime.tv_sec = -1; > + vap->va_birthtime.tv_nsec = 0; > vap->va_fsid = vp->v_mount->mnt_stat.f_fsid.val[0]; > if (v3) { > vap->va_nlink = fxdr_unsigned(u_short, fp->fa_nlink); Need to remove the zeroing and check other fields before defaulting birthtime here. > Index: sys/fs/pseudofs/pseudofs_vnops.c > =================================================================== > --- sys/fs/pseudofs/pseudofs_vnops.c (revision 180729) > +++ sys/fs/pseudofs/pseudofs_vnops.c (working copy) > @@ -200,7 +200,7 @@ pfs_getattr(struct vop_getattr_args *va) > vap->va_fsid = vn->v_mount->mnt_stat.f_fsid.val[0]; > vap->va_nlink = 1; > nanotime(&vap->va_ctime); > - vap->va_atime = vap->va_mtime = vap->va_ctime; > + vap->va_atime = vap->va_mtime = vap->va_birthtime = vap->va_ctime; > > switch (pn->pn_type) { > case pfstype_procdir: I don't understand why it doesn't have _any_ persistent storage for times. > Index: sys/fs/portalfs/portal_vnops.c > =================================================================== > --- sys/fs/portalfs/portal_vnops.c (revision 180729) > +++ sys/fs/portalfs/portal_vnops.c (working copy) > @@ -462,6 +462,8 @@ portal_getattr(ap) > nanotime(&vap->va_atime); > vap->va_mtime = vap->va_atime; > vap->va_ctime = vap->va_mtime; > + vap->va_birthtime.tv_sec = -1; > + vap->va_birthtime.tv_nsec = 0; > vap->va_gen = 0; > vap->va_flags = 0; > vap->va_rdev = 0; This uses both bzero() and vattr_null(). Oops, I only grepped for use of VATTR_NULL() when I looked for bogus initializations above. VATTR_NULL() is the public interface and vattr_null() is an implementation detail. Add the following file systems to the list of file systems with bogusly initialized vattr's in VOP_GETATTR(): devfs, portalfs These both misuse bzero() and vattr_null(). There are no other misuses of vattr_null(). > Index: sys/fs/devfs/devfs_vnops.c > =================================================================== > --- sys/fs/devfs/devfs_vnops.c (revision 180729) > +++ sys/fs/devfs/devfs_vnops.c (working copy) > @@ -543,6 +543,8 @@ devfs_getattr(struct vop_getattr_args *a > > vap->va_rdev = cdev2priv(dev)->cdp_inode; > } > + vap->va_birthtime.tv_sec = -1; > + vap->va_birthtime.tv_nsec = 0; > vap->va_gen = 0; > vap->va_flags = 0; > vap->va_nlink = de->de_links; See above. > Index: sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c > =================================================================== > --- sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c (revision 180729) > +++ sys/gnu/fs/xfs/FreeBSD/xfs_vnops.c (working copy) > @@ -263,6 +263,8 @@ _xfs_getattr( > vap->va_atime = va.va_atime; > vap->va_mtime = va.va_mtime; > vap->va_ctime = va.va_ctime; > + vap->va_birthtime.tv_sec = -1; > + vap->va_birthtime.tv_nsec = 0; > vap->va_gen = va.va_gen; > vap->va_rdev = va.va_rdev; > vap->va_bytes = (va.va_nblocks << BBSHIFT); See above (need to do somethign about the VATTR_NULL()). > %%% Grepping for va_.*flags in only sys/fs/ shows the following problems in VOP_SETATTR(): - coda: sets va_vaflags in a macro but never uses va_vaflags (needed for layering?) - ntfs: sets va_flags to ip->i_flag -- nonsense -- i_flag is an internal flag that has nothing to do with va_flags - nwfs: sets va_vaflags in nwfs_attr_cacheenter(), but I think nothing uses this setting. - smbfs: sets va_vaflags in smbfs_attrcachelookup() ... - tmpfs: sets va_vaflags and also va_spare. and the following non-problems: - all except msdosfs set va_flags to 0, so defaulting va_flags to 0 and deleting most settings of it would work well. Bruce From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 17:23:21 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F4B4106564A; Wed, 23 Jul 2008 17:23:21 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from ns1.jnielsen.net (ns1.jnielsen.net [69.55.238.237]) by mx1.freebsd.org (Postfix) with ESMTP id 20CA78FC1B; Wed, 23 Jul 2008 17:23:21 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from max.local (rrcs-74-218-226-253.se.biz.rr.com [74.218.226.253]) (authenticated bits=0) by ns1.jnielsen.net (8.12.9p2/8.12.9) with ESMTP id m6NHNGJP088425; Wed, 23 Jul 2008 13:23:16 -0400 (EDT) (envelope-from lists@jnielsen.net) From: John Nielsen To: freebsd-current@freebsd.org, ticso@cicely.de Date: Wed, 23 Jul 2008 13:23:37 -0400 User-Agent: KMail/1.9.7 References: <200807221128.27592.lists@jnielsen.net> <20080723082401.GC3603@garage.freebsd.pl> <20080723090450.GV58113@cicely7.cicely.de> In-Reply-To: <20080723090450.GV58113@cicely7.cicely.de> X-Face: #X5#Y*q>F:]zT!DegL3z5Xo'^MN[$8k\[4^3rN~wm=s=Uw(sW}R?3b^*f1Wu*.<=?utf-8?q?of=5F4NrS=0A=09P*M/9CpxDo!D6?=)IY1w<9B1jB; tBQf[RU-R<,I)e"$q7N7 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200807231323.37358.lists@jnielsen.net> X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on ns1.jnielsen.net X-Virus-Status: Clean Cc: Pawel Jakub Dawidek , current@freebsd.org, fs@freebsd.org Subject: Re: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 17:23:21 -0000 On Wednesday 23 July 2008, Bernd Walter wrote: > On Wed, Jul 23, 2008 at 10:24:01AM +0200, Pawel Jakub Dawidek wrote: > > On Tue, Jul 22, 2008 at 11:28:27AM -0400, John Nielsen wrote: > > > I have a FreeBSD server (which I use as a NAS device, among other > > > things) and a FreeBSD deskop. The desktop is running 7-STABLE from a > > > couple days ago and the server is running 8-CURRENT from yesterday. > > > The server has several NFS-exported ZFS'es which I mount from the > > > desktop. Since moving the shares to ZFS I've been having trouble > > > writing to them from the desktop--the mount hangs after the first or > > > second attempt. This is similar if not identical to what's described > > > in the thread > > > (from -current) I partially copied below. > > > > > > Today I discovered that the problem seems to go away if I change the > > > NFS mount options on the desktop. The following is a summary/timeline > > > of what I've tried: > > > > > > 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; > > > works 7-STABLE client, no NFS options (defaults); 7-STABLE server, > > > ZFS; broken 7-STABLE client, no NFS options (defaults); 8-CURRENT > > > server, ZFS; broken 7-STABLE client, tcp,nfsv3,-r32768,-w32768; > > > 8-CURRENT server, ZFS, works > > > > Do you need all the options here? If not, could you try to find the > > smallest subset of options that are needed to make ZFS work? Maybe > > 'nfsv3' is all that is needed, or 'tcp' alone fixes it? At work we use > > many NFS exported ZFS file systems, mostly accessed from MacOS X and > > we see no problems. > > Whenever changing NFS transport options has an influence on reliability > my first task is to verify the network. > Especially there were often hardware problems with some NIC lately, > of which some have worked around in the drivers and some not. > Disabling TSO and checksum offloading typically helps. > This kind of problem is typical on both the client and server, but also > on routers. > Of course network problems can also be on any cable, switch in between > as well, but are less typical to produce complete NFS hangs. A good strategy I'm sure. However in this case the whole network is within arm's reach, the switch and cables are brand new and I haven't had any other issues that would point to a network fault. Further, I saw the exact same behavior on a completely different set of hardware around the time of 7-BETA. In both cases the NFS shares worked fine prior to my moving the shared ports tree to ZFS. PJD- I'll try to narrow the options needed this afternoon or tomorrow and let you know what I find. JN From owner-freebsd-fs@FreeBSD.ORG Wed Jul 23 18:02:40 2008 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DAF5B1065709; Wed, 23 Jul 2008 18:02:40 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from ns1.jnielsen.net (ns1.jnielsen.net [69.55.238.237]) by mx1.freebsd.org (Postfix) with ESMTP id AB6598FC25; Wed, 23 Jul 2008 18:02:40 +0000 (UTC) (envelope-from lists@jnielsen.net) Received: from max.local (rrcs-74-218-226-253.se.biz.rr.com [74.218.226.253]) (authenticated bits=0) by ns1.jnielsen.net (8.12.9p2/8.12.9) with ESMTP id m6NI2dJP009816; Wed, 23 Jul 2008 14:02:40 -0400 (EDT) (envelope-from lists@jnielsen.net) From: John Nielsen To: freebsd-current@freebsd.org Date: Wed, 23 Jul 2008 14:02:59 -0400 User-Agent: KMail/1.9.7 References: <200807221128.27592.lists@jnielsen.net> <20080723082401.GC3603@garage.freebsd.pl> In-Reply-To: <20080723082401.GC3603@garage.freebsd.pl> X-Face: #X5#Y*q>F:]zT!DegL3z5Xo'^MN[$8k\[4^3rN~wm=s=Uw(sW}R?3b^*f1Wu*.<=?utf-8?q?of=5F4NrS=0A=09P*M/9CpxDo!D6?=)IY1w<9B1jB; tBQf[RU-R<,I)e"$q7N7 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200807231402.59893.lists@jnielsen.net> X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on ns1.jnielsen.net X-Virus-Status: Clean Cc: Pawel Jakub Dawidek , fs@freebsd.org Subject: Re: NFS writes and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jul 2008 18:02:41 -0000 On Wednesday 23 July 2008, Pawel Jakub Dawidek wrote: > On Tue, Jul 22, 2008 at 11:28:27AM -0400, John Nielsen wrote: > > I have a FreeBSD server (which I use as a NAS device, among other > > things) and a FreeBSD deskop. The desktop is running 7-STABLE from a > > couple days ago and the server is running 8-CURRENT from yesterday. The > > server has several NFS-exported ZFS'es which I mount from the desktop. > > Since moving the shares to ZFS I've been having trouble writing to them > > from the desktop--the mount hangs after the first or second attempt. > > This is similar if not identical to what's described in the thread > > (from -current) I partially copied below. > > > > Today I discovered that the problem seems to go away if I change the > > NFS mount options on the desktop. The following is a summary/timeline > > of what I've tried: > > > > 7-STABLE client, no NFS options (defaults); 7-STABLE server, UFS; works > > 7-STABLE client, no NFS options (defaults); 7-STABLE server, ZFS; > > broken 7-STABLE client, no NFS options (defaults); 8-CURRENT server, > > ZFS; broken 7-STABLE client, tcp,nfsv3,-r32768,-w32768; 8-CURRENT > > server, ZFS, works > > Do you need all the options here? If not, could you try to find the > smallest subset of options that are needed to make ZFS work? Maybe > 'nfsv3' is all that is needed, or 'tcp' alone fixes it? At work we use > many NFS exported ZFS file systems, mostly accessed from MacOS X and > we see no problems. No. "tcp" alone fixes it. That's not too surprising since nfsv3 should be a no-op. With everything _but_ "tcp" it took only slightly longer to hang the mount (not scientifically measured). With the default NFS mount mode changed to TCP in -CURRENT the workaround is already in place for FreeBSD clients, and the issue apparently never popped up on other clients--there are a few people (yourself included) who say they've never had a problem with Mac OS, e.g.. I haven't come across reports either way about Solaris or Linux. Are we the last ones to use UDP by default? Anyway, I hope this is helpful. Let me know if I should file a PR or anything. Thanks, JN From owner-freebsd-fs@FreeBSD.ORG Fri Jul 25 07:23:19 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AEB891065676 for ; Fri, 25 Jul 2008 07:23:19 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from emh04.mail.saunalahti.fi (emh04.mail.saunalahti.fi [62.142.5.110]) by mx1.freebsd.org (Postfix) with ESMTP id 5AFCB8FC18 for ; Fri, 25 Jul 2008 07:23:19 +0000 (UTC) (envelope-from jh@saunalahti.fi) Received: from saunalahti-vams (vs3-12.mail.saunalahti.fi [62.142.5.96]) by emh04-2.mail.saunalahti.fi (Postfix) with SMTP id DB31613BFD5; Fri, 25 Jul 2008 10:23:17 +0300 (EEST) Received: from emh04.mail.saunalahti.fi ([62.142.5.110]) by vs3-12.mail.saunalahti.fi ([62.142.5.96]) with SMTP (gateway) id A06A7D95F7A; Fri, 25 Jul 2008 10:23:17 +0300 Received: from a91-153-120-204.elisa-laajakaista.fi (a91-153-120-204.elisa-laajakaista.fi [91.153.120.204]) by emh04.mail.saunalahti.fi (Postfix) with SMTP id 9711541D50; Fri, 25 Jul 2008 10:23:15 +0300 (EEST) Date: Fri, 25 Jul 2008 10:23:15 +0300 From: Jaakko Heinonen To: Bruce Evans Message-ID: <20080725072314.GA807@a91-153-120-204.elisa-laajakaista.fi> References: <200806020800.m528038T072838@freefall.freebsd.org> <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> <20080722215249.K17453@delplex.bde.org> <20080723103424.GA1856@a91-153-120-204.elisa-laajakaista.fi> <20080724000618.Q16961@besplex.bde.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080724000618.Q16961@besplex.bde.org> User-Agent: Mutt/1.5.18 (2008-05-17) X-Antivirus: VAMS Cc: freebsd-fs@freebsd.org Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2008 07:23:19 -0000 On 2008-07-24, Bruce Evans wrote: > First, the fields shouldn't be initialized using VATTR_NULL() in > VOP_GETATTR(). > Second, VNOVAL is an extremly bogus default value. Except for va_fsid because there's this check in vn_stat(): if (vap->va_fsid != VNOVAL) sb->st_dev = vap->va_fsid; else sb->st_dev = vp->v_mount->mnt_stat.f_fsid.val[0]; What do you think that is a proper default value for va_rdev? Some file systems set it to 0 and some to VNOVAL. > After deleting the bogus initializations, we're left with va_filerev, > va_birthtime and va_flags. Most file systems don't support these, so > they could usefully all be handled by defaulting them as in the proposed > changes for va_birthtime. Unfortunately moving initializations to vn_stat() breaks things. For example vm_mmap_vnode() uses VOP_GETATTR() to determine which file flags are set. Thus moving va_flags initialization to vn_stat() breaks mmap. In theory this could be a potential problem for birthtime too. > > 3) Initialize vattr with VATTR_NULL() but not birthtime explicitly. Thus > > tv_sec and tv_nsec are set to -1 (VNOVAL). (devfs, xfs, portalfs, > > pseudofs) > > I don't want the case (3). Otherwise good. Thank you for your valuable comments. I will try to update the patch. -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Fri Jul 25 10:00:19 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 696CF10656BE for ; Fri, 25 Jul 2008 10:00:19 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id 9C16A8FC2A for ; Fri, 25 Jul 2008 10:00:11 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c220-239-252-11.carlnfd3.nsw.optusnet.com.au (c220-239-252-11.carlnfd3.nsw.optusnet.com.au [220.239.252.11]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id m6PA02Pa019933 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 25 Jul 2008 20:00:08 +1000 Date: Fri, 25 Jul 2008 20:00:01 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Jaakko Heinonen In-Reply-To: <20080725072314.GA807@a91-153-120-204.elisa-laajakaista.fi> Message-ID: <20080725192315.D27178@delplex.bde.org> References: <200806020800.m528038T072838@freefall.freebsd.org> <20080722075718.GA1881@a91-153-120-204.elisa-laajakaista.fi> <20080722215249.K17453@delplex.bde.org> <20080723103424.GA1856@a91-153-120-204.elisa-laajakaista.fi> <20080724000618.Q16961@besplex.bde.org> <20080725072314.GA807@a91-153-120-204.elisa-laajakaista.fi> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: birthtime initialization X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jul 2008 10:00:19 -0000 On Fri, 25 Jul 2008, Jaakko Heinonen wrote: > On 2008-07-24, Bruce Evans wrote: >> First, the fields shouldn't be initialized using VATTR_NULL() in >> VOP_GETATTR(). > >> Second, VNOVAL is an extremly bogus default value. > > Except for va_fsid because there's this check in vn_stat(): > > if (vap->va_fsid != VNOVAL) > sb->st_dev = vap->va_fsid; > else > sb->st_dev = vp->v_mount->mnt_stat.f_fsid.val[0]; Hmm, this is remarkably broken too. In VOP_GETATTR() for file systems under sys/fs: - the following file systems set va_fsid to dev2udev() (and thus defeat the better default above): cd9660, hpfs, msdosfs, ntfs, udf, unionfs - the following file systems don't seem to set va_fsid (and thus set st_dev to stack garbage) - the following file systems set va_fsid to VNOVAL via VATTR_NULL(): fdescfs - the following file systems set va_fsid to VNOVAL via vattr_null(): devfs, portalfs - the following file systems set va_fsid to VNOVAL via obscure means: coda (?) - the following file systems set va_fsid to mnt_stat.f_fsid.val[0] directly: nullfs, nwfs (?), pseudofs, smbfs (?), tmpfs > What do you think that is a proper default value for va_rdev? Some file > systems set it to 0 and some to VNOVAL. Either NODEV or VNOVAL explicitly translated late to NODEV. NODEV is (dev_t)(-1) (bug: this has parentheses in all the wrong places -- it should be ((dev_t)-1), so VNOVAL = -1 assigned to va_rdev of type dev_t equals NODEV and the identity translation works. >> After deleting the bogus initializations, we're left with va_filerev, >> va_birthtime and va_flags. Most file systems don't support these, so >> they could usefully all be handled by defaulting them as in the proposed >> changes for va_birthtime. > > Unfortunately moving initializations to vn_stat() breaks things. For > example vm_mmap_vnode() uses VOP_GETATTR() to determine which file flags > are set. Thus moving va_flags initialization to vn_stat() breaks > mmap. Oops. > In theory this could be a potential problem for birthtime too. It's a bit dangerous, but most callers to VOP_GETATTR() except vn_stat() only want a couple of fields, and hopefully none want new fields. Maybe the public interface should be vop_getattr() which sets defaults and calls VOP_GETATTR(). Does this work with layering? There is negative point to inlining most VOPs, and for VOP_GETATTR(), no one cares about the much higher overhead of setting all fields in it when only a couple are wanted. Bruce From owner-freebsd-fs@FreeBSD.ORG Sat Jul 26 18:22:28 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D081C1065673 for ; Sat, 26 Jul 2008 18:22:28 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from smtp.infidyne.com (ds9.infidyne.com [88.80.6.206]) by mx1.freebsd.org (Postfix) with ESMTP id 95FB28FC0C for ; Sat, 26 Jul 2008 18:22:28 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from c-a916e555.03-51-73746f3.cust.bredbandsbolaget.se (c-a916e555.03-51-73746f3.cust.bredbandsbolaget.se [85.229.22.169]) by smtp.infidyne.com (Postfix) with ESMTPSA id 617F18736F for ; Sat, 26 Jul 2008 20:04:47 +0200 (CEST) From: Peter Schuller To: freebsd-fs@freebsd.org Date: Sat, 26 Jul 2008 20:05:46 +0200 User-Agent: KMail/1.9.7 MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1701664.oZ5f8Fbn9c"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200807262005.54235.peter.schuller@infidyne.com> Subject: Asynchronous writing to zvols (ZFS) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jul 2008 18:22:28 -0000 --nextPart1701664.oZ5f8Fbn9c Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hello, I was finally playing around with iSCSI, having never used it before. For=20 convenience, and also because it may be a future use case for real use, I=20 used zvols for my targets. I could not get write speed above roughly 1 MB/second even in simple cases= =20 like dd:ing with an 8 MB block size, with the zvol:s on a 6-disk raidz2. Th= e=20 individual disk utilization of constituent drives remains small (< 3%).=20 Switching to a memory disk target yielded expected performance=20 characteristics. I notice that there were confirmed issues with writes to zvol:s: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496356 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496344 The problem is that I'm not really sure how to translate "snv_59" into=20 something that I can compare with the version of ZFS in FreeBSD. Do the=20 above "bugs" still apply to the ZFS version in FreeBSD, or am I hitting=20 something else? =2D-=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --nextPart1701664.oZ5f8Fbn9c Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEABECAAYFAkiLZ4IACgkQDNor2+l1i32XxQCg1XPJYeV1vIMEI3iStSrWIZW3 DcUAoJu4Q632YYqe+ZCyhZx23Znx1hMQ =GNbf -----END PGP SIGNATURE----- --nextPart1701664.oZ5f8Fbn9c-- From owner-freebsd-fs@FreeBSD.ORG Sat Jul 26 19:20:03 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2EEE21065683; Sat, 26 Jul 2008 19:20:03 +0000 (UTC) (envelope-from randy@psg.com) Received: from rip.psg.com (rip.psg.com [IPv6:2001:418:1::39]) by mx1.freebsd.org (Postfix) with ESMTP id 19E708FC13; Sat, 26 Jul 2008 19:20:03 +0000 (UTC) (envelope-from randy@psg.com) Received: from [130.129.1.147] (helo=rmac.psg.com) by rip.psg.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KMpJA-0002Hk-6P; Sat, 26 Jul 2008 19:20:00 +0000 Message-ID: <488B78DE.108@psg.com> Date: Sat, 26 Jul 2008 20:19:58 +0100 From: Randy Bush User-Agent: Thunderbird 2.0.0.14 (Macintosh/20080421) MIME-Version: 1.0 To: FreeBSD Current X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: work0 drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jul 2008 19:20:03 -0000 hardware problem caused looping page fault problems. scrub cored i zpool remove'd the offending drive for the moment and the system works. randy --- acd0: CDROM at ata0-slave UDMA33 ad4: 305245MB at ata2-master SATA150 ad6: 305245MB at ata3-master SATA150 ad8: 305245MB at ata4-master SATA150 ad10: 305245MB at ata5-master SATA150 SMP: AP CPU #1 Launched! GEOM_MIRROR: Device mirror/boot launched (1/2). GEOM_MIRROR: Device boot: rebuilding provider ad4s1. Trying to mount root from ufs:/dev/mirror/boota WARNING: / was not properly dismounted Loading configuration files. dumpon: /dev/ad4s1b: No such file or directory Entropy harvesting: interrupts ethernet point_to_point kickstart. swapon: adding /dev/mirror/bootb as swap device Starting file system checks: /dev/mirror/boota: 2690 files, 185913 used, 3875150 free (3686 frags, 483933 blocks, 0.1% fragmentation) Setting hostuuid: 7634a964-b127-0430-c299-00304891d708. Setting hostid: 0xeebf67d9. Mounting local file systems:. ad6: FAILURE - READ_DMA status=51 error=40 Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x8:0xffffffff801aaa06 stack pointer = 0x10:0xffffffff80a64bc0 frame pointer = 0x10:0x51 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3 (g_up) trap number = 12 panic: page fault cpuid = 1 From owner-freebsd-fs@FreeBSD.ORG Sat Jul 26 20:51:20 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E1FC3106567A for ; Sat, 26 Jul 2008 20:51:20 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello087206045140.chello.pl [87.206.45.140]) by mx1.freebsd.org (Postfix) with ESMTP id 4581A8FC16 for ; Sat, 26 Jul 2008 20:51:19 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 2D04D45C99; Sat, 26 Jul 2008 22:51:18 +0200 (CEST) Received: from localhost (chello087206045140.chello.pl [87.206.45.140]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 2C15645684; Sat, 26 Jul 2008 22:51:13 +0200 (CEST) Date: Sat, 26 Jul 2008 22:51:18 +0200 From: Pawel Jakub Dawidek To: Peter Schuller Message-ID: <20080726205118.GB1345@garage.freebsd.pl> References: <200807262005.54235.peter.schuller@infidyne.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="VywGB/WGlW4DM4P8" Content-Disposition: inline In-Reply-To: <200807262005.54235.peter.schuller@infidyne.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Asynchronous writing to zvols (ZFS) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jul 2008 20:51:21 -0000 --VywGB/WGlW4DM4P8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Jul 26, 2008 at 08:05:46PM +0200, Peter Schuller wrote: > Hello, >=20 > I was finally playing around with iSCSI, having never used it before. For= =20 > convenience, and also because it may be a future use case for real use, I= =20 > used zvols for my targets. >=20 > I could not get write speed above roughly 1 MB/second even in simple case= s=20 > like dd:ing with an 8 MB block size, with the zvol:s on a 6-disk raidz2. = The=20 > individual disk utilization of constituent drives remains small (< 3%).= =20 > Switching to a memory disk target yielded expected performance=20 > characteristics. >=20 > I notice that there were confirmed issues with writes to zvol:s: >=20 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496356 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496344 >=20 > The problem is that I'm not really sure how to translate "snv_59" into=20 > something that I can compare with the version of ZFS in FreeBSD. Do the= =20 > above "bugs" still apply to the ZFS version in FreeBSD, or am I hitting= =20 > something else? The problem is that we don't between async and sync I/O request on GEOM level, that's why I decided to commit a ZIL log after each write, which wasn't very smart it seems. This is handled differently in version I've in perforce. Could you try the below patch and see how it performs now? http://people.freebsd.org/~pjd/patches/zvol.c.patch --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --VywGB/WGlW4DM4P8 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFIi45GForvXbEpPzQRAizYAJ9CBIcCWCJpTeOkBiqNTLAGJX+CggCfd9nx rePeFIxubC3Hou73zMatFnU= =HOCB -----END PGP SIGNATURE----- --VywGB/WGlW4DM4P8-- From owner-freebsd-fs@FreeBSD.ORG Sat Jul 26 21:27:02 2008 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F00AB1065671 for ; Sat, 26 Jul 2008 21:27:02 +0000 (UTC) (envelope-from tom.hurst@clara.net) Received: from spork.qfe3.net (spork.qfe3.net [212.13.207.101]) by mx1.freebsd.org (Postfix) with ESMTP id B11388FC18 for ; Sat, 26 Jul 2008 21:27:02 +0000 (UTC) (envelope-from tom.hurst@clara.net) Received: from [81.104.123.28] (helo=voi.aagh.net) by spork.qfe3.net with esmtp (Exim 4.66 (FreeBSD)) (envelope-from ) id 1KMqv9-000Gz6-D4; Sat, 26 Jul 2008 22:03:19 +0100 Received: from freaky by voi.aagh.net with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1KMqv9-000GjW-3z; Sat, 26 Jul 2008 22:03:19 +0100 Date: Sat, 26 Jul 2008 22:03:19 +0100 From: Thomas Hurst To: Peter Schuller Message-ID: <20080726210319.GA57383@voi.aagh.net> Mail-Followup-To: Peter Schuller , freebsd-fs@freebsd.org References: <200807262005.54235.peter.schuller@infidyne.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ikeVEW9yuYc//A+q" Content-Disposition: inline In-Reply-To: <200807262005.54235.peter.schuller@infidyne.com> Organization: Not much. User-Agent: Mutt/1.5.18 (2008-05-17) Sender: Thomas Hurst Cc: freebsd-fs@freebsd.org Subject: Re: Asynchronous writing to zvols (ZFS) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jul 2008 21:27:03 -0000 --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable * Peter Schuller (peter.schuller@infidyne.com) wrote: > I notice that there were confirmed issues with writes to zvol:s: >=20 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496356 > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6496344 >=20 > The problem is that I'm not really sure how to translate "snv_59" into > something that I can compare with the version of ZFS in FreeBSD. Do > the above "bugs" still apply to the ZFS version in FreeBSD, or am I > hitting something else? WARNING: ZFS is considered to be an experimental feature in FreeBSD. ZFS filesystem version 6 http://opensolaris.org/os/community/zfs/version/6/ This feature is available in: Solaris Nevada Build 62 However, some of this looks faimilar from recent Perforce activity: This means that zvol needs to support this special command (DKIOCFLUSHWRITECACHE), and that it should save zil_commit() for only the times it's necessary. http://perforce.freebsd.org/changeList.cgi?CMD=3Dchanges&FSPC=3D//depot/use= r/pjd/zfs/... Change 145289 2008/07/15 by pjd@pjd_zoo=20 Improve ZVOL performance by only committing ZIL on BIO_FLUSH request, not on every BIO_WRITE request. Which looks like a good candidate. You could see if ZIL is the problem by setting vfs.zfs.zil_disable=3D1 --=20 Thomas 'Freaky' Hurst http://hur.st/ --ikeVEW9yuYc//A+q Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkiLkRUACgkQNBBHZ542MwRGrQCdFD1l/ibRPLAZ6ORvstu7lE5s H0EAoMnwI6NNTCIOk++e9PIjTalmUS9l =LrIE -----END PGP SIGNATURE----- --ikeVEW9yuYc//A+q--