From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 01:10:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FDAD106566B; Sun, 27 Nov 2011 01:10:00 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 39AA58FC0C; Sun, 27 Nov 2011 01:09:59 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAJiM0U6DaFvO/2dsb2JhbABEhQOnBIFyAQEFIzIkGxgCAg0ZAlkGrEyQcoEwgi+FbYEWBIghjCmSKQ X-IronPort-AV: E=Sophos;i="4.69,577,1315195200"; d="scan'208";a="147187645" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 26 Nov 2011 20:09:59 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 6E382B3F0A; Sat, 26 Nov 2011 20:09:59 -0500 (EST) Date: Sat, 26 Nov 2011 20:09:59 -0500 (EST) From: Rick Macklem To: Pawel Jakub Dawidek Message-ID: <1145879532.430614.1322356199403.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111126165823.GD8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@FreeBSD.org Subject: Re: NFS corruption in recent HEAD. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 01:10:00 -0000 Pawel Jakub Dawidek wrote: > Hi. > > I'm booting machine over the network using new NFS client and I'm > getting those warnings on boot: > > /etc/rc.subr: 666: Syntax error: "(" unexpected (expecting ";;") > > I inspected the /etc/rc.subr file on the client and here is the > problem. > > At offset 16384 the file on the client contains data from offset 32768 > from the server. It contains exactly 7599 bytes from the wrong place > at > this offset. All next bytes up to offset 32768 are all zeros. > > So the data is identical for ranges <0-16384) and <32768-40367) (to > the > end of the file). > > Then range <16384-23983) contains data from <32768-40367) and > <23984-32768) is all zeros. Probably if file would be bigger there > will > be no zeros, but more data from wrong block. > > It seems that the client is asking for third block where it should ask > for second block (or the server is providing wrong block). > > Server is running '8.2-STABLE #17: Wed Sep 28 10:30:02 EDT 2011'. > > BTW. When I copy the file on the client using cp(1), the copy is not > corrupted (cp(1) is using mmap(2)?). But when I do > 'cat /etc/rc.subr > /foo' the corruptions is visible in new file too. > Hmm, the only thing I've done recently (r227493) was make the "readahead" command line option actually work. Prior to r227493, the readahead was always set to 1 and the command line option was ignored (overwritten by a default assignment of 1 after the command options were processed). Do you happen to specify "readahead=" for this case? If so, try taking the command line option off and see if that fixes the problem. (If it does, there must be a bug for readaheads != 1.) Other than that, I can't think of any recent change that might be related to the above. rick > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 02:03:47 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87963106566C; Sun, 27 Nov 2011 02:03:47 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 2C9B18FC16; Sun, 27 Nov 2011 02:03:46 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAIeZ0U6DaFvO/2dsb2JhbABEhQOnBIFyAQEFIzIkGxgCAg0ZAlkGrESQcoEwgi+FbYEWBIghjCmSKQ X-IronPort-AV: E=Sophos;i="4.69,577,1315195200"; d="scan'208";a="147189245" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 26 Nov 2011 21:03:46 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 0BD5FB3FE6; Sat, 26 Nov 2011 21:03:46 -0500 (EST) Date: Sat, 26 Nov 2011 21:03:46 -0500 (EST) From: Rick Macklem To: Pawel Jakub Dawidek Message-ID: <1798569802.431412.1322359425997.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111126165823.GD8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@FreeBSD.org Subject: Re: NFS corruption in recent HEAD. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 02:03:47 -0000 Pawel Jakub Dawidek wrote: > Hi. > > I'm booting machine over the network using new NFS client and I'm > getting those warnings on boot: > > /etc/rc.subr: 666: Syntax error: "(" unexpected (expecting ";;") > > I inspected the /etc/rc.subr file on the client and here is the > problem. > > At offset 16384 the file on the client contains data from offset 32768 > from the server. It contains exactly 7599 bytes from the wrong place > at > this offset. All next bytes up to offset 32768 are all zeros. > > So the data is identical for ranges <0-16384) and <32768-40367) (to > the > end of the file). > > Then range <16384-23983) contains data from <32768-40367) and > <23984-32768) is all zeros. Probably if file would be bigger there > will > be no zeros, but more data from wrong block. > > It seems that the client is asking for third block where it should ask > for second block (or the server is providing wrong block). > > Server is running '8.2-STABLE #17: Wed Sep 28 10:30:02 EDT 2011'. > > BTW. When I copy the file on the client using cp(1), the copy is not > corrupted (cp(1) is using mmap(2)?). But when I do > 'cat /etc/rc.subr > /foo' the corruptions is visible in new file too. > Oh, and maybe you could try reverting r227543 in the client (assuming the client is post-r227543). Maybe that file's vnode type isn't set to VREG early in the diskless booting and needs the ncl_flush() for some reason. I don't actually have a bug that needs r227543 to fix it. It just seemed incorrect to flush non-VREG files (particularily VDIR). As such, reverting it wouldn't be a big deal. rick > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 04:58:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 894AA1065670 for ; Sun, 27 Nov 2011 04:58:49 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by mx1.freebsd.org (Postfix) with ESMTP id 49F9C8FC08 for ; Sun, 27 Nov 2011 04:58:49 +0000 (UTC) Received: from omta22.westchester.pa.mail.comcast.net ([76.96.62.73]) by qmta12.westchester.pa.mail.comcast.net with comcast id 24uY1i0041ap0As5C4ypA7; Sun, 27 Nov 2011 04:58:49 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta22.westchester.pa.mail.comcast.net with comcast id 24yn1i00d1t3BNj3i4yovM; Sun, 27 Nov 2011 04:58:49 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6BC60102C19; Sat, 26 Nov 2011 20:58:46 -0800 (PST) Date: Sat, 26 Nov 2011 20:58:46 -0800 From: Jeremy Chadwick To: Mark Felder Message-ID: <20111127045846.GA54467@icarus.home.lan> References: <95d00c1b714837aa32e7da72bc4afd03@feld.me> <20111126104840.GA8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: zfs i/o hangs on 9-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 04:58:49 -0000 On Sat, Nov 26, 2011 at 04:47:35PM -0600, Mark Felder wrote: > It appears that I'm mistaken about those messages then . However this does both happen on my AMD x6 and Intel Atom machines with different hard drives, controllers, etc. I feel it would be unlikely to be hardware. > > Unfortunately the procstat command is probably of no use because I can't interact with the console or ssh for the periods of time when it is hanging (sometimes in excess of a minute). Zpool scrubs come up clean and I never see any errors reported. I've been running this hardware for 2 years and v28 for quite some time. It doesn't seem like it started happening until I upgraded to a build past RC1. I don't know where to find RC1 media and I don't know the svn revision of RC1 so I haven't tried. The kernel backtrace you provided indicates a problem in pf(4), not ZFS. What piece am I missing? -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 11:24:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6986106566C for ; Sun, 27 Nov 2011 11:24:24 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 639AC8FC14 for ; Sun, 27 Nov 2011 11:24:24 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:5974:a369:b987:bc4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 78B214AC1C; Sun, 27 Nov 2011 15:24:22 +0400 (MSK) Date: Sun, 27 Nov 2011 15:24:14 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1381381670.20111127152414@serebryakov.spb.ru> To: Kostik Belousov In-Reply-To: <20111126084151.GH50300@deviant.kiev.zoral.com.ua> References: <20111123194444.GE50300@deviant.kiev.zoral.com.ua> <201111260725.pAQ7PDow056289@chez.mckusick.com> <20111126080351.GD50300@deviant.kiev.zoral.com.ua> <1961318852.20111126121354@serebryakov.spb.ru> <20111126084151.GH50300@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 11:24:24 -0000 Hello, Kostik. You wrote 26 =CE=CF=D1=C2=D2=D1 2011 =C7., 12:41:51: > on the operation end. In fact, there is inherited uglyness due to async > nature, namely, the kernel-owned buffer locks. Getting rid of them would > be much more useful then breaking UFS. Why do you name it breaking? How additional piece of meta-information cou= ld break UFS? > The non-broken driver must not return the 'completed' bio into the up > queue until write is sent to hardware and hardware reported the completio= n. So, hold bio without completion for, say, 5 minutes, will be Ok? > Raid controllers which aggressively cache the writes use nvram or > battery backups, and do not allow to turn on write cache if battery is > non-functional. I had not seen SU inconsistencies on RAID 6 on mfi(4), It is not always true. And it could be not true for network attached storage, as here is too many variables in equation in such case. Yes, good controller should do this, I could not agree more. But it is not always possible, unfortunately. > despite one our machine has unfortunate habit of dropping boot disk over > SATA channel each week, for 2 years. Great! But even battery-backed (read: UPS) software realization is not protected from OS crashes. So, it is impossible to implement software RAID5, which plays nicely with UFS (in case of crash -- until ehre is no crash, everyhting is perfect), now. Ok, you could say ``we don't need it at all,'' but I could not agree with this statement. Yes, I'm biased here. But, really, I see some interest to software RAID5 on FreeBSD now. > You again missed the point - if metadata is not reordable, but user > data is, you get security issues. They are similar (but inverse) to what > I described in the previous paragraph. In case of crash -- yes. But, IMHO, in case of crash here could be scenario when some information is leaked in any case. If here is no crash, you haven't security issues. Because every read will return actual information, either from write cache, or from plates. Inconsistent cache implementation is bad thing, for sure, but it is orthogonal question to what we discuss here. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 11:27:47 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B56F8106564A for ; Sun, 27 Nov 2011 11:27:47 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7BAB48FC0A for ; Sun, 27 Nov 2011 11:27:47 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:5974:a369:b987:bc4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 4801B4AC1C; Sun, 27 Nov 2011 15:27:46 +0400 (MSK) Date: Sun, 27 Nov 2011 15:27:39 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <469709203.20111127152739@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201111261712.pAQHCY8G081783@chez.mckusick.com> References: <147455115.20111126115248@serebryakov.spb.ru> <201111261712.pAQHCY8G081783@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 11:27:47 -0000 Hello, Kirk. You wrote 26 =ED=EE=FF=E1=F0=FF 2011 =E3., 21:12:34: > Kostik has it right. The requirement for SU and SU+J is simply > that the underlying I/O subsystem not issue bio_done on a write > until it is on stable store. If the I/O subsystem wants to cache > it for a while (multiple seconds) before writing it to disk that > is fine (SU thinks in terms of 30-second intervals). The only Ok, These "multiple seconds" are good news. > thing that SU requires is that the subsystem NOT lie by issuing > the bio_done before it has committed the data to disk. Perhaps > what we need is a "delay acknowledgement until done' flag to make > this clear. Ok, such flag (and "30 seconds is Ok" statement) will be enough for me (RAID5) to implement robust but high-performance write queuing. But FSYNC flag will be nice and useful too, IMHO. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 11:28:02 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04B22106564A; Sun, 27 Nov 2011 11:28:02 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id CFEC08FC17; Sun, 27 Nov 2011 11:28:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pARBS1cW081929; Sun, 27 Nov 2011 11:28:01 GMT (envelope-from jh@freefall.freebsd.org) Received: (from jh@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pARBS1iP081925; Sun, 27 Nov 2011 11:28:01 GMT (envelope-from jh) Date: Sun, 27 Nov 2011 11:28:01 GMT Message-Id: <201111271128.pARBS1iP081925@freefall.freebsd.org> To: yalur2mail.ru@FreeBSD.org, jh@FreeBSD.org, freebsd-fs@FreeBSD.org From: jh@FreeBSD.org Cc: Subject: Re: kern/127375: [zfs] If vm.kmem_size_max>"1073741823" then write speed to ZFS pool decrease 3-5 times. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 11:28:02 -0000 Synopsis: [zfs] If vm.kmem_size_max>"1073741823" then write speed to ZFS pool decrease 3-5 times. State-Changed-From-To: feedback->closed State-Changed-By: jh State-Changed-When: Sun Nov 27 11:28:01 UTC 2011 State-Changed-Why: Feedback timeout. http://www.freebsd.org/cgi/query-pr.cgi?pr=127375 From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 11:32:25 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5ED521065672 for ; Sun, 27 Nov 2011 11:32:25 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2593F8FC18 for ; Sun, 27 Nov 2011 11:32:25 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:5974:a369:b987:bc4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 5C25F4AC31; Sun, 27 Nov 2011 15:32:24 +0400 (MSK) Date: Sun, 27 Nov 2011 15:32:17 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1356091030.20111127153217@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201111261712.pAQHCY8G081783@chez.mckusick.com> References: <147455115.20111126115248@serebryakov.spb.ru> <201111261712.pAQHCY8G081783@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 11:32:25 -0000 Hello, Kirk. You wrote 26 =ED=EE=FF=E1=F0=FF 2011 =E3., 21:12:34: > and other issues raised, this is clearly a bad idea. So, my question > to you is how can we reliably get the underlying systems to not lie > to us about the stability of our I/O request? BTW, strict FSYNC flag will be very useful for transactional databases. It is not too good, that now GEOM doesn't know about such writes and could dealy them and could not mark them for underlying hardware. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 11:37:17 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6C42C106567B for ; Sun, 27 Nov 2011 11:37:17 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 1FD688FC24 for ; Sun, 27 Nov 2011 11:37:16 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 3D6427E7; Sun, 27 Nov 2011 12:00:23 +0100 (CET) Date: Sun, 27 Nov 2011 11:59:13 +0100 From: Pawel Jakub Dawidek To: Rick Macklem Message-ID: <20111127105913.GH8794@garage.freebsd.pl> References: <20111126165823.GD8794@garage.freebsd.pl> <1798569802.431412.1322359425997.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Dx9iWuMxHO1cCoFc" Content-Disposition: inline In-Reply-To: <1798569802.431412.1322359425997.JavaMail.root@erie.cs.uoguelph.ca> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@FreeBSD.org Subject: Re: NFS corruption in recent HEAD. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 11:37:17 -0000 --Dx9iWuMxHO1cCoFc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Nov 26, 2011 at 09:03:46PM -0500, Rick Macklem wrote: > Pawel Jakub Dawidek wrote: > > Hi. > >=20 > > I'm booting machine over the network using new NFS client and I'm > > getting those warnings on boot: > >=20 > > /etc/rc.subr: 666: Syntax error: "(" unexpected (expecting ";;") [...] > Oh, and maybe you could try reverting r227543 in the client (assuming > the client is post-r227543). Maybe that file's vnode type isn't set to > VREG early in the diskless booting and needs the ncl_flush() for some > reason. >=20 > I don't actually have a bug that needs r227543 to fix it. It just seemed > incorrect to flush non-VREG files (particularily VDIR). As such, reverting > it wouldn't be a big deal. I haven't tried reverting anything yet, but I think I was able to reproduce this with old NFS client as well. The problem goes away when I comment out root mount point from /etc/fstab or remove mntudp from mount options. NFS root is mounted using TCP, AFAIK and it probably happens when startup scripts (rc.d/mountcritremote) remounts root with mntudp flag. The rc.subr warning starts to appear just after mountcritremote is called. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --Dx9iWuMxHO1cCoFc Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk7SGAEACgkQForvXbEpPzR0jgCfSk7qlWbtlihOUj/0Cw2osf0T baMAoL+2PaO1uOzI02VySRccGJuqw0/Q =eelX -----END PGP SIGNATURE----- --Dx9iWuMxHO1cCoFc-- From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 14:27:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 86973106564A; Sun, 27 Nov 2011 14:27:12 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E56A08FC0A; Sun, 27 Nov 2011 14:27:11 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAChI0k6DaFvO/2dsb2JhbABEhQOnBIFyAQEFIwQuJBsYAgINGQJZBqwokGeBMIIvhW2BFgSIIYwpkik X-IronPort-AV: E=Sophos;i="4.69,579,1315195200"; d="scan'208";a="147216786" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 27 Nov 2011 09:27:10 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id DBEC3B3F54; Sun, 27 Nov 2011 09:27:10 -0500 (EST) Date: Sun, 27 Nov 2011 09:27:10 -0500 (EST) From: Rick Macklem To: Pawel Jakub Dawidek Message-ID: <1308447178.437372.1322404030886.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111127105913.GH8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@FreeBSD.org Subject: Re: NFS corruption in recent HEAD. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 14:27:12 -0000 Pawel Jakub Dawidek wrote: > On Sat, Nov 26, 2011 at 09:03:46PM -0500, Rick Macklem wrote: > > Pawel Jakub Dawidek wrote: > > > Hi. > > > > > > I'm booting machine over the network using new NFS client and I'm > > > getting those warnings on boot: > > > > > > /etc/rc.subr: 666: Syntax error: "(" unexpected (expecting ";;") > [...] > > Oh, and maybe you could try reverting r227543 in the client > > (assuming > > the client is post-r227543). Maybe that file's vnode type isn't set > > to > > VREG early in the diskless booting and needs the ncl_flush() for > > some > > reason. > > > > I don't actually have a bug that needs r227543 to fix it. It just > > seemed > > incorrect to flush non-VREG files (particularily VDIR). As such, > > reverting > > it wouldn't be a big deal. > > I haven't tried reverting anything yet, but I think I was able to > reproduce this with old NFS client as well. The problem goes away when > I > comment out root mount point from /etc/fstab or remove mntudp from > mount > options. NFS root is mounted using TCP, AFAIK and it probably happens > when startup scripts (rc.d/mountcritremote) remounts root with mntudp > flag. The rc.subr warning starts to appear just after mountcritremote > is > called. > Ok, I'm not surprised that the recent commits I've done weren't related, since I couldn't think how they would have been. (Although the "readahead" option can now override the default of 1, the readahead code hasn't changed in ages. The new NFS code is just a clone of the old stuff.) And I doubt fsync() was being called for the file (plus it should be VREG), so I can't think how that might have affected it, either. It sounds like the mount update that changed TCP->UDP caused it. I'll take a closer look at that code. Maybe shutting down the TCP socket results in a read failing, leaving an invalid buffer cache block that isn't marked invalid properly (or something like that)? Still seems to be a mystery to me. Thanks for digging into this and please let me know if you figure out more, rick > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 15:47:32 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 232951065673; Sun, 27 Nov 2011 15:47:32 +0000 (UTC) (envelope-from kevlo@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EF23C8FC19; Sun, 27 Nov 2011 15:47:31 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pARFlVal025191; Sun, 27 Nov 2011 15:47:31 GMT (envelope-from kevlo@freefall.freebsd.org) Received: (from kevlo@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pARFlVvW025186; Sun, 27 Nov 2011 15:47:31 GMT (envelope-from kevlo) Date: Sun, 27 Nov 2011 15:47:31 GMT Message-Id: <201111271547.pARFlVvW025186@freefall.freebsd.org> To: yuri@tsoft.com, kevlo@FreeBSD.org, freebsd-fs@FreeBSD.org From: kevlo@FreeBSD.org Cc: Subject: Re: kern/133174: [msdosfs] [patch] msdosfs must support multibyte international characters in file names X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 15:47:32 -0000 Synopsis: [msdosfs] [patch] msdosfs must support multibyte international characters in file names State-Changed-From-To: open->closed State-Changed-By: kevlo State-Changed-When: Sun Nov 27 15:45:18 UTC 2011 State-Changed-Why: Fixed. Committed to HEAD(r227650 and r228023). http://www.freebsd.org/cgi/query-pr.cgi?pr=133174 From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 15:50:37 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E12D106566B; Sun, 27 Nov 2011 15:50:37 +0000 (UTC) (envelope-from kevlo@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 363178FC14; Sun, 27 Nov 2011 15:50:37 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pARFobWB027737; Sun, 27 Nov 2011 15:50:37 GMT (envelope-from kevlo@freefall.freebsd.org) Received: (from kevlo@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pARFoaGK027640; Sun, 27 Nov 2011 15:50:36 GMT (envelope-from kevlo) Date: Sun, 27 Nov 2011 15:50:36 GMT Message-Id: <201111271550.pARFoaGK027640@freefall.freebsd.org> To: m.meelis@easybow.com, kevlo@FreeBSD.org, freebsd-fs@FreeBSD.org From: kevlo@FreeBSD.org Cc: Subject: Re: kern/151845: [smbfs] [patch] smbfs should be upgraded to support Unicode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 15:50:37 -0000 Synopsis: [smbfs] [patch] smbfs should be upgraded to support Unicode State-Changed-From-To: open->closed State-Changed-By: kevlo State-Changed-When: Sun Nov 27 15:49:55 UTC 2011 State-Changed-Why: Fixed. Commited to HEAD(r227650 and r228023). http://www.freebsd.org/cgi/query-pr.cgi?pr=151845 From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 18:41:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1FCFE106567B; Sun, 27 Nov 2011 18:41:26 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 9664B8FC1A; Sun, 27 Nov 2011 18:41:25 +0000 (UTC) Received: from alf.home (alf.kiev.zoral.com.ua [10.1.1.177]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id pARIfLRs051807 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 27 Nov 2011 20:41:21 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from alf.home (kostik@localhost [127.0.0.1]) by alf.home (8.14.5/8.14.5) with ESMTP id pARIfLWu065382; Sun, 27 Nov 2011 20:41:21 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by alf.home (8.14.5/8.14.5/Submit) id pARIfKG6065381; Sun, 27 Nov 2011 20:41:20 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: alf.home: kostik set sender to kostikbel@gmail.com using -f Date: Sun, 27 Nov 2011 20:41:20 +0200 From: Kostik Belousov To: Lev Serebryakov Message-ID: <20111127184120.GT50300@deviant.kiev.zoral.com.ua> References: <20111123194444.GE50300@deviant.kiev.zoral.com.ua> <201111260725.pAQ7PDow056289@chez.mckusick.com> <20111126080351.GD50300@deviant.kiev.zoral.com.ua> <1961318852.20111126121354@serebryakov.spb.ru> <20111126084151.GH50300@deviant.kiev.zoral.com.ua> <1381381670.20111127152414@serebryakov.spb.ru> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="M3MVXBHeTEnycIo5" Content-Disposition: inline In-Reply-To: <1381381670.20111127152414@serebryakov.spb.ru> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 18:41:26 -0000 --M3MVXBHeTEnycIo5 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Nov 27, 2011 at 03:24:14PM +0400, Lev Serebryakov wrote: > Hello, Kostik. > You wrote 26 =CE=CF=D1=C2=D2=D1 2011 =C7., 12:41:51: >=20 > > on the operation end. In fact, there is inherited uglyness due to async > > nature, namely, the kernel-owned buffer locks. Getting rid of them would > > be much more useful then breaking UFS. > Why do you name it breaking? How additional piece of meta-information c= ould break > UFS? Because disabling reordering of the writes issued by UFS slows it down by a factor of 3-10 times. >=20 > > The non-broken driver must not return the 'completed' bio into the up > > queue until write is sent to hardware and hardware reported the complet= ion. > So, hold bio without completion for, say, 5 minutes, will be Ok? It is up to users of your driver to decide is it Ok or no. For UFS/SU, the only consequence will be the accumulation of the workitems in memory that track dependencies of other metadata buffers on the delayed one. For UFS/SU+J, if some buffer is delayed indefinitely, the journal might overflow. >=20 > > Raid controllers which aggressively cache the writes use nvram or > > battery backups, and do not allow to turn on write cache if battery is > > non-functional. I had not seen SU inconsistencies on RAID 6 on mfi(4), > It is not always true. And it could be not true for network > attached storage, as here is too many variables in equation in such > case. Yes, good controller should do this, I could not agree more. But > it is not always possible, unfortunately. Yous claims are not backed by any facts. Please inform us on the models and revisions of the firmware for the devices you declare are broken in the described ways. Also, please reference the documentation which states that devices behave in such a way. At least I would know what to avoid. >=20 > > despite one our machine has unfortunate habit of dropping boot disk over > > SATA channel each week, for 2 years. > Great! But even battery-backed (read: UPS) software realization is > not protected from OS crashes. So, it is impossible to implement > software RAID5, which plays nicely with UFS (in case of crash -- > until ehre is no crash, everyhting is perfect), now. Ok, you could > say ``we don't need it at all,'' but I could not agree with this > statement. Yes, I'm biased here. But, really, I see some interest to > software RAID5 on FreeBSD now. Software RAID5 might loose the checksum block due to kernel or power failure. This is not different from RAID1 declared inconsistent after the unclean stop. Your claim is not backed by facts, again. >=20 > > You again missed the point - if metadata is not reordable, but user > > data is, you get security issues. They are similar (but inverse) to what > > I described in the previous paragraph. > In case of crash -- yes. But, IMHO, in case of crash here could be > scenario when some information is leaked in any case. If here is no > crash, you haven't security issues. Because every read will return > actual information, either from write cache, or from plates. > Inconsistent cache implementation is bad thing, for sure, but it is > orthogonal question to what we discuss here. I cannot understand how you answer is related to my statement. --M3MVXBHeTEnycIo5 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAk7ShFAACgkQC3+MBN1Mb4ispQCfa9fVFO8CZ6dNcpeqxxVWbxqQ Q0oAoN8Mhg+VlkLLhSbx4xooATs6l80g =AHnQ -----END PGP SIGNATURE----- --M3MVXBHeTEnycIo5-- From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 19:09:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0C8CF1065782 for ; Sun, 27 Nov 2011 19:09:50 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id BA5E88FC08 for ; Sun, 27 Nov 2011 19:09:49 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1RUk6m-0000Bt-97 for freebsd-fs@freebsd.org; Sun, 27 Nov 2011 20:09:48 +0100 Received: from cpe-188-129-112-68.dynamic.amis.hr ([188.129.112.68]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 27 Nov 2011 20:09:48 +0100 Received: from ivoras by cpe-188-129-112-68.dynamic.amis.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 27 Nov 2011 20:09:48 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Sun, 27 Nov 2011 20:09:34 +0100 Lines: 16 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cpe-188-129-112-68.dynamic.amis.hr User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 Subject: lseek() scalability and PostgreSQL X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 19:09:50 -0000 Interesting tidbit from PostgreSQL optimization: http://rhaas.blogspot.com/2011/11/linux-lseek-scalability.html """ PostgreSQL calls lseek quite frequently (to determine the file length, not to actually move the file pointer), and due to the performance enhancements in 9.2devel, it's now much easier to hit the contention problems that can be caused by frequently acquiring and releasing the inode mutex. But it looks like this should be fixed in Linux 3.2, which is now at rc1, and therefore on track to be released well before PostgreSQL 9.2. """ It looks like this might cause some of the things I've observed on many-core servers. From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 19:11:43 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3024106566C for ; Sun, 27 Nov 2011 19:11:43 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2EFC38FC1A for ; Sun, 27 Nov 2011 19:11:43 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:5974:a369:b987:bc4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 3FFCC4AC1C; Sun, 27 Nov 2011 23:11:41 +0400 (MSK) Date: Sun, 27 Nov 2011 23:11:33 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <733839967.20111127231133@serebryakov.spb.ru> To: Kostik Belousov In-Reply-To: <20111127184120.GT50300@deviant.kiev.zoral.com.ua> References: <20111123194444.GE50300@deviant.kiev.zoral.com.ua> <201111260725.pAQ7PDow056289@chez.mckusick.com> <20111126080351.GD50300@deviant.kiev.zoral.com.ua> <1961318852.20111126121354@serebryakov.spb.ru> <20111126084151.GH50300@deviant.kiev.zoral.com.ua> <1381381670.20111127152414@serebryakov.spb.ru> <20111127184120.GT50300@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: quoted-printable Cc: Kirk McKusick , freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 19:11:43 -0000 Hello, Kostik. You wrote 27 =CE=CF=D1=C2=D2=D1 2011 =C7., 22:41:20: >> UFS? > Because disabling reordering of the writes issued by UFS slows it down > by a factor of 3-10 times. Any writes or SU/SU+J-related ones? Ok, let mark them with flag "NOCACHE", which will hint, that they should not be delayed too much. BTW, I'm lost here. Above, you claim, that disabling reordering of the writes impact performance. Below, you claim, that reordering of writes could lead do security problems. >> It is not always true. And it could be not true for network >> attached storage, as here is too many variables in equation in such >> case. Yes, good controller should do this, I could not agree more. But >> it is not always possible, unfortunately. > Yous claims are not backed by any facts. Please inform us on the models > and revisions of the firmware for the devices you declare are broken > in the described ways. Also, please reference the documentation which > states that devices behave in such a way. Some of RAID controllers ALLOWS to enable write cache without batteries, with big warnings, that it is unsafe. I could not name exact models (Ok, I don;t work with this hardware a much and I don't have infinite memory, unfortunately), and they don't do it by default. >> > despite one our machine has unfortunate habit of dropping boot disk ov= er >> > SATA channel each week, for 2 years. >> Great! But even battery-backed (read: UPS) software realization is >> not protected from OS crashes. So, it is impossible to implement >> software RAID5, which plays nicely with UFS (in case of crash -- >> until ehre is no crash, everyhting is perfect), now. Ok, you could >> say ``we don't need it at all,'' but I could not agree with this >> statement. Yes, I'm biased here. But, really, I see some interest to >> software RAID5 on FreeBSD now. > Software RAID5 might loose the checksum block due to kernel or power > failure. This is not different from RAID1 declared inconsistent after > the unclean stop. > Your claim is not backed by facts, again. I speak about other aspect: performance. If you want to have descent write performance, you need to have write cache, or you'll end with N-2 reads for each write, and such RAID5 will be almost unusable (I've performed many tests here, in most cases several minutes on write cache almost guarantee on may workloads, that whole set of stripes is gathered and no reads to re-calculate cheksums will be needed). To support descent write cache, implementation needs to report cached data as written one. >> > You again missed the point - if metadata is not reordable, but user >> > data is, you get security issues. They are similar (but inverse) to wh= at >> > I described in the previous paragraph. >> In case of crash -- yes. But, IMHO, in case of crash here could be >> scenario when some information is leaked in any case. If here is no >> crash, you haven't security issues. Because every read will return >> actual information, either from write cache, or from plates. >> Inconsistent cache implementation is bad thing, for sure, but it is >> orthogonal question to what we discuss here. > I cannot understand how you answer is related to my statement. In case of crash, HDD could not be able to flush its hardware cache, and this security issue could arise. Without crash software cache, which reorders writes, are perfectly Ok, as any other cache (from security point of view). --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sun Nov 27 20:12:03 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx2.freebsd.org (mx2.freebsd.org [IPv6:2001:4f8:fff6::35]) by hub.freebsd.org (Postfix) with ESMTP id 4A92F1065670; Sun, 27 Nov 2011 20:12:03 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from [127.0.0.1] (hub.freebsd.org [IPv6:2001:4f8:fff6::36]) by mx2.freebsd.org (Postfix) with ESMTP id 2AED415616C; Sun, 27 Nov 2011 20:11:32 +0000 (UTC) Message-ID: <4ED29973.2040604@FreeBSD.org> Date: Sun, 27 Nov 2011 12:11:31 -0800 From: Doug Barton Organization: http://www.FreeBSD.org/ User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: kevlo@FreeBSD.org References: <201111271547.pARFlVvW025186@freefall.freebsd.org> In-Reply-To: <201111271547.pARFlVvW025186@freefall.freebsd.org> X-Enigmail-Version: 1.3.3 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, yuri@tsoft.com Subject: Re: kern/133174: [msdosfs] [patch] msdosfs must support multibyte international characters in file names X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Nov 2011 20:12:03 -0000 On 11/27/2011 7:47 AM, kevlo@FreeBSD.org wrote: > Synopsis: [msdosfs] [patch] msdosfs must support multibyte international characters in file names > > State-Changed-From-To: open->closed > State-Changed-By: kevlo > State-Changed-When: Sun Nov 27 15:45:18 UTC 2011 > State-Changed-Why: > Fixed. Committed to HEAD(r227650 and r228023). > > http://www.freebsd.org/cgi/query-pr.cgi?pr=133174 Traditionally we either wait to close the PR until the change has been MFC'ed, or we indicate that the change won't be MFC'ed in the PR. Doug -- "We could put the whole Internet into a book." "Too practical." Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 02:21:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A94B31065670; Mon, 28 Nov 2011 02:21:26 +0000 (UTC) (envelope-from feld@feld.me) Received: from mwi1.coffeenet.org (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 5E23E8FC12; Mon, 28 Nov 2011 02:21:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=Message-Id:References:In-Reply-To:Subject:To:From:Date:Content-Type:Mime-Version; bh=nu5wCuFJB2x6UzTAvYyJpklGSW8NmseEG6Dk1u/zq+g=; b=YiLnjfIhRY3zod1avQDQ1gYGDef4XaNfrbRCuslWIOiRauxp911nqpYiqWnYHHeXLzIWClDOeOaVoJgzOYHDN+95SMDztj9QT78rpsxjzMgEL6/C3EeHmZj99DThLP0+; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by mwi1.coffeenet.org with esmtp (Exim 4.77 (FreeBSD)) (envelope-from ) id 1RUqqS-00086j-Ch; Sun, 27 Nov 2011 20:21:25 -0600 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpsa id 1322446878-1863-1862/5/4; Mon, 28 Nov 2011 02:21:18 +0000 Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Date: Sun, 27 Nov 2011 20:21:18 -0600 From: Mark Felder To: freebsd-fs@freebsd.org, freebsd-current@freebsd.org In-Reply-To: <881f876f-6f27-49fd-b6c7-edbe6493ec75@email.android.com> References: <95d00c1b714837aa32e7da72bc4afd03@feld.me> <20111126104840.GA8794@garage.freebsd.pl> <881f876f-6f27-49fd-b6c7-edbe6493ec75@email.android.com> Message-Id: <820791346f600ea50ff9ebd68e30c059@feld.me> X-Sender: feld@feld.me User-Agent: Roundcube Webmail/0.6 X-SA-Score: -1.0 Cc: Subject: Re: zfs i/o hangs on 9-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 02:21:26 -0000 After many hours of testing, reproducing, and testing again I've finally been able to narrow down what the real issue is and it's not ZFS as I suspected. After completely turning off all NFS functionality and serving my files over Samba I haven't had a single issue. It seems there is something going on with the new NFS code (I serve out over v4, but reproduced it last week with v3) and my media player box, a Popcorn Hour A-200 which is running Linux. If I can cobble some hardware together and place it between so I can do some tcpdumps I will provide that data so perhaps someone can understand what's going on. If this is due to a badly behaving client this is potentially a DoS on the server. Regards, Mark From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 03:27:34 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AE9F41065670; Mon, 28 Nov 2011 03:27:34 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8620C8FC0A; Mon, 28 Nov 2011 03:27:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pAS3RYIV020706; Mon, 28 Nov 2011 03:27:34 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pAS3RYSH020702; Mon, 28 Nov 2011 03:27:34 GMT (envelope-from linimon) Date: Mon, 28 Nov 2011 03:27:34 GMT Message-Id: <201111280327.pAS3RYSH020702@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/162860: [zfs] Cannot share ZFS filesystem to hosts with a hyphen in hostname X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 03:27:34 -0000 Old Synopsis: Cannot share ZFS filesystem to hosts with a hyphen in hostname New Synopsis: [zfs] Cannot share ZFS filesystem to hosts with a hyphen in hostname Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Nov 28 03:27:17 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=162860 From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 07:48:34 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE4F2106566B; Mon, 28 Nov 2011 07:48:34 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:150:6101::4]) by mx1.freebsd.org (Postfix) with ESMTP id 6AD238FC0C; Mon, 28 Nov 2011 07:48:34 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.2]) by mail.vx.sk (Postfix) with ESMTP id 9ED3215C59; Mon, 28 Nov 2011 08:48:33 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk by core.vx.sk (amavisd-new, unix socket) with LMTP id REYqUcXXQMKW; Mon, 28 Nov 2011 08:48:31 +0100 (CET) Received: from [10.9.8.1] (188-167-78-15.dynamic.chello.sk [188.167.78.15]) by mail.vx.sk (Postfix) with ESMTPSA id 83CC615C52; Mon, 28 Nov 2011 08:48:29 +0100 (CET) Message-ID: <4ED33CCC.1050405@FreeBSD.org> Date: Mon, 28 Nov 2011 08:48:28 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Mark Felder References: <95d00c1b714837aa32e7da72bc4afd03@feld.me> <20111126104840.GA8794@garage.freebsd.pl> <881f876f-6f27-49fd-b6c7-edbe6493ec75@email.android.com> <820791346f600ea50ff9ebd68e30c059@feld.me> In-Reply-To: <820791346f600ea50ff9ebd68e30c059@feld.me> X-Enigmail-Version: 1.3.3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: zfs i/o hangs on 9-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 07:48:34 -0000 On 28.11.2011 3:21, Mark Felder wrote: > After many hours of testing, reproducing, and testing again I've > finally been able to narrow down what the real issue is and it's not > ZFS as I suspected. After completely turning off all NFS functionality > and serving my files over Samba I haven't had a single issue. It seems > there is something going on with the new NFS code (I serve out over > v4, but reproduced it last week with v3) and my media player box, a > Popcorn Hour A-200 which is running Linux. If I can cobble some > hardware together and place it between so I can do some tcpdumps I > will provide that data so perhaps someone can understand what's going > on. If this is due to a badly behaving client this is potentially a > DoS on the server. > > > Regards, > > > > Mark Hi Mark, as to the output you have posted this seems to be a pf problem. Could you try the same situation with with pf(4) disabled? If you are not able to reproduce this hang with pf(4) disabled, it would be very nice to have a PR submitted. -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 09:23:54 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B06D106566C for ; Mon, 28 Nov 2011 09:23:54 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5FA4B8FC19 for ; Mon, 28 Nov 2011 09:23:52 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA29067; Mon, 28 Nov 2011 11:23:50 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1RUxRG-000NLj-Cb; Mon, 28 Nov 2011 11:23:50 +0200 Message-ID: <4ED35326.80402@FreeBSD.org> Date: Mon, 28 Nov 2011 11:23:50 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:8.0) Gecko/20111108 Thunderbird/8.0 MIME-Version: 1.0 To: Florian Wagner References: <20111015214347.09f68e4e@naclador.mos32.de> <4E9ACA9F.5090308@FreeBSD.org> <20111019082139.1661868e@auedv3.syscomp.de> <4E9EEF45.9020404@FreeBSD.org> <20111019182130.27446750@naclador.mos32.de> <4EB98E05.4070900@FreeBSD.org> <20111119211921.7ffa9953@naclador.mos32.de> <4EC8CD14.4040600@FreeBSD.org> <20111120121248.5e9773c8@naclador.mos32.de> <4EC91B36.7060107@FreeBSD.org> <20111120191018.1aa4e882@naclador.mos32.de> <4ECA2DBD.5040701@FreeBSD.org> <20111121201332.03ecadf1@naclador.mos32.de> <4ECAC272.5080500@FreeBSD.org> <4ECEBD44.6090900@FreeBSD.org> <20111125224722.6cf3a299@naclador.mos32.de> <4ED0CFF9.4030503@FreeBSD.org> <20111126134927.60fe5097@naclador.mos32.de> In-Reply-To: <20111126134927.60fe5097@naclador.mos32.de> X-Enigmail-Version: undefined Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: Extending zfsboot.c to allow selecting filesystem from boot.config X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 09:23:54 -0000 on 26/11/2011 14:49 Florian Wagner said the following: > > I'll try applying your patches to head instead of stable/8 in the next > days and test that. To make matters easier, can you tell me which > revision of head they are based on? I have finally updated my source repo and rebased the patches upon the recent CUURENT. Please see https://gitorious.org/~avg/freebsd/avgbsd/commits/devel-20111127_1. The interesting commits are all near the HEAD. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 11:07:19 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 795D81065673 for ; Mon, 28 Nov 2011 11:07:19 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 66ACC8FC1A for ; Mon, 28 Nov 2011 11:07:19 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pASB7JoA042560 for ; Mon, 28 Nov 2011 11:07:19 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pASB7ISQ042543 for freebsd-fs@FreeBSD.org; Mon, 28 Nov 2011 11:07:18 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 28 Nov 2011 11:07:18 GMT Message-Id: <201111281107.pASB7ISQ042543@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 11:07:19 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/162860 fs [zfs] Cannot share ZFS filesystem to hosts with a hyph o kern/162751 fs [zfs] [panic] kernel panics during file operations o kern/162591 fs [nullfs] cross-filesystem nullfs does not work as expe o kern/162564 fs [ext2fs][patch] fs/ext2fs: Add include guard o kern/162519 fs [zfs] "zpool import" relies on buggy realpath() behavi o kern/162362 fs [snapshots] [panic] ufs with snapshot(s) panics when g o kern/162083 fs [zfs] [panic] zfs unmount -f pool o kern/161968 fs [zfs] [hang] renaming snapshot with -r including a zvo o kern/161897 fs [zfs] [patch] zfs partition probing causing long delay o kern/161864 fs [ufs] removing journaling from UFS partition fails on o bin/161807 fs [patch] add option for explicitly specifying metadata o kern/161674 fs [ufs] snapshot on journaled ufs doesn't work o kern/161579 fs [smbfs] FreeBSD sometimes panics when an smb share is o kern/161533 fs [zfs] [panic] zfs receive panic: system ioctl returnin o kern/161511 fs [unionfs] Filesystem deadlocks when using multiple uni o kern/161438 fs [zfs] [panic] recursed on non-recursive spa_namespace_ o kern/161424 fs [nullfs] __getcwd() calls fail when used on nullfs mou o kern/161280 fs [zfs] Stack overflow in gptzfsboot o kern/161205 fs [nfs] [pfsync] [regression] [build] Bug report freebsd o kern/161169 fs [zfs] [panic] ZFS causes kernel panic in dbuf_dirty o kern/161112 fs [ufs] [lor] filesystem LOR in FreeBSD 9.0-BETA3 o kern/160893 fs [zfs] [panic] 9.0-BETA2 kernel panic o kern/160860 fs Random UFS root filesystem corruption with SU+J [regre o kern/160801 fs [zfs] zfsboot on 8.2-RELEASE fails to boot from root-o o kern/160790 fs [fusefs] [panic] VPUTX: negative ref count with FUSE o kern/160777 fs [zfs] [hang] RAID-Z3 causes fatal hang upon scrub/impo o kern/160706 fs [zfs] zfs bootloader fails when a non-root vdev exists o kern/160591 fs [zfs] Fail to boot on zfs root with degraded raidz2 [r o kern/160410 fs [smbfs] [hang] smbfs hangs when transferring large fil o kern/160283 fs [zfs] [patch] 'zfs list' does abort in make_dataset_ha o kern/159971 fs [ffs] [panic] panic with soft updates journaling durin o kern/159930 fs [ufs] [panic] kernel core o kern/159418 fs [tmpfs] [panic] [patch] tmpfs kernel panic: recursing o kern/159402 fs [zfs][loader] symlinks cause I/O errors o kern/159357 fs [zfs] ZFS MAXNAMELEN macro has confusing name (off-by- o kern/159356 fs [zfs] [patch] ZFS NAME_ERR_DISKLIKE check is Solaris-s o kern/159351 fs [nfs] [patch] - divide by zero in mountnfs() o kern/159251 fs [zfs] [request]: add FLETCHER4 as DEDUP hash option o kern/159233 fs [ext2fs] [patch] fs/ext2fs: finish reallocblk implemen o kern/159232 fs [ext2fs] [patch] fs/ext2fs: merge ext2_readwrite into o kern/159077 fs [zfs] Can't cd .. with latest zfs version o kern/159048 fs [smbfs] smb mount corrupts large files o kern/159045 fs [zfs] [hang] ZFS scrub freezes system o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs amd(8) ICMP storm and unkillable process. o kern/158711 fs [ffs] [panic] panic in ffs_blkfree and ffs_valloc o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157722 fs [geli] unable to newfs a geli encrypted partition o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs f kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an f bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files f sparc/123566 fs [zfs] zpool import issue: EOVERFLOW o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 259 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 12:55:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9A79106564A for ; Mon, 28 Nov 2011 12:55:30 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 7638A8FC19 for ; Mon, 28 Nov 2011 12:55:30 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1RV0k5-0007jY-I4 for freebsd-fs@freebsd.org; Mon, 28 Nov 2011 13:55:29 +0100 Received: from ib-jtotz.ib.ic.ac.uk ([155.198.110.220]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 28 Nov 2011 13:55:29 +0100 Received: from johannes by ib-jtotz.ib.ic.ac.uk with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 28 Nov 2011 13:55:29 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Johannes Totz Date: Mon, 28 Nov 2011 12:55:17 +0000 Lines: 39 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: ib-jtotz.ib.ic.ac.uk User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 Subject: panic: solaris assert X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 12:55:30 -0000 Hi! Just got a panic trying to create a new dataset on a test-pool. No dump, this is transcribbled off the screen: panic: solaris assert: zfs_get_zplprop(os, ZFS_PROP_NORMALIZE, &norm) == 0, file: /usr/src/sys/modules/zfs../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c, line: 2819 cpuid = 0 KDB: stack backtrace: #0 ... kbd_backtrace #1 ... panic #2 ... zfs_fill_zplprops_impl #3 ... zfs_ioc_create #4 ... zfsdev_ioctl #5 ... devfs_ioctl_f #6 ... kern_ioctl #7 ... sys_ioctl #8 ... amd64_syscall #9 ... Xfast_syscall I don't have pool config anymore. The only curious thing was that zpool status reported lots of checksum errors but nonetheless regarded it "healthy"... This is on: FreeBSD XXX 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #0 r227793: Mon Nov 21 20:19:04 GMT 2011 root@XXX:/usr/obj/usr/src/sys/GENERIC amd64 I checked the code for the assert. Instead of just assert'ing the return value it might be better to return an error to the caller and fail dataset creation. Johannes From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 14:19:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E2C52106566C; Mon, 28 Nov 2011 14:19:28 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 9553F8FC0C; Mon, 28 Nov 2011 14:19:28 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id D80C13AF; Sat, 26 Nov 2011 11:49:47 +0100 (CET) Date: Sat, 26 Nov 2011 11:48:41 +0100 From: Pawel Jakub Dawidek To: Mark Felder Message-ID: <20111126104840.GA8794@garage.freebsd.pl> References: <95d00c1b714837aa32e7da72bc4afd03@feld.me> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ikeVEW9yuYc//A+q" Content-Disposition: inline In-Reply-To: <95d00c1b714837aa32e7da72bc4afd03@feld.me> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: zfs i/o hangs on 9-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 14:19:29 -0000 --ikeVEW9yuYc//A+q Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Nov 25, 2011 at 01:20:01PM -0600, Mark Felder wrote: > 13:14:32 nas:~ > uname -a > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 r227971M:=20 > Fri Nov 25 10:07:48 CST 2011 =20 > root@nas.feld.me:/usr/obj/tank/svn/sys/GENERIC amd64 >=20 > This seemed to start happening sometime after RC1. I tried 8-STABLE and= =20 > it's happening there too right now. I think whatever caused this was=20 > MFC'd. I've also reproduced this on completely different hardware=20 > running a single disk ZFS pool. >=20 >=20 > I'm getting this output in dmesg after these hangs I keep seeing. Mark, those backtrace are not related to ZFS, but to PF. Not sure if they are at all related to your hangs. Most cases where ZFS I/O seems to hang are hardware problems, where I/O requests are not completed. 'procstat -kk -a' output might be useful once the hang happens. > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable locks= =20 > held: > exclusive sleep mutex pf task mtx (pf task mtx) r =3D 0=20 > (0xffffffff8199af20) locked @=20 > /tank/svn/sys/modules/pf/../../contrib/pf/net/pf_ioctl.c:1589 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > kdb_backtrace() at kdb_backtrace+0x37 > _witness_debugger() at _witness_debugger+0x2e > witness_warn() at witness_warn+0x2c4 > uma_zalloc_arg() at uma_zalloc_arg+0x335 > pfr_create_ktable() at pfr_create_ktable+0xd8 > pfr_ina_define() at pfr_ina_define+0x12b > pfioctl() at pfioctl+0x1c5a > devfs_ioctl_f() at devfs_ioctl_f+0x7a > kern_ioctl() at kern_ioctl+0xcd > sys_ioctl() at sys_ioctl+0xfd > amd64_syscall() at amd64_syscall+0x3ac > Xfast_syscall() at Xfast_syscall+0xf7 > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip =3D 0x800da711c, rsp =3D= =20 > 0x7fffffff9d28, rbp =3D 0x7fffffffa1f0 --- --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --ikeVEW9yuYc//A+q Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk7QxAgACgkQForvXbEpPzR0VgCfR/mF7sxZOaNYoHcsvOIDTljh Re0AnR9RoDZr4yLmuwSqGrEaaLDu4B1E =pCIh -----END PGP SIGNATURE----- --ikeVEW9yuYc//A+q-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 14:29:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6060F106564A for ; Mon, 28 Nov 2011 14:29:28 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 15E608FC13 for ; Mon, 28 Nov 2011 14:29:28 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 8F0443E7; Sat, 26 Nov 2011 13:08:30 +0100 (CET) Date: Sat, 26 Nov 2011 13:07:28 +0100 From: Pawel Jakub Dawidek To: Johan Hendriks Message-ID: <20111126120727.GB8794@garage.freebsd.pl> References: <4ED0949A.8080602@gmail.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3lcZGd9BuhuYXNfi" Content-Disposition: inline In-Reply-To: <4ED0949A.8080602@gmail.com> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: zvol and zfs send no file on the receiving side. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 14:29:28 -0000 --3lcZGd9BuhuYXNfi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sat, Nov 26, 2011 at 08:26:18AM +0100, Johan Hendriks wrote: > Hello all! >=20 > After some reluctance using zfs i decided it was time to just use it,=20 > and it make my life a lot easier. > Thank you all ! >=20 > One thing i can not get to work however is sending and receiving a zfs=20 > volume. >=20 > I have a master so to say, and a bacup server. > On the master the pool is named storage, on the backup machine the pool= =20 > is called tank. >=20 > on the master i did the following. >=20 > # zfs create -V10G storage/iscsitest >=20 > if i do a zfs list -t volume, i see that the volume is there. >=20 > # zfs list -t volume > NAME USED AVAIL REFER MOUNTPOINT > storage/iscsitest 10.3G 65.9G 16K - >=20 > If i go to the directory /storage and do a ls -al > i see my zvol file. >=20 > # ls -al > total 1901416 > drwxr-xr-x 4 root wheel 5 Nov 25 17:44 . > drwxr-xr-x 19 root wheel 1024 Nov 25 11:33 .. > -rw-r--r-- 1 root wheel 10737418240 Nov 25 20:46 iscsitest Very interesting that you have this file here. It surely wasn't created by ZFS. ZVOL is not regular file, it is a GEOM provider (disk-like device) and can be found in /dev/zvol/storage/iscsitest. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --3lcZGd9BuhuYXNfi Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk7Q1n8ACgkQForvXbEpPzQhnACbBuB874983gYwNOQFAOJd2mly bHMAoPa7JJOmyC9TW+qHOhX4GJUJg18r =ZWKd -----END PGP SIGNATURE----- --3lcZGd9BuhuYXNfi-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 14:34:28 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED0DE106566C for ; Mon, 28 Nov 2011 14:34:28 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id A4CF58FC0C for ; Mon, 28 Nov 2011 14:34:28 +0000 (UTC) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) by mail.dawidek.net (Postfix) with ESMTPSA id 610BA4C2; Sat, 26 Nov 2011 17:59:31 +0100 (CET) Date: Sat, 26 Nov 2011 17:58:23 +0100 From: Pawel Jakub Dawidek To: freebsd-fs@FreeBSD.org Message-ID: <20111126165823.GD8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="PHCdUe6m4AxPMzOu" Content-Disposition: inline X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Subject: NFS corruption in recent HEAD. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 14:34:29 -0000 --PHCdUe6m4AxPMzOu Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi. I'm booting machine over the network using new NFS client and I'm getting those warnings on boot: /etc/rc.subr: 666: Syntax error: "(" unexpected (expecting ";;") I inspected the /etc/rc.subr file on the client and here is the problem. At offset 16384 the file on the client contains data from offset 32768 =66rom the server. It contains exactly 7599 bytes from the wrong place at this offset. All next bytes up to offset 32768 are all zeros. So the data is identical for ranges <0-16384) and <32768-40367) (to the end of the file). Then range <16384-23983) contains data from <32768-40367) and <23984-32768) is all zeros. Probably if file would be bigger there will be no zeros, but more data from wrong block. It seems that the client is asking for third block where it should ask for second block (or the server is providing wrong block). Server is running '8.2-STABLE #17: Wed Sep 28 10:30:02 EDT 2011'. BTW. When I copy the file on the client using cp(1), the copy is not corrupted (cp(1) is using mmap(2)?). But when I do 'cat /etc/rc.subr > /foo' the corruptions is visible in new file too. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --PHCdUe6m4AxPMzOu Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk7RGq8ACgkQForvXbEpPzTXjQCghLqwfZa5mwoa1ZtpXcwuGp8Q qesAnR665r+tDN/hdR1o3GFmAiw06iUQ =GyUF -----END PGP SIGNATURE----- --PHCdUe6m4AxPMzOu-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 14:39:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B6851065673; Mon, 28 Nov 2011 14:39:27 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 8FBAC8FC16; Mon, 28 Nov 2011 14:39:26 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAPOb006DaFvO/2dsb2JhbAA7CIUDpwCBcgEBBSMyJBsYAgINGQJZBhOtGpFXgTCCL4NSghuBFgSIIYwpkik X-IronPort-AV: E=Sophos;i="4.69,584,1315195200"; d="scan'208";a="147318916" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 28 Nov 2011 09:39:25 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 66ED0B406C; Mon, 28 Nov 2011 09:39:25 -0500 (EST) Date: Mon, 28 Nov 2011 09:39:25 -0500 (EST) From: Rick Macklem To: Pawel Jakub Dawidek Message-ID: <2111049507.479178.1322491165407.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20111126104840.GA8794@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: zfs i/o hangs on 9-PRERELEASE X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 14:39:27 -0000 Pawel Jakub Dawidek wrote: > On Fri, Nov 25, 2011 at 01:20:01PM -0600, Mark Felder wrote: > > 13:14:32 nas:~ > uname -a > > FreeBSD nas.feld.me 9.0-PRERELEASE FreeBSD 9.0-PRERELEASE #3 > > r227971M: > > Fri Nov 25 10:07:48 CST 2011 > > root@nas.feld.me:/usr/obj/tank/svn/sys/GENERIC amd64 > > > > This seemed to start happening sometime after RC1. I tried 8-STABLE > > and > > it's happening there too right now. I think whatever caused this was > > MFC'd. I've also reproduced this on completely different hardware > > running a single disk ZFS pool. > > > > > > I'm getting this output in dmesg after these hangs I keep seeing. > > Mark, those backtrace are not related to ZFS, but to PF. Not sure if > they are at all related to your hangs. Most cases where ZFS I/O seems > to > hang are hardware problems, where I/O requests are not completed. > He recently posted that his hangs went away when he stopped using NFS. NFS does use uma_zalloc() and there are several places in pfioctl() where uma_zalloc(...M_WAITOK...) is called (hidden under pool_get()) when a mutex (the PF_LOCK() one) is held. I've emailed bz@ related to this. I'm also not sure if they could be related to his hangs, but it seems that if uma_zalloc() decides to sleep with the mutex held, something may break and a broken uma_zalloc() would impact NFS. rick > 'procstat -kk -a' output might be useful once the hang happens. > > > uma_zalloc_arg: zone "pfrktable" with the following non-sleepable > > locks > > held: > > exclusive sleep mutex pf task mtx (pf task mtx) r = 0 > > (0xffffffff8199af20) locked @ > > /tank/svn/sys/modules/pf/../../contrib/pf/net/pf_ioctl.c:1589 > > KDB: stack backtrace: > > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > > kdb_backtrace() at kdb_backtrace+0x37 > > _witness_debugger() at _witness_debugger+0x2e > > witness_warn() at witness_warn+0x2c4 > > uma_zalloc_arg() at uma_zalloc_arg+0x335 > > pfr_create_ktable() at pfr_create_ktable+0xd8 > > pfr_ina_define() at pfr_ina_define+0x12b > > pfioctl() at pfioctl+0x1c5a > > devfs_ioctl_f() at devfs_ioctl_f+0x7a > > kern_ioctl() at kern_ioctl+0xcd > > sys_ioctl() at sys_ioctl+0xfd > > amd64_syscall() at amd64_syscall+0x3ac > > Xfast_syscall() at Xfast_syscall+0xf7 > > --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800da711c, rsp = > > 0x7fffffff9d28, rbp = 0x7fffffffa1f0 --- > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 15:04:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8D9A1065673; Mon, 28 Nov 2011 15:04:27 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 8D3E68FC13; Mon, 28 Nov 2011 15:04:27 +0000 (UTC) Received: from localhost (58.wheelsystems.com [83.12.187.58]) by mail.dawidek.net (Postfix) with ESMTPSA id CFEE6EEB; Fri, 25 Nov 2011 12:03:40 +0100 (CET) Date: Fri, 25 Nov 2011 12:02:35 +0100 From: Pawel Jakub Dawidek To: Kostik Belousov Message-ID: <20111125110235.GB1642@garage.freebsd.pl> References: <1957615267.20111123230026@serebryakov.spb.ru> <20111123194444.GE50300@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="OwLcNYc0lM97+oe1" Content-Disposition: inline In-Reply-To: <20111123194444.GE50300@deviant.kiev.zoral.com.ua> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 15:04:27 -0000 --OwLcNYc0lM97+oe1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Nov 23, 2011 at 09:44:44PM +0200, Kostik Belousov wrote: > On Wed, Nov 23, 2011 at 11:00:26PM +0400, Lev Serebryakov wrote: > > Hello, Freebsd-fs. > >=20 > > Does UFS2 with softupdates (without journal) issues BIO_FLUSH to > > GEOM layer when it need to ensure consistency on on-disk metadata? > No. Softupdates do not need flushes. Well, they do for two reasons: 1. To properly handle sync operations (fsync(2), O_SYNC). 2. To maintain consistent on-disk structures. The second point is there, because BIO_FLUSH is the only way to avoid reordering (apart from turning off disk write cache). SU assumes no I/O reordering will happen, which is very weak assumption. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --OwLcNYc0lM97+oe1 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk7PdcsACgkQForvXbEpPzR8WgCffIE47sDfnjN+O411ELBT/hAV NRcAoKzWvT5wiAqg6reIdvqJqtAq5/30 =kLyd -----END PGP SIGNATURE----- --OwLcNYc0lM97+oe1-- From owner-freebsd-fs@FreeBSD.ORG Mon Nov 28 23:23:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BDA7106564A for ; Mon, 28 Nov 2011 23:23:49 +0000 (UTC) (envelope-from techchavez@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 9E0DF8FC08 for ; Mon, 28 Nov 2011 23:23:48 +0000 (UTC) Received: by eaai12 with SMTP id i12so3504606eaa.13 for ; Mon, 28 Nov 2011 15:23:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=9zmPTOY+NwTTsJks4wZkc03GSUlvQPuzlYVD2zdMqaM=; b=QFzSEa0gQeSV2ReYIhNY+dyEmLUIi1ho5MLPyQ8cjWEl4sPS7YhryOoyrRk7ZFnznz Rm0yjtHzC0rSrDwusoGWBS28gi2E0ZmvuPw9I2VTOsPICRsRj6xyKo9CNDu9ylenYm3K l85HN71Xqt9aABOndVg9/RPV0aZ2kJniYAav0= MIME-Version: 1.0 Received: by 10.180.103.131 with SMTP id fw3mr46785734wib.57.1322521316727; Mon, 28 Nov 2011 15:01:56 -0800 (PST) Received: by 10.180.94.197 with HTTP; Mon, 28 Nov 2011 15:01:56 -0800 (PST) Date: Mon, 28 Nov 2011 16:01:56 -0700 Message-ID: From: Techie To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Subject: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Nov 2011 23:23:49 -0000 Hi all, Is there any plans to implement sharing of the ZFS DDT Dedup table or to make ZFS aware of the destination duplicate blocks on a remote system? >From how I understand it, the zfs send/recv stream does not know about the duplicated blocks on the receiving side when using zfs send -D -i to sendonly incremental changes. So take for example I have an application that I backup each night to a ZFS file system. I want to replicate this every night to my remote site. Each night that I back up I create a tar file on the ZFS data file system. When I go to send an incremental stream it sends the entire tar file to the destination even though over 90% of those blocks already exist at the destination.. Is there any plans to make ZFS aware of what exists already at the destination site to eliminate the need to send duplicate blocks over the wire? zfs send -D I believe only eliminates the duplicate blocks within the stream. Perhaps I am wrong.. Thanks Jimmy From owner-freebsd-fs@FreeBSD.ORG Tue Nov 29 09:58:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADBC1106564A for ; Tue, 29 Nov 2011 09:58:28 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 4D9F48FC0C for ; Tue, 29 Nov 2011 09:58:28 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:5974:a369:b987:bc4d]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 775CF4AC1C; Tue, 29 Nov 2011 13:58:25 +0400 (MSK) Date: Tue, 29 Nov 2011 13:58:16 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1401883450.20111129135816@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201111290639.pAT6dWbq073934@chez.mckusick.com> References: <469709203.20111127152739@serebryakov.spb.ru> <201111290639.pAT6dWbq073934@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Nov 2011 09:58:28 -0000 Hello, Kirk. You wrote 29 =ED=EE=FF=E1=F0=FF 2011 =E3., 10:39:32: > The FSYNC flag is useful. Most write requests are being done in > background, so the delay is not an issue. But there are some write > requests where the application is waiting and in those cases > minimizing the delay is necessary and important. Yep. So, maybe we record this as conclusion from discussion and FS guys (you? Kostik?) add this (FSYNC flag fro "struct buf" in case of true FSYNC request) to TODO list? It doesn't look as much work for person, who understand FFS codebase well. > That said, delaying for 5 minutes would be unacceptable. Users do > expect their data to be stable within a "reasonable" amount of time. > Historically "reasonable" has been defined as within 30 seconds of > having been written. Thus delaying a write acknowledgement for more > than about 30 seconds would not be considered acceptable (though SU > would be able to handle such a delay). Ok, I got it. 30 seconds is much better than nothing :) And, I think, it is possible to allow user to set this timeout higher, with BIG WARNING in manual page, as SU code will not break immediately. It is sad, that it is impossible to do true write cache, which will report requests as complete after copying it into cache, even with help of UPS, but it seems to be sad true :( I've understood, that it is impossible to have robust software cache data-wise (crash could kill data in cache, for sure), but it is something new for me, that it is impossible to guarantee easy-recoverable FS structures in such case even in cooperation with FS itself :( --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Tue Nov 29 10:01:25 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 20693106564A; Tue, 29 Nov 2011 10:01:25 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EC26F8FC1A; Tue, 29 Nov 2011 10:01:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id pATA1OOL031913; Tue, 29 Nov 2011 10:01:24 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id pATA1OZZ031903; Tue, 29 Nov 2011 10:01:24 GMT (envelope-from linimon) Date: Tue, 29 Nov 2011 10:01:24 GMT Message-Id: <201111291001.pATA1OZZ031903@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/162944: [coda] Coda file system module looks broken in 9.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Nov 2011 10:01:25 -0000 Old Synopsis: Coda file system module looks broken in 9.0 New Synopsis: [coda] Coda file system module looks broken in 9.0 Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Tue Nov 29 10:00:33 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=162944 From owner-freebsd-fs@FreeBSD.ORG Wed Nov 30 16:52:32 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A808C1065675 for ; Wed, 30 Nov 2011 16:52:32 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 48DE58FC20 for ; Wed, 30 Nov 2011 16:52:32 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:69b8:2555:9d19:7f7b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 36B814AC1C; Wed, 30 Nov 2011 20:52:30 +0400 (MSK) Date: Wed, 30 Nov 2011 20:52:15 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1293164635.20111130205215@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201111292154.pATLsgIk073730@chez.mckusick.com> References: <1401883450.20111129135816@serebryakov.spb.ru> <201111292154.pATLsgIk073730@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Nov 2011 16:52:32 -0000 Hello, Kirk. You wrote 30 =ED=EE=FF=E1=F0=FF 2011 =E3., 1:54:42: >> > The FSYNC flag is useful. Most write requests are being done in >> > background, so the delay is not an issue. But there are some write >> > requests where the application is waiting and in those cases >> > minimizing the delay is necessary and important. >>=20 >> Yep. So, maybe we record this as conclusion from discussion and FS >> guys (you? Kostik?) add this (FSYNC flag fro "struct buf" in case of >> true FSYNC request) to TODO list? It doesn't look as much work for >> person, who understand FFS codebase well. > When you say you want an FSYNC flag added to struct buf, what you > ultimately want is that the BIO_FLUSH flag be set on the request > handed down to the GEOM layer? BIO_FLUSH is additional transaction, not flag. If there will be FSYNC flag on "struct buf" (passed from VFS level) GEOM could add this transaction after BIO_WRITE one, or it could add its own flag (which is not existent now) to BIO_WRITE transaction, what is better, because it will be faster and will have smaller latency and will not need special logic in drivers. Such flag will be translated directly to special hardware flags by drivers, and BIO_FLUSH is processed separately, which could be not so effective. Now GEOM doesn't have such flag, because it doesn't have any reasons to have it :) P.S. You don't add CC: to fs@ mailing list -- is it intentional? --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Wed Nov 30 17:39:34 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AF941065673 for ; Wed, 30 Nov 2011 17:39:34 +0000 (UTC) (envelope-from marty.rosenberg@gmail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2C6B98FC14 for ; Wed, 30 Nov 2011 17:39:33 +0000 (UTC) Received: by vcbfk1 with SMTP id fk1so991698vcb.13 for ; Wed, 30 Nov 2011 09:39:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=0MIpndesmjZPerOm/hMvZ+3rnLqhgNr2CW6EI1ilyOE=; b=Q73bV3MK/rhf/90h4WSVIkR6Etz/pMpa+ncsSrvBZbInQgw2/0XedjkqA6ldQs6ptn MzF9DVe6F/WFYnnepoYcpUeojkdoKPw+/BXX37J7y347O5KkdczPTUwaPF4j5Bu/NtwZ dfeVGg4u5Bq6TAXxmkjqlXNS+6qqUczUIP9KY= MIME-Version: 1.0 Received: by 10.221.0.140 with SMTP id nm12mr588243vcb.202.1322673145567; Wed, 30 Nov 2011 09:12:25 -0800 (PST) Received: by 10.52.37.133 with HTTP; Wed, 30 Nov 2011 09:12:25 -0800 (PST) Date: Wed, 30 Nov 2011 09:12:25 -0800 Message-ID: From: Marty Rosenberg To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: extattr on zfs is corrupted X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 30 Nov 2011 17:39:34 -0000 Yesterday, I tried to run zfs destroy on the filesystem that was mounted on /local2/media, and a few seconds later the machine kernel panicked (ran out of kernel memory). Upon restarting the machine, everything seems sane, but gluster no longer works, because it attempts to set an extended attribute on the directory that it is referencing (in this case, /local). Upon investigating, the attribute trusted.glusterfs.test looks like it exists, but is completely inaccessible. [root@memoryalpha /]# lsextattr user /local /local trusted.posix1.gen trusted.glusterfs.dht trusted.glusterfs.test [root@memoryalpha /]# getextattr user trusted.glusterfs.test /local getextattr: /local: failed: Attribute not found [root@memoryalpha /]# rmextattr user trusted.glusterfs.test /local rmextattr: /local: failed: Attribute not found [root@memoryalpha /]# setextattr user trusted.glusterfs.test value /local setextattr: /local: failed: No such file or directory Any ideas on how to fix this? the system is kind of old: FreeBSD memoryalpha.rosenbridge 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE #0: Thu Dec 9 16:04:49 UTC 2010 mjrosenb1@memoryalpha.rosenbridge:/usr/obj/usr/src/sys/GENERIC amd64 [root@memoryalpha /]# zpool list -o version store VERSION 15 Thanks --Marty From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 10:44:17 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 721001065673 for ; Thu, 1 Dec 2011 10:44:17 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 199E18FC18 for ; Thu, 1 Dec 2011 10:44:16 +0000 (UTC) Received: by yenq9 with SMTP id q9so2424129yen.13 for ; Thu, 01 Dec 2011 02:44:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=2viTJKtsN3j+G/+Igln7Ptp8jLhTosc/Q6MGxE4Lu0c=; b=DdvKa6XY40PKUb5vtXYffDZo1zZ8T2FNycnIjOm43b/X08B4f/y0EUoy7tw5rp7Yq2 SD9HzW7+1ocOlWb2Dddg4oVecdRJ7MXCcEiJ6qktj8VueK7MWJU23KQ0QyNgon1jLQnl iRWPN/qBFOcvZJrQ5ks5xtN5CYOtCx8ksiAwI= MIME-Version: 1.0 Received: by 10.236.155.36 with SMTP id i24mr10361237yhk.43.1322734818286; Thu, 01 Dec 2011 02:20:18 -0800 (PST) Received: by 10.236.95.41 with HTTP; Thu, 1 Dec 2011 02:20:18 -0800 (PST) In-Reply-To: References: Date: Thu, 1 Dec 2011 10:20:18 +0000 Message-ID: From: krad To: Techie Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 10:44:17 -0000 On 28 November 2011 23:01, Techie wrote: > Hi all, > > Is there any plans to implement sharing of the ZFS DDT Dedup table or > to make ZFS aware of the destination duplicate blocks on a remote > system? > > >From how I understand it, the zfs send/recv stream does not know about > the duplicated blocks on the receiving side when using zfs send -D -i > to sendonly incremental changes. > > So take for example I have an application that I backup each night to > a ZFS file system. I want to replicate this every night to my remote > site. Each night that I back up I create a tar file on the ZFS data > file system. When I go to send an incremental stream it sends the > entire tar file to the destination even though over 90% of those > blocks already exist at the destination.. Is there any plans to make > ZFS aware of what exists already at the destination site to eliminate > the need to send duplicate blocks over the wire? zfs send -D I believe > only eliminates the duplicate blocks within the stream. > > Perhaps I am wrong.. > > > Thanks > Jimmy > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Why tar up the stuff? Just do a zfs snap and then you bypass the whole issue? From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 13:15:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FCC8106566C for ; Thu, 1 Dec 2011 13:15:42 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.17.9]) by mx1.freebsd.org (Postfix) with ESMTP id 0AAE18FC13 for ; Thu, 1 Dec 2011 13:15:41 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis) id 0LZl8O-1R6txW1Px6-00lXMq; Thu, 01 Dec 2011 14:03:06 +0100 Message-ID: <4ED77B09.1090709@brockmann-consult.de> Date: Thu, 01 Dec 2011 14:03:05 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:G1UGavzIb6bbsSgiyNIc2m1vpNgA31/P1CPs5qAOdAt RV1/upUphKiF8sYKQoWCpz8dPSohf14EUbKlGDc0mwO7IRxLXO Sw9+4gDXcyPjxHbeKWGxazVzpA46hH31+7AVer+JZJUxKDHjB8 o0/aFyqESBGIsBm6rdMLnZFA1z05UxIbHgn57nRYFqFlGSfuke 5gmBsdprGl4iTqSbCm2TxYB1bE///k/toSQpfb3uYQrAlUREFL lHVoVk0a1cGHhBP2AVeIgFoXxcX0pk7OBDOikK82ppOzV49OS2 KlJCrF5+xYy18Vc6Iv8iZRrtzJG2phaPXdE6K/GkQHWrdQQc9k 9rKxioX6wiJnG/ShKbGU9jNjnpkTiVSdpktvV22Yf Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 13:15:42 -0000 On 12/01/2011 11:20 AM, krad wrote: > On 28 November 2011 23:01, Techie wrote: > >> Hi all, >> >> Is there any plans to implement sharing of the ZFS DDT Dedup table or >> to make ZFS aware of the destination duplicate blocks on a remote >> system? >> >> >From how I understand it, the zfs send/recv stream does not know about >> the duplicated blocks on the receiving side when using zfs send -D -i >> to sendonly incremental changes. >> >> So take for example I have an application that I backup each night to >> a ZFS file system. I want to replicate this every night to my remote >> site. Each night that I back up I create a tar file on the ZFS data >> file system. When I go to send an incremental stream it sends the >> entire tar file to the destination even though over 90% of those >> blocks already exist at the destination.. Is there any plans to make >> ZFS aware of what exists already at the destination site to eliminate >> the need to send duplicate blocks over the wire? zfs send -D I believe >> only eliminates the duplicate blocks within the stream. >> >> Perhaps I am wrong.. >> >> >> Thanks >> Jimmy >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >> > > Why tar up the stuff? Just do a zfs snap and then you bypass the whole > issue? I was thinking the same thing when I read his message. I don't understand it either. On my system with 12 TiB used up, what I do in a script is basically: -generate a snap name -make a recursive snapshot -ssh to the remote server and compare snapshots (find the latest common snapshot, to find an incremental reference point) -if a usable reference point exists, start the incremental send like this (which wipes all changes on the remote system without confirmation): zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${destHost} zfs recv -d -F -v ${destPool} -and if no usable reference point existed, then do a full send, non-incremental: zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F -v ${destDataSet} The part about finding the reference snapshot is the most complicated part of my script, and missing from anything else I found online when I was looking for a good solution. For example this script: http://blogs.sun.com/clive/resource/zfs_repl.ksh found on this page: http://blogs.oracle.com/clive/entry/replication_using_zfs was found to be quite terrible, and would fail completely when there was a new dataset, or a snapshot missing for some reason. So I suggest you look at that one, but write your own. The only time my script failed is when there was a zfs bug; the same one seen here: http://serverfault.com/questions/66414/cannot-destroy-zfs-snapshot-dataset-already-exists so I just deleted the clone manually and it worked again. I thought gzip could save a small amount of time, eg. I compared speed of "zfs send .... | ssh zfs recv ..." to "zfs send ... | gzip -c | ssh 'gunzip -c | zfs recv...'" and found not much or no difference. But I have no idea why you would use tar. And just to confirm, I have the same problems with dedup causing severe bottlenecks on many things, especially zfs recv and scrub, even though I have 48 GB of memory installed and 44 available to ZFS. But I find incremental sends to be very efficient, taking much less than a minute (depending on how much data was changed) when it runs every hour. And unless your bandwidth is slow and precious, I recommend sending more than daily, because it is very fast if done often enough. I send hourly because I didn't have time to work on some scripts to clean up the old snapshots. Otherwise I would do it every 15 min or maybe 15 seconds ;) > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de -------------------------------------------- From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 16:42:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 837BA106566B for ; Thu, 1 Dec 2011 16:42:55 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 410D68FC0C for ; Thu, 1 Dec 2011 16:42:55 +0000 (UTC) Received: by ywp17 with SMTP id 17so3093683ywp.13 for ; Thu, 01 Dec 2011 08:42:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=LNZJycUjaUobWf311XcpMHgLfeXgzFrn9nSrTa28Nws=; b=D05LnwLUJUepbw7wcCIRmzcYtJKhC6J/HEcwTI6ai3EGKPQDK/HDrgfaFW+M5u66ID v4Q34UkZnS8iPCEYQ4gW5Z/43V5DmT7FjkEEoBY2J2EsR3GHAVyBWHYmmZkmd2ytk1H+ o4ovHOJIOvP/kDoysfHGxL9Gfop2EyAXXrgcE= MIME-Version: 1.0 Received: by 10.236.192.233 with SMTP id i69mr13065899yhn.60.1322757774657; Thu, 01 Dec 2011 08:42:54 -0800 (PST) Received: by 10.236.95.41 with HTTP; Thu, 1 Dec 2011 08:42:54 -0800 (PST) In-Reply-To: <4ED77B09.1090709@brockmann-consult.de> References: <4ED77B09.1090709@brockmann-consult.de> Date: Thu, 1 Dec 2011 16:42:54 +0000 Message-ID: From: krad To: Peter Maloney Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 16:42:55 -0000 On 1 December 2011 13:03, Peter Maloney wrote: > On 12/01/2011 11:20 AM, krad wrote: > > On 28 November 2011 23:01, Techie wrote: > > > >> Hi all, > >> > >> Is there any plans to implement sharing of the ZFS DDT Dedup table or > >> to make ZFS aware of the destination duplicate blocks on a remote > >> system? > >> > >> >From how I understand it, the zfs send/recv stream does not know about > >> the duplicated blocks on the receiving side when using zfs send -D -i > >> to sendonly incremental changes. > >> > >> So take for example I have an application that I backup each night to > >> a ZFS file system. I want to replicate this every night to my remote > >> site. Each night that I back up I create a tar file on the ZFS data > >> file system. When I go to send an incremental stream it sends the > >> entire tar file to the destination even though over 90% of those > >> blocks already exist at the destination.. Is there any plans to make > >> ZFS aware of what exists already at the destination site to eliminate > >> the need to send duplicate blocks over the wire? zfs send -D I believe > >> only eliminates the duplicate blocks within the stream. > >> > >> Perhaps I am wrong.. > >> > >> > >> Thanks > >> Jimmy > >> _______________________________________________ > >> freebsd-fs@freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >> > > > > Why tar up the stuff? Just do a zfs snap and then you bypass the whole > > issue? > I was thinking the same thing when I read his message. I don't > understand it either. > > On my system with 12 TiB used up, what I do in a script is basically: > > -generate a snap name > -make a recursive snapshot > -ssh to the remote server and compare snapshots (find the latest common > snapshot, to find an incremental reference point) > -if a usable reference point exists, start the incremental send like > this (which wipes all changes on the remote system without confirmation): > zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${destHost} > zfs recv -d -F -v ${destPool} > -and if no usable reference point existed, then do a full send, > non-incremental: > zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F -v > ${destDataSet} > > > The part about finding the reference snapshot is the most complicated > part of my script, and missing from anything else I found online when I > was looking for a good solution. For example this script: > http://blogs.sun.com/clive/resource/zfs_repl.ksh > found on this page: > http://blogs.oracle.com/clive/entry/replication_using_zfs > was found to be quite terrible, and would fail completely when there was > a new dataset, or a snapshot missing for some reason. So I suggest you > look at that one, but write your own. > > The only time my script failed is when there was a zfs bug; the same one > seen here: > > http://serverfault.com/questions/66414/cannot-destroy-zfs-snapshot-dataset-already-exists > so I just deleted the clone manually and it worked again. > > I thought gzip could save a small amount of time, eg. > I compared speed > of "zfs send .... | ssh zfs recv ..." > to "zfs send ... | gzip -c | ssh 'gunzip -c | zfs recv...'" > and found not much or no difference. > But I have no idea why you would use tar. > > And just to confirm, I have the same problems with dedup causing severe > bottlenecks on many things, especially zfs recv and scrub, even though I > have 48 GB of memory installed and 44 available to ZFS. > > But I find incremental sends to be very efficient, taking much less than > a minute (depending on how much data was changed) when it runs every > hour. And unless your bandwidth is slow and precious, I recommend > sending more than daily, because it is very fast if done often enough. I > send hourly because I didn't have time to work on some scripts to clean > up the old snapshots. Otherwise I would do it every 15 min or maybe 15 > seconds ;) > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > -- > > -------------------------------------------- > Peter Maloney > Brockmann Consult > Max-Planck-Str. 2 > 21502 Geesthacht > Germany > Tel: +49 4152 889 300 > Fax: +49 4152 889 333 > E-mail: peter.maloney@brockmann-consult.de > Internet: http://www.brockmann-consult.de > -------------------------------------------- > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > sounds like we have been through very similar experiences From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 17:19:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 96E41106564A for ; Thu, 1 Dec 2011 17:19:28 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 5CB268FC16 for ; Thu, 1 Dec 2011 17:19:28 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.5/8.14.4) with ESMTP id pB1HJPHt074711 for ; Thu, 1 Dec 2011 12:19:25 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4ED7B713.200@sentex.net> Date: Thu, 01 Dec 2011 12:19:15 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.71 on IPv6:2607:f3e0:0:1::12 Subject: mount_smbfs re-exported via samba not working X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 17:19:28 -0000 Hi, I am trying to export a series of shares from one Windows XP box, to another so I can better control and log access and am having problems seeing the files. I am not sure if this is a samba issue or an mount_smbfs issue The setup is [XP-Server] -------- [FreeBSD]------[DMZ Clients] So FreeBSD box mount_smbfs from [XP-Server], servs up the share under samab so that the DMZ clients can see it. I can mount the windows XP file system shares no problem eg mount_smbfs -c u -d 777 -f 777 -O xpuser:xpuser -N //xpuser@xpserver/pricelist /export This shows up, and I can see and create files no problem from the FreeBSD box # mount -t smbfs //XPUSER@XPSERVER/PRICELIST on /export (smbfs) # ls -l /export total 17 drwxrwxrwx 1 xpuser xpuser - 16384 Dec 31 1969 . drwxrwxrwx 7 xpuser xpuser - 512 Dec 1 08:04 .. -rwxrwxrwx 1 xpuser xpuser - 95 Dec 1 10:29 dd -rwxrwxrwx 1 xpuser xpuser - 92 Dec 1 10:58 n -rwxrwxrwx 1 xpuser xpuser - 92 Dec 1 10:33 new -rwxrwxrwx 1 xpuser xpuser - 95 Dec 1 10:29 test -rwxrwxrwx 1 xpuser xpuser - 436 Dec 1 10:34 test2 -rwxrwxrwx 1 xpuser xpuser - 15 Dec 1 10:22 this-is-a-test.txt Now the problem is when I try and re-export that using samba. The files never show up on the windows PC. e.g on the other PC attached to the DMZ NIC of the FreeBSD server, I try and do something like net use m: \\192.168.1.1\pl /user:someothersmbuser I can map the drive, but doing a dir shows no files. if I do something like dir > test, it does actually create the file, but I still cannot see it If I use smbclient from another FreeBSD box, also in the DMZ %smbclient -U somesmbuser //192.168.1.1/pl Enter somesmbuser's password: Domain=[DMZ] OS=[Unix] Server=[Samba 3.6.1] smb: \> dir NT_STATUS_INVALID_HANDLE listing \* But... I can actually read the files that I know are there and make and change into directories ??!! smb: \> get dd getting file \dd of size 95 as dd (1.2 KiloBytes/sec) (average 1.2 KiloBytes/sec) smb: \> get test2 getting file \test2 of size 436 as test2 (5.6 KiloBytes/sec) (average 3.4 KiloBytes/sec) smb: \> smb: \> mkdir testdir smb: \> cd testdir smb: \testdir\> dir NT_STATUS_INVALID_HANDLE listing \testdir\* smb: \testdir\> Any idea why its not working It seems directly listings are the only things not working, even though ls sees them on the FreeBSD box. If I export a directory in samba that just has a normal UFS file system all works just fine. Its only when I try and export the smbfs system that it does not work. I also try exporting a nullfs mounted file system and that worked, but again, only if the underlying file system was UFS, not smbfs ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 19:48:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C87DE1065672 for ; Thu, 1 Dec 2011 19:48:58 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from mo-p05-ob6.rzone.de (mo-p05-ob6.rzone.de [IPv6:2a01:238:20a:202:53f5::1]) by mx1.freebsd.org (Postfix) with ESMTP id 3CCD58FC15 for ; Thu, 1 Dec 2011 19:48:58 +0000 (UTC) X-RZG-AUTH: :LWIKdA2leu0bPbLmhzXgqn0MTG6qiKEwQRWfNxSw4HzYIwjsnvdDt2QV8d370WKsPpuXlw== X-RZG-CLASS-ID: mo05 Received: from [192.168.179.42] (hmbg-4d06fd27.pool.mediaWays.net [77.6.253.39]) by smtp.strato.de (fruni mo50) (RZmta 26.10 AUTH) with (DHE-RSA-AES128-SHA encrypted) ESMTPA id z070eenB1HDTSn for ; Thu, 1 Dec 2011 20:48:38 +0100 (MET) Message-ID: <4ED7DA15.1030108@brockmann-consult.de> Date: Thu, 01 Dec 2011 20:48:37 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4ED7B713.200@sentex.net> In-Reply-To: <4ED7B713.200@sentex.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: mount_smbfs re-exported via samba not working X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 19:48:59 -0000 Am 01.12.2011 18:19, schrieb Mike Tancsa: > Hi, > > I am trying to export a series of shares from one Windows XP box, to another so I can better control and log access and am having problems seeing the files. I am not sure if this is a samba issue or an mount_smbfs issue > > > The setup is > > [XP-Server] -------- [FreeBSD]------[DMZ Clients] I am doing something like this: FreeBSD zfs -----nfs mount -------> Linux machine -------smb mount-------> whichever client Note I am using NFS from FreeBSD instead of samba. And before I added this to the linux machine's smb.conf global section, I would get total lockups and nobody could list directories. strict locking = no blocking locks = no So give that a try. With the first setting, any tests I tried worked fine, but others reported a similar problem, maybe when there are a few concurrent users. With the second setting, I haven't had the same problem since. (But if I share my .zfs directory over NFS, listing that hangs the dataset) > > So FreeBSD box mount_smbfs from [XP-Server], servs up the share under samab so that the DMZ clients can see it. > > > I can mount the windows XP file system shares no problem > eg > > mount_smbfs -c u -d 777 -f 777 -O xpuser:xpuser -N //xpuser@xpserver/pricelist /export > > This shows up, and I can see and create files no problem from the FreeBSD box > > # mount -t smbfs > //XPUSER@XPSERVER/PRICELIST on /export (smbfs) > > > > # ls -l /export > total 17 > drwxrwxrwx 1 xpuser xpuser - 16384 Dec 31 1969 . > drwxrwxrwx 7 xpuser xpuser - 512 Dec 1 08:04 .. > -rwxrwxrwx 1 xpuser xpuser - 95 Dec 1 10:29 dd > -rwxrwxrwx 1 xpuser xpuser - 92 Dec 1 10:58 n > -rwxrwxrwx 1 xpuser xpuser - 92 Dec 1 10:33 new > -rwxrwxrwx 1 xpuser xpuser - 95 Dec 1 10:29 test > -rwxrwxrwx 1 xpuser xpuser - 436 Dec 1 10:34 test2 > -rwxrwxrwx 1 xpuser xpuser - 15 Dec 1 10:22 this-is-a-test.txt > > > Now the problem is when I try and re-export that using samba. The files never show up on the windows PC. e.g on the other PC attached to the DMZ NIC of the FreeBSD server, I try and do something like > > net use m: \\192.168.1.1\pl /user:someothersmbuser > > I can map the drive, but doing a dir shows no files. if I do something like dir > test, it does actually create the file, but I still cannot see it > > If I use smbclient from another FreeBSD box, also in the DMZ > > %smbclient -U somesmbuser //192.168.1.1/pl > > Enter somesmbuser's password: > Domain=[DMZ] OS=[Unix] Server=[Samba 3.6.1] > smb: \> dir > NT_STATUS_INVALID_HANDLE listing \* > > > But... I can actually read the files that I know are there and make and change into directories ??!! > > smb: \> get dd > getting file \dd of size 95 as dd (1.2 KiloBytes/sec) (average 1.2 KiloBytes/sec) > smb: \> get test2 > getting file \test2 of size 436 as test2 (5.6 KiloBytes/sec) (average 3.4 KiloBytes/sec) > smb: \> > smb: \> mkdir testdir > smb: \> cd testdir > smb: \testdir\> dir > NT_STATUS_INVALID_HANDLE listing \testdir\* > smb: \testdir\> > > Any idea why its not working > > It seems directly listings are the only things not working, even though ls sees them on the FreeBSD box. > > If I export a directory in samba that just has a normal UFS file system all works just fine. Its only when I try and export the smbfs system that it does not work. I also try exporting a nullfs mounted file system and that worked, but again, only if the underlying file system was UFS, not smbfs > > > > ---Mike > > From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 20:14:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF704106566B for ; Thu, 1 Dec 2011 20:14:15 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 4A78A8FC0A for ; Thu, 1 Dec 2011 20:14:14 +0000 (UTC) Received: from [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a] (saphire3.sentex.ca [IPv6:2607:f3e0:0:4:f025:8813:7603:7e4a]) by smarthost1.sentex.ca (8.14.5/8.14.4) with ESMTP id pB1KE94J013407; Thu, 1 Dec 2011 15:14:09 -0500 (EST) (envelope-from mike@sentex.net) Message-ID: <4ED7E007.1060600@sentex.net> Date: Thu, 01 Dec 2011 15:13:59 -0500 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Peter Maloney References: <4ED7B713.200@sentex.net> <4ED7DA15.1030108@brockmann-consult.de> In-Reply-To: <4ED7DA15.1030108@brockmann-consult.de> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.71 on IPv6:2607:f3e0:0:1::12 Cc: freebsd-fs@freebsd.org Subject: Re: mount_smbfs re-exported via samba not working X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 20:14:15 -0000 On 12/1/2011 2:48 PM, Peter Maloney wrote: > > And before I added this to the linux machine's smb.conf global section, > I would get total lockups and nobody could list directories. > > strict locking = no > blocking locks = no > > So give that a try. Hi, Thanks for the response, but it does not seem to help :( Running a session through tshark sees the same sort of response (STATUS_INVALID_HANDLE) on the request to list the directory structure 89 2.884570 192.168.1.23 -> 192.168.1.1 TCP 44 52479 > microsoft-ds [ACK] Seq=2558 Ack=1651 Win=65375 Len=0 90 2.999964 192.168.1.1 -> 192.168.1.23 TCP 56 netbios-ssn > 56189 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1408 WS=8 SACK_PERM=1 91 7.590425 192.168.1.23 -> 192.168.1.1 SMB 146 NT Create AndX Request, Path: \test2 92 7.594526 192.168.1.1 -> 192.168.1.23 SMB 183 NT Create AndX Response, FID: 0x15cc 93 7.611333 192.168.1.23 -> 192.168.1.1 SMB 120 Trans2 Request, QUERY_FILE_INFO, FID: 0x15cc, Query File Internal Info 94 7.611725 192.168.1.1 -> 192.168.1.23 SMB 116 Trans2 Response, FID: 0x15cc, QUERY_FILE_INFO 95 7.630035 192.168.1.23 -> 192.168.1.1 SMB 118 Trans2 Request, QUERY_FS_INFO, Query FS Attribute Info 96 7.630368 192.168.1.1 -> 192.168.1.23 SMB 124 Trans2 Response, QUERY_FS_INFO 97 7.646549 192.168.1.23 -> 192.168.1.1 SMB 128 Trans2 Request, QUERY_FS_INFO, Info Allocation 98 7.647546 192.168.1.1 -> 192.168.1.23 SMB 122 Trans2 Response, QUERY_FS_INFO 99 7.663806 192.168.1.23 -> 192.168.1.1 SMB 96 Write Request, FID: 0x15cc, 0 bytes at offset 33 100 7.666261 192.168.1.1 -> 192.168.1.23 SMB 85 Write Response, FID: 0x15cc, 0 bytes 101 7.687203 192.168.1.23 -> 192.168.1.1 SMB 145 Write AndX Request, FID: 0x15cc, 33 bytes at offset 0 102 7.687910 192.168.1.1 -> 192.168.1.23 SMB 95 Write AndX Response, FID: 0x15cc, 33 bytes 103 7.704950 192.168.1.23 -> 192.168.1.1 SMB 148 Write AndX Request, FID: 0x15cc, 36 bytes at offset 33 104 7.705559 192.168.1.1 -> 192.168.1.23 SMB 95 Write AndX Response, FID: 0x15cc, 36 bytes 105 7.722181 192.168.1.23 -> 192.168.1.1 SMB 114 Write AndX Request, FID: 0x15cc, 2 bytes at offset 69 106 7.722782 192.168.1.1 -> 192.168.1.23 SMB 95 Write AndX Response, FID: 0x15cc, 2 bytes 107 7.738991 192.168.1.23 -> 192.168.1.1 SMB 133 Write AndX Request, FID: 0x15cc, 21 bytes at offset 71 108 7.739616 192.168.1.1 -> 192.168.1.23 SMB 95 Write AndX Response, FID: 0x15cc, 21 bytes 109 7.757164 192.168.1.23 -> 192.168.1.1 SMB 134 Trans2 Request, FIND_FIRST2, Pattern: \* 110 7.758395 192.168.1.1 -> 192.168.1.23 SMB 83 Trans2 Response, FIND_FIRST2, Error: STATUS_INVALID_HANDLE 111 7.775637 192.168.1.23 -> 192.168.1.1 SMB 89 Close Request, FID: 0x15cc 112 7.778383 192.168.1.1 -> 192.168.1.23 SMB 83 Close Response, FID: 0x15cc 113 7.913181 192.168.1.23 -> 192.168.1.1 TCP 44 52479 > microsoft-ds [ACK] Seq=3445 Ack=2343 Win=64683 Len=0 114 8.999957 192.168.1.1 -> 192.168.1.23 TCP 56 netbios-ssn > 56189 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1408 WS=8 SACK_PERM=1 115 9.134924 192.168.1.23 -> 192.168.1.1 SMB 118 Trans2 Request, QUERY_FS_INFO, Query FS Attribute Info 116 9.135305 192.168.1.1 -> 192.168.1.23 SMB 124 Trans2 Response, QUERY_FS_INFO 117 9.156648 192.168.1.23 -> 192.168.1.1 SMB 134 Trans2 Request, FIND_FIRST2, Pattern: \* 118 9.157972 192.168.1.1 -> 192.168.1.23 SMB 83 Trans2 Response, FIND_FIRST2, Error: STATUS_INVALID_HANDLE 119 9.321949 192.168.1.23 -> 192.168.1.1 TCP 44 52479 > microsoft-ds [ACK] Seq=3609 Ack=2462 Win=64564 Len=0 120 10.328172 192.168.1.23 -> 192.168.1.1 SMB 87 Logoff AndX Request 121 10.328562 192.168.1.1 -> 192.168.1.23 SMB 87 Logoff AndX Response 122 10.344914 192.168.1.23 -> 192.168.1.1 SMB 83 Tree Disconnect Request 123 10.345600 192.168.1.1 -> 192.168.1.23 SMB 83 Tree Disconnect Response 124 10.527992 192.168.1.23 -> 192.168.1.1 TCP 44 52479 > microsoft-ds [ACK] Seq=3691 Ack=2544 Win=64482 Len=0 ---Mike -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-fs@FreeBSD.ORG Thu Dec 1 21:35:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CD2F3106566C for ; Thu, 1 Dec 2011 21:35:27 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 602AB8FC0A for ; Thu, 1 Dec 2011 21:35:27 +0000 (UTC) Received: by faak28 with SMTP id k28so2510235faa.13 for ; Thu, 01 Dec 2011 13:35:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=ZR4d5xLiScddCXaXDcA7FKPJWJqvoAz1wgxiYMWO0pQ=; b=oZTRu0YmMcb9iIDMPRFUqZ6k9pJE26ebCQCXWqkfuYUYp9w/rphRTIJs9RbViVtFAf x6R1uP2zNwB8/yKaA6nO5usbfKP6XZ4e0ttlJiSVqFjyfb/r6wV2xWZ9lz8qFeOLXCs5 YskV0sTiVsTRQ9VRifx7i/GVcKkHBkBSuTHks= MIME-Version: 1.0 Received: by 10.204.152.83 with SMTP id f19mr9256042bkw.90.1322773915741; Thu, 01 Dec 2011 13:11:55 -0800 (PST) Received: by 10.204.39.141 with HTTP; Thu, 1 Dec 2011 13:11:55 -0800 (PST) In-Reply-To: References: <4ED77B09.1090709@brockmann-consult.de> Date: Thu, 1 Dec 2011 16:11:55 -0500 Message-ID: From: Zaphod Beeblebrox To: krad Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 21:35:27 -0000 On Thu, Dec 1, 2011 at 11:42 AM, krad wrote: > On 1 December 2011 13:03, Peter Maloney > wrote: > >> On 12/01/2011 11:20 AM, krad wrote: >> On my system with 12 TiB used up, what I do in a script is basically: >> -generate a snap name >> -make a recursive snapshot >> -ssh to the remote server and compare snapshots (find the latest common >> snapshot, to find an incremental reference point) >> -if a usable reference point exists, start the incremental send like >> this (which wipes all changes on the remote system without confirmation)= : >> =A0 =A0 =A0 =A0zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${des= tHost} >> zfs recv -d -F -v ${destPool} >> -and if no usable reference point existed, then do a full send, >> non-incremental: >> =A0 =A0 =A0 =A0zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F = -v >> ${destDataSet} >> The part about finding the reference snapshot is the most complicated >> part of my script, and missing from anything else I found online when I >> was looking for a good solution. For example this script: >> http://blogs.sun.com/clive/resource/zfs_repl.ksh >> found on this page: >> http://blogs.oracle.com/clive/entry/replication_using_zfs Not that everyone doesn't enjoy inventing their own wheel, but would you mind sharing your snapshot parser? From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 02:36:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7BB7C106564A; Fri, 2 Dec 2011 02:36:30 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [70.36.157.235]) by mx1.freebsd.org (Postfix) with ESMTP id 416FA8FC17; Fri, 2 Dec 2011 02:36:30 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id pB22aQi6059579; Thu, 1 Dec 2011 18:36:26 -0800 (PST) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201112020236.pB22aQi6059579@chez.mckusick.com> To: Pawel Jakub Dawidek In-reply-to: <20111125110235.GB1642@garage.freebsd.pl> Date: Thu, 01 Dec 2011 18:36:26 -0800 From: Kirk McKusick X-Spam-Status: No, score=0.0 required=5.0 tests=MISSING_MID, UNPARSEABLE_RELAY autolearn=failed version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: freebsd-fs@freebsd.org Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 02:36:30 -0000 > Date: Fri, 25 Nov 2011 12:02:35 +0100 > From: Pawel Jakub Dawidek > To: Kostik Belousov > Cc: freebsd-fs@freebsd.org > Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with > softupdates)? > > On Wed, Nov 23, 2011 at 09:44:44PM +0200, Kostik Belousov wrote: > > On Wed, Nov 23, 2011 at 11:00:26PM +0400, Lev Serebryakov wrote: > > > Hello, Freebsd-fs. > > >=20 > > > Does UFS2 with softupdates (without journal) issues BIO_FLUSH to > > > GEOM layer when it need to ensure consistency on on-disk metadata? > > No. Softupdates do not need flushes. > > Well, they do for two reasons: > 1. To properly handle sync operations (fsync(2), O_SYNC). > 2. To maintain consistent on-disk structures. SU does not use synchronous writes to maintain consistency (see below). It rarely uses synchronous writes even to immplement fsync. Instead it issues async I/O requests for all the blocks necessary to ensure that the inode in question is stable. It then waits until all of those writes have been acknowledged. When they have been acknowledged the fsync returns. If the subsystem acknowledges them before they are truely on stable store, then SU and correspondingly the caller of fsync is screwed. The one place where a synchronous write (bwrite) is used is when the completion of a particular I/O is needed to make progress on many other things (usually an update to the SUJ log) in which case SU will force a flush on that I/O (e.g., bwrite it) to hasten it along. > The second point is there, because BIO_FLUSH is the only way to avoid > reordering (apart from turning off disk write cache). > > SU assumes no I/O reordering will happen, which is very weak assumption. > > -- > Pawel Jakub Dawidek http://www.wheelsystems.com > FreeBSD committer http://www.FreeBSD.org > Am I Evil? Yes, I Am! http://yomoli.com SU has no problems with reordering. It makes no assumptions on the order in which its I/O operations are done. Its only requirement is that it not be told that something is stable before it truely is stable. Because the way that it maintains consistency is to not issue a write on something before it knows that something that it depends on for consistency is in fact stable. But it has no problem with any of its outstanding writes being done in any particular order. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 07:35:05 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DBF7E1065670 for ; Fri, 2 Dec 2011 07:35:05 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.186]) by mx1.freebsd.org (Postfix) with ESMTP id 5DDAD8FC1B for ; Fri, 2 Dec 2011 07:35:05 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mrbap2) with ESMTP (Nemesis) id 0MXWgY-1RKhkH1XnS-00WVoK; Fri, 02 Dec 2011 08:35:03 +0100 Message-ID: <4ED87FA6.6010408@brockmann-consult.de> Date: Fri, 02 Dec 2011 08:35:02 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: Zaphod Beeblebrox References: <4ED77B09.1090709@brockmann-consult.de> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: multipart/mixed; boundary="------------060108000304030705050005" X-Provags-ID: V02:K0:ExgzHgEcOX9h2vJAO57Hj79+GOScYfVvDOzjY3rC+BC /VlLKM8Sj0S6iYXCZbeD73pYU008oxLd/EaTBfdPUBkdfw2Mhq 5cHtzNu9NOX0G+Z/dgA4qdsTWB6GgrWerVf7Ww43AxKOHNGB1I 3K1QCRa7zUi3qlXdvcR49rGfFP0/ah/WhZfiW9JdBqDQOqrC+c miWc5wgJdWU2m2v+tD7+LvgkeN/7o6gpgsGwyy+r5u5Uy13QNc WKfzObinMDM4kKHyX4clcEAYhBFEr3jEQoefT2HF8pAGIIIifB UADOO8+8/cUld5hBGUsxcs4Z0zu+CdO+q9xSRWOYTjrAgyQyzt kL0/iVajkmHzVPiMHl8t6nu9jzbKSp6k1rzYufqyO X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 07:35:05 -0000 This is a multi-part message in MIME format. --------------060108000304030705050005 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 12/01/2011 10:11 PM, Zaphod Beeblebrox wrote: > On Thu, Dec 1, 2011 at 11:42 AM, krad wrote: >> On 1 December 2011 13:03, Peter Maloney >> wrote: >>> On 12/01/2011 11:20 AM, krad wrote: >>> On my system with 12 TiB used up, what I do in a script is basically: >>> -generate a snap name >>> -make a recursive snapshot >>> -ssh to the remote server and compare snapshots (find the latest common >>> snapshot, to find an incremental reference point) >>> -if a usable reference point exists, start the incremental send like >>> this (which wipes all changes on the remote system without confirmation): >>> zfs send -R -I ${destLastSnap} ${srcLastSnap} | ssh ${destHost} >>> zfs recv -d -F -v ${destPool} >>> -and if no usable reference point existed, then do a full send, >>> non-incremental: >>> zfs send -R ${srcLastSnap} | ssh ${destHost} zfs recv -F -v >>> ${destDataSet} >>> The part about finding the reference snapshot is the most complicated >>> part of my script, and missing from anything else I found online when I >>> was looking for a good solution. For example this script: >>> http://blogs.sun.com/clive/resource/zfs_repl.ksh >>> found on this page: >>> http://blogs.oracle.com/clive/entry/replication_using_zfs > Not that everyone doesn't enjoy inventing their own wheel, but would > you mind sharing your snapshot parser? Sure. Here are a bunch of my zfs scripts. (attached) Disclaimer/notes: -provided as is... might destroy your system, furthermore, I am not responsible for bodily injury nor nuclear war that may result from misuse -there are no unit tests, and no documentation other than a few comments that are possibly only coherent when I read them. For example, it says that it does it recursively and rolls back the destination dataset, but there are a few undocumented cases I can't remember when I needed to do something manual like delete a snapshot, or destroy a dataset. Maybe that is all in the past. I don't know. -the zfs_repl2.bash is the one that makes snapshots and replicates which I wrote myself. The other ksh one is the Oracle one I linked above, and the .sh version of it was just what I was working on to try to make it work reliably, before redoing it all myself (reinventing the wheel is indeed fun). -especially beware of the deleteOldSnapshots.bash which is not well tested and not used yet (and deleteEmptySnapshots.bash which does not work and I believe cannot work). -granted transferable your choice of any present or future version of the BSD or GPL license and another note, I meant to study these which might be better versions of the same thing, or something different, but never got around to it: /usr/ports/sysutils/zfs-replicate/ /usr/ports/sysutils/zfsnap/ /usr/ports/sysutils/zfs-periodic Enjoy! -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de -------------------------------------------- --------------060108000304030705050005-- From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 08:43:06 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 59755106564A; Fri, 2 Dec 2011 08:43:06 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id EE97B8FC13; Fri, 2 Dec 2011 08:43:05 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:69b8:2555:9d19:7f7b]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id D930A4AC31; Fri, 2 Dec 2011 12:43:03 +0400 (MSK) Date: Fri, 2 Dec 2011 12:42:52 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <12310118271.20111202124252@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201112020236.pB22aQi6059579@chez.mckusick.com> References: <20111125110235.GB1642@garage.freebsd.pl> <201112020236.pB22aQi6059579@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: Does UFS2 send BIO_FLUSH to GEOM when update metadata (with softupdates)? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 08:43:06 -0000 Hello, Kirk. You wrote 2 =E4=E5=EA=E0=E1=F0=FF 2011 =E3., 6:36:26: > The one place where a synchronous write (bwrite) is used is when the > completion of a particular I/O is needed to make progress on many other > things (usually an update to the SUJ log) in which case SU will force a > flush on that I/O (e.g., bwrite it) to hasten it along. So, it will be great, if SU will mark such writes additionally :) It could help sometimes -- even if lower layers will not lie about complete writes, it looks that SUJ could work better if such really-critical writes will not delayed at all (even for 30 seconds, which are Ok for all pother SU writes, as you stated earlier). --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 14:17:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 32BB510656A8 for ; Fri, 2 Dec 2011 14:17:18 +0000 (UTC) (envelope-from mattblists@icritical.com) Received: from mail1.icritical.com (mail1.icritical.com [93.95.13.41]) by mx1.freebsd.org (Postfix) with SMTP id 8021B8FC13 for ; Fri, 2 Dec 2011 14:17:17 +0000 (UTC) Received: (qmail 28669 invoked from network); 2 Dec 2011 13:50:33 -0000 Received: from localhost (127.0.0.1) by mail1.icritical.com with SMTP; 2 Dec 2011 13:50:33 -0000 Received: (qmail 28662 invoked by uid 599); 2 Dec 2011 13:50:32 -0000 Received: from unknown (HELO icritical.com) (212.57.254.146) by mail1.icritical.com (qpsmtpd/0.28) with ESMTP; Fri, 02 Dec 2011 13:50:32 +0000 Message-ID: <4ED8D7A5.7090700@icritical.com> Date: Fri, 02 Dec 2011 13:50:29 +0000 From: Matt Burke User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110403 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 02 Dec 2011 13:50:29.0604 (UTC) FILETIME=[5A592A40:01CCB0F9] X-Virus-Scanned: by iCritical at mail1.icritical.com Subject: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 14:17:18 -0000 Can someone enlighten me as to how to get 'iostat -Id' or 'iostat -Idx' style counters for zpools? I've read through the man pages, but all I can see is 'zpool iostat' which gives values which appear to be averaged over an unspecified time period. With a 30-disk zpool, I can't fathom out how to get any meaningful data from the individual disk stats, and keeping a daemon running 'zpool iostat N' just to parse its output seems hugely inefficient and hacky... Thanks. -- The information contained in this message is confidential and is intended for the addressee only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden. Critical Software Ltd. reserves the right to monitor and record e-mail messages sent to and from this address for the purposes of investigating or detecting any unauthorised use of its system and ensuring its effective operation. Critical Software Ltd. registered in England, 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. ------------------------------------------------------------ This message has been scanned for security threats by iCritical. For further information, please visit www.icritical.com ------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 14:47:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25AEC106566C for ; Fri, 2 Dec 2011 14:47:28 +0000 (UTC) (envelope-from miconof80.list@gmail.com) Received: from mailhost.math.cnrs.fr (margauxlyon.mathrice.fr [194.167.215.28]) by mx1.freebsd.org (Postfix) with ESMTP id A57CF8FC0C for ; Fri, 2 Dec 2011 14:47:27 +0000 (UTC) Received: from localhost (localhost.localdomain [127.0.0.1]) by mailhost.math.cnrs.fr (Postfix) with ESMTP id 5ECA9440B8; Fri, 2 Dec 2011 15:27:09 +0100 (CET) X-Virus-Scanned: amavisd-new at math.cnrs.fr Received: from mailhost.math.cnrs.fr ([127.0.0.1]) by localhost (margaux.math.cnrs.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P78zzXlX-VQr; Fri, 2 Dec 2011 15:27:08 +0100 (CET) Received: from e4310 (e4310.lsv.ens-cachan.fr [138.231.81.249]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mailhost.math.cnrs.fr (Postfix) with ESMTP id 2E593440B5; Fri, 2 Dec 2011 15:27:08 +0100 (CET) Date: Fri, 2 Dec 2011 15:27:03 +0100 From: Michel Le Cocq To: Peter Maloney Message-ID: <20111202142656.GA7104@e4310> References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4ED87FA6.6010408@brockmann-consult.de> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 14:47:28 -0000 it's just me or there is no attachment ? I've got also a script like that but seems not as good as you describe. It's base on zfs-periodic. -- Michel > Sure. Here are a bunch of my zfs scripts. (attached) > > Disclaimer/notes: > -provided as is... might destroy your system, furthermore, I am not > responsible for bodily injury nor nuclear war that may result from misuse > -there are no unit tests, and no documentation other than a few comments > that are possibly only coherent when I read them. For example, it says > that it does it recursively and rolls back the destination dataset, but > there are a few undocumented cases I can't remember when I needed to do > something manual like delete a snapshot, or destroy a dataset. Maybe > that is all in the past. I don't know. > -the zfs_repl2.bash is the one that makes snapshots and replicates which > I wrote myself. The other ksh one is the Oracle one I linked above, and > the .sh version of it was just what I was working on to try to make it > work reliably, before redoing it all myself (reinventing the wheel is > indeed fun). > -especially beware of the deleteOldSnapshots.bash which is not well > tested and not used yet (and deleteEmptySnapshots.bash which does not > work and I believe cannot work). > -granted transferable your choice of any present or future version of > the BSD or GPL license > > and another note, I meant to study these which might be better versions > of the same thing, or something different, but never got around to it: > /usr/ports/sysutils/zfs-replicate/ > /usr/ports/sysutils/zfsnap/ > /usr/ports/sysutils/zfs-periodic > > > Enjoy! From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 14:47:44 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7948E106566C for ; Fri, 2 Dec 2011 14:47:44 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smtp-out1.tiscali.nl (smtp-out1.tiscali.nl [195.241.79.176]) by mx1.freebsd.org (Postfix) with ESMTP id 3C0E18FC19 for ; Fri, 2 Dec 2011 14:47:44 +0000 (UTC) Received: from [212.182.167.131] (helo=sjakie.klop.ws) by smtp-out1.tiscali.nl with esmtp (Exim) (envelope-from ) id 1RWUOt-00017h-51 for freebsd-fs@freebsd.org; Fri, 02 Dec 2011 15:47:43 +0100 Received: from 212-182-167-131.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id 81E4810527 for ; Fri, 2 Dec 2011 15:47:39 +0100 (CET) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-fs@freebsd.org References: <4ED8D7A5.7090700@icritical.com> Date: Fri, 02 Dec 2011 15:47:38 +0100 MIME-Version: 1.0 From: "Ronald Klop" Message-ID: In-Reply-To: <4ED8D7A5.7090700@icritical.com> User-Agent: Opera Mail/11.52 (FreeBSD) Content-Transfer-Encoding: quoted-printable Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 14:47:44 -0000 On Fri, 02 Dec 2011 14:50:29 +0100, Matt Burke = =20 wrote: > Can someone enlighten me as to how to get 'iostat -Id' or 'iostat -Idx' > style counters for zpools? > > I've read through the man pages, but all I can see is 'zpool iostat' =20 > which > gives values which appear to be averaged over an unspecified time perio= d. > > > With a 30-disk zpool, I can't fathom out how to get any meaningful data > from the individual disk stats, and keeping a daemon running 'zpool =20 > iostat > N' just to parse its output seems hugely inefficient and hacky... > > > > Thanks. > while true; do gstat -b -I 1s; done From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 14:49:31 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5D4FF106566B for ; Fri, 2 Dec 2011 14:49:31 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.westchester.pa.mail.comcast.net (qmta15.westchester.pa.mail.comcast.net [76.96.59.228]) by mx1.freebsd.org (Postfix) with ESMTP id 0C3E38FC18 for ; Fri, 2 Dec 2011 14:49:30 +0000 (UTC) Received: from omta15.westchester.pa.mail.comcast.net ([76.96.62.87]) by qmta15.westchester.pa.mail.comcast.net with comcast id 4EHF1i0011swQuc5FEpX8v; Fri, 02 Dec 2011 14:49:31 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta15.westchester.pa.mail.comcast.net with comcast id 4EpW1i00L1t3BNj3bEpWd5; Fri, 02 Dec 2011 14:49:31 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0F998102C1D; Fri, 2 Dec 2011 06:49:29 -0800 (PST) Date: Fri, 2 Dec 2011 06:49:29 -0800 From: Jeremy Chadwick To: Matt Burke Message-ID: <20111202144929.GA27319@icarus.home.lan> References: <4ED8D7A5.7090700@icritical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4ED8D7A5.7090700@icritical.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 14:49:31 -0000 On Fri, Dec 02, 2011 at 01:50:29PM +0000, Matt Burke wrote: > Can someone enlighten me as to how to get 'iostat -Id' or 'iostat -Idx' > style counters for zpools? > > I've read through the man pages, but all I can see is 'zpool iostat' which > gives values which appear to be averaged over an unspecified time period. > > With a 30-disk zpool, I can't fathom out how to get any meaningful data > from the individual disk stats, and keeping a daemon running 'zpool iostat > N' just to parse its output seems hugely inefficient and hacky... What exactly are you wanting? It sounds to me like what you want are incrementing counters, not averages, but I've re-read your mail a few times and really aren't sure. iostat -Id and iostat -Idx, without any interval argument (e.g. "iostat -Id" and not "iostat -Id 1") will give you, according to the man page: The first statistics that are printed are averaged over the system uptime. ...which is still an average, just over the entire uptime of the system. IMO, that's not very helpful either. This is why most people use iostat with an interval parameter. "zpool iostat" offers the latter, but not the former (meaning it does not offer "statistics shown averaged over the system uptime"). If you're effectively wanting counters: sadly this information is not available through any means that I know of. The only two "frameworks" I can think of are libzfs and libzpool, but I can't find documentation for either of them (probably my fault). Solaris is in the same boat here as FreeBSD, just for the record. The best you're going to get is either X-second averages (e.g. "zpool iostat -v 1" -- note the -v will show you those averages on a per-pool *and* per-vdev *and* per-device basis) for ZFS, or non-ZFS-related counters (e.g. pure device counters, which are available through statfs(2), but you will not get ZFS bits through this). Re: "keeping a daemon running 'zpool iostat N' to parse its output seems hackish" -- this is exactly what many programs on many OSes do, actually. E.g. a perl script that does open(FH, "| zpool iostat N") and has to handle things appropriately. We use this model at work on Solaris for parsing iostat and mpstat data and working it into a monitoring script that runs indefinitely, hooked into (sort of) Nagios. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 14:50:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5ED91065675 for ; Fri, 2 Dec 2011 14:50:10 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.westchester.pa.mail.comcast.net (qmta10.westchester.pa.mail.comcast.net [76.96.62.17]) by mx1.freebsd.org (Postfix) with ESMTP id 7F0698FC1A for ; Fri, 2 Dec 2011 14:50:10 +0000 (UTC) Received: from omta04.westchester.pa.mail.comcast.net ([76.96.62.35]) by qmta10.westchester.pa.mail.comcast.net with comcast id 4Epr1i0020ldTLk5AEqAdm; Fri, 02 Dec 2011 14:50:10 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta04.westchester.pa.mail.comcast.net with comcast id 4Eq91i01L1t3BNj3QEqARA; Fri, 02 Dec 2011 14:50:10 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6BDA4102C1D; Fri, 2 Dec 2011 06:50:08 -0800 (PST) Date: Fri, 2 Dec 2011 06:50:08 -0800 From: Jeremy Chadwick To: Michel Le Cocq Message-ID: <20111202145008.GA27853@icarus.home.lan> References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> <20111202142656.GA7104@e4310> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20111202142656.GA7104@e4310> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 14:50:10 -0000 On Fri, Dec 02, 2011 at 03:27:03PM +0100, Michel Le Cocq wrote: > it's just me or there is no attachment ? The mailing list stripped the attachment. The previous individual will need to put it up on the web somewhere. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 15:07:22 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC0B1106566B for ; Fri, 2 Dec 2011 15:07:22 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.westchester.pa.mail.comcast.net (qmta10.westchester.pa.mail.comcast.net [76.96.62.17]) by mx1.freebsd.org (Postfix) with ESMTP id 8A1CB8FC14 for ; Fri, 2 Dec 2011 15:07:22 +0000 (UTC) Received: from omta19.westchester.pa.mail.comcast.net ([76.96.62.98]) by qmta10.westchester.pa.mail.comcast.net with comcast id 4CZz1i00327AodY5AF7NJr; Fri, 02 Dec 2011 15:07:22 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta19.westchester.pa.mail.comcast.net with comcast id 4F7M1i01c1t3BNj3fF7NAt; Fri, 02 Dec 2011 15:07:22 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id B24F5102C1D; Fri, 2 Dec 2011 07:07:20 -0800 (PST) Date: Fri, 2 Dec 2011 07:07:20 -0800 From: Jeremy Chadwick To: Ronald Klop Message-ID: <20111202150720.GA28016@icarus.home.lan> References: <4ED8D7A5.7090700@icritical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 15:07:22 -0000 On Fri, Dec 02, 2011 at 03:47:38PM +0100, Ronald Klop wrote: > On Fri, 02 Dec 2011 14:50:29 +0100, Matt Burke > wrote: > > >Can someone enlighten me as to how to get 'iostat -Id' or 'iostat -Idx' > >style counters for zpools? > > > >I've read through the man pages, but all I can see is 'zpool > >iostat' which > >gives values which appear to be averaged over an unspecified time period. > > > > > >With a 30-disk zpool, I can't fathom out how to get any meaningful data > >from the individual disk stats, and keeping a daemon running > >'zpool iostat > >N' just to parse its output seems hugely inefficient and hacky... > > > > > > > >Thanks. > > > > while true; do gstat -b -I 1s; done ...which is valid, but is still an "averaged" value. Additionally, this does not get the OP correlation between devices and ZFS-related bits (e.g. pool, vdev, etc.). "zpool iostat -v N" or "zpool iostat -v pool N" would be how to accomplish that -- except 1) this does not behave like gstat -b, and 2) does not provide the amount of device-level granularity the OP may want. The -v flag will show both device and vdev statistics. One problem with "zpool iostat" is that it only gives you human-readable numbers; if you're using this in a script you not only have to handle parsing unit types (requiring you to go look at the C code for it, which I believe lives in libzfs or libzpool, I forget -- I have done it), but worse, you lose accuracy given rounding and unit conversion. There's no code to make it show integers or floats, at least not last time I looked (last year). A combination of two utilities may be needed -- something that parses "zpool status" to learn what devices are associated with what pool/vdev, and then runs "gstat -b 1s" and gets the actual "low-level" details desired. This should be a not-too-difficult job for any nominal perl programmer (at work I wrote our check_zpool_status Nagios-like check, which parses "zpool status" for example; lots of nuances though, especially between ZFS versions since the output actually has changed over time). But again, depending on what the OP wants, these are still "averages over short periods of time" (gstat example = 1 second) rather than incrementing counters or what iostat without an interval argument provides (average over entire system uptime). -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 15:19:57 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EF07A1065679 for ; Fri, 2 Dec 2011 15:19:57 +0000 (UTC) (envelope-from mattblists@icritical.com) Received: from mail1.icritical.com (mail1.icritical.com [93.95.13.41]) by mx1.freebsd.org (Postfix) with SMTP id 4EC148FC18 for ; Fri, 2 Dec 2011 15:19:56 +0000 (UTC) Received: (qmail 12557 invoked from network); 2 Dec 2011 15:19:59 -0000 Received: from localhost (127.0.0.1) by mail1.icritical.com with SMTP; 2 Dec 2011 15:19:59 -0000 Received: (qmail 12549 invoked by uid 599); 2 Dec 2011 15:19:59 -0000 Received: from unknown (HELO icritical.com) (212.57.254.146) by mail1.icritical.com (qpsmtpd/0.28) with ESMTP; Fri, 02 Dec 2011 15:19:59 +0000 Message-ID: <4ED8EC9A.2080706@icritical.com> Date: Fri, 02 Dec 2011 15:19:54 +0000 From: Matt Burke User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110403 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4ED8D7A5.7090700@icritical.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 02 Dec 2011 15:19:54.0628 (UTC) FILETIME=[D826DC40:01CCB105] X-Virus-Scanned: by iCritical at mail1.icritical.com Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 15:19:58 -0000 On 12/02/11 14:47, Ronald Klop wrote: > while true; do gstat -b -I 1s; done Looks like I wasn't clear about what I'm after - sorry. I want to see how many bytes or KB have been read and written to a given zpool since creation (as in the newer of uptime or zpool creation) on the system. For instance I want this data: # time iostat -Idx extended device statistics device r/i w/i kr/i kw/i wait svc_t %b mfid0 284807.0 5469251.0 4452202.0 116634996.0 0 0.8 0 mfid1 284576.0 5466322.0 4474976.5 116510280.0 0 0.8 0 mfid2 278686.0 5450269.0 4418703.0 116511709.0 0 0.8 0 mfid3 281673.0 5452757.0 4439770.5 116560910.5 0 0.8 0 mfid4 279549.0 5472177.0 4440227.0 116609067.0 0 0.8 0 mfid5 282625.0 5464261.0 4503257.5 116608801.5 0 0.8 0 mfid6 275635.0 5470654.0 4433529.0 116616131.5 0 0.8 0 ... mfid27 302950.0 5464880.0 4434398.0 116542100.0 0 0.7 0 mfid28 281464.0 5459410.0 4461678.5 116595780.5 0 0.8 0 mfid29 277535.0 5468784.0 4443352.5 116642932.0 0 0.8 0 ... real 0m0.003s user 0m0.000s sys 0m0.007s For the zpool as a singular entitiy (or even by zfs filesystem), but not for the individual disks. Hope this clarifies my request a bit -- The information contained in this message is confidential and is intended for the addressee only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden. Critical Software Ltd. reserves the right to monitor and record e-mail messages sent to and from this address for the purposes of investigating or detecting any unauthorised use of its system and ensuring its effective operation. Critical Software Ltd. registered in England, 04909220. Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. ------------------------------------------------------------ This message has been scanned for security threats by iCritical. For further information, please visit www.icritical.com ------------------------------------------------------------ From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 15:30:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BBC8A1065676 for ; Fri, 2 Dec 2011 15:30:33 +0000 (UTC) (envelope-from jhellenthal@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 3D39C8FC15 for ; Fri, 2 Dec 2011 15:30:32 +0000 (UTC) Received: by wgbdr11 with SMTP id dr11so1367299wgb.31 for ; Fri, 02 Dec 2011 07:30:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to; bh=1FLhD2jdBjCOEk1OWplaq/O7WToB9299ul6qg23HrVM=; b=FIF+wr2sDf/rChugOtBe8oVwQgax99iU3TW0QkRJ6TKgwBbPv1OAt78eYaA05EXyNO g+312vJgOc4Re7ck03zpf1fr6U4W6qKaXsT4eICFJa9nRia94cggF1e4ELikB7OzOZ3/ FQlfWJ3u+ZMNvPfpsQlCcAixQyI1aCqO7PbTQ= Received: by 10.227.206.144 with SMTP id fu16mr6307570wbb.23.1322838365231; Fri, 02 Dec 2011 07:06:05 -0800 (PST) Received: from DataIX.net (ppp-21.233.dialinfree.com. [209.172.21.233]) by mx.google.com with ESMTPS id z35sm10539328wbm.12.2011.12.02.07.06.01 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 02 Dec 2011 07:06:04 -0800 (PST) Sender: Jason Hellenthal Received: from DataIX.net (localhost [127.0.0.1]) by DataIX.net (8.14.5/8.14.5) with ESMTP id pB2F5uad026553 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 2 Dec 2011 10:05:56 -0500 (EST) (envelope-from jhell@DataIX.net) Received: (from jhell@localhost) by DataIX.net (8.14.5/8.14.5/Submit) id pB2F5pWp026552; Fri, 2 Dec 2011 10:05:51 -0500 (EST) (envelope-from jhell@DataIX.net) Date: Fri, 2 Dec 2011 10:05:51 -0500 From: Jason Hellenthal To: Matt Burke Message-ID: <20111202150551.GA26344@DataIX.net> References: <4ED8D7A5.7090700@icritical.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="dTy3Mrz/UPE2dbVg" Content-Disposition: inline In-Reply-To: <4ED8D7A5.7090700@icritical.com> Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 15:30:33 -0000 --dTy3Mrz/UPE2dbVg Content-Type: multipart/mixed; boundary="IS0zKkzwUGydFO0o" Content-Disposition: inline --IS0zKkzwUGydFO0o Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable You will need to add this patch to your kernel and recompile. Things like t= op -m io and iostat will start working properly after. On Fri, Dec 02, 2011 at 01:50:29PM +0000, Matt Burke wrote: > Can someone enlighten me as to how to get 'iostat -Id' or 'iostat -Idx' > style counters for zpools? >=20 > I've read through the man pages, but all I can see is 'zpool iostat' which > gives values which appear to be averaged over an unspecified time period. >=20 >=20 > With a 30-disk zpool, I can't fathom out how to get any meaningful data > from the individual disk stats, and keeping a daemon running 'zpool iostat > N' just to parse its output seems hugely inefficient and hacky... >=20 >=20 >=20 > Thanks. >=20 > --=20 >=20 > =20 > The information contained in this message is confidential and is intended= for the addressee only. If you have received this message in error or ther= e are any problems please notify the originator immediately. The unauthoris= ed use, disclosure, copying or alteration of this message is strictly forbi= dden.=20 >=20 > Critical Software Ltd. reserves the right to monitor and record e-mail me= ssages sent to and from this address for the purposes of investigating or d= etecting any unauthorised use of its system and ensuring its effective oper= ation. >=20 > Critical Software Ltd. registered in England, 04909220. Registered Office= : IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH. >=20 > ------------------------------------------------------------ > This message has been scanned for security threats by iCritical. > For further information, please visit www.icritical.com > ------------------------------------------------------------ >=20 >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" --IS0zKkzwUGydFO0o Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="td_ru.ru_inblock+oublock.patch" Content-Transfer-Encoding: quoted-printable # HG changeset patch # Parent 34d97359838e3296c9a2f2070c8a4f731daefc58 Add the capability to use top -m io to ZFS diff -r 34d97359838e sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c @@ -625,6 +625,9 @@ rw_exit(&dn->dn_struct_rwlock); DB_DNODE_EXIT(db); } else if (db->db_state =3D=3D DB_UNCACHED) { +#ifdef _KERNEL + curthread->td_ru.ru_inblock++; +#endif spa_t *spa =3D dn->dn_objset->os_spa; =20 if (zio =3D=3D NULL) diff -r 34d97359838e sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c @@ -400,6 +400,10 @@ if (read) { (void) dbuf_read(db, zio, dbuf_flags); } +#ifdef _KERNEL + else + curthread->td_ru.ru_oublock++; +#endif dbp[i] =3D &db->db; } rw_exit(&dn->dn_struct_rwlock); --IS0zKkzwUGydFO0o-- --dTy3Mrz/UPE2dbVg Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- iQEcBAEBAgAGBQJO2OlPAAoJEJBXh4mJ2FR+8aQIAJtNnlTEdcToDlzHjw2x70UL CAS2wF8Beo6BhiV2WTYjVp1d0TJ0+7NY/R46tA9KkWuDcfKfg9d/IB5t3m+PEVj7 hCKTdnIDccNnv6C9HQZxYozx/0P+5rdnOS6lhi4i5DRzbHydKeGsM5LdbfXLZ5P4 Pc3Z59i5ft9KYVJRcTTR135wzzc5zlJBTfCfC449BqotR6YeQu1H3P7GQN3xc5Qc CVEFdL98SVOxNyxyucRJB7CrHxoVupiFkydFYAvSkmEzKAJaKe5z29HbPdjp8oUj 71X9FX2D93K6WZl+rqF4g7bMGJ/K45sMf6I12N8gjIsnfugLzKu/M6a0Wf6SlwM= =kUH3 -----END PGP SIGNATURE----- --dTy3Mrz/UPE2dbVg-- From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 15:36:26 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F9931065670 for ; Fri, 2 Dec 2011 15:36:26 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta05.westchester.pa.mail.comcast.net (qmta05.westchester.pa.mail.comcast.net [76.96.62.48]) by mx1.freebsd.org (Postfix) with ESMTP id 3D47D8FC14 for ; Fri, 2 Dec 2011 15:36:26 +0000 (UTC) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta05.westchester.pa.mail.comcast.net with comcast id 4Cku1i0081ei1Bg55FcSVX; Fri, 02 Dec 2011 15:36:26 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta24.westchester.pa.mail.comcast.net with comcast id 4FcR1i01F1t3BNj3kFcS49; Fri, 02 Dec 2011 15:36:26 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6FCA6102C1D; Fri, 2 Dec 2011 07:36:24 -0800 (PST) Date: Fri, 2 Dec 2011 07:36:24 -0800 From: Jeremy Chadwick To: Matt Burke Message-ID: <20111202153624.GA28715@icarus.home.lan> References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4ED8EC9A.2080706@icritical.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 15:36:26 -0000 On Fri, Dec 02, 2011 at 03:19:54PM +0000, Matt Burke wrote: > On 12/02/11 14:47, Ronald Klop wrote: > > while true; do gstat -b -I 1s; done > > Looks like I wasn't clear about what I'm after - sorry. > > I want to see how many bytes or KB have been read and written to a given > zpool since creation (as in the newer of uptime or zpool creation) on the > system. > > For instance I want this data: > > # time iostat -Idx > extended device statistics > device r/i w/i kr/i kw/i wait svc_t %b > mfid0 284807.0 5469251.0 4452202.0 116634996.0 0 0.8 0 > mfid1 284576.0 5466322.0 4474976.5 116510280.0 0 0.8 0 > mfid2 278686.0 5450269.0 4418703.0 116511709.0 0 0.8 0 > mfid3 281673.0 5452757.0 4439770.5 116560910.5 0 0.8 0 > mfid4 279549.0 5472177.0 4440227.0 116609067.0 0 0.8 0 > mfid5 282625.0 5464261.0 4503257.5 116608801.5 0 0.8 0 > mfid6 275635.0 5470654.0 4433529.0 116616131.5 0 0.8 0 > ... > mfid27 302950.0 5464880.0 4434398.0 116542100.0 0 0.7 0 > mfid28 281464.0 5459410.0 4461678.5 116595780.5 0 0.8 0 > mfid29 277535.0 5468784.0 4443352.5 116642932.0 0 0.8 0 > ... > real 0m0.003s > user 0m0.000s > sys 0m0.007s > > > For the zpool as a singular entitiy (or even by zfs filesystem), but not > for the individual disks. > > Hope this clarifies my request a bit To my knowledge this kind of data is not kept/available in ZFS (FreeBSD or Solaris). What you're wanting (truly) are counters rather than averages, and you can do the averaging yourself (if wanted). "zpool iostat" does not do this. With most utilities like iostat, mpstat, zpool iostat, gstat, vmstat, and others of this nature, the established method/model/norm is that you always provide an interval and you ignore the first sample/set of data shown. In iostat's case on FreeBSD, it provides you an average over the entire system uptime. Other utilities do not work this way. Even if "zpool iostat" behaved like your above iostat example, you'd still run into the problem I described in my other mail (which is that you get human-readable output, not actual integers/floats, and you therefore have to do math to turn the values into integers, which sounds easy but isn't, and you lose granularity/accuracy too). I cannot explain why "zpool iostat" (note no interval argument!) shows some reads/writes. For example, on my systems, the following loop: while true; do zpool iostat; done ...literally returns the same data over and over, no matter what is going on with he pools (reads or writes). I'm sure someone can explain this behaviour, but it reminds me of systems where running "vmstat 1" shows "crazy" values for the first interval, but the 2nd and onward are accurate. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 15:45:07 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2C70106567C for ; Fri, 2 Dec 2011 15:45:07 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.186]) by mx1.freebsd.org (Postfix) with ESMTP id 4F5708FC19 for ; Fri, 2 Dec 2011 15:45:06 +0000 (UTC) Received: from [10.3.0.26] ([141.4.215.32]) by mrelayeu.kundenserver.de (node=mrbap4) with ESMTP (Nemesis) id 0M5xYf-1QZAW83vKr-00xv4N; Fri, 02 Dec 2011 16:45:06 +0100 Message-ID: <4ED8F281.10903@brockmann-consult.de> Date: Fri, 02 Dec 2011 16:45:05 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110922 Thunderbird/3.1.15 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> <20111202153624.GA28715@icarus.home.lan> In-Reply-To: <20111202153624.GA28715@icarus.home.lan> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:J/8HlKlCEGATAlAWFqBFylobQP4IKSnctfdj4vh0b+8 j4h6htmygRWwqcMbdJJAvCvmBKI0/kwW2jgtmp/Oq/yttvCCom WnOOeegNU4SZxrUXQO9fIAh8ltYOyGh3ecJU/rqxlw1+bDQsC4 xTqMI+H9j6oZ5BiuQxj0zmucrFCiyiB0ErOjonbnK9E1P7/RF8 dd0QA/TQQj/fuGY48bomNo4I63ytAZYSUu0mSpLrPJkqWuiKOL yNaQCvK2HLMEwOS3m7piaraR8tXN/QAspXDjMv2fMQt3iKv7wO w95XI9lCJQEq8sBI52JDYYhr+Nv5imrDZYRrVtYXVdO5O/VrW7 cy34+rvWdLyOrZtH/E16phF45HoFFrTjUd62zQ/Gi Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 15:45:07 -0000 On 12/02/2011 04:36 PM, Jeremy Chadwick wrote: > On Fri, Dec 02, 2011 at 03:19:54PM +0000, Matt Burke wrote: >> On 12/02/11 14:47, Ronald Klop wrote: >>> while true; do gstat -b -I 1s; done >> Looks like I wasn't clear about what I'm after - sorry. >> >> I want to see how many bytes or KB have been read and written to a given >> zpool since creation (as in the newer of uptime or zpool creation) on the >> system. >> >> For instance I want this data: >> >> # time iostat -Idx >> extended device statistics >> device r/i w/i kr/i kw/i wait svc_t %b >> mfid0 284807.0 5469251.0 4452202.0 116634996.0 0 0.8 0 >> mfid1 284576.0 5466322.0 4474976.5 116510280.0 0 0.8 0 >> mfid2 278686.0 5450269.0 4418703.0 116511709.0 0 0.8 0 >> mfid3 281673.0 5452757.0 4439770.5 116560910.5 0 0.8 0 >> mfid4 279549.0 5472177.0 4440227.0 116609067.0 0 0.8 0 >> mfid5 282625.0 5464261.0 4503257.5 116608801.5 0 0.8 0 >> mfid6 275635.0 5470654.0 4433529.0 116616131.5 0 0.8 0 >> ... >> mfid27 302950.0 5464880.0 4434398.0 116542100.0 0 0.7 0 >> mfid28 281464.0 5459410.0 4461678.5 116595780.5 0 0.8 0 >> mfid29 277535.0 5468784.0 4443352.5 116642932.0 0 0.8 0 >> ... >> real 0m0.003s >> user 0m0.000s >> sys 0m0.007s >> >> >> For the zpool as a singular entitiy (or even by zfs filesystem), but not >> for the individual disks. >> >> Hope this clarifies my request a bit > To my knowledge this kind of data is not kept/available in ZFS (FreeBSD > or Solaris). What you're wanting (truly) are counters rather than > averages, and you can do the averaging yourself (if wanted). "zpool > iostat" does not do this. Couldn't something like this be added easily to the zfs kernel module, so it can be viewed with something like "sysctl -a | grep kstat.zfs.vdevstats"? > With most utilities like iostat, mpstat, zpool iostat, gstat, vmstat, > and others of this nature, the established method/model/norm is that you > always provide an interval and you ignore the first sample/set of data > shown. In iostat's case on FreeBSD, it provides you an average over the > entire system uptime. Other utilities do not work this way. > > Even if "zpool iostat" behaved like your above iostat example, you'd > still run into the problem I described in my other mail (which is that > you get human-readable output, not actual integers/floats, and you > therefore have to do math to turn the values into integers, which sounds > easy but isn't, and you lose granularity/accuracy too). > > I cannot explain why "zpool iostat" (note no interval argument!) shows > some reads/writes. For example, on my systems, the following loop: > > while true; do zpool iostat; done > > ...literally returns the same data over and over, no matter what is > going on with he pools (reads or writes). I'm sure someone can explain > this behaviour, but it reminds me of systems where running "vmstat 1" > shows "crazy" values for the first interval, but the 2nd and onward > are accurate. > -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.maloney@brockmann-consult.de Internet: http://www.brockmann-consult.de -------------------------------------------- From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 17:37:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 773C7106566C for ; Fri, 2 Dec 2011 17:37:30 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 316A88FC0C for ; Fri, 2 Dec 2011 17:37:29 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id pB2HbSlj002102; Fri, 2 Dec 2011 11:37:29 -0600 (CST) Date: Fri, 2 Dec 2011 11:37:28 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Matt Burke In-Reply-To: <4ED8EC9A.2080706@icritical.com> Message-ID: References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Fri, 02 Dec 2011 11:37:29 -0600 (CST) Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 17:37:30 -0000 On Fri, 2 Dec 2011, Matt Burke wrote: > On 12/02/11 14:47, Ronald Klop wrote: >> while true; do gstat -b -I 1s; done > > Looks like I wasn't clear about what I'm after - sorry. > > I want to see how many bytes or KB have been read and written to a given > zpool since creation (as in the newer of uptime or zpool creation) on the > system. This implies that these statistics would need to be stored in the pool itself, which implies that the statistics need to be periodically written (e.g. in each transaction group) as a form of metadata. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 18:16:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0524F1065670 for ; Fri, 2 Dec 2011 18:16:54 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8AECA8FC08 for ; Fri, 2 Dec 2011 18:16:53 +0000 (UTC) Received: by faak28 with SMTP id k28so3313595faa.13 for ; Fri, 02 Dec 2011 10:16:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1vYKxs9VK7LLUeeEGvzEWFQHdIZocO3iIPlgOLq3HTw=; b=jEpw7xG6i9DFq7EkLEMef1YYqOiJ+tnS64H5iLVoU08OszvwRFUkh75zygFnQYo5in svjmTUoyNmsq7Uvo6sJLCuKrRKeQ6EzYXKkHrCwJ22O23yrfGwyO6bFsqjXwPzmTZxtE xen3n+BPaGqywzkCkW/YzwKer88rWdlWHDd9A= MIME-Version: 1.0 Received: by 10.180.108.114 with SMTP id hj18mr11438154wib.2.1322847937649; Fri, 02 Dec 2011 09:45:37 -0800 (PST) Received: by 10.223.83.14 with HTTP; Fri, 2 Dec 2011 09:45:37 -0800 (PST) In-Reply-To: <20111202150551.GA26344@DataIX.net> References: <4ED8D7A5.7090700@icritical.com> <20111202150551.GA26344@DataIX.net> Date: Fri, 2 Dec 2011 11:45:37 -0600 Message-ID: From: Adam Vande More To: Jason Hellenthal Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 18:16:54 -0000 On Fri, Dec 2, 2011 at 9:05 AM, Jason Hellenthal wrote: > > You will need to add this patch to your kernel and recompile. Things like > top -m io and iostat will start working properly after. > Is there some reason this hasn't made it into the src yet? -- Adam Vande More From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 19:04:38 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F36E106564A for ; Fri, 2 Dec 2011 19:04:38 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from mo-p05-ob6.rzone.de (mo-p05-ob6.rzone.de [IPv6:2a01:238:20a:202:53f5::1]) by mx1.freebsd.org (Postfix) with ESMTP id 86F4E8FC0C for ; Fri, 2 Dec 2011 19:04:37 +0000 (UTC) X-RZG-AUTH: :LWIKdA2leu0bPbLmhzXgqn0MTG6qiKEwQRWfNxSw4HzYIwjsnvdDt2oX8drk23mufl0QIA== X-RZG-CLASS-ID: mo05 Received: from [192.168.179.42] (hmbg-5f760a09.pool.mediaWays.net [95.118.10.9]) by smtp.strato.de (cohen mo30) (RZmta 26.10 AUTH) with (DHE-RSA-AES256-SHA encrypted) ESMTPA id n04b34nB2GbFDn ; Fri, 2 Dec 2011 20:04:26 +0100 (MET) Message-ID: <4ED92139.6010900@brockmann-consult.de> Date: Fri, 02 Dec 2011 20:04:25 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Jeremy Chadwick References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> <20111202142656.GA7104@e4310> <20111202145008.GA27853@icarus.home.lan> In-Reply-To: <20111202145008.GA27853@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 19:04:38 -0000 Am 02.12.2011 15:50, schrieb Jeremy Chadwick: > On Fri, Dec 02, 2011 at 03:27:03PM +0100, Michel Le Cocq wrote: >> it's just me or there is no attachment ? > The mailing list stripped the attachment. The previous individual will > need to put it up on the web somewhere. > It is possible that I forgot to attach it. I assumed it would be stripped off but the ones in the to/cc would get it. Here it is on the company website: http://www.brockmann-consult.de/peter2/zfs.tgz Disclaimer/notes: -provided as is... might destroy your system, furthermore, I am not responsible for bodily injury nor nuclear war that may result from misuse -there are no unit tests, and no documentation other than a few comments that are possibly only coherent when I read them. For example, it says that it does it recursively and rolls back the destination dataset, but there are a few undocumented cases I can't remember when I needed to do something manual like delete a snapshot, or destroy a dataset. Maybe that is all in the past. I don't know. -the zfs_repl2.bash is the one that makes snapshots and replicates which I wrote myself. The other ksh one is the Oracle one I linked above, and the .sh version of it was just what I was working on to try to make it work reliably, before redoing it all myself (reinventing the wheel is indeed fun). -especially beware of the deleteOldSnapshots.bash which is not well tested and not used yet (and deleteEmptySnapshots.bash which does not work and I believe cannot work). -granted transferable your choice of any present or future version of the BSD or GPL license and another note, I meant to study these which might be better versions of the same thing, or something different, but never got around to it: /usr/ports/sysutils/zfs-replicate/ /usr/ports/sysutils/zfsnap/ /usr/ports/sysutils/zfs-periodic From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 19:36:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F2D25106564A for ; Fri, 2 Dec 2011 19:36:49 +0000 (UTC) (envelope-from martin@lispworks.com) Received: from lwfs1-cam.cam.lispworks.com (mail.lispworks.com [193.34.186.230]) by mx1.freebsd.org (Postfix) with ESMTP id 9D7A28FC13 for ; Fri, 2 Dec 2011 19:36:49 +0000 (UTC) Received: from higson.cam.lispworks.com (higson [192.168.1.7]) by lwfs1-cam.cam.lispworks.com (8.14.3/8.14.3) with ESMTP id pB2JaitY009005; Fri, 2 Dec 2011 19:36:44 GMT (envelope-from martin@lispworks.com) Received: from higson.cam.lispworks.com (localhost.localdomain [127.0.0.1]) by higson.cam.lispworks.com (8.14.4) id pB2JaiOj012448; Fri, 2 Dec 2011 19:36:44 GMT Received: (from martin@localhost) by higson.cam.lispworks.com (8.14.4/8.14.4/Submit) id pB2Jaiuw012444; Fri, 2 Dec 2011 19:36:44 GMT Date: Fri, 2 Dec 2011 19:36:44 GMT Message-Id: <201112021936.pB2Jaiuw012444@higson.cam.lispworks.com> From: Martin Simmons To: freebsd-fs@freebsd.org In-reply-to: <20111202153624.GA28715@icarus.home.lan> (message from Jeremy Chadwick on Fri, 2 Dec 2011 07:36:24 -0800) References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> <20111202153624.GA28715@icarus.home.lan> Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 19:36:50 -0000 >>>>> On Fri, 2 Dec 2011 07:36:24 -0800, Jeremy Chadwick said: > > On Fri, Dec 02, 2011 at 03:19:54PM +0000, Matt Burke wrote: > > On 12/02/11 14:47, Ronald Klop wrote: > > > while true; do gstat -b -I 1s; done > > > > Looks like I wasn't clear about what I'm after - sorry. > > > > I want to see how many bytes or KB have been read and written to a given > > zpool since creation (as in the newer of uptime or zpool creation) on the > > system. > > > > For instance I want this data: > > > > # time iostat -Idx > > extended device statistics > > device r/i w/i kr/i kw/i wait svc_t %b > > mfid0 284807.0 5469251.0 4452202.0 116634996.0 0 0.8 0 > > mfid1 284576.0 5466322.0 4474976.5 116510280.0 0 0.8 0 > > mfid2 278686.0 5450269.0 4418703.0 116511709.0 0 0.8 0 > > mfid3 281673.0 5452757.0 4439770.5 116560910.5 0 0.8 0 > > mfid4 279549.0 5472177.0 4440227.0 116609067.0 0 0.8 0 > > mfid5 282625.0 5464261.0 4503257.5 116608801.5 0 0.8 0 > > mfid6 275635.0 5470654.0 4433529.0 116616131.5 0 0.8 0 > > ... > > mfid27 302950.0 5464880.0 4434398.0 116542100.0 0 0.7 0 > > mfid28 281464.0 5459410.0 4461678.5 116595780.5 0 0.8 0 > > mfid29 277535.0 5468784.0 4443352.5 116642932.0 0 0.8 0 > > ... > > real 0m0.003s > > user 0m0.000s > > sys 0m0.007s > > > > > > For the zpool as a singular entitiy (or even by zfs filesystem), but not > > for the individual disks. > > > > Hope this clarifies my request a bit > > To my knowledge this kind of data is not kept/available in ZFS (FreeBSD > or Solaris). What you're wanting (truly) are counters rather than > averages, and you can do the averaging yourself (if wanted). "zpool > iostat" does not do this. > > With most utilities like iostat, mpstat, zpool iostat, gstat, vmstat, > and others of this nature, the established method/model/norm is that you > always provide an interval and you ignore the first sample/set of data > shown. In iostat's case on FreeBSD, it provides you an average over the > entire system uptime. Other utilities do not work this way. > > Even if "zpool iostat" behaved like your above iostat example, you'd > still run into the problem I described in my other mail (which is that > you get human-readable output, not actual integers/floats, and you > therefore have to do math to turn the values into integers, which sounds > easy but isn't, and you lose granularity/accuracy too). > > I cannot explain why "zpool iostat" (note no interval argument!) shows > some reads/writes. For example, on my systems, the following loop: > > while true; do zpool iostat; done > > ...literally returns the same data over and over, no matter what is > going on with he pools (reads or writes). I'm sure someone can explain > this behaviour, but it reminds me of systems where running "vmstat 1" > shows "crazy" values for the first interval, but the 2nd and onward > are accurate. It looks like the first set of numbers are the averages between now and the time that the vdev was loaded into the kernel (see print_vdev_stats). It should be easy to write a function that prints the unscaled raw values. __Martin From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 19:56:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D6811065677 for ; Fri, 2 Dec 2011 19:56:29 +0000 (UTC) (envelope-from jhellenthal@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 062378FC17 for ; Fri, 2 Dec 2011 19:56:28 +0000 (UTC) Received: by wgbdr11 with SMTP id dr11so1794849wgb.31 for ; Fri, 02 Dec 2011 11:56:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to; bh=lW0zEptH86K13toQDpblysysz62XqGOc95cc3rlk+OA=; b=myzDUYRfgFu90g/+s/JIcOPXc/5rXYJx6Egx3mcM8kAiIAFHfAGUn2AzIHM5wFSWDa YgbViGSc+ZnRo68lZ/zi1yeDPkfAAF01QBinIfnCUVpWrghOnAawbLvk8RIcnm1NFcNQ TQuAmPCwxMrAkCvua0rZy4eRCMcogkML4Gd3A= Received: by 10.227.206.6 with SMTP id fs6mr6985752wbb.20.1322855788041; Fri, 02 Dec 2011 11:56:28 -0800 (PST) Received: from DataIX.net (ppp-21.139.dialinfree.com. [209.172.21.139]) by mx.google.com with ESMTPS id fk3sm11791012wbb.10.2011.12.02.11.56.25 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 02 Dec 2011 11:56:27 -0800 (PST) Sender: Jason Hellenthal Received: from DataIX.net (localhost [127.0.0.1]) by DataIX.net (8.14.5/8.14.5) with ESMTP id pB2JuKFR038968 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 2 Dec 2011 14:56:20 -0500 (EST) (envelope-from jhell@DataIX.net) Received: (from jhell@localhost) by DataIX.net (8.14.5/8.14.5/Submit) id pB2JuE7V038967; Fri, 2 Dec 2011 14:56:14 -0500 (EST) (envelope-from jhell@DataIX.net) Date: Fri, 2 Dec 2011 14:56:14 -0500 From: Jason Hellenthal To: Adam Vande More Message-ID: <20111202195614.GA38299@DataIX.net> References: <4ED8D7A5.7090700@icritical.com> <20111202150551.GA26344@DataIX.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 19:56:29 -0000 On Fri, Dec 02, 2011 at 11:45:37AM -0600, Adam Vande More wrote: > On Fri, Dec 2, 2011 at 9:05 AM, Jason Hellenthal wrote: > > > > > You will need to add this patch to your kernel and recompile. Things like > > top -m io and iostat will start working properly after. > > > > Is there some reason this hasn't made it into the src yet? > As far as I know it very well could be in 9.0 I have not checked. I do know it is not in 8.2-STABLE. For what reason is beyond me but this has been laying around for quite some time. >=1y From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 20:02:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12EE21065698 for ; Fri, 2 Dec 2011 20:02:58 +0000 (UTC) (envelope-from jhellenthal@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8ED378FC14 for ; Fri, 2 Dec 2011 20:02:57 +0000 (UTC) Received: by bkat2 with SMTP id t2so5015471bka.13 for ; Fri, 02 Dec 2011 12:02:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to; bh=wDSxHpqTZf5GPao/rmLBRkMYJwYiFcVratv48caXZ3c=; b=J040BbZ9fiCo172ZKse7VJVjDh3m6N6p5LIju7yOXy/NIMGYI1KvQy5lVoCw0nFdKT R54TfUMFvXH2J8v/CK0IdppRpejcXeDGYj98RjJ7+e0k0T925RXfQN1V+FXnhmpK+f9w dQlNZkzYtFfpRyzIP+j5aoh8esKhpsg0GuAf8= Received: by 10.180.7.164 with SMTP id k4mr11706491wia.52.1322856176259; Fri, 02 Dec 2011 12:02:56 -0800 (PST) Received: from DataIX.net (ppp-21.139.dialinfree.com. [209.172.21.139]) by mx.google.com with ESMTPS id em4sm11801877wbb.20.2011.12.02.12.02.52 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 02 Dec 2011 12:02:55 -0800 (PST) Sender: Jason Hellenthal Received: from DataIX.net (localhost [127.0.0.1]) by DataIX.net (8.14.5/8.14.5) with ESMTP id pB2K2l5F039350 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 2 Dec 2011 15:02:47 -0500 (EST) (envelope-from jhell@DataIX.net) Received: (from jhell@localhost) by DataIX.net (8.14.5/8.14.5/Submit) id pB2K2gXW039349; Fri, 2 Dec 2011 15:02:42 -0500 (EST) (envelope-from jhell@DataIX.net) Date: Fri, 2 Dec 2011 15:02:41 -0500 From: Jason Hellenthal To: Bob Friesenhahn Message-ID: <20111202200241.GA38979@DataIX.net> References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 20:02:58 -0000 On Fri, Dec 02, 2011 at 11:37:28AM -0600, Bob Friesenhahn wrote: > On Fri, 2 Dec 2011, Matt Burke wrote: > > > On 12/02/11 14:47, Ronald Klop wrote: > >> while true; do gstat -b -I 1s; done > > > > Looks like I wasn't clear about what I'm after - sorry. > > > > I want to see how many bytes or KB have been read and written to a given > > zpool since creation (as in the newer of uptime or zpool creation) on the > > system. > > This implies that these statistics would need to be stored in the pool > itself, which implies that the statistics need to be periodically > written (e.g. in each transaction group) as a form of metadata. > I thought this was ( df -[h,b,m,g]) output... ;) zfs list ... From owner-freebsd-fs@FreeBSD.ORG Fri Dec 2 20:15:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A569106564A for ; Fri, 2 Dec 2011 20:15:15 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 203288FC08 for ; Fri, 2 Dec 2011 20:15:14 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id pB2KF5PB002648; Fri, 2 Dec 2011 14:15:05 -0600 (CST) Date: Fri, 2 Dec 2011 14:15:05 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Jason Hellenthal In-Reply-To: <20111202200241.GA38979@DataIX.net> Message-ID: References: <4ED8D7A5.7090700@icritical.com> <4ED8EC9A.2080706@icritical.com> <20111202200241.GA38979@DataIX.net> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Fri, 02 Dec 2011 14:15:05 -0600 (CST) Cc: freebsd-fs@freebsd.org Subject: Re: Monitoring ZFS IO X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 Dec 2011 20:15:15 -0000 On Fri, 2 Dec 2011, Jason Hellenthal wrote: >>> >>> I want to see how many bytes or KB have been read and written to a given >>> zpool since creation (as in the newer of uptime or zpool creation) on the >>> system. >> >> This implies that these statistics would need to be stored in the pool >> itself, which implies that the statistics need to be periodically >> written (e.g. in each transaction group) as a form of metadata. >> > > I thought this was ( df -[h,b,m,g]) output... ;) Apparently the term "creation" was meant to mean "since boot" rather than since filesystem creation. Regardless, it is possible to write thousands of times more data to a filesystem than it contains. This means that 'df' won't do the job. It would be nice if the zfs pool could store its own performance data since the dawn of time. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Sat Dec 3 05:29:31 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 16679106566C for ; Sat, 3 Dec 2011 05:29:31 +0000 (UTC) (envelope-from techchavez@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 94E908FC0C for ; Sat, 3 Dec 2011 05:29:30 +0000 (UTC) Received: by wgbdr11 with SMTP id dr11so2694947wgb.31 for ; Fri, 02 Dec 2011 21:29:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=h3tUU5558abFVwA20wQMjvcuQUYGC5a++JCA/sFK82k=; b=sXF8lahFCWSkYIW163nvFuOyweXoTQ/FNXV+zZ6ax6F+Q6zIlLpFID5snnvV9KJU6s UbJ0akNJo+M6pl8404rCwFSOqrqFEOkY0TWiWSXv5XGVlnQe1R7ZbmW6XSfaiMoxHpGw rRFY8C5ANivQO0u/kzvLzVH6ZEh+gdbE0rOtg= MIME-Version: 1.0 Received: by 10.216.14.37 with SMTP id c37mr242177wec.86.1322890169281; Fri, 02 Dec 2011 21:29:29 -0800 (PST) Received: by 10.180.94.197 with HTTP; Fri, 2 Dec 2011 21:29:29 -0800 (PST) In-Reply-To: <4ED92139.6010900@brockmann-consult.de> References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> <20111202142656.GA7104@e4310> <20111202145008.GA27853@icarus.home.lan> <4ED92139.6010900@brockmann-consult.de> Date: Fri, 2 Dec 2011 22:29:29 -0700 Message-ID: From: Techie To: Peter Maloney Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Dec 2011 05:29:31 -0000 Hey Peter, Thanks for your informative response. This is odd, I have been waiting for a response to this question for a few days and these messages just came through..I am glad they did. Anyhow please allow me to explain the whole "tar" thing. I regret it because no one even addressed the DDT, Dedup Table part of it. You see I wanted to use ZFS as a deduplication disk target for my backup applications and use the native replication capabilities of ZFS to replicate the virtual backup cartridges. All modern backup apps leverage disk as a backup target but some don't do replication. My idea was to use ZFS to do this. However after testing I came to the realization that ZFS deduplication is NOT ideal for "deduping" third party backup streams. From what I read this is due to the fact that backup applications put their own metadata in the streams and throw off the block alignment. Products like Data Domain and Quantum DXi use a variable length block and are developed towards deduplicating backup application streams. ZFS does OK but nothing in comparison to the dedup ratio seen on these aforementioned appliances. I used tar as an example and should have been more specific. I understand what you are saying about replicating every 15 minutes etc.. However since backup application create huge files, an incremental send would need to send the newly created huge file..At least that is how I understand, I may be correctly. In my testing this was the case but perhaps my syntax was not correct. In any case deduplication appliances when replicating only send the changed blocks that don't exist on the target side. To do this they have to have knowledge of what exists in the target side "block pool", "dedup hash table",or whatever it may be called. >From what I understand a ZFS file system on the source side has no idea of what exists on the target side. I also understand and maybe incorrectly, that the zfs send -D only eliminates duplicate blocks in the stream it is sending and does not account for a block that may already exist at the target. As an example let's say I am using a backup app like Amanda.. I do a full backup every day to a ZFS based disk target..Every day after the backup completes I do a -- "zfs send -D -i {snap_yesterday} {snap_today} | ssh DESTINATION zfs recv DEST_FS". Now each day's full backup will only have maybe a 1% change rate and this will be reflected on the source side file system. So if I had 5 days of 2 GB full backups, the source file system will show maybe 3GB Alloc in the zpool list output. However since the source does not know about duplicate blocks on the target side from yesterday's backup, it sends the entire 2GB full backup from today only removing any duplicate blocks that exist in the stream it is sending. The difference with a dedup appliance is that it is aware of duplicate blocks on the target side and won't send them. This is the reason my original question was asking if there were any plans to implement a "global DDT" or dedup table to make the target aware of the destination duplicate blocks so that only unique blocks are transferred. Am I incorrect in my understanding of the ZFS DDT being unique to each ZFS file system/pool? Thanks Jimmy On Fri, Dec 2, 2011 at 12:04 PM, Peter Maloney wrote: > Am 02.12.2011 15:50, schrieb Jeremy Chadwick: >> On Fri, Dec 02, 2011 at 03:27:03PM +0100, Michel Le Cocq wrote: >>> it's just me or there is no attachment ? >> The mailing list stripped the attachment. =C2=A0The previous individual = will >> need to put it up on the web somewhere. >> > It is possible that I forgot to attach it. I assumed it would be > stripped off but the ones in the to/cc would get it. > > Here it is on the company website: > > http://www.brockmann-consult.de/peter2/zfs.tgz > > > > Disclaimer/notes: > -provided as is... might destroy your system, furthermore, I am not > responsible for bodily injury nor nuclear war that may result from misuse > -there are no unit tests, and no documentation other than a few comments > that are possibly only coherent when I read them. For example, it says > that it does it recursively and rolls back the destination dataset, but > there are a few undocumented cases I can't remember when I needed to do > something manual like delete a snapshot, or destroy a dataset. Maybe > that is all in the past. I don't know. > -the zfs_repl2.bash is the one that makes snapshots and replicates which > I wrote myself. The other ksh one is the Oracle one I linked above, and > the .sh version of it was just what I was working on to try to make it > work reliably, before redoing it all myself (reinventing the wheel is > indeed fun). > -especially beware of the deleteOldSnapshots.bash which is not well > tested and not used yet (and deleteEmptySnapshots.bash which does not > work and I believe cannot work). > -granted transferable your choice of any present or future version of > the BSD or GPL license > > and another note, I meant to study these which might be better versions > of the same thing, or something different, but never got around to it: > =C2=A0 =C2=A0/usr/ports/sysutils/zfs-replicate/ > =C2=A0 =C2=A0/usr/ports/sysutils/zfsnap/ > =C2=A0 =C2=A0/usr/ports/sysutils/zfs-periodic > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Dec 3 09:53:53 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9BBEE106566C for ; Sat, 3 Dec 2011 09:53:53 +0000 (UTC) (envelope-from joh.hendriks@gmail.com) Received: from mail-ee0-f54.google.com (mail-ee0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 314DB8FC08 for ; Sat, 3 Dec 2011 09:53:52 +0000 (UTC) Received: by eekc13 with SMTP id c13so2856646eek.13 for ; Sat, 03 Dec 2011 01:53:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject :content-type:content-transfer-encoding; bh=HbKUMzkIynlWB1OsglW1RjIfZ3Y2LS/9tA4Vysw8rVI=; b=KYNPESCYAYnxdAwT0nQV8C38xj5YWwfKnt7l7NXSEvC63lTIiH9y75V6wUrF0c8/tn jBA2j8h4yG+ILyzBg9piAjl+Y3lQcxEniV8asQYN4CKVsF9wZzwDbOzY/1pQxBxeO2B0 jeIGkMdUwZP0eENToY2sSRnLi4+QTtq7iOnsU= Received: by 10.14.30.134 with SMTP id k6mr135198eea.59.1322906032065; Sat, 03 Dec 2011 01:53:52 -0800 (PST) Received: from [192.168.1.12] (5ED0E470.cm-7-1d.dynamic.ziggo.nl. [94.208.228.112]) by mx.google.com with ESMTPS id 65sm11639482eeg.8.2011.12.03.01.53.51 (version=SSLv3 cipher=OTHER); Sat, 03 Dec 2011 01:53:51 -0800 (PST) Message-ID: <4ED9F1A9.2070103@gmail.com> Date: Sat, 03 Dec 2011 10:53:45 +0100 From: Johan Hendriks User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: ZFS and autoreplace. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Dec 2011 09:53:53 -0000 Hello all. I noticed that the autoreplace function is not working in FreeBSD. So after some search on the net i believe this has not been implemented. Should ZFS not warn users that put a a spare in the pool that this is NOT a hot spare, and human intervention is needed to put the spare in place. I think that a lot of users could get false sence of secutity. In case of the auto replace, i think the pool must not insert the spare when an administrator sets a drive offline! Only when a drive fails, and goes to UNAVAIL or REMOVED state. For me, and i think others having hot spares is almost an normal thing to have, especially with an advanced system as ZFS. Also does anyone have a script that can do this. My scripting capabilities are as close to non existing.. Secondly devd. Are the entries in /etc/devd.conf accurate? if i see enries in /var/run/devd.pipe, the all look like. !system=ZFS subsystem=ZFS type=misc.fs.zfs.config_sync. the entry's in devd.conf do not contain the subsystem entry. notify 10 { match "system" "ZFS"; match "type" "data"; action "logger -p kern.warn 'ZFS: zpool I/O failure, zpool=$pool error=$zio_err'"; }; should they not contain the following line also? match "subsystem" "ZFS" if i convert the above example to detect a state change , no entry in the log. notify 10 { match "system" "ZFS"; match "type" "resource.fs.zfs.statechange"; action "logger -p kern.warn 'ZFS: State has chaged on , zpool=$pool'"; }; if i add the line match "subsystem" "ZFS" then i get the warning in my /var/log/messages. notify 10 { match "system" "ZFS"; match "subsytem" "ZFS"; match "type" "resource.fs.zfs.statechange"; action "logger -p kern.warn 'ZFS: State has chaged on , zpool=$pool'"; }; And which match type is there for a removed or unavailable drive? Thanks for your patience regards Johan Hendriks From owner-freebsd-fs@FreeBSD.ORG Sat Dec 3 10:12:34 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D4BCC106564A for ; Sat, 3 Dec 2011 10:12:34 +0000 (UTC) (envelope-from peter.maloney@brockmann-consult.de) Received: from mo-p05-ob6.rzone.de (mo-p05-ob6.rzone.de [IPv6:2a01:238:20a:202:53f5::1]) by mx1.freebsd.org (Postfix) with ESMTP id 17B5B8FC08 for ; Sat, 3 Dec 2011 10:12:33 +0000 (UTC) X-RZG-AUTH: :LWIKdA2leu0bPbLmhzXgqn0MTG6qiKEwQRWfNxSw4HzYIwjsnvdDt2QV8d370m6rGVRSdw== X-RZG-CLASS-ID: mo05 Received: from [192.168.179.42] (hmbg-4d06c233.pool.mediaWays.net [77.6.194.51]) by post.strato.de (mrclete mo37) (RZmta 26.10 AUTH) with (DHE-RSA-AES128-SHA encrypted) ESMTPA id 402607nB39P2Sj ; Sat, 3 Dec 2011 11:12:14 +0100 (MET) Message-ID: <4ED9F5FC.5040400@brockmann-consult.de> Date: Sat, 03 Dec 2011 11:12:12 +0100 From: Peter Maloney User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20111105 Thunderbird/8.0 MIME-Version: 1.0 To: Techie References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> <20111202142656.GA7104@e4310> <20111202145008.GA27853@icarus.home.lan> <4ED92139.6010900@brockmann-consult.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Dec 2011 10:12:34 -0000 So I'm still not sure I know exactly what you want to do from a high level. But I think it isn't the normal way to do things in zfs. Surely making the sender aware of the receiver's dedup table would be nice, but you shouldn't need it so badly. Comments below... (basically describing how I think you should do things instead, but I don't know if it meets your needs.) Am 03.12.2011 06:29, schrieb Techie: > Hey Peter, > > Thanks for your informative response. > > This is odd, I have been waiting for a response to this question for a > few days and these messages just came through..I am glad they did. > > Anyhow please allow me to explain the whole "tar" thing. I regret it > because no one even addressed the DDT, Dedup Table part of it. > > You see I wanted to use ZFS as a deduplication disk target for my > backup applications and use the native replication capabilities of ZFS > to replicate the virtual backup cartridges. All modern backup apps > leverage disk as a backup target but some don't do replication. > > My idea was to use ZFS to do this. However after testing I came to the > realization that ZFS deduplication is NOT ideal for "deduping" third > party backup streams. Probably true, but why would you have duplicates in those streams? Why don't you use incremental backup streams? > From what I read this is due to the fact that > backup applications put their own metadata in the streams and throw > off the block alignment. Products like Data Domain and Quantum DXi use > a variable length block and are developed towards deduplicating backup > application streams. ZFS does OK but nothing in comparison to the > dedup ratio seen on these aforementioned appliances. I used tar as an > example and should have been more specific. I understand what you are > saying about replicating every 15 minutes etc.. However since backup > application create huge files, It should only be huge when you do a full backup... this statement makes it sound like you are doing a full backup every time. > an incremental send would need to send > the newly created huge file..At least that is how I understand, I may > be correctly. In my testing this was the case but perhaps my syntax > was not correct. > > In any case deduplication appliances when replicating only send the > changed blocks that don't exist on the target side. To do this they > have to have knowledge of what exists in the target side "block pool", > "dedup hash table",or whatever it may be called. With zfs replication, it probably doesn't have access to the reciever's dedup table, but incremental streams that only send the changes (not the whole disk in a newly created tar) would do this for you. So I think this should be your goal. > From what I understand a ZFS file system on the source side has no > idea of what exists on the target side. I also understand and maybe > incorrectly, that the zfs send -D only eliminates duplicate blocks in > the stream it is sending and does not account for a block that may > already exist at the target. > > As an example let's say I am using a backup app like Amanda.. I do a > full backup every day to a ZFS based disk target. Again you say full backup. I really think you should be doing incremental backups most of the time. But ideally, instead of using backup software to restore, you would just choose a snapshot from the backup system, and restore the files directly. "incremental" sends would be done, but incremental files would not exist... instead, you would be using the copy-on-write ability controlled by snapshots. In my reading, this seems to be "the zfs way", and the only disadvantage I ran into is that zfs send does not understand what to do when the end of a tape is hit, so it would just fail instead of letting you insert a second tape. Since you are not using tapes, this doesn't matter. What I think you should to to restore files: eg. recover some files me@somenonzfscomputer # scp -r me@mybackuphost:/somedataset/.zfs/snapshot/daily-2011-11-04T00:00:00/ /somewheretoputit or recover the whole zfs snapshot with all metadata, etc., or even snapshots if you do it recursively me@somezfscomputer # ssh me@mybackuphost "zfs send ... somedataset@daily-2011-11-04T00:00:00" | zfs recv ... somedatasettoputit > .Every day after the > backup completes I do a -- "zfs send -D -i {snap_yesterday} > {snap_today} | ssh DESTINATION zfs recv DEST_FS". Now each day's full > backup will only have maybe a 1% change rate and this will be > reflected on the source side file system. So if I had 5 days of 2 GB > full backups, the source file system will show maybe 3GB Alloc in the > zpool list output. However since the source does not know about > duplicate blocks on the target side from yesterday's backup, it sends > the entire 2GB full backup from today only removing any duplicate > blocks that exist in the stream it is sending. The difference with a > dedup appliance is that it is aware of duplicate blocks on the target > side and won't send them. > > This is the reason my original question was asking if there were any > plans to implement a "global DDT" or dedup table to make the target > aware of the destination duplicate blocks so that only unique blocks > are transferred. > > Am I incorrect in my understanding of the ZFS DDT being unique to each > ZFS file system/pool? > > > Thanks > Jimmy > > On Fri, Dec 2, 2011 at 12:04 PM, Peter Maloney > wrote: >> Am 02.12.2011 15:50, schrieb Jeremy Chadwick: >>> On Fri, Dec 02, 2011 at 03:27:03PM +0100, Michel Le Cocq wrote: >>>> it's just me or there is no attachment ? >>> The mailing list stripped the attachment. The previous individual will >>> need to put it up on the web somewhere. >>> >> It is possible that I forgot to attach it. I assumed it would be >> stripped off but the ones in the to/cc would get it. >> >> Here it is on the company website: >> >> http://www.brockmann-consult.de/peter2/zfs.tgz >> >> >> >> Disclaimer/notes: >> -provided as is... might destroy your system, furthermore, I am not >> responsible for bodily injury nor nuclear war that may result from misuse >> -there are no unit tests, and no documentation other than a few comments >> that are possibly only coherent when I read them. For example, it says >> that it does it recursively and rolls back the destination dataset, but >> there are a few undocumented cases I can't remember when I needed to do >> something manual like delete a snapshot, or destroy a dataset. Maybe >> that is all in the past. I don't know. >> -the zfs_repl2.bash is the one that makes snapshots and replicates which >> I wrote myself. The other ksh one is the Oracle one I linked above, and >> the .sh version of it was just what I was working on to try to make it >> work reliably, before redoing it all myself (reinventing the wheel is >> indeed fun). >> -especially beware of the deleteOldSnapshots.bash which is not well >> tested and not used yet (and deleteEmptySnapshots.bash which does not >> work and I believe cannot work). >> -granted transferable your choice of any present or future version of >> the BSD or GPL license >> >> and another note, I meant to study these which might be better versions >> of the same thing, or something different, but never got around to it: >> /usr/ports/sysutils/zfs-replicate/ >> /usr/ports/sysutils/zfsnap/ >> /usr/ports/sysutils/zfs-periodic >> >> >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Sat Dec 3 18:16:03 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AE07106564A for ; Sat, 3 Dec 2011 18:16:03 +0000 (UTC) (envelope-from kraduk@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 105158FC14 for ; Sat, 3 Dec 2011 18:16:02 +0000 (UTC) Received: by ghbg20 with SMTP id g20so5439347ghb.13 for ; Sat, 03 Dec 2011 10:16:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=kjAFI8NzvTCiEwPIDziV+4HiB5q0JrokMZyKKtR/tmg=; b=UW4BGYUVfF+YldQsCeEHv7o8aS+2F85IAsJ9H0aNfd8ndC1ygIJVyyW6JJ4wPz0Iqk Fi5w0DenmbPVvPDGb7tcevI5VmxJ6yGndOW9u136V57pij1Qjj0WSm3zulwbelS//lfy T+4fsQSLYz7LSS/l9IYnwG1FSm5IPn9zXPQIc= MIME-Version: 1.0 Received: by 10.236.192.233 with SMTP id i69mr3971013yhn.60.1322936160555; Sat, 03 Dec 2011 10:16:00 -0800 (PST) Received: by 10.236.95.41 with HTTP; Sat, 3 Dec 2011 10:16:00 -0800 (PST) In-Reply-To: References: <4ED77B09.1090709@brockmann-consult.de> <4ED87FA6.6010408@brockmann-consult.de> <20111202142656.GA7104@e4310> <20111202145008.GA27853@icarus.home.lan> <4ED92139.6010900@brockmann-consult.de> Date: Sat, 3 Dec 2011 18:16:00 +0000 Message-ID: From: krad To: Techie Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS dedup and replication X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Dec 2011 18:16:03 -0000 On 3 December 2011 05:29, Techie wrote: > Hey Peter, > > Thanks for your informative response. > > This is odd, I have been waiting for a response to this question for a > few days and these messages just came through..I am glad they did. > > Anyhow please allow me to explain the whole "tar" thing. I regret it > because no one even addressed the DDT, Dedup Table part of it. > > You see I wanted to use ZFS as a deduplication disk target for my > backup applications and use the native replication capabilities of ZFS > to replicate the virtual backup cartridges. All modern backup apps > leverage disk as a backup target but some don't do replication. > > My idea was to use ZFS to do this. However after testing I came to the > realization that ZFS deduplication is NOT ideal for "deduping" third > party backup streams. From what I read this is due to the fact that > backup applications put their own metadata in the streams and throw > off the block alignment. Products like Data Domain and Quantum DXi use > a variable length block and are developed towards deduplicating backup > application streams. ZFS does OK but nothing in comparison to the > dedup ratio seen on these aforementioned appliances. I used tar as an > example and should have been more specific. I understand what you are > saying about replicating every 15 minutes etc.. However since backup > application create huge files, an incremental send would need to send > the newly created huge file..At least that is how I understand, I may > be correctly. In my testing this was the case but perhaps my syntax > was not correct. > > In any case deduplication appliances when replicating only send the > changed blocks that don't exist on the target side. To do this they > have to have knowledge of what exists in the target side "block pool", > "dedup hash table",or whatever it may be called. > > >From what I understand a ZFS file system on the source side has no > idea of what exists on the target side. I also understand and maybe > incorrectly, that the zfs send -D only eliminates duplicate blocks in > the stream it is sending and does not account for a block that may > already exist at the target. > > As an example let's say I am using a backup app like Amanda.. I do a > full backup every day to a ZFS based disk target..Every day after the > backup completes I do a -- "zfs send -D -i {snap_yesterday} > {snap_today} | ssh DESTINATION zfs recv DEST_FS". Now each day's full > backup will only have maybe a 1% change rate and this will be > reflected on the source side file system. So if I had 5 days of 2 GB > full backups, the source file system will show maybe 3GB Alloc in the > zpool list output. However since the source does not know about > duplicate blocks on the target side from yesterday's backup, it sends > the entire 2GB full backup from today only removing any duplicate > blocks that exist in the stream it is sending. The difference with a > dedup appliance is that it is aware of duplicate blocks on the target > side and won't send them. > > This is the reason my original question was asking if there were any > plans to implement a "global DDT" or dedup table to make the target > aware of the destination duplicate blocks so that only unique blocks > are transferred. > > Am I incorrect in my understanding of the ZFS DDT being unique to each > ZFS file system/pool? > > > Thanks > Jimmy > > On Fri, Dec 2, 2011 at 12:04 PM, Peter Maloney > wrote: > > Am 02.12.2011 15:50, schrieb Jeremy Chadwick: > >> On Fri, Dec 02, 2011 at 03:27:03PM +0100, Michel Le Cocq wrote: > >>> it's just me or there is no attachment ? > >> The mailing list stripped the attachment. The previous individual will > >> need to put it up on the web somewhere. > >> > > It is possible that I forgot to attach it. I assumed it would be > > stripped off but the ones in the to/cc would get it. > > > > Here it is on the company website: > > > > http://www.brockmann-consult.de/peter2/zfs.tgz > > > > > > > > Disclaimer/notes: > > -provided as is... might destroy your system, furthermore, I am not > > responsible for bodily injury nor nuclear war that may result from misuse > > -there are no unit tests, and no documentation other than a few comments > > that are possibly only coherent when I read them. For example, it says > > that it does it recursively and rolls back the destination dataset, but > > there are a few undocumented cases I can't remember when I needed to do > > something manual like delete a snapshot, or destroy a dataset. Maybe > > that is all in the past. I don't know. > > -the zfs_repl2.bash is the one that makes snapshots and replicates which > > I wrote myself. The other ksh one is the Oracle one I linked above, and > > the .sh version of it was just what I was working on to try to make it > > work reliably, before redoing it all myself (reinventing the wheel is > > indeed fun). > > -especially beware of the deleteOldSnapshots.bash which is not well > > tested and not used yet (and deleteEmptySnapshots.bash which does not > > work and I believe cannot work). > > -granted transferable your choice of any present or future version of > > the BSD or GPL license > > > > and another note, I meant to study these which might be better versions > > of the same thing, or something different, but never got around to it: > > /usr/ports/sysutils/zfs-replicate/ > > /usr/ports/sysutils/zfsnap/ > > /usr/ports/sysutils/zfs-periodic > > > > > > _______________________________________________ > > freebsd-fs@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > What we do at work is keep every things simple, and rsync all files off the box to the archiver box and at the end of every rsync run snap the filesystem for that backup client. This provides a very efficient incremental forever solution and works with all unix hosts, zfs aware or not. There are some cases where we dont use the rsync though. One of these would be mysql. Rsyncing the data dir would be pointless as it would result in inconsistent backups. As all our mysql server are zfs based, we have a script that either stops the slave, or issues a global write lock depending on whether database is a slave or master, it then flushes and then snaps the mysql-data fs and removes the lock or restarts the slave. We then just to a zfs incremental send on the data filesystem. Again this is very efficient. More importantly though this solution allowed us reduce our netbackup costs considerably as we reduced the number of clients from many to not many, as the only clients afterwards were the archivers, which we dumped to tape. This was more for insurance purposes though rather than any operational reason