From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 00:00:31 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A8DB61065695 for ; Sat, 1 Jan 2011 00:00:31 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail11.syd.optusnet.com.au (mail11.syd.optusnet.com.au [211.29.132.192]) by mx1.freebsd.org (Postfix) with ESMTP id C17588FC14 for ; Sat, 1 Jan 2011 00:00:23 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail11.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p0100KYW022250 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 1 Jan 2011 11:00:21 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id p0100K03043002; Sat, 1 Jan 2011 11:00:20 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id p0100JwN043001; Sat, 1 Jan 2011 11:00:19 +1100 (EST) (envelope-from peter) Date: Sat, 1 Jan 2011 11:00:19 +1100 From: Peter Jeremy To: Adam Stylinski Message-ID: <20110101000019.GC48579@server.vk2pj.dyndns.org> References: <20101230073130.GA55431@zephyr.adamsnet> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="eAbsdosE1cNLO4uF" Content-Disposition: inline In-Reply-To: <20101230073130.GA55431@zephyr.adamsnet> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: slow ZFS on FreeBSD 8.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 00:00:31 -0000 --eAbsdosE1cNLO4uF Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2010-Dec-30 02:31:30 -0500, Adam Stylinski wro= te: >I can tell you what the problem is right now, actually. ZFS performs >very poorly on low performance CPUs (i.e. your Atom N330). I would disagree. In this case, the op's most serious problem is a bug in sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:arc_memory_throttle() which is leading to ARC starvation. The direct effect of this is very poor ZFS I/O performance. It can be identified by very high "inactive" and possibly "cache" memory (as reported by 'systat -v' or top) as well as very high kstat.zfs.misc.arcstats.memory_throttle_count This bug was fixed in r210427 on -current, r211599 on 8.x and r211623 on 7.x. > Try the >same system with a different CPU and you'll get a different result. Not until the above bug is fixed. That said, ZFS is far more CPU intensive than UFS and a more powerful CPU may help - especially if you want gzip compression and/or sha256 checksumming. --=20 Peter Jeremy --eAbsdosE1cNLO4uF Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEYEARECAAYFAk0ebpMACgkQ/opHv/APuIevnACfXrm4QSBnEJADZS2s9QrCFzgd gb0AoIKFyJbGPgepu22cevpIKvz3BG16 =mgI4 -----END PGP SIGNATURE----- --eAbsdosE1cNLO4uF-- From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 00:01:23 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EF5D1065679 for ; Sat, 1 Jan 2011 00:01:23 +0000 (UTC) (envelope-from cforgeron@acsi.ca) Received: from mta03.eastlink.ca (mta03.eastlink.ca [24.224.136.9]) by mx1.freebsd.org (Postfix) with ESMTP id 68ECC8FC15 for ; Sat, 1 Jan 2011 00:01:23 +0000 (UTC) MIME-version: 1.0 Content-type: text/plain; charset=iso-8859-1 Received: from ip03.eastlink.ca ([unknown] [24.222.39.36]) by mta03.eastlink.ca (Sun Java(tm) System Messaging Server 7.3-11.01 64bit (built Sep 1 2009)) with ESMTP id <0LEB006B7HEAKDI0@mta03.eastlink.ca> for freebsd-stable@freebsd.org; Fri, 31 Dec 2010 20:01:22 -0400 (AST) X-CMAE-Score: 0 X-CMAE-Analysis: v=1.1 cv=8reSTVRqS4Rq5Xx4Jai9N41eZpHz3D5gSX5rA0od4mg= c=1 sm=1 a=8nJEP1OIZ-IA:10 a=6I5d2MoRAAAA:8 a=OebMqPbCMdyMwxhNwusA:9 a=tEFGL9vjqMWV_svNgJsA:7 a=Cah0SD2AJBdNRUUJhJEPoKz76qYA:4 a=wPNLvfGTeEIA:10 a=1IBa77YoCOYA:10 a=SGNCbYeMB1kA:10 a=V7LQdhr3H0IA:10 a=SV7veod9ZcQA:10 a=qrH0pO_QpGKo7yvw:21 a=4r4Zjed6aiSPxQTH:21 a=k1w2ZutsjhWYawe+LO1aOw==:117 Received: from blk-222-10-85.eastlink.ca (HELO server7.acsi.ca) ([24.222.10.85]) by ip03.eastlink.ca with ESMTP; Fri, 31 Dec 2010 20:01:22 -0400 Received: from server7.acsi.ca ([192.168.9.7]) by server7.acsi.ca ([192.168.9.7]) with mapi; Fri, 31 Dec 2010 20:01:17 -0400 From: Chris Forgeron To: "freebsd-stable@freebsd.org" Date: Fri, 31 Dec 2010 20:01:17 -0400 Thread-topic: ZFS v28 and zil_disable Thread-index: AcupNZahBcx36wFoRi2rwBWUj768GgAEUzdw Message-id: References: In-reply-to: Accept-Language: en-US Content-language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-transfer-encoding: quoted-printable Subject: RE: ZFS v28 and zil_disable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 00:01:23 -0000 Oh, and then I read what I post, and notice that the zil_disable parts are = in the .orig files, from the patch. :-) Oh well, I guess I'll just have to invest in a proper ZIL device.=20 -- Christopher Forgeron, B.Sc., CCS, A+, N+=20 ACSI Consulting, Inc / Aardvark Computer Solutions, Inc. email: chris@acsi.ca 2070 Oxford Street, Suite 100, Halifax NS B3L-2T2 Tel: 902-425-2686=A0 Fax: 902-484-7909 -----Original Message----- From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-stable@freebsd= .org] On Behalf Of Chris Forgeron Sent: December-31-10 6:01 PM To: freebsd-stable@freebsd.org Subject: ZFS v28 and zil_disable BTW, I'm noticing the removal of vfs.zfs.zil_disable as well - It's not lis= ted as a sysctl when I check vfs.zfs, but I see it's still in the source co= de; In usr/src/sys/cddl/ : # grep -r zil_disable * cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h.orig:extern int zil_di= sable; cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:int zil_disable =3D 0= ; /* disable intent logging */ cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:TUNABLE_INT("vfs.zfs.= zil_disable", &zil_disable); cddl/contrib/opensolaris/uts/common/fs/zfs/zil= .c.orig:SYSCTL_INT(_vfs_zfs, OID_AUTO, zil_disable, CTLFLAG_RW, &zil_disabl= e, 0, cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c.orig: if (zil_dis= able) { cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c.orig: if (bp->bio= _cmd =3D=3D BIO_FLUSH && !zil_disable) I know Sun was trying to move away from allowing people to disable the ZIL,= but was this by design in the FreeBSD port, or are we just missing some co= de to link the sysctl up with the code to easily disable the ZIL?=20 I'll try setting zil_disable=3D1 in the source tomorrow and recompile to se= e if it works. It's such a huge speed increase for some operations (80MB/s= ec with ZIL, 450 MB/sec without ZIL) that I still use zil_disable. I'll also have to check my 9.0-CUR v28 patch, although I assume it's the sa= me.=20 -----Original Message----- From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-stable@freebsd= .org] On Behalf Of Jean-Yves Avenard Sent: December-27-10 1:31 AM To: jhell Cc: freebsd-stable@freebsd.org Subject: Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic Jean-Yves PS: saving my 5MB files over the network , went from 40-55s with v15 to a c= onstant 16s with v28... I can't test with ZIL completely disabled , it seem= s that vfs.zfs.zil_disable has been removed, and so did vfs.zfs.write_limit= _override _______________________________________________ _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 00:17:10 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC04E106566B for ; Sat, 1 Jan 2011 00:17:10 +0000 (UTC) (envelope-from swhetzel@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 6A3678FC14 for ; Sat, 1 Jan 2011 00:17:10 +0000 (UTC) Received: by bwz12 with SMTP id 12so5303924bwz.13 for ; Fri, 31 Dec 2010 16:17:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=sbeBloCf3k4w3fY0ET3sei6YUt8QM9NmYu1hAyOzOto=; b=ljONn7M4G6BlszgNAsYzL6GlCioaF7BtrmCU3iW0He/8ybRH3Es/4FC4ErOTGGFJR3 Q9cr9DMEBcrIXJvCGh+4YNvviOh/M8td57T443iyjEcKRDEoaldG1otCbmtOqzMOxaEs UzXbXfquroE4bvTm9jwwPB8tkvT2ih8TPXEIg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=V25+/A9Vcr2z8j8iyiQYTJFF7irk8uXGsRDkhFjJNi6fw+YHOKE7olv9ByfKBhMEOD keGV1neuLYVq6ygo1Zsp1Xfe9d5Qz9JN24zp7Jv49hXVChWKE42DA4zvwiVqO4ABVZFR jjXFnzDBLt/TsysVWEW0I7IJgvyNaOeKd1QDE= MIME-Version: 1.0 Received: by 10.204.140.70 with SMTP id h6mr8140109bku.117.1293839663235; Fri, 31 Dec 2010 15:54:23 -0800 (PST) Received: by 10.204.73.76 with HTTP; Fri, 31 Dec 2010 15:54:23 -0800 (PST) In-Reply-To: References: Date: Fri, 31 Dec 2010 17:54:23 -0600 Message-ID: From: Scot Hetzel To: Chris Forgeron Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: "freebsd-stable@freebsd.org" Subject: Re: ZFS v28 and zil_disable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 00:17:10 -0000 On Fri, Dec 31, 2010 at 4:00 PM, Chris Forgeron wrote: > BTW, I'm noticing the removal of vfs.zfs.zil_disable as well - It's not l= isted as a sysctl when I check vfs.zfs, but I see it's still in the source = code; > > In usr/src/sys/cddl/ : > # grep -r zil_disable * > cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h.orig:extern int zil_= disable; > cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:int zil_disable =3D= 0; =A0 =A0 =A0/* disable intent logging */ > cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:TUNABLE_INT("vfs.zf= s.zil_disable", &zil_disable); > cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:SYSCTL_INT(_vfs_zfs= , OID_AUTO, zil_disable, CTLFLAG_RW, &zil_disable, 0, > cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c.orig: =A0 if (zil= _disable) { > cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c.orig: =A0 =A0 =A0 =A0 i= f (bp->bio_cmd =3D=3D BIO_FLUSH && !zil_disable) > > All the files above show that the original files (*.orig) have the zil_disa= ble. Your grep of the sources shows that zil_disable was removed from zil.h, zil.c, zfs_vfsops.c and zvol.c. I looked at pjd's perforce repository and found that zil_disable was renamed to zil_replay_disable. Scot From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 00:32:48 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13C80106564A for ; Sat, 1 Jan 2011 00:32:48 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [76.96.27.243]) by mx1.freebsd.org (Postfix) with ESMTP id EDC2F8FC0A for ; Sat, 1 Jan 2011 00:32:47 +0000 (UTC) Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73]) by qmta13.emeryville.ca.mail.comcast.net with comcast id pzwP1f0081afHeLAD0Yn9T; Sat, 01 Jan 2011 00:32:47 +0000 Received: from koitsu.dyndns.org ([98.248.34.134]) by omta17.emeryville.ca.mail.comcast.net with comcast id q0Yk1f0062tehsa8d0YkD8; Sat, 01 Jan 2011 00:32:45 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DAA629B427; Fri, 31 Dec 2010 16:32:43 -0800 (PST) Date: Fri, 31 Dec 2010 16:32:43 -0800 From: Jeremy Chadwick To: Chris Forgeron Message-ID: <20110101003243.GA9124@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-stable@freebsd.org" Subject: Re: ZFS v28 and zil_disable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 00:32:48 -0000 On Fri, Dec 31, 2010 at 08:01:17PM -0400, Chris Forgeron wrote: > Oh, and then I read what I post, and notice that the zil_disable parts are in the .orig files, from the patch. :-) > > Oh well, I guess I'll just have to invest in a proper ZIL device. > > -----Original Message----- > From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Chris Forgeron > Sent: December-31-10 6:01 PM > To: freebsd-stable@freebsd.org > Subject: ZFS v28 and zil_disable > > BTW, I'm noticing the removal of vfs.zfs.zil_disable as well - It's not listed as a sysctl when I check vfs.zfs, but I see it's still in the source code; > > In usr/src/sys/cddl/ : > # grep -r zil_disable * > cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zil.h.orig:extern int zil_disable; > cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:int zil_disable = 0; /* disable intent logging */ > cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:TUNABLE_INT("vfs.zfs.zil_disable", &zil_disable); cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c.orig:SYSCTL_INT(_vfs_zfs, OID_AUTO, zil_disable, CTLFLAG_RW, &zil_disable, 0, > cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c.orig: if (zil_disable) { > cddl/contrib/opensolaris/uts/common/fs/zfs/zvol.c.orig: if (bp->bio_cmd == BIO_FLUSH && !zil_disable) > > > I know Sun was trying to move away from allowing people to disable the ZIL, but was this by design in the FreeBSD port, or are we just missing some code to link the sysctl up with the code to easily disable the ZIL? > > I'll try setting zil_disable=1 in the source tomorrow and recompile to see if it works. It's such a huge speed increase for some operations (80MB/sec with ZIL, 450 MB/sec without ZIL) that I still use zil_disable. > > I'll also have to check my 9.0-CUR v28 patch, although I assume it's the same. > > > -----Original Message----- > From: owner-freebsd-stable@freebsd.org [mailto:owner-freebsd-stable@freebsd.org] On Behalf Of Jean-Yves Avenard > Sent: December-27-10 1:31 AM > To: jhell > Cc: freebsd-stable@freebsd.org > Subject: Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic > > Jean-Yves > PS: saving my 5MB files over the network , went from 40-55s with v15 to a constant 16s with v28... I can't test with ZIL completely disabled , it seems that vfs.zfs.zil_disable has been removed, and so did vfs.zfs.write_limit_override You shouldn't disable the ZIL[1], and you don't need a "proper ZIL device" for the ZIL to work. FreeBSD ZFS, like its Solaris counterpart, offers the ability for the ZIL to be associated with one or more dedicated devices[1]. These are referred to as "log" devices (not to be confused with "cache" devices). In the case you use a dedicated device for your ZIL, be aware that you should probably use two[2] devices (or if a single physical device, two slices) else risk data integrity problems. Switching over to a brief mention of "cache" devices, there is one case[3] of someone experiencing high CPU when either a USB flash drive or an SSD drive[4], where rebooting the system apparently solved the problem (we do not know if this was the case permanently or temporarily). [1]: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29 [2]: http://forums.freebsd.org/showthread.php?t=18221 [3]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-November/060014.html [4]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-November/060076.html -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 03:02:39 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DCF94106566B for ; Sat, 1 Jan 2011 03:02:38 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail18.syd.optusnet.com.au (mail18.syd.optusnet.com.au [211.29.132.199]) by mx1.freebsd.org (Postfix) with ESMTP id 5364F8FC12 for ; Sat, 1 Jan 2011 03:02:35 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail18.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p0132VCm017882 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 1 Jan 2011 14:02:32 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id p0132SeF064009; Sat, 1 Jan 2011 14:02:28 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id p0132SOv064008; Sat, 1 Jan 2011 14:02:28 +1100 (EST) (envelope-from peter) Date: Sat, 1 Jan 2011 14:02:28 +1100 From: Peter Jeremy To: Jeremy Chadwick Message-ID: <20110101030228.GD48579@server.vk2pj.dyndns.org> References: <4D1AF388.3080107@infracaninophile.co.uk> <4D1B7431.7070808@infracaninophile.co.uk> <4D1BD8D0.5010402@langille.org> <4D1C4A2D.4020206@infracaninophile.co.uk> <4D1C7929.3040809@langille.org> <20101231233343.GB48579@server.vk2pj.dyndns.org> <20101231234747.GA8171@icarus.home.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="WfZ7S8PLGjBY9Voh" Content-Disposition: inline In-Reply-To: <20101231234747.GA8171@icarus.home.lan> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: slow ZFS on FreeBSD 8.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 03:02:39 -0000 --WfZ7S8PLGjBY9Voh Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2010-Dec-31 15:47:47 -0800, Jeremy Chadwick w= rote: >Is your ZFS root filesystem associated with a pool that's mirrored or >using raidzX? Currently, mirrored. I'm considering raidz1 at home. Note that my work system is a single pool, whereas I'll use a separate pool for root at home. > What about mismatched /boot content (ZFS vs. UFS)? Can you give me an example of what you mean here. > What about booting into single-user mode? I haven't run into any problems here, though I agree that starting ZFS in single-user mode is a lot messier than starting UFS. >error/mistake). Is it worth the risk? Most administrators don't have >the tolerance for stuff like that in the middle of a system upgrade or >what not; they should be able to follow exactly what's in the handbook, >to a tee. I've been using FreeBSD for long enough that I'm confident to upgrade or similar without blindly following a process. But I agree that FreeBSD should be usable without needing to be a guru. >There's a link to www.dan.me.uk at the bottom of the above Wiki page >that outlines "the madness" that's required to configure the setup, all >of which has to be done by hand. I don't know many administrators who >are going to tolerate this when deploying numerous machines, especially >when compounded by the complexities mentioned above. Root on ZFS is still very bespoke. I agree there's no way you could roll it out across lots of machines at present but I'm happy to hand- craft installs on a few machines. Hopefully, son-of-sysinstall will support ZFS installs (one prerequisite is someone being willing to do the work). >The mmap(2) and sendfile(2) complexities will bite an junior or >mid-level SA in the butt too -- they won't know why software starts >failing or behaving oddly (FreeBSD ftpd is a good example). It just so >happens that Apache, out-of-the-box, comes with mmap and sendfile use >disabled. mmap(2) is a design problem with ZFS - it's present on Solaris as well. IMHO, it's the biggest flaw in ZFS. The sendfile(2) issues haven't bitten me so I haven't studied them as much but I'm aware that some fixes were committed recently. Oh and one root-on-ZFS gotcha that I missed is the lack of gzip support. I spent about =BDday tracking that down - not helped by the lack of any documentation or a useful error message (though there is a comment in the code when you eventually track it down). --=20 Peter Jeremy --WfZ7S8PLGjBY9Voh Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEYEARECAAYFAk0emUQACgkQ/opHv/APuIdcvgCeNZaH8el6KcE6daDvOzjcGuiP NOIAn2hDnXMPwkYgVcBWN3LQsqgU+vEI =bPME -----END PGP SIGNATURE----- --WfZ7S8PLGjBY9Voh-- From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 05:11:16 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D5CE1067FEC for ; Sat, 1 Jan 2011 05:10:55 +0000 (UTC) (envelope-from vladimir@shumbely.com) Received: from shumbely.com (shumbely.com [82.146.60.47]) by mx1.freebsd.org (Postfix) with ESMTP id 36AA18FC08 for ; Sat, 1 Jan 2011 05:10:55 +0000 (UTC) Received: from 43-89-52-95.baltnet.ru ([95.52.89.43] helo=[192.168.0.11]) by shumbely.com with esmtpa (Exim 4.72 (FreeBSD)) (envelope-from ) id 1PYtFy-0003QQ-Dt for freebsd-stable@freebsd.org; Sat, 01 Jan 2011 06:39:54 +0200 From: Vladimir Vasilenko aka jeltoesolnce To: freebsd-stable@freebsd.org Content-Type: text/plain Organization: Consultant Date: Sat, 01 Jan 2011 06:39:53 +0200 Message-ID: <1293856793.1768.2.camel@gray.homenetwork> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Subject: Happy New Year! X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: vladimir@shumbely.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 05:11:16 -0000 Subj.) From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 12:41:54 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B17591065672 for ; Sat, 1 Jan 2011 12:41:54 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail36.syd.optusnet.com.au (mail36.syd.optusnet.com.au [211.29.133.76]) by mx1.freebsd.org (Postfix) with ESMTP id 475788FC16 for ; Sat, 1 Jan 2011 12:41:54 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail36.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p01CfpWX017831 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 1 Jan 2011 23:41:52 +1100 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id p01CfnV0054981 for ; Sat, 1 Jan 2011 23:41:49 +1100 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id p01CfnHW054980 for freebsd-stable@freebsd.org; Sat, 1 Jan 2011 23:41:49 +1100 (EST) (envelope-from peter) Date: Sat, 1 Jan 2011 23:41:49 +1100 From: Peter Jeremy To: freebsd-stable@freebsd.org Message-ID: <20110101124149.GE48579@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="2iBwrppp/7QCDedR" Content-Disposition: inline X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Specifying root mount options on diskless boot. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 12:41:54 -0000 --2iBwrppp/7QCDedR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable [I'm not sure if -stable is the best list for this but anyway...] I'm trying to convert an old laptop running FreeBSD 8.0 into a diskless client (since its internal HDD is growing bad spots faster than I can repair them). I have it pxebooting nicely and running with an NFS root but it then reports locking problems: devd, syslogd, moused (and maybe others) lock their PID file to protect against multiple instances. Unfortunately, these daemons all start before statd/lockd and so the locking fails and reports "operation not supported". It's not practical to reorder the startup sequence to make lockd start early enough (I've tried). Since the filesystem is reserved for this client, there's no real need to forward lock requests across the wire and so specifying "nolockd" would be another solution. Looking through sys/nfsclient/bootp_subr.c, DHCP option 130 should allow NFS mount options to be specified (though it's not clear that the relevant code path is actually followed because I don't see the associated printf()s anywhere on the console. After getting isc-dhcpd to forward this option (made more difficult because its documentation is incorrect), it still doesn't work. Understanding all this isn't helped by kenv(8) reporting three different sets of root filesystem options: boot.nfsroot.path=3D"/tank/m3" boot.nfsroot.server=3D"192.168.123.200" dhcp.option-130=3D"nolockd" dhcp.root-path=3D"192.168.123.200:/tank/m3" vfs.root.mountfrom=3D"nfs:server:/tank/m3" vfs.root.mountfrom.options=3D"rw,tcp,nolockd" And the console also reports conflicting root definitions: Trying to mount root from nfs:server:/tank/m3 NFS ROOT: 192.168.123.200:/tank/m3 Working through all these: boot.nfsroot.* appears to be initialised by sys/boot/i386/libi386/pxe.c but, whilst nfsclient/nfs_diskless.c can parse boot.nfsroot.options, there's no code to initialise that kenv name in pxe.c dhcp.* appears to be initialised by lib/libstand/bootp.c - which does include code to populate boot.nfsroot.options (using vendor specific DHCP option 20) but this code is not compiled in. Further studying of bootp.c shows that it's possible to initialise arbitrary kenv's using DHCP options 246-254 - but the DHCPDISCOVER packets do not request these options so they don't work without special DHCP server configuration (to forward options that aren't requested). vfs.root.* is parsed out of /etc/fstab but, other than being reported in the console message above, it doesn't appear to be used in this environment (it looks like the root entry can be commented out of /etc/fstab without problem). My final solution was to specify 'boot.nfsroot.options=3D"nolockd"' in loader.conf - and this seems to actually work. It seems rather unfortunate that FreeBSD has code to allow NFS root mount options to be specified via DHCP (admittedly in several incompatible ways) but none actually work. A quick look at -current suggests that the situation there remains equally broken. Has anyone else tried to use any of this? And would anyone be interested in trying to make it actually work? --=20 Peter Jeremy --2iBwrppp/7QCDedR Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEUEARECAAYFAk0fIQ0ACgkQ/opHv/APuIe14ACYnAyxV11IANbDgFWdU87GhiUh vACgtNM7FhX3DUMnPjyDAjGDxcxWG7s= =08Ad -----END PGP SIGNATURE----- --2iBwrppp/7QCDedR-- From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 13:56:30 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1A0ED106564A; Sat, 1 Jan 2011 13:56:30 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 8DBE58FC15; Sat, 1 Jan 2011 13:56:29 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 5793519E030; Sat, 1 Jan 2011 14:56:27 +0100 (CET) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 76C0519E02F; Sat, 1 Jan 2011 14:56:24 +0100 (CET) Message-ID: <4D1F3287.6080708@quip.cz> Date: Sat, 01 Jan 2011 14:56:23 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.16) Gecko/20101123 SeaMonkey/2.0.11 MIME-Version: 1.0 To: John Baldwin References: <4BCE4D0F.2020807@quip.cz> <201012130824.33968.jhb@freebsd.org> <4D15D8DD.9010900@quip.cz> <201012281123.53669.jhb@freebsd.org> In-Reply-To: <201012281123.53669.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: /libexec/ld-elf.so.1: Cannot execute objects on / X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 13:56:30 -0000 John Baldwin wrote: > On Saturday, December 25, 2010 6:43:25 am Miroslav Lachman wrote: >> John Baldwin wrote: >>> On Saturday, December 11, 2010 11:51:41 am Miroslav Lachman wrote: >>>> Miroslav Lachman wrote: >>>>> Garrett Cooper wrote: >>>>>> 2010/4/20 Miroslav Lachman<000.fbsd@quip.cz>: >>>>>>> I have large storage partition (/vol0) mounted as noexec and nosuid. >>>>>>> Then >>>>>>> one directory from this partition is mounted by nullfs as "exec and >>>>>>> suid" so >>>>>>> anything on it can be executed. >>>>>>> >>>>>>> The directory contains full installation of jail. Jail is running >>>>>>> fine, but >>>>>>> some ports (PHP for example) cannot be compiled inside the jail with >>>>>>> message: >>>>>>> >>>>>>> /libexec/ld-elf.so.1: Cannot execute objects on / >>>>>>> >>>>>>> The same apply to executing of apxs >>>>>>> >>>>>>> root@rainnew ~/# /usr/local/sbin/apxs -q MPM_NAME >>>>>>> /libexec/ld-elf.so.1: Cannot execute objects on / >>>>>>> >>>>>>> apxs:Error: Sorry, no shared object support for Apache. >>>>>>> apxs:Error: available under your platform. Make sure. >>>>>>> apxs:Error: the Apache module mod_so is compiled into. >>>>>>> apxs:Error: your server binary '/usr/local/sbin/httpd'.. >>>>>>> >>>>>>> (it should return "prefork") >>>>>>> >>>>>>> So I think there is some bug in checking the mountpoint options, >>>>>>> where the >>>>>>> check is made on "parent" of the nullfs instead of the nullfs target >>>>>>> mountpoint. >>>>>>> >>>>>>> It is on 6.4-RELEASE i386 GENERIC. I did not test it on another release. >>>>>>> >>>>>>> This is list of related mount points: >>>>>>> >>>>>>> /dev/mirror/gm0s2d on /vol0 (ufs, local, noexec, nosuid, soft-updates) >>>>>>> /vol0/jail/.nullfs/rain on /vol0/jail/rain_new (nullfs, local) >>>>>>> /usr/ports on /vol0/jail/rain_new/usr/ports (nullfs, local) >>>>>>> devfs on /vol0/jail/rain_new/dev (devfs, local) >>>>>>> >>>>>>> If I changed /vol0 options to (ufs, local, soft-updates) the above >>>>>>> error is >>>>>>> gone and apxs / compilation works fine. >>>>>>> >>>>>>> Can somebody look at this problem? >>>>>> >>>>>> Can you please provide output from ktrace / truss for the issue? >>>>> >>>>> I did >>>>> # ktrace /usr/local/sbin/apxs -q MPM_NAME >>>>> >>>>> The output is here http://freebsd.quip.cz/ld-elf/ktrace.out >>>>> >>>>> Let me know if you need something else. >>>>> >>>>> Thank you for your interest! >>>> >>>> The problem is still there in FreeBSD 8.1-RELEASE amd64 GENERIC (and in >>>> 7.x). >>>> >>>> Can somebody say if this is a bug or an expected "feature"? >>> >>> I think this is the expected behavior as nullfs is simply re-exposing /vol0 >>> and it shouldn't be able to create a more privileged mount than the underlying >>> mount I think (e.g. a read/write nullfs mount of a read-only filesystem would >>> not make the underlying files read/write). It can be used to provide less >>> privilege (e.g. a readonly nullfs mount of a read/write filesystem does not >>> allow writes via the nullfs layer). >>> >>> That said, I'm not sure exactly where the permission check is failing. >>> execve() only checks MNT_NOEXEC on the "upper" vnode's mountpoint (i.e. the >>> nullfs mountpoint) and the VOP_ACCESS(.., V_EXEC) check does not look at >>> MNT_NOEXEC either. >>> >>> I do think there might be bugs in that a nullfs mount that specifies noexec or >>> nosuid might not enforce the noexec or nosuid bits if the underlying mount >>> point does not have them set (from what I can see). >> >> Thank you for your explanation. Then it is strange, that there is bug, >> that allows execution on originally non executable mountpoint. >> It should be mentioned in the bugs section of the mount_nullfs man page. >> >> It would be useful, if 'mount' output shows inherited options for nullfs. >> >> If parent is: >> /dev/mirror/gm0s2d on /vol0 (ufs, local, noexec, nosuid, soft-updates) >> >> Then nullfs line will be: >> /vol0/jail/.nullfs/rain on /vol0/jail/rain_new (nullfs, local, noexec, >> nosuid) >> >> instead of just >> /vol0/jail/.nullfs/rain on /vol0/jail/rain_new (nullfs, local) >> >> >> Then I can understand what is expected behavior, but our current state >> is half working, if I can execute scripts and binaries, run jail on it, >> but can't execute "apxs -q MPM_NAME" and few others. > > Hmm, so I was a bit mistaken. The kernel is not failing to exec the binary. > Instead, rtld is reporting the error here: > > static Obj_Entry * > do_load_object(int fd, const char *name, char *path, struct stat *sbp, > int flags) > { > Obj_Entry *obj; > struct statfs fs; > > /* > * but first, make sure that environment variables haven't been > * used to circumvent the noexec flag on a filesystem. > */ > if (dangerous_ld_env) { > if (fstatfs(fd,&fs) != 0) { > _rtld_error("Cannot fstatfs \"%s\"", path); > return NULL; > } > if (fs.f_flags& MNT_NOEXEC) { > _rtld_error("Cannot execute objects on %s\n", fs.f_mntonname); > return NULL; > } > } > > I wonder if the fstatfs is falling down to the original mount rather than > being caught by nullfs. > > Hmm, nullfs' statfs method returns the flags for the underlying mount, not > the flags for the nullfs mount. This is possibly broken, but it is the > behavior nullfs has always had and the behavior it still has on other BSDs. I am sorry, I am not a programmer, so the code doesn't tell me much. Does it mean "we must leave it in current state" (for compatibility with other BSDs) or can it be fixed in the future? I can't tell if it will be better to disable all exec operations if parental mount is noexec, or to allow all exec operations. I just think that current state is broken if something can be executed ant something can't. And again, thank you for your time, explanation and interest in this problem! Miroslav Lachman From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 15:11:49 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8F8E106566C for ; Sat, 1 Jan 2011 15:11:49 +0000 (UTC) (envelope-from ml@my.gd) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 76BCD8FC12 for ; Sat, 1 Jan 2011 15:11:47 +0000 (UTC) Received: by wwf26 with SMTP id 26so12123736wwf.31 for ; Sat, 01 Jan 2011 07:11:47 -0800 (PST) Received: by 10.227.146.9 with SMTP id f9mr7067090wbv.209.1293894707081; Sat, 01 Jan 2011 07:11:47 -0800 (PST) Received: from dfleuriot.local (did75-17-88-165-130-96.fbx.proxad.net [88.165.130.96]) by mx.google.com with ESMTPS id m13sm12620611wbz.9.2011.01.01.07.11.44 (version=SSLv3 cipher=RC4-MD5); Sat, 01 Jan 2011 07:11:46 -0800 (PST) Message-ID: <4D1F442F.8060801@my.gd> Date: Sat, 01 Jan 2011 16:11:43 +0100 From: Damien Fleuriot User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <4D1C6F90.3080206@my.gd> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 15:11:50 -0000 This is a home machine so I am afraid I won't have backups in place, if only because I just won't have another machine with as much disk space. The data is nothing critically important anyway, movies, music mostly. My objective here is getting more used to ZFS and seeing how performance gets. I remember getting rather average performance on v14 but Jean-Yves reported good performance boosts from upgrading to v15. Will try this out when the disks arrive :) Thanks for the pointers guys. On 12/30/10 6:49 PM, Ronald Klop wrote: > On Thu, 30 Dec 2010 12:40:00 +0100, Damien Fleuriot wrote: > >> Hello list, >> >> >> >> I currently have a ZFS zraid1 with 4x 1.5TB drives. >> The system is a zfs-only FreeBSD 8.1 with zfs version 14. >> >> I am concerned that in the event a drive fails, I won't be able to >> repair the disks in time before another actually fails. >> >> >> >> >> I wish to reinstall the OS on a dedicated drive (possibly SSD, doesn't >> matter, likely UFS) and dedicate the 1.5tb disks to storage only. >> >> I have ordered 5x new drives and would like to create a new zraid2 >> mirrored pool. >> >> Then I plan on moving data from pool1 to pool2, removing drives from >> pool1 and adding them to pool2. >> >> >> >> My questions are as follows: >> >> With a total of 9x 1.5TB drives, should I be using zraid3 instead of >> zraid2 ? I will not be able to add any more drives so unnecessary parity >> drives = less storage room. >> >> What are the steps for properly removing my drives from the zraid1 pool >> and inserting them in the zraid2 pool ? >> >> >> Regards, >> >> >> dfl > > Make sure you have spare drives so you can swap in a new one quickly and > have off-line backups in case disaster strikes. Extra backups are always > nice. Disks are not the only parts which can break and damage your data. > > Ronald. > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 17:28:27 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB7371065693 for ; Sat, 1 Jan 2011 17:28:27 +0000 (UTC) (envelope-from jyavenard@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7314F8FC0A for ; Sat, 1 Jan 2011 17:28:27 +0000 (UTC) Received: by iyb26 with SMTP id 26so11554221iyb.13 for ; Sat, 01 Jan 2011 09:28:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=NxSVcd5JmgwPf77rMTreH5LchZDDbfZWaKS5QPiYGhc=; b=sqdcNRXFZE3jGDZ13wIP+/Mq/pjYsQ8QYmKTlJQ25IMAi36TybbXENwislmiBYRqcM OUJ9xdwyAcNwqWCL8GDWahQXx0JXAq0wPAOB9+zcJAdG97R+2aV6gppyfV5EQYSuz+0R k+jmraaYw3hdhKEKThv4Jp3hcpI750pHfrC84= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=uwzIqNo9IeLXpUjMi5ncJdhKuUJM8cFbUWFID42FXr/0XXW16R8LfvJG7CfHi09deY 9vyiF74gAcH/i/CqNkjFy50qU/0yBXDEzIKZRgQJjdd+zMUccZtXzQuft3qgBgBeAQdT MdnnHaZnJxdxHuKGLHkH5jqyJcLgMMa5zECNo= MIME-Version: 1.0 Received: by 10.42.177.196 with SMTP id bj4mr19038648icb.129.1293902907109; Sat, 01 Jan 2011 09:28:27 -0800 (PST) Received: by 10.42.172.69 with HTTP; Sat, 1 Jan 2011 09:28:27 -0800 (PST) In-Reply-To: <4D1F442F.8060801@my.gd> References: <4D1C6F90.3080206@my.gd> <4D1F442F.8060801@my.gd> Date: Sun, 2 Jan 2011 04:28:27 +1100 Message-ID: From: Jean-Yves Avenard To: Damien Fleuriot Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-stable@freebsd.org Subject: Re: ZFS - moving from a zraid1 to zraid2 pool with 1.5tb disks X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 17:28:27 -0000 On 2 January 2011 02:11, Damien Fleuriot wrote: > I remember getting rather average performance on v14 but Jean-Yves > reported good performance boosts from upgrading to v15. that was v28 :) saw no major difference between v14 and v15. JY From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 18:36:43 2011 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D648106566C; Sat, 1 Jan 2011 18:36:43 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id E2A588FC0C; Sat, 1 Jan 2011 18:36:42 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 241586B3B10; Sat, 1 Jan 2011 19:18:53 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 13.8405] X-CRM114-CacheID: sfid-20110101_19185_C62F655B X-CRM114-Status: Good ( pR: 13.8405 ) X-Spambayes-Classification: ham; 0.00 Message-ID: <4D1F7008.3050506@fsn.hu> Date: Sat, 01 Jan 2011 19:18:48 +0100 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Martin Matuska References: <4D0A09AF.3040005@FreeBSD.org> In-Reply-To: <4D0A09AF.3040005@FreeBSD.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org, freebsd-stable@FreeBSD.org Subject: Re: New ZFSv28 patchset for 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 18:36:43 -0000 On 12/16/2010 01:44 PM, Martin Matuska wrote: > Link to the patch: > > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > > I've used this: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz on a server with amd64, 8 G RAM, acting as a file server on ftp/http/rsync, the content being read only mounted with nullfs in jails, and the daemons use sendfile (ftp and http). The effects can be seen here: http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ the exact moment of the switch can be seen on zfs_mem-week.png, where the L2 ARC has been discarded. What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) Maybe I could accept the higher system load as normal, because there were a lot of things changed between v15 and v28 (but I was hoping if I use the same feature set, it will require less CPU), but dropping the L2ARC hit rate so radically seems to be a major issue somewhere. As you can see from the memory stats, I have enough kernel memory to hold the L2 headers, so the L2 devices got filled up to their maximum capacity. Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. Thanks, From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 19:09:33 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A9A0E106564A; Sat, 1 Jan 2011 19:09:33 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-ww0-f42.google.com (mail-ww0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id DFD5F8FC15; Sat, 1 Jan 2011 19:09:32 +0000 (UTC) Received: by wwi17 with SMTP id 17so12843527wwi.1 for ; Sat, 01 Jan 2011 11:09:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=K8i/jWVzO1r5x2IgYWNKM5hWXmxpDl7KRJTmSQMIPhM=; b=OL7gJgtgYQnCt86aUuCVQ4fUZl+lSzH1L8u7PwIAHk3iItaFDHadf2epnbnKsuqusH mzILAjLp9m5msX1nanlDXanuKxfBVdDat/MBwJvgc3UGux4kLE9XjJ6q9rCGlW/b0auN M8zQ4GZNUh20G027MDsB/4QC84y5yEBlbYofQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=iJC3ei/nOpKzH6Dwo7T9P8uzyzTU58iPhsnuSzzoBfXyBZCFXsz0MhqWFFoaMFv+Cf 2eAAKyDuxc7ZZ7BmujBXLPWumvkuuMBLVVyuidlRoN5yodr4jEBgCDM35YbirmdUCc93 Ad+S/SPn0YmM9AL0Or9udlFNxScS1FyZjmmVc= MIME-Version: 1.0 Received: by 10.227.137.203 with SMTP id x11mr10989261wbt.80.1293908971710; Sat, 01 Jan 2011 11:09:31 -0800 (PST) Sender: artemb@gmail.com Received: by 10.227.129.6 with HTTP; Sat, 1 Jan 2011 11:09:31 -0800 (PST) In-Reply-To: <4D1F7008.3050506@fsn.hu> References: <4D0A09AF.3040005@FreeBSD.org> <4D1F7008.3050506@fsn.hu> Date: Sat, 1 Jan 2011 11:09:31 -0800 X-Google-Sender-Auth: YVPJG4s-m-QcEYPXj1NRZyvNvRg Message-ID: From: Artem Belevich To: Attila Nagy Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, Martin Matuska Subject: Re: New ZFSv28 patchset for 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 19:09:33 -0000 On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: > What I see: > - increased CPU load > - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased > hard disk load (IOPS graph) > ... > Any ideas on what could cause these? I haven't upgraded the pool version and > nothing was changed in the pool or in the file system. The fact that L2 ARC is full does not mean that it contains the right data. Initial L2ARC warm up happens at a much higher rate than the rate L2ARC is updated after it's been filled initially. Even accelerated warm-up took almost a day in your case. In order for L2ARC to warm up properly you may have to wait quite a bit longer. My guess is that it should slowly improve over the next few days as data goes through L2ARC and those bits that are hit more often take residence there. The larger your data set, the longer it will take for L2ARC to catch the right data. Do you have similar graphs from pre-patch system just after reboot? I suspect that it may show similarly abysmal L2ARC hit rates initially, too. --Artem From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 19:23:11 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3EAE106564A; Sat, 1 Jan 2011 19:23:11 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id B8A648FC18; Sat, 1 Jan 2011 19:23:10 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id 382B76B3E1A; Sat, 1 Jan 2011 20:23:09 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.001600, version=1.2.2 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 13.8090] X-CRM114-CacheID: sfid-20110101_20230_F10AFF58 X-CRM114-Status: Good ( pR: 13.8090 ) X-Spambayes-Classification: ham; 0.00 Message-ID: <4D1F7F1C.9090106@fsn.hu> Date: Sat, 01 Jan 2011 20:23:08 +0100 From: Attila Nagy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.23) Gecko/20090817 Thunderbird/2.0.0.23 Mnenhy/0.7.6.0 MIME-Version: 1.0 To: Artem Belevich References: <4D0A09AF.3040005@FreeBSD.org> <4D1F7008.3050506@fsn.hu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, Martin Matuska Subject: Re: New ZFSv28 patchset for 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 19:23:11 -0000 On 01/01/2011 08:09 PM, Artem Belevich wrote: > On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: >> What I see: >> - increased CPU load >> - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased >> hard disk load (IOPS graph) >> > ... >> Any ideas on what could cause these? I haven't upgraded the pool version and >> nothing was changed in the pool or in the file system. > The fact that L2 ARC is full does not mean that it contains the right > data. Initial L2ARC warm up happens at a much higher rate than the > rate L2ARC is updated after it's been filled initially. Even > accelerated warm-up took almost a day in your case. In order for L2ARC > to warm up properly you may have to wait quite a bit longer. My guess > is that it should slowly improve over the next few days as data goes > through L2ARC and those bits that are hit more often take residence > there. The larger your data set, the longer it will take for L2ARC to > catch the right data. > > Do you have similar graphs from pre-patch system just after reboot? I > suspect that it may show similarly abysmal L2ARC hit rates initially, > too. > > Sadly no, but I remember that I've seen increasing hit rates as the cache grew, that's what I wrote the email after one and a half days. Currently it's at the same level, when it was right after the reboot... We'll see after few days. From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 20:48:39 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB5D0106564A for ; Sat, 1 Jan 2011 20:48:39 +0000 (UTC) (envelope-from mikes@siralan.org) Received: from mail.suso.org (mail.suso.org [66.244.94.5]) by mx1.freebsd.org (Postfix) with ESMTP id A63E28FC12 for ; Sat, 1 Jan 2011 20:48:39 +0000 (UTC) Received: from c-71-194-154-137.hsd1.in.comcast.net (c-71-194-154-137.hsd1.in.comcast.net [71.194.154.137]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.suso.org (Postfix) with ESMTP id 632EC1B08A; Sat, 1 Jan 2011 20:48:39 +0000 (GMT) Date: Sat, 1 Jan 2011 15:43:09 -0500 (EST) From: "Michael L. Squires" X-X-Sender: mikes@familysquires.net To: Jeremy Chadwick In-Reply-To: <20101230235022.GA78602@icarus.home.lan> Message-ID: <20110101153603.X21482@familysquires.net> References: <20101127184952.E90087@familysquires.net> <20101128160043.W11452@familysquires.net> <20101129023531.GB1380@michelle.cdnetworks.com> <20101220181005.P32987@familysquires.net> <20101221180543.GB5236@michelle.cdnetworks.com> <20101222135420.C47756@familysquires.net> <20101230181239.N2744@familysquires.net> <20101230235022.GA78602@icarus.home.lan> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Pyun YongHyeon , freebsd-stable@freebsd.org Subject: Re: bge driver regression in 7.4-PRERELEASE, Tyan S4881 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 20:48:39 -0000 On Thu, 30 Dec 2010, Jeremy Chadwick wrote: > Please provide output from the following command, as root: > > pciconf -lbvc > > And only include the bge1 and bge0 devices in your output. Thanks. > This is the output, as root, using the kernel with the 10/7/2010 bge code (which works for me). I can provide the code with the 7.4-PRERELEASE kernel if you want that. OS is compiled as amd64. bge0@pci0:17:2:0: class=0x020000 card=0x164814e4 chip=0x164814e4 rev=0x03 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme Dual Gigabit Adapter (BCM5704)' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xd0110000, size 65536, enabled bar [18] = type Memory, range 64, base 0xd0100000, size 65536, enabled cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transact ion cap 01[48] = powerspec 2 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit bge1@pci0:17:2:1: class=0x020000 card=0x164814e4 chip=0x164814e4 rev=0x03 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme Dual Gigabit Adapter (BCM5704)' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xd0130000, size 65536, enabled bar [18] = type Memory, range 64, base 0xd0120000, size 65536, enabled cap 07[40] = PCI-X 64-bit supports 133MHz, 2048 burst read, 1 split transact ion cap 01[48] = powerspec 2 supports D0 D3 current D0 cap 03[50] = VPD cap 05[58] = MSI supports 8 messages, 64 bit This is a hobby system supporting a home server, so it's not "mission-critical" and my current hack is working properly. Thanks to both of you for your assistance. Mike Squires mikes@siralan.org UN*X at home since 1986 From owner-freebsd-stable@FreeBSD.ORG Sat Jan 1 23:22:32 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6522106564A for ; Sat, 1 Jan 2011 23:22:32 +0000 (UTC) (envelope-from miyamoto.31b@gmail.com) Received: from mail-ww0-f42.google.com (mail-ww0-f42.google.com [74.125.82.42]) by mx1.freebsd.org (Postfix) with ESMTP id 5FB718FC14 for ; Sat, 1 Jan 2011 23:22:31 +0000 (UTC) Received: by wwi17 with SMTP id 17so12927977wwi.1 for ; Sat, 01 Jan 2011 15:22:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=inRjvygVI7+ec2y0OIOkpkig+N2qThNz8cT4Clhlucw=; b=ZAd5oV/eD+cvaYcOUKue9GL6wSAIjkqGXvOON9JIIY82Vc+jiJftKJ34X5rPJg8epB JVEL/+9U0DGEGSub4DTm8SDaO3kR7GHPPWeZS54A8nB0WwqHmpLmtqWv47wa9FNbbUKW kLalvMntRRrXfYAzT1KIgoUpcvGLfCCEJb03U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=b253Dy9nAFFcG8MQkgPSVOlNHkGRTKQgXWeN/rI3hamaz9gO/uJlNRvqJad11s493E 69W3nmnoLcautLTePhGIz/vwpfhQYcmVA4H3jqDMa2uE3oJI+5m1nsWcO5dI7/78mVHo XFUsiJiPkY2syJIOuD4XZf2UKQmhv1Q4+c2Rk= MIME-Version: 1.0 Received: by 10.227.107.99 with SMTP id a35mr10635039wbp.156.1293922335900; Sat, 01 Jan 2011 14:52:15 -0800 (PST) Received: by 10.227.13.143 with HTTP; Sat, 1 Jan 2011 14:52:15 -0800 (PST) Date: Sat, 1 Jan 2011 22:52:15 +0000 Message-ID: From: miyamoto moesasji To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: tmpfs runs out of space on 8.2pre-release, zfs related? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Jan 2011 23:22:32 -0000 In setting up tmpfs (so not tmpmfs) on a machine that is using zfs(v15, zfs v4) on 8.2prerelease I run out of space on the tmpfs when copying a file of ~4.6 GB file from the zfs-filesystem to the memory disk. This machine has 8GB of memory backed by swap on the harddisk, so I expected the file to copy to memory without problems. Below in detail what happens: upon rebooting the machine the tmpfs has 8GB available as can be seen below: --- hge@PulsarX4:~/ > df -hi /tmp Filesystem Size Used Avail Capacity iused ifree %iused Mounted on tmpfs 8.2G 12K 8.2G 0% 19 39M 0% /tmp --- Subsequently copying a ~4.6GB file from a location in the zfs-pool to the memory filesystem fails with no more space left message --- hge@PulsarX4:~/ > cp ~/temp/large.iso /tmp/large_file cp: /tmp/large_file: No space left on device --- After this the tmpfs has shrunk to just 2.7G, obviously much less than the 8.2G available before the copy-operation. At the same time there are still free inodes left, so that does not appear to be the problem. Output of the df after the copy: --- hge@PulsarX4:~/ > df -hi /tmp Filesystem Size Used Avail Capacity iused ifree %iused Mounted on tmpfs 2.7G 2.7G 1.4M 100% 19 6.4k 0% /tmp --- A quick search shows the following bug-report for solaris: http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=e4ae9c32983000ef651e38edbba1?bug_id=6804661This appears closely related as here I also try to copy a file >50% of memory to the tmpfs and the way to reproduce appears identical to what I did here. As it might help spot the problem: below the information on the zfs ARC size obtained from the output of zfs-stats. This gives: Before the copy: --- System Memory Statistics: Physical Memory: 8161.74M Kernel Memory: 511.64M DATA: 94.27% 482.31M TEXT: 5.73% 29.33M ARC Size: Current Size (arcsize): 5.88% 404.38M Target Size (Adaptive, c): 100.00% 6874.44M Min Size (Hard Limit, c_min): 12.50% 859.31M Max Size (High Water, c_max): ~8:1 6874.44M --- After the copy: --- System Memory Statistics: Physical Memory: 8161.74M Kernel Memory: 3326.98M DATA: 99.12% 3297.65M TEXT: 0.88% 29.33M ARC Size: Current Size (arcsize): 46.99% 3230.55M Target Size (Adaptive, c): 100.00% 6874.44M Min Size (Hard Limit, c_min): 12.50% 859.31M Max Size (High Water, c_max): ~8:1 6874.44M --- Unfortunately I have difficulties interpreting this further than this, so suggestions how to prevent this behavior (or further trouble shoot this) would be appreciated as my feeling is that this should not happen.