From owner-freebsd-fs@FreeBSD.ORG Mon Apr 30 10:20:10 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4161316A403 for ; Mon, 30 Apr 2007 10:20:10 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from transport.cksoft.de (transport.cksoft.de [62.111.66.27]) by mx1.freebsd.org (Postfix) with ESMTP id 0173113C4BC for ; Mon, 30 Apr 2007 10:20:09 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from transport.cksoft.de (localhost [127.0.0.1]) by transport.cksoft.de (Postfix) with ESMTP id D2E3E1FFE79; Mon, 30 Apr 2007 12:00:09 +0200 (CEST) Received: by transport.cksoft.de (Postfix, from userid 66) id BA4031FFE72; Mon, 30 Apr 2007 12:00:05 +0200 (CEST) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id 0CFE4444885; Mon, 30 Apr 2007 09:56:00 +0000 (UTC) Date: Mon, 30 Apr 2007 09:55:59 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: freebsd-current@freebsd.org In-Reply-To: <461B46E5.5070509@yandex.ru> Message-ID: <20070430095208.L36917@maildrop.int.zabbadoz.net> References: <20070406025700.GB98545@garage.freebsd.pl> <461B1DDC.8050009@yandex.ru> <20070410061006.GA42711@xor.obsecurity.org> <6eb82e0704092335h31df5be5qd7cee8f7234b1539@mail.gmail.com> <20070410063817.GA43061@xor.obsecurity.org> <461B36D5.3020307@yandex.ru> <20070410070628.GA43365@xor.obsecurity.org> <461B46E5.5070509@yandex.ru> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: by AMaViS cksoft-s20020300-20031204bz on transport.cksoft.de Cc: freebsd-fs@freebsd.org Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Apr 2007 10:20:10 -0000 On Tue, 10 Apr 2007, Andrey V. Elsukov wrote: > Kris Kennaway wrote: >> \o/ >> >> You might need to recompile with DEBUG_LOCKS and DEBUG_VFS_LOCKS and >> do 'show lockedvnods', but maybe this is trivially reproducible. > > I've rollbacked and destroyed this snapshot and now don't have this > problem. But i have several LOR. The two news ones got added to 'The LOR page': >lock order reversal: > 1st 0xc2be9154 zfs:&db->db_mtx (zfs:&db->db_mtx) @ > sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode.c:318 > 2nd 0xc2c94b20 zfs:&zp->z_lock (zfs:&zp->z_lock) @ > sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:73 LOR ID #209 http://sources.zabbadoz.net/freebsd/lor.html#209 >lock order reversal: > 1st 0xc2c3e818 zfs:&ds->ds_deadlist.bpl_lock (zfs:&ds->ds_deadlist.bpl_lock) @ > sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/bplist.c:154 > 2nd 0xc2be63a0 zfs:&dn->dn_struct_rwlock (zfs:&dn->dn_struct_rwlock) @ > sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode.c:571 LOR ID #210 http://sources.zabbadoz.net/freebsd/lor.html#210 -- Bjoern A. Zeeb bzeeb at Zabbadoz dot NeT From owner-freebsd-fs@FreeBSD.ORG Mon Apr 30 21:28:34 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 38CCF16A402; Mon, 30 Apr 2007 21:28:34 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from mxfep01.bredband.com (mxfep01.bredband.com [195.54.107.70]) by mx1.freebsd.org (Postfix) with ESMTP id 34AEB13C487; Mon, 30 Apr 2007 21:28:32 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from ironport2.bredband.com ([195.54.107.84] [195.54.107.84]) by mxfep01.bredband.com with ESMTP id <20070430212831.VMRK28445.mxfep01.bredband.com@ironport2.bredband.com>; Mon, 30 Apr 2007 23:28:31 +0200 Received: from c-5416e555.03-51-73746f3.cust.bredbandsbolaget.se (HELO scode.mine.nu) ([85.229.22.84]) by ironport2.bredband.com with ESMTP; 30 Apr 2007 23:28:31 +0200 Received: from [127.0.0.1] (localhost [127.0.0.1]) by scode.mine.nu (Postfix) with ESMTP id 294F51B0A; Mon, 30 Apr 2007 23:28:31 +0200 (CEST) Message-ID: <46365F76.7090708@infidyne.com> Date: Mon, 30 Apr 2007 23:28:22 +0200 From: Peter Schuller User-Agent: Thunderbird 1.5.0.10 (X11/20070417) MIME-Version: 1.0 To: Craig Boston , freebsd-current@FreeBSD.org, Pawel Jakub Dawidek , freebsd-fs@FreeBSD.org References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> In-Reply-To: <20070410003505.GA8189@nowhere> X-Enigmail-Version: 0.94.3.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigF04CD83DED20F6D94E36C095" Cc: Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Apr 2007 21:28:34 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigF04CD83DED20F6D94E36C095 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable > Hi, just wanted to chime in that I'm experiencing the same panic with > a fresh -CURRENT. I am also/still seeing the "kmem_map too small" panic on a tree cvsup:ed around April 27. I can consistently trigger it by doing "rsync -a /usr/ports /somepool/somepath", with both /usr and /somepool being on ZFS (different pools). This is on a machine with 1 GB memory, with the kmem size being 320 MB as per default. The kstat.zfs.misc.arcstats.size never jumps; the "solaris" memory usage never significantly jumps - stays between about 80 MB and 150 MB at all times. It does not even consistently increase in size within this range - it goes up and down. In terms of absolute sizes, nothing in the output of vmstat -m, except solaris, is even approaching the sizes we are talking about (everything is a handful of megs at most). Watching "top" during the rsync I can see wired memory steadily increasing. Starting at about 110 megs or so after startup (including parts of my desktop), it begins consistently increasing when I run the rsync. in this case I started to approach 200 megs. When rsync was done (ran it with -v this time) reading the source directory and began copying files, the growth of wired memory increased significantly in speed (it was up to 280 MB or so in under 30 seconds). Killing rsync did not cause the wired total to go down. Any suggestions on whether there is something else to monitor to find out what is using all the memory? zfs_kmem_alloc() always allocates with M_SOLARIS. kmem_cache_{create,alloc} don't, but they seem to be allocating very small amounts of memory (could there be leakage of a huge number of these?). Is it expected that ZFS would allocate significant amount of memory that is not categorized as M_SOLARIS? Could there be fragmentation going on? Are there very large allocations, relative to the 320 MB kmem size, intermixed with small allocations? Anything I can do in terms of testing that would help debug this, beyond what has already been done and reported on on -current? --=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --------------enigF04CD83DED20F6D94E36C095 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNl9+DNor2+l1i30RCCk+AJ9CyO17c6ESWugBGb0aerpN7VtTjACgxkUK wuINyXFIiLUa88294i9S77M= =NDfk -----END PGP SIGNATURE----- --------------enigF04CD83DED20F6D94E36C095-- From owner-freebsd-fs@FreeBSD.ORG Mon Apr 30 21:31:19 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B072116A401; Mon, 30 Apr 2007 21:31:19 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 1326613C483; Mon, 30 Apr 2007 21:31:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 36E7445B26; Mon, 30 Apr 2007 23:31:15 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 0DAC545681; Mon, 30 Apr 2007 23:31:08 +0200 (CEST) Date: Mon, 30 Apr 2007 23:30:43 +0200 From: Pawel Jakub Dawidek To: Peter Schuller Message-ID: <20070430213043.GF67738@garage.freebsd.pl> References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="cz6wLo+OExbGG7q/" Content-Disposition: inline In-Reply-To: <46365F76.7090708@infidyne.com> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@FreeBSD.org, Craig Boston , freebsd-current@FreeBSD.org Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Apr 2007 21:31:19 -0000 --cz6wLo+OExbGG7q/ Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 30, 2007 at 11:28:22PM +0200, Peter Schuller wrote: > > Hi, just wanted to chime in that I'm experiencing the same panic with > > a fresh -CURRENT. >=20 > I am also/still seeing the "kmem_map too small" panic on a tree cvsup:ed > around April 27. >=20 > I can consistently trigger it by doing "rsync -a /usr/ports > /somepool/somepath", with both /usr and /somepool being on ZFS > (different pools). This is on a machine with 1 GB memory, with the kmem > size being 320 MB as per default. >=20 > The kstat.zfs.misc.arcstats.size never jumps; the "solaris" memory usage > never significantly jumps - stays between about 80 MB and 150 MB at all > times. It does not even consistently increase in size within this range > - it goes up and down. >=20 > In terms of absolute sizes, nothing in the output of vmstat -m, except > solaris, is even approaching the sizes we are talking about (everything > is a handful of megs at most). >=20 > Watching "top" during the rsync I can see wired memory steadily > increasing. Starting at about 110 megs or so after startup (including > parts of my desktop), it begins consistently increasing when I run the > rsync. in this case I started to approach 200 megs. When rsync was done > (ran it with -v this time) reading the source directory and began > copying files, the growth of wired memory increased significantly in > speed (it was up to 280 MB or so in under 30 seconds). >=20 > Killing rsync did not cause the wired total to go down. >=20 > Any suggestions on whether there is something else to monitor to find > out what is using all the memory? >=20 > zfs_kmem_alloc() always allocates with M_SOLARIS. > kmem_cache_{create,alloc} don't, but they seem to be allocating very > small amounts of memory (could there be leakage of a huge number of > these?). Is it expected that ZFS would allocate significant amount of > memory that is not categorized as M_SOLARIS? >=20 > Could there be fragmentation going on? Are there very large allocations, > relative to the 320 MB kmem size, intermixed with small allocations? >=20 > Anything I can do in terms of testing that would help debug this, beyond > what has already been done and reported on on -current? What you're seeing is probably another problem, which was described already. Try tunning kern.maxvnodes down to 3/4 of the current value, see if that helps and please report back. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --cz6wLo+OExbGG7q/ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGNmADForvXbEpPzQRAoSIAKC/vuP7XWzvba1h2MJOTI5iDTBZ6wCeLoDz isJ4AqbPWD6IYslc6ByUQPo= =Kopf -----END PGP SIGNATURE----- --cz6wLo+OExbGG7q/-- From owner-freebsd-fs@FreeBSD.ORG Mon Apr 30 21:56:15 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5E7E116A401; Mon, 30 Apr 2007 21:56:15 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from mxfep03.bredband.com (mxfep03.bredband.com [195.54.107.76]) by mx1.freebsd.org (Postfix) with ESMTP id 5E7A913C480; Mon, 30 Apr 2007 21:56:14 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from ironport2.bredband.com ([195.54.107.84] [195.54.107.84]) by mxfep03.bredband.com with ESMTP id <20070430215612.UVTE23113.mxfep03.bredband.com@ironport2.bredband.com>; Mon, 30 Apr 2007 23:56:12 +0200 Received: from c-5416e555.03-51-73746f3.cust.bredbandsbolaget.se (HELO scode.mine.nu) ([85.229.22.84]) by ironport2.bredband.com with ESMTP; 30 Apr 2007 23:56:12 +0200 Received: from [127.0.0.1] (localhost [127.0.0.1]) by scode.mine.nu (Postfix) with ESMTP id 010DBBD; Mon, 30 Apr 2007 23:56:11 +0200 (CEST) Message-ID: <463665F2.8090605@infidyne.com> Date: Mon, 30 Apr 2007 23:56:02 +0200 From: Peter Schuller User-Agent: Thunderbird 1.5.0.10 (X11/20070417) MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> In-Reply-To: <20070430213043.GF67738@garage.freebsd.pl> X-Enigmail-Version: 0.94.3.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigA12CB248266FB71C4FF0A25E" Cc: freebsd-fs@FreeBSD.org, Craig Boston , freebsd-current@FreeBSD.org Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Apr 2007 21:56:15 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigA12CB248266FB71C4FF0A25E Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable > What you're seeing is probably another problem, which was described > already. My apologies. I see this was mentioned on -fs (but for some reason doesn't show up in Google). I'll subscribe to that and remember to check the archive before generating more noise in the future. > Try tunning kern.maxvnodes down to 3/4 of the current > value, see if that helps and please report back. This does seem to eliminate the problem here too. Again, apologies for the noise, and thank you very much. --=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --------------enigA12CB248266FB71C4FF0A25E Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNmX6DNor2+l1i30RCNdFAKCuYGT3HiISktJEky6VeRNA4A+bnwCeP0eZ Vewh5CUn0fU4BsaWk0tjlAQ= =HsO4 -----END PGP SIGNATURE----- --------------enigA12CB248266FB71C4FF0A25E-- From owner-freebsd-fs@FreeBSD.ORG Tue May 1 07:42:42 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7987E16A407 for ; Tue, 1 May 2007 07:42:42 +0000 (UTC) (envelope-from garyj@jennejohn.org) Received: from mail08b.verio.de (mail08b.verio.de [213.198.55.74]) by mx1.freebsd.org (Postfix) with SMTP id BCFC213C45A for ; Tue, 1 May 2007 07:42:41 +0000 (UTC) (envelope-from garyj@jennejohn.org) Received: from mx64.stngva01.us.mxservers.net (204.202.242.134) by mail08b.verio.de (RS ver 1.0.95vs) with SMTP id 3-0371325358 for ; Tue, 1 May 2007 09:42:39 +0200 (CEST) Received: from mmm808.verio.de [213.198.55.120] (EHLO mmm808.verio.de) by mx64.stngva01.us.mxservers.net (mxl_mta-1.3.8-10p4) with ESMTP id 57be6364.7545.329.mx64.stngva01.us.mxservers.net; Tue, 01 May 2007 03:25:41 -0400 (EDT) Received: (qmail 72312 invoked from network); 1 May 2007 07:42:37 -0000 Received: from unknown (HELO peedub.jennejohn.org) (89.59.20.133) by with SMTP; 1 May 2007 07:42:37 -0000 Received: from jennejohn.org (localhost [127.0.0.1]) by peedub.jennejohn.org (8.14.1/8.14.1) with ESMTP id l417gZ58002245; Tue, 1 May 2007 09:42:35 +0200 (CEST) (envelope-from garyj@jennejohn.org) Message-Id: <200705010742.l417gZ58002245@peedub.jennejohn.org> X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.0.4 To: Pawel Jakub Dawidek In-Reply-To: Message from Pawel Jakub Dawidek of "Mon, 30 Apr 2007 23:30:43 +0200." <20070430213043.GF67738@garage.freebsd.pl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 May 2007 09:42:35 +0200 From: Gary Jennejohn X-Spam: [F=0.7966601863; heur=0.500(-20400); stat=0.500; spamtraq-heur=0.796(2007022501)] X-MAIL-FROM: X-SOURCE-IP: [213.198.55.120] X-SF-Loop: 1 Cc: freebsd-fs@FreeBSD.org, Craig Boston , freebsd-current@FreeBSD.org Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 07:42:42 -0000 > > On Mon, Apr 30, 2007 at 11:28:22PM +0200, Peter Schuller wrote: [big snip - report of "kmem_map too small" panic] Pawel Jakub Dawidek writes: > What you're seeing is probably another problem, which was described > already. Try tunning kern.maxvnodes down to 3/4 of the current value, > see if that helps and please report back. > I was seeing this panic too and can confirm that Pawel's suggestion fixed it for me. --- Gary Jennejohn / garyjATjennejohnDOTorg gjATfreebsdDOTorg garyjATdenxDOTde From owner-freebsd-fs@FreeBSD.ORG Tue May 1 08:50:29 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5C2EE16A403 for ; Tue, 1 May 2007 08:50:29 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from solow.pil.dk (relay.pil.dk [195.41.47.164]) by mx1.freebsd.org (Postfix) with ESMTP id 2DE9513C45A for ; Tue, 1 May 2007 08:50:29 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from coruscant.local (naboo.binarysolutions.dk [80.196.17.173]) by solow.pil.dk (Postfix) with ESMTP id 241761CC0F6 for ; Tue, 1 May 2007 10:33:12 +0200 (CEST) Received: by coruscant.local (Postfix, from userid 502) id 39E5F2FEE76; Tue, 1 May 2007 10:33:10 +0200 (CEST) To: freebsd-fs@freebsd.org From: Kenneth Vestergaard Schmidt Date: Tue, 01 May 2007 10:33:10 +0200 Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.96 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Sun Fire X4500, FreeBSD and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 08:50:29 -0000 Mjello. Just thought I'd say that we've got a Sun Fire X4500 running with -CURRENT as of yesterday and ZFS. Works beautifully, after we disabled MSI and increased VM_KMEM_SIZE_MAX. Without the increased VM_KMEM_SIZE_MAX, we got the usual panic (kmem_map too small). I haven't tried adjusting maxvnodes - that might also have helped. However, the machine has 16 GB RAM, so it might as well be used for something. I'm not quite sure how to tweak the box efficiently, but for now the bottleneck is our network, so we're going to upgrade some pieces and try again. We configured the 48 drives as follows: - ad52 and ad60 are magic - the BIOS is hardcoded to boot from them, so we put them in a gmirror - 5 RAIDZ2's, each with 9 disks, for a usable total of 7 per array - one global hotspare # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT void 20.3T 62.1G 20.3T 0% ONLINE - # zfs list NAME USED AVAIL REFER MOUNTPOINT void 48.2G 15.5T 41.9K /void All in all, a fun little toy :) -- Best Regards Kenneth Schmidt From owner-freebsd-fs@FreeBSD.ORG Tue May 1 13:12:25 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0590216A401; Tue, 1 May 2007 13:12:25 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from mxfep01.bredband.com (mxfep01.bredband.com [195.54.107.70]) by mx1.freebsd.org (Postfix) with ESMTP id 0441213C455; Tue, 1 May 2007 13:12:23 +0000 (UTC) (envelope-from peter.schuller@infidyne.com) Received: from ironport.bredband.com ([195.54.107.82] [195.54.107.82]) by mxfep01.bredband.com with ESMTP id <20070501131222.CIDM28445.mxfep01.bredband.com@ironport.bredband.com>; Tue, 1 May 2007 15:12:22 +0200 Received: from c-5416e555.03-51-73746f3.cust.bredbandsbolaget.se (HELO scode.mine.nu) ([85.229.22.84]) by ironport.bredband.com with ESMTP; 01 May 2007 15:12:22 +0200 Received: from [127.0.0.1] (localhost [127.0.0.1]) by scode.mine.nu (Postfix) with ESMTP id BEDE412F; Tue, 1 May 2007 15:12:21 +0200 (CEST) Message-ID: <46373CAD.6000502@infidyne.com> Date: Tue, 01 May 2007 15:12:13 +0200 From: Peter Schuller User-Agent: Thunderbird 1.5.0.10 (X11/20070417) MIME-Version: 1.0 To: Peter Schuller References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> In-Reply-To: <463665F2.8090605@infidyne.com> X-Enigmail-Version: 0.94.3.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="------------enigCA26013F7A39243E7B6C30CE" Cc: freebsd-fs@FreeBSD.org, Craig Boston , freebsd-current@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 13:12:25 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigCA26013F7A39243E7B6C30CE Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable > This does seem to eliminate the problem here too. It appears the problem persists, but is more difficult to trigger. I had a reboot again during building of ports. I decreased maxvnodes further (to about 2/3 of the default, instead of the recommended 4/3 of the default). Even after that, I had another reboot just now, also during building of ports. It takes on the order of several hours to trigger it. Note that I say "reboot" because that's what it was; it appears that when this happens when in X (which I suspect is the triggering difference) a reboot is triggered immediately, while at the console I get the kernel debugger. As a result I cannot say for certain that I am still seeing the same problem; it is only an assumption at this point. Because swap is on a glabel I have crashdumps turned off, as I was not sure whether it was safe (i.e., I don't want crashdumps to accidentally write to the wrong partition). --=20 / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller ' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org --------------enigCA26013F7A39243E7B6C30CE Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGNzy1DNor2+l1i30RCPRvAKDUUCmMoL3gtROUx+8L+s/O8K0h/ACfZwC9 B9lZeKebbe1uBpKdMHEU+eg= =YJd4 -----END PGP SIGNATURE----- --------------enigCA26013F7A39243E7B6C30CE-- From owner-freebsd-fs@FreeBSD.ORG Tue May 1 14:40:13 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D935C16A402; Tue, 1 May 2007 14:40:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from gigi.cs.uoguelph.ca (gigi.cs.uoguelph.ca [131.104.94.210]) by mx1.freebsd.org (Postfix) with ESMTP id 9564513C459; Tue, 1 May 2007 14:40:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.96.170]) by gigi.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id l41Ee8Fu025696; Tue, 1 May 2007 10:40:08 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id l41EfB025438; Tue, 1 May 2007 10:41:11 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 1 May 2007 10:41:10 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher To: Peter Schuller In-Reply-To: <46373CAD.6000502@infidyne.com> Message-ID: References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.57 on 131.104.94.210 Cc: freebsd-fs@freebsd.org, Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 14:40:14 -0000 On Tue, 1 May 2007, Peter Schuller wrote: >> This does seem to eliminate the problem here too. > > It appears the problem persists, but is more difficult to trigger. [stuff snipped] > It takes on the order of several hours to trigger it. I don't know if it relevent, but I've seen "kmem_map: too small" panics when testing my NFSv4 server, ever since about FreeBSD5.4. There is no problem running the same server code on FreeBSD4 (which is what I still run in production mode) or OpenBSD3 or 4. If I increase the size of the map, I can delay the panic for up to about two weeks of hard testing, but it never goes away. I don't see any evidence of a memory leak during the several days of testing leading up to the panic. (NFSv4 uses MALLOC/FREE extensively for state related structures.) So, I'm wondering if maybe there is some subtle bug in MALLOC/FREE (maybe i386 specific, since that's what I test on)? Anyhow, good luck with it, rick From owner-freebsd-fs@FreeBSD.ORG Tue May 1 14:41:07 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8182016A401 for ; Tue, 1 May 2007 14:41:07 +0000 (UTC) (envelope-from ml.freebsd-fs@ledisez.net) Received: from ledisez.net (ledisez.net [80.247.230.138]) by mx1.freebsd.org (Postfix) with ESMTP id 12D2413C45D for ; Tue, 1 May 2007 14:41:06 +0000 (UTC) (envelope-from ml.freebsd-fs@ledisez.net) Received: from webmail.ledisez.net (localhost.localdomain [80.247.230.138]) by ledisez.net (Postfix) with ESMTP id 96EA045AEE8; Tue, 1 May 2007 16:41:05 +0200 (CEST) Received: from 84.102.41.193 (SquirrelMail authenticated user romain) by webmail.ledisez.net with HTTP; Tue, 1 May 2007 16:41:05 +0200 (CEST) Message-ID: <51864.84.102.41.193.1178030465.squirrel@webmail.ledisez.net> In-Reply-To: <46373CAD.6000502@infidyne.com> References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> Date: Tue, 1 May 2007 16:41:05 +0200 (CEST) From: "Romain LE DISEZ" To: "Peter Schuller" User-Agent: SquirrelMail/1.4.9a MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-fs@freebsd.org, Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 14:41:07 -0000 > It appears the problem persists, but is more difficult to trigger. I had the same problem but with the solution of decreasing kern.maxvnodes to 3/4 of the original value, the problem seems solved. > I had a reboot again during building of ports. I decreased maxvnodes > further (to about 2/3 of the default, instead of the recommended 4/3 of > the default). Even after that, I had another reboot just now, also > during building of ports. > > It takes on the order of several hours to trigger it. > > Note that I say "reboot" because that's what it was; it appears that > when this happens when in X (which I suspect is the triggering > difference) a reboot is triggered immediately, while at the console I > get the kernel debugger. I'm currently running Xorg 7.2 from the experimental port tree (see freebsd-x11@) and I'm compiling all the ports of gnome2. The compilation starts about 8 hours ago (pfff... that's too long) => no problems. In this compilation, I'm using differents FS : port tree is on ZFS+compression and work-dir is on UFS. -- Romain LE DISEZ 06.78.77.99.18 http://www.ledisez.net/ From owner-freebsd-fs@FreeBSD.ORG Tue May 1 16:02:14 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9412C16A401; Tue, 1 May 2007 16:02:14 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 7AFF513C458; Tue, 1 May 2007 16:02:14 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 051181A4D98; Tue, 1 May 2007 09:02:47 -0700 (PDT) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id C7059513AE; Tue, 1 May 2007 12:02:13 -0400 (EDT) Date: Tue, 1 May 2007 12:02:13 -0400 From: Kris Kennaway To: Rick Macklem Message-ID: <20070501160213.GA496@xor.obsecurity.org> References: <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="W/nzBZO5zC0uMSeA" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Cc: freebsd-fs@freebsd.org, Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 16:02:14 -0000 --W/nzBZO5zC0uMSeA Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 01, 2007 at 10:41:10AM -0400, Rick Macklem wrote: >=20 >=20 > On Tue, 1 May 2007, Peter Schuller wrote: >=20 > >>This does seem to eliminate the problem here too. > > > >It appears the problem persists, but is more difficult to trigger. > [stuff snipped] > >It takes on the order of several hours to trigger it. >=20 > I don't know if it relevent, but I've seen "kmem_map: too small" panics > when testing my NFSv4 server, ever since about FreeBSD5.4. There is no > problem running the same server code on FreeBSD4 (which is what I still > run in production mode) or OpenBSD3 or 4. If I increase the size of the > map, I can delay the panic for up to about two weeks of hard testing, > but it never goes away. I don't see any evidence of a memory leak during > the several days of testing leading up to the panic. (NFSv4 uses=20 > MALLOC/FREE extensively for state related structures.) Sounds exactly like a memory leak to me. How did you rule it out? > So, I'm wondering if maybe there is some subtle bug in MALLOC/FREE (maybe > i386 specific, since that's what I test on)? That would be unlikely. Kris --W/nzBZO5zC0uMSeA Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGN2SFWry0BWjoQKURAmvbAKDWlLGyQ+f+AUu07xQMmy5eVDkaKwCgyB8r qzARjqlVzO3sRv7tDeOUkKY= =zFtH -----END PGP SIGNATURE----- --W/nzBZO5zC0uMSeA-- From owner-freebsd-fs@FreeBSD.ORG Tue May 1 20:38:13 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BE24916A409; Tue, 1 May 2007 20:38:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from moe.cs.uoguelph.ca (moe.cs.uoguelph.ca [131.104.94.198]) by mx1.freebsd.org (Postfix) with ESMTP id 7887913C4BC; Tue, 1 May 2007 20:38:13 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.96.170]) by moe.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id l41Kc6ta012580; Tue, 1 May 2007 16:38:06 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id l41Kd9P16727; Tue, 1 May 2007 16:39:10 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Tue, 1 May 2007 16:39:09 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher To: Kris Kennaway In-Reply-To: <20070501160213.GA496@xor.obsecurity.org> Message-ID: References: <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <20070501160213.GA496@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.57 on 131.104.94.198 Cc: freebsd-fs@freebsd.org, Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 20:38:13 -0000 On Tue, 1 May 2007, Kris Kennaway wrote: >> I don't know if it relevent, but I've seen "kmem_map: too small" panics >> when testing my NFSv4 server, ever since about FreeBSD5.4. There is no >> problem running the same server code on FreeBSD4 (which is what I still >> run in production mode) or OpenBSD3 or 4. If I increase the size of the >> map, I can delay the panic for up to about two weeks of hard testing, >> but it never goes away. I don't see any evidence of a memory leak during >> the several days of testing leading up to the panic. (NFSv4 uses >> MALLOC/FREE extensively for state related structures.) > > Sounds exactly like a memory leak to me. How did you rule it out? Well, I had a little program running on the server that grabbed the mti_stats[] out of the kernel and logged them. I had one client mounted running thousands of passes of the Connectathon basic tests (one client, same activity over and over and over again). For a week, the stats don't show any increase in allocation for any type (alloc - free doesn't get unreasonably big), then..."panic: kmem_map too small". How many days it took to happen would vary with the size of the kernel map, but no evidence of a leak prior to the crash. It seemed to be based on the number of times MALLOC and FREE were called. Also, the same server code (except for the port changes, which have nothing to do with the state handling where MALLOC/FREE get called a lot), works fine for months on FreeBSD4 and OpenBSD3.9. So, I won't say a "memory leak is ruled out", but if there was a leak why wouldn't it bite FreeBSD4 or show up in mti_stats[]? I first saw it on FreeBSD6.0, but went back to FreeBSD5.4 and tried the same test and got the same result. rick From owner-freebsd-fs@FreeBSD.ORG Tue May 1 20:42:42 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9B59416A403 for ; Tue, 1 May 2007 20:42:42 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from mh2.centtech.com (moat3.centtech.com [64.129.166.50]) by mx1.freebsd.org (Postfix) with ESMTP id 6967813C457 for ; Tue, 1 May 2007 20:42:42 +0000 (UTC) (envelope-from anderson@freebsd.org) Received: from neutrino.centtech.com (neutrino.centtech.com [10.177.171.220]) by mh2.centtech.com (8.13.8/8.13.8) with ESMTP id l41KgeV5082837; Tue, 1 May 2007 15:42:41 -0500 (CDT) (envelope-from anderson@freebsd.org) Message-ID: <4637A640.6050700@freebsd.org> Date: Tue, 01 May 2007 15:42:40 -0500 From: Eric Anderson User-Agent: Thunderbird 2.0.0.0 (X11/20070420) MIME-Version: 1.0 To: Cristian KLEIN References: <59558.86.125.188.48.1177802342.squirrel@intranet.utcluj.ro> In-Reply-To: <59558.86.125.188.48.1177802342.squirrel@intranet.utcluj.ro> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.88.4/3189/Tue May 1 11:02:13 2007 on mh2.centtech.com X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=8.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.6 X-Spam-Checker-Version: SpamAssassin 3.1.6 (2006-10-03) on mh2.centtech.com Cc: freebsd-fs@freebsd.org Subject: Re: panic: softdep_setup_inomapdep: found inode already exists in 6.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 20:42:42 -0000 On 04/28/07 18:19, Cristian KLEIN wrote: > Hi everybody, > > I am running a FreeBSD 6.2-p3, on which I am experiencing exactly the same > simtoms as one item of the TODO list of 6.0: > http://www.freebsd.org/releases/6.0R/todo.html > > panic: softdep_setup_inomapdep: found inode Needs testing Tor Egge > Found by stress tests at > http://www.holm.cc/stress/log/cons138.html > > Does anybody know whether this bug should have been solved in 6.2? Should > I file a PR? Sorry if I missed it, but were you able to provide a backtrace? If you can, you should compile your kernel with debugging, so at least you could make a little more out of the crash. See the handbook if you need help on that. Eric From owner-freebsd-fs@FreeBSD.ORG Tue May 1 21:49:17 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2ECCB16A400 for ; Tue, 1 May 2007 21:49:17 +0000 (UTC) (envelope-from lists@stringsutils.com) Received: from zoraida.natserv.net (p65-147.acedsl.com [66.114.65.147]) by mx1.freebsd.org (Postfix) with ESMTP id EB54813C455 for ; Tue, 1 May 2007 21:49:16 +0000 (UTC) (envelope-from lists@stringsutils.com) Received: by zoraida.natserv.net (Postfix, from userid 58) id 05D6AC2F2; Tue, 1 May 2007 17:49:16 -0400 (EDT) X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on zoraida.natserv.net X-Spam-Level: X-Spam-Status: No, score=-1.4 required=4.0 tests=ALL_TRUSTED, DK_POLICY_SIGNSOME autolearn=disabled version=3.1.8 X-Spam-Report: * 0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs some mails * -1.4 ALL_TRUSTED Passed through trusted hosts only via SMTP Received: from 35st.simplicato.com (static-71-249-233-130.nycmny.east.verizon.net [71.249.233.130]) by zoraida.natserv.net (Postfix) with ESMTP id BF2A9C2F4; Tue, 1 May 2007 17:49:13 -0400 (EDT) References: <20070422124731.GA20548@harmless.hu> Message-ID: X-Mailer: http://www.courier-mta.org/cone/ From: Francisco Reyes To: Greg Troxel Date: Tue, 01 May 2007 17:49:13 -0400 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="US-ASCII" Content-Disposition: inline Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: distributed filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 21:49:17 -0000 Greg Troxel writes: > Coda (http://www.coda.cs.cmu.edu/) works well on NetBSD-current, in > which I just fixed the kernel module to conform to updated/simplified .. > There's also arla (afs working client, and server that I'm not sure of > the status). >From a performance perspective would you recommend Coda or Arla? Are distributed filesystems fast enough to handle something like a mailstore for a busy Imap/pop3 server? From owner-freebsd-fs@FreeBSD.ORG Tue May 1 22:20:17 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8E96416A407; Tue, 1 May 2007 22:20:17 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 73F7413C447; Tue, 1 May 2007 22:20:17 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 533851A4D9E; Tue, 1 May 2007 15:20:50 -0700 (PDT) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id B013451451; Tue, 1 May 2007 18:20:16 -0400 (EDT) Date: Tue, 1 May 2007 18:20:16 -0400 From: Kris Kennaway To: Rick Macklem Message-ID: <20070501222016.GA6713@xor.obsecurity.org> References: <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <20070501160213.GA496@xor.obsecurity.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Cc: freebsd-fs@freebsd.org, Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek , Kris Kennaway Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 22:20:17 -0000 On Tue, May 01, 2007 at 04:39:09PM -0400, Rick Macklem wrote: > > > On Tue, 1 May 2007, Kris Kennaway wrote: > >>I don't know if it relevent, but I've seen "kmem_map: too small" panics > >>when testing my NFSv4 server, ever since about FreeBSD5.4. There is no > >>problem running the same server code on FreeBSD4 (which is what I still > >>run in production mode) or OpenBSD3 or 4. If I increase the size of the > >>map, I can delay the panic for up to about two weeks of hard testing, > >>but it never goes away. I don't see any evidence of a memory leak during > >>the several days of testing leading up to the panic. (NFSv4 uses > >>MALLOC/FREE extensively for state related structures.) > > > >Sounds exactly like a memory leak to me. How did you rule it out? > Well, I had a little program running on the server that grabbed the > mti_stats[] out of the kernel and logged them. I had one client mounted > running thousands of passes of the Connectathon basic tests (one client, > same activity over and over and over again). For a week, the stats don't > show any increase in allocation for any type (alloc - free doesn't get > unreasonably big), then..."panic: kmem_map too small". How many days it > took to happen would vary with the size of the kernel map, but no evidence > of a leak prior to the crash. It seemed to be based on the number of times > MALLOC and FREE were called. Or something else is leaking. Really, if there was a problem with MALLOC/FREE we'd see it. Kris From owner-freebsd-fs@FreeBSD.ORG Tue May 1 22:20:45 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BD83A16A400 for ; Tue, 1 May 2007 22:20:45 +0000 (UTC) (envelope-from gore_jarold@yahoo.com) Received: from web63010.mail.re1.yahoo.com (web63010.mail.re1.yahoo.com [69.147.96.221]) by mx1.freebsd.org (Postfix) with SMTP id 6F60313C46A for ; Tue, 1 May 2007 22:20:45 +0000 (UTC) (envelope-from gore_jarold@yahoo.com) Received: (qmail 70165 invoked by uid 60001); 1 May 2007 22:06:44 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=5Ku3sr2XnHOz5vhFzERd/mu8pVKX3lQj+SsLqsxsC+u5SLa1Do0tu7xQHEe2BDwZUEebSMGIlAtz4D6PxzKMtdtqJp18EaLns1hWFAjZuitkzGpBmqD0DVrB43ZqUHHJjd/5wfBy+acrvMZSlO6+iL5JWA+B5I37zBVkTbOpesg=; X-YMail-OSG: LZEbA9UVM1lo5wDESPDHD5.QEEfucyDUss8YHg.PJQ2UGFUO5OPyp5.f8__1Vy10elkA3NRCSOck4G9TUQf4yBJlr17IswLeO1QGspIrG1dVxhlC0Btl6xBygdq7HawK Received: from [75.72.230.91] by web63010.mail.re1.yahoo.com via HTTP; Tue, 01 May 2007 15:06:43 PDT Date: Tue, 1 May 2007 15:06:43 -0700 (PDT) From: Gore Jarold To: freebsd-fs@freebsd.org MIME-Version: 1.0 Message-ID: <7303.68459.qm@web63010.mail.re1.yahoo.com> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: infrequently used filesystem gets VERY slow X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 May 2007 22:20:45 -0000 Two machines, almost identical (same 3ware 9650se sata raid card, same 500 GB hitachi drives, etc.) First machine is running 6.1-RELEASE ... never noticed anything odd. Second machine is running 6.2-RELEASE. These machines are very simple office fileservers. So there is a / partition, a /var, and then a big monster raid-5 array with all the rest of the space. So it is very rare that anyone actually runs any interactive commands in the shell or does any access to files on the / or the /var filesystems. The VAST majority of all usage is on the big array. On the first machine (6.1) there is no noticable lag when accessing / and /var ... but on the second machine (6.2) there is a very long wait - sometimes 20 seconds or more - for commands that deal with the / or /var ... for instance, if I: grep username /etc/passwd or: tail -200 /var/log/messages those commands can take 10-20 seconds to complete. But then, if you run subsequent commands they are quick. Once you do some work on / or /var, their speed picks back up. So it would appear that files in / and /var are just nowhere in the cache at all ... which isn't so odd, but what is odd is that the problem is not seen on the first system. So was there any major differences between 6.1 and 6.2 in the filesystem caching and strategies that would account for this difference ? Any comments at all on what I am seeing ? --------------------------------- Ahhh...imagining that irresistible "new car" smell? Check outnew cars at Yahoo! Autos. From owner-freebsd-fs@FreeBSD.ORG Wed May 2 01:58:27 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 553B016A400 for ; Wed, 2 May 2007 01:58:27 +0000 (UTC) (envelope-from staalebk@ifi.uio.no) Received: from smtp.bluecom.no (smtp.bluecom.no [193.75.75.28]) by mx1.freebsd.org (Postfix) with ESMTP id C968713C44C for ; Wed, 2 May 2007 01:58:26 +0000 (UTC) (envelope-from staalebk@ifi.uio.no) Received: from eschew.pusen.org (unknown [193.69.145.10]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.bluecom.no (Postfix) with ESMTP id 70DEC12C0D9 for ; Wed, 2 May 2007 03:58:25 +0200 (CEST) Received: from chiller by eschew.pusen.org with local (Exim 4.50) id 1Hj46u-0001nB-1x for freebsd-fs@freebsd.org; Wed, 02 May 2007 03:58:28 +0200 Date: Wed, 2 May 2007 03:58:27 +0200 From: =?iso-8859-1?Q?St=E5le?= Kristoffersen To: freebsd-fs@freebsd.org Message-ID: <20070502015827.GA5924@eschew.pusen.org> References: <20070427202222.GA26824@eschew.pusen.org> <20070428003115.GA1003@eschew.pusen.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20070428003115.GA1003@eschew.pusen.org> User-Agent: Mutt/1.5.13 (2006-08-11) Subject: Re: ZFS performance X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 01:58:27 -0000 On 2007-04-28 at 02:31, Ståle Kristoffersen wrote: > cvsup'ed, and buildt a new kernel and now the problem is gone, sorry about > the noise. Not entirely true. I still experience very strange (to me) issues. I'm trying to stream something off the server at about 100 KB/s. It's a video-file located on zfs. I can see that the data pushed across the network is about the correct speed (fluctuating around 100 KB/s). Now if I start up 'zpool iostat -v 1' things start to look strange. it first reads a couple of hundres KB's, then about 1 MB the next second, then 3 MB, then 10 MB then 20 MB, before the cycle restart at around a 3-400 KB again. The numbers differ from run to run but the overall pattern is the same. This makes streaming several streams impossible as it is not able to deliver enough data. (It should easily push 5 streams of 100 KB/s, right?) I'm using samba, but the problem is there when transfering using proftpd, (limiting the transfer on the receiver side to 100 KB/s) alltho the "pattern" is not as clear, it still reads _way_ more data than is transferred. I'm not sure how to debug this, I'm willing to try anything! Below is a snip from 'zpool iostat -v 1', notice that the file beeing streamed is located on the last disc of the pool. (the pool was full, I added one disc and therefor all new data ends up on that disc). It completes two "cycles": capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 0 0 0 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 0 0 0 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 4 0 639K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 4 0 639K 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 2 0 384K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 2 0 384K 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 7 0 1023K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 7 0 1023K 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 27 0 3.50M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 27 0 3.50M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 101 0 12.6M 0 ad14 298G 87.4M 0 0 5.99K 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 100 0 12.6M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 127 0 16.0M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 127 0 16.0M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 2 0 384K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 2 0 384K 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 12 0 1.62M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 12 0 1.62M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 27 0 3.50M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 27 0 3.50M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 27 0 3.50M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 27 0 3.50M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 73 0 9.12M 0 ad14 298G 87.4M 0 0 5.99K 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 72 0 9.12M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 128 0 16.1M 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 128 0 16.1M 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 5 0 767K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 5 0 767K 0 ---------- ----- ----- ----- ----- ----- ----- capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- stash 1.42T 76.3G 7 0 1023K 0 ad14 298G 87.4M 0 0 0 0 ad15 298G 174M 0 0 0 0 ad8 298G 497M 0 0 0 0 ad10s1d 340G 441M 0 0 0 0 ad16 223G 75.1G 7 0 1023K 0 ---------- ----- ----- ----- ----- ----- ----- -- Ståle Kristoffersen staalebk@ifi.uio.no From owner-freebsd-fs@FreeBSD.ORG Wed May 2 05:22:43 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9583016A400 for ; Wed, 2 May 2007 05:22:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (ns1.bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 8602513C45B for ; Wed, 2 May 2007 05:22:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 485FE5B51 for ; Tue, 1 May 2007 22:22:43 -0700 (PDT) To: freebsd-fs@freebsd.org Date: Tue, 01 May 2007 22:22:43 -0700 From: Bakul Shah Message-Id: <20070502052243.485FE5B51@mail.bitblocks.com> Subject: ZFS vs UFS2 overhead and may be a bug? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 05:22:43 -0000 Here is a surprising result for ZFS. I ran the following script on both ZFS and UF2 filesystems. $ dd SPACY# 10G zero bytes allocated $ truncate -s 10G HOLEY # no space allocated $ time dd /dev/null bs=1m # A1 $ time dd /dev/null bs=1m # A2 $ time cat SPACY >/dev/null # B1 $ time cat HOLEY >/dev/null # B2 $ time md5 SPACY # C1 $ time md5 HOLEY # C2 I have summarized the results below. ZFS UFS2 Elapsed System Elapsed System Test dd SPACY bs=1m 110.26 22.52 340.38 19.11 A1 dd HOLEY bs=1m 22.44 22.41 24.24 24.13 A2 cat SPACY 119.64 33.04 342.77 17.30 B1 cat HOLEY 222.85 222.08 22.91 22.41 B2 md5 SPACY 210.01 77.46 337.51 25.54 C1 md5 HOLEY 856.39 801.21 82.11 28.31 C2 A1, A2: Numbers are more or less as expected. When doing large reads, reading from "holes" takes far less time than from a real disk. We also see that UFS2 disk is about 3 times slower for sequential reads. B1, B2: UFS2 numbers are as expected but ZFS numbers for the HOLEY file are much too high. Why should *not* going to a real disk cost more? We also see that UFS2 handles holey files 10 times more efficiently than ZFS! C1, C2: Again UFS2 numbers and C1 numbers for ZFS are as expected. but C2 numbers for ZFS are very high. md5 uses BLKSIZ (== 1k) size reads and does hardly any other system calls. For ZFS each syscall takes 76.4 microseconds while UFS2 syscalls are 2.7 us each! zpool iostat shows there is no IO to the real disk so this implies that for the HOLEY case zfs read calls have a significantly higher overhead or there is a bug. Basically C tests just confirm what we find in B tests. From owner-freebsd-fs@FreeBSD.ORG Wed May 2 06:06:07 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 33D2F16A400 for ; Wed, 2 May 2007 06:06:07 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: from web30310.mail.mud.yahoo.com (web30310.mail.mud.yahoo.com [209.191.69.72]) by mx1.freebsd.org (Postfix) with SMTP id E1A2A13C45D for ; Wed, 2 May 2007 06:06:04 +0000 (UTC) (envelope-from arne_woerner@yahoo.com) Received: (qmail 30947 invoked by uid 60001); 2 May 2007 06:06:04 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=CQGQSOtfRjo0PB3Fk3EORyEVHewOrLBE4BqMokGLAYFdppmBY61csyzimpRXIsjeN8Pomt0iDd8YkG2gtrggtP0wMbvBX6p44/+/zrdrHGi6BIaNgZvwHj2BTqHnzYIodGgksuEo02Xp6JNk+xsPmibaNx1kCbFiQOv0XZYSWyI=; X-YMail-OSG: m1.W_2QVM1myJ51BwEuGEVtlbom4m1_wVHsCFqf7x4G0yxpBMPgqspy3CBZWwsFUmyq7dczDWfyKO8q2nKh.MpsKWRoj9nX3YZ1HS.MmQbxhoK4nKFV5pP8S3FLtvg-- Received: from [213.54.0.160] by web30310.mail.mud.yahoo.com via HTTP; Tue, 01 May 2007 23:06:04 PDT Date: Tue, 1 May 2007 23:06:04 -0700 (PDT) From: Arne "Wörner" To: Francisco Reyes In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <75116.30487.qm@web30310.mail.mud.yahoo.com> Cc: freebsd-fs@freebsd.org Subject: Re: distributed filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 06:06:07 -0000 --- Francisco Reyes wrote: > Greg Troxel writes: > > Coda (http://www.coda.cs.cmu.edu/) works well on NetBSD-current, in > > which I just fixed the kernel module to conform to updated/simplified > .. > > There's also arla (afs working client, and server that I'm not sure of > > the status). > >From a performance perspective would you recommend Coda or Arla? > > Are distributed filesystems fast enough to handle something like a mailstore > for a busy Imap/pop3 server? > Depends... Since Imap/pop3 sounds like that services r limited in bandwidth by network bandwidth, u just have to care that the network connection between the file servers is fast enough. Then u should just have a little delay (when the data is sent a second time through the network) but no contention. Theoretically: If the fs does lazy updates (just getting a lock on another server and transfering the data later from a local mirror -- like described earlier in a change request for gmirror), it can do updates as fast as it can transfer data to the other server. Reading should be a lot faster, if the write-locks r handled intelligently. -Arne __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From owner-freebsd-fs@FreeBSD.ORG Wed May 2 07:20:07 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E227F16A408; Wed, 2 May 2007 07:20:07 +0000 (UTC) (envelope-from ml.freebsd-fs@ledisez.net) Received: from ledisez.net (ledisez.net [80.247.230.138]) by mx1.freebsd.org (Postfix) with ESMTP id 5AEA313C455; Wed, 2 May 2007 07:20:07 +0000 (UTC) (envelope-from ml.freebsd-fs@ledisez.net) Received: from webmail.ledisez.net (localhost.localdomain [80.247.230.138]) by ledisez.net (Postfix) with ESMTP id 2479845AEE8; Wed, 2 May 2007 09:20:05 +0200 (CEST) Received: from 62.212.122.219 (SquirrelMail authenticated user romain) by webmail.ledisez.net with HTTP; Wed, 2 May 2007 09:20:06 +0200 (CEST) Message-ID: <64371.62.212.122.219.1178090406.squirrel@webmail.ledisez.net> In-Reply-To: <51864.84.102.41.193.1178030465.squirrel@webmail.ledisez.net> References: <20070406025700.GB98545@garage.freebsd.pl> <20070407025644.GC8831@cicely12.cicely.de> <20070407131353.GE63916@garage.freebsd.pl> <4617A3A6.60804@kasimir.com> <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <51864.84.102.41.193.1178030465.squirrel@webmail.ledisez.net> Date: Wed, 2 May 2007 09:20:06 +0200 (CEST) From: "Romain LE DISEZ" To: freebsd-fs@freebsd.org User-Agent: SquirrelMail/1.4.9a MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: Craig Boston , freebsd-current@freebsd.org, Pawel Jakub Dawidek Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 07:20:08 -0000 >> It appears the problem persists, but is more difficult to trigger. > > I had the same problem but with the solution of decreasing kern.maxvnodes > to 3/4 of the original value, the problem seems solved. Finally, it seems you're right : "the problem persists, but is more difficult to trigger." >> I had a reboot again during building of ports. I decreased maxvnodes >> further (to about 2/3 of the default, instead of the recommended 4/3 of >> the default). Even after that, I had another reboot just now, also >> during building of ports. >> >> It takes on the order of several hours to trigger it. >> >> Note that I say "reboot" because that's what it was; it appears that >> when this happens when in X (which I suspect is the triggering >> difference) a reboot is triggered immediately, while at the console I >> get the kernel debugger. > > I'm currently running Xorg 7.2 from the experimental port tree (see > freebsd-x11@) and I'm compiling all the ports of gnome2. The compilation > starts about 8 hours ago (pfff... that's too long) => no problems. In this > compilation, I'm using differents FS : port tree is on ZFS+compression and > work-dir is on UFS. When generating the ports database (portsdb -Uu), I get a reboot. On the boot message, I see it was a "kmem_map too small". I don't know if there is a link but I was running X.org (7.2). From owner-freebsd-fs@FreeBSD.ORG Wed May 2 12:30:45 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1B27916A400 for ; Wed, 2 May 2007 12:30:45 +0000 (UTC) (envelope-from gdt@ir.bbn.com) Received: from fnord.ir.bbn.com (fnord.ir.bbn.com [192.1.100.210]) by mx1.freebsd.org (Postfix) with ESMTP id CED1E13C465 for ; Wed, 2 May 2007 12:30:44 +0000 (UTC) (envelope-from gdt@ir.bbn.com) Received: by fnord.ir.bbn.com (Postfix, from userid 10853) id ADCE952A9; Wed, 2 May 2007 08:30:43 -0400 (EDT) From: Greg Troxel To: Arne =?iso-8859-1?Q?W=F6rner?= References: <75116.30487.qm@web30310.mail.mud.yahoo.com> X-Hashcash: 1:20:070502:arne_woerner@yahoo.com::LLyNKVtI42B3J/6C:0000000000000000000000000000000000000001Vok X-Hashcash: 1:20:070502:freebsd-fs@freebsd.org::LLyNKVtI42B3J/6C:0000000000000000000000000000000000000000vzU X-Hashcash: 1:20:070502:lists@stringsutils.com::LLyNKVtI42B3J/6C:00000000000000000000000000000000000000019On Date: Wed, 02 May 2007 08:30:37 -0400 In-Reply-To: <75116.30487.qm@web30310.mail.mud.yahoo.com> (arne_woerner@yahoo.com's message of "Tue\, 1 May 2007 23\:06\:04 -0700 \(PDT\)") Message-ID: User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (berkeley-unix) MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Cc: freebsd-fs@freebsd.org Subject: Re: distributed filesystems X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 12:30:45 -0000 --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Arne "W=F6rner" writes: > --- Francisco Reyes wrote: >> Greg Troxel writes: >> > Coda (http://www.coda.cs.cmu.edu/) works well on NetBSD-current, in >> > which I just fixed the kernel module to conform to updated/simplified >> .. >> > There's also arla (afs working client, and server that I'm not sure of >> > the status). >> >From a performance perspective would you recommend Coda or Arla? >> >> Are distributed filesystems fast enough to handle something like a mails= tore=20 >> for a busy Imap/pop3 server?=20 >> > Depends... > > Since Imap/pop3 sounds like that services r limited in bandwidth by netwo= rk > bandwidth, u just have to care that the network connection between the fi= le > servers is fast enough. Then u should just have a little delay (when the = data > is sent a second time through the network) but no contention. > > Theoretically: > If the fs does lazy updates (just getting a lock on another server and > transfering the data later from a local mirror -- like described earlier = in a > change request for gmirror), it can do updates as fast as it can transfer= data > to the other server. > Reading should be a lot faster, if the write-locks r handled intelligentl= y. Coda (or any system like it) needs to be used in a way that does not regularly produce conflicts. So having multiple clients write is difficult, especially with the now-normal disconnected mode. If there's only one writer, and others read, it will probably be ok. Some people put CVS repositories in Coda, but I would never do that - I use remote CVS to the server. That means no CVS while disconnected, but it also means no fs-level conflicts in the repository ,v files. --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (NetBSD) iD8DBQFGOIRt+vesoDJhHiURApg4AKCSq8/rvU92jg8YMGiGYwITwZXuOQCcCCX+ f1O4q4ZRw9xl2ijQw8OXaIA= =NISp -----END PGP SIGNATURE----- --=-=-=-- From owner-freebsd-fs@FreeBSD.ORG Wed May 2 14:53:51 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4825716A401; Wed, 2 May 2007 14:53:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id C7B0C13C45D; Wed, 2 May 2007 14:53:50 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 1A9F546F96; Wed, 2 May 2007 10:53:50 -0400 (EDT) Date: Wed, 2 May 2007 15:53:50 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Rick Macklem In-Reply-To: Message-ID: <20070502154934.E30345@fledge.watson.org> References: <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <20070501160213.GA496@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Craig Boston , Pawel Jakub Dawidek , freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Kris Kennaway Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 14:53:51 -0000 On Tue, 1 May 2007, Rick Macklem wrote: > On Tue, 1 May 2007, Kris Kennaway wrote: > >>> I don't know if it relevent, but I've seen "kmem_map: too small" panics >>> when testing my NFSv4 server, ever since about FreeBSD5.4. There is no >>> problem running the same server code on FreeBSD4 (which is what I still >>> run in production mode) or OpenBSD3 or 4. If I increase the size of the >>> map, I can delay the panic for up to about two weeks of hard testing, but >>> it never goes away. I don't see any evidence of a memory leak during the >>> several days of testing leading up to the panic. (NFSv4 uses MALLOC/FREE >>> extensively for state related structures.) >> >> Sounds exactly like a memory leak to me. How did you rule it out? > Well, I had a little program running on the server that grabbed the > mti_stats[] out of the kernel and logged them. I had one client mounted > running thousands of passes of the Connectathon basic tests (one client, > same activity over and over and over again). For a week, the stats don't > show any increase in allocation for any type (alloc - free doesn't get > unreasonably big), then..."panic: kmem_map too small". How many days it took > to happen would vary with the size of the kernel map, but no evidence of a > leak prior to the crash. It seemed to be based on the number of times MALLOC > and FREE were called. > > Also, the same server code (except for the port changes, which have nothing > to do with the state handling where MALLOC/FREE get called a lot), works > fine for months on FreeBSD4 and OpenBSD3.9. > > So, I won't say a "memory leak is ruled out", but if there was a leak why > wouldn't it bite FreeBSD4 or show up in mti_stats[]? > > I first saw it on FreeBSD6.0, but went back to FreeBSD5.4 and tried the same > test and got the same result. Historically, such panics have been a result of one of two things: (1) An immediate resource leak in UMA(9) or malloc(9) allocated memory. (2) Mis-tuning of a resource limit, perhaps due to sizing the limit based on solely physical memory size, not taking available kernel address space into account. mti_stats reports only on malloc(9), you need to also look at uma(9), since many frequently allocated types are allocated directly with the slab allocator, and not from kernel malloc. Take a look at the output of "show uma" or "show malloc" in DDB, or respectively "vmstat -z" and "vmstat -m" on a core or on a live system. malloc(9) is actually implemented using two different back-ends: UMA-managed fixed size memory buckets for small allocations, and direct page allocation for large allocations. The most frequent example of (2) is mis-tuning in the maximum vnode limit of the system, resulting in the vnode cache exceeding available address space. Try tuning down that limit. Notice that vnodes, inodes, and most frequently used file system allocation data types are allocated using uma(9) and not malloc(9). Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-fs@FreeBSD.ORG Wed May 2 21:27:01 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC08116A402; Wed, 2 May 2007 21:27:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from galileo.cs.uoguelph.ca (galileo.cs.uoguelph.ca [131.104.94.215]) by mx1.freebsd.org (Postfix) with ESMTP id 74F5F13C45B; Wed, 2 May 2007 21:27:01 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from muncher.cs.uoguelph.ca (muncher.cs.uoguelph.ca [131.104.96.170]) by galileo.cs.uoguelph.ca (8.13.1/8.13.1) with ESMTP id l42LQwXE019553; Wed, 2 May 2007 17:26:59 -0400 Received: from localhost (rmacklem@localhost) by muncher.cs.uoguelph.ca (8.11.7p3+Sun/8.11.6) with ESMTP id l42LS4n15290; Wed, 2 May 2007 17:28:04 -0400 (EDT) X-Authentication-Warning: muncher.cs.uoguelph.ca: rmacklem owned process doing -bs Date: Wed, 2 May 2007 17:28:04 -0400 (EDT) From: Rick Macklem X-X-Sender: rmacklem@muncher To: Robert Watson In-Reply-To: <20070502154934.E30345@fledge.watson.org> Message-ID: References: <20070407165759.GG8831@cicely12.cicely.de> <20070407180319.GH8831@cicely12.cicely.de> <20070407191517.GN63916@garage.freebsd.pl> <20070407212413.GK8831@cicely12.cicely.de> <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <20070501160213.GA496@xor.obsecurity.org> <20070502154934.E30345@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Scanned-By: MIMEDefang 2.57 on 131.104.94.215 Cc: Craig Boston , Pawel Jakub Dawidek , freebsd-fs@FreeBSD.org, freebsd-current@FreeBSD.org, Kris Kennaway Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 21:27:01 -0000 On Wed, 2 May 2007, Robert Watson wrote: [stuff snipped] > > Historically, such panics have been a result of one of two things: > > (1) An immediate resource leak in UMA(9) or malloc(9) allocated memory. > > (2) Mis-tuning of a resource limit, perhaps due to sizing the limit based on > solely physical memory size, not taking available kernel address space > into account. > > mti_stats reports only on malloc(9), you need to also look at uma(9), since > many frequently allocated types are allocated directly with the slab > allocator, and not from kernel malloc. Take a look at the output of "show > uma" or "show malloc" in DDB, or respectively "vmstat -z" and "vmstat -m" on > a core or on a live system. malloc(9) is actually implemented using two > different back-ends: UMA-managed fixed size memory buckets for small > allocations, and direct page allocation for large allocations. Ok, it does appear I'm leaking NAMEIs. "vmstat -z", which I didn't know about, was the trick. Handling lookup name buffers is also port specific, so it wouldn't have shown up in the other ports. So, forget what I said w.r.t. a MALLOC bug and thanks for the help. I should be able to locate the leak pretty easily with "vmstat -z". Thanks, rick From owner-freebsd-fs@FreeBSD.ORG Wed May 2 22:24:00 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B8CE316A402 for ; Wed, 2 May 2007 22:24:00 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from bavaria.utcluj.ro (bavaria.utcluj.ro [193.226.5.35]) by mx1.freebsd.org (Postfix) with ESMTP id 24C0813C45B for ; Wed, 2 May 2007 22:24:00 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from localhost (localhost [127.0.0.1]) by bavaria.utcluj.ro (Postfix) with ESMTP id 8B37C50834 for ; Thu, 3 May 2007 01:23:58 +0300 (EEST) X-Virus-Scanned: by the daemon playing with your mail on bavaria.utcluj.ro Received: from bavaria.utcluj.ro ([127.0.0.1]) by localhost (bavaria.utcluj.ro [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n6Pt-r1OHo5S for ; Thu, 3 May 2007 01:23:54 +0300 (EEST) Received: from [172.27.2.200] (c7.campus.utcluj.ro [193.226.6.226]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by bavaria.utcluj.ro (Postfix) with ESMTP id 981DA5082B for ; Thu, 3 May 2007 01:23:54 +0300 (EEST) Message-ID: <46390F78.5080206@net.utcluj.ro> Date: Thu, 03 May 2007 01:23:52 +0300 From: Cristian KLEIN Organization: Data Communication Center - Technical University of Cluj-Napoca User-Agent: Thunderbird 1.5.0.10 (X11/20070306) MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <59558.86.125.188.48.1177802342.squirrel@intranet.utcluj.ro> <4637A640.6050700@freebsd.org> In-Reply-To: <4637A640.6050700@freebsd.org> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Subject: Re: panic: softdep_setup_inomapdep: found inode already exists in 6.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 02 May 2007 22:24:00 -0000 On Mar, Mai 1, 2007 11:42 pm, Eric Anderson wrote: > On 04/28/07 18:19, Cristian KLEIN wrote: > >> Hi everybody, >> >> >> I am running a FreeBSD 6.2-p3, on which I am experiencing exactly the >> same simtoms as one item of the TODO list of 6.0: >> http://www.freebsd.org/releases/6.0R/todo.html >> >> >> panic: softdep_setup_inomapdep: found inode Needs testing Tor Egge >> Found by stress tests at >> http://www.holm.cc/stress/log/cons138.html >> >> >> Does anybody know whether this bug should have been solved in 6.2? >> Should >> I file a PR? >> > > > Sorry if I missed it, but were you able to provide a backtrace? If you > can, you should compile your kernel with debugging, so at least you could > make a little more out of the crash. See the handbook if you need help on > that. Hi, I haven't mentioned any technical details yet, as I wasn't sure whether this issue is known or not. First, let me tell you what I did. I wanted to better protect data on /jail/mail/home by doing daily snapshots, saved like /jail/mail/home/.snap/2007-04-03-03-22-02. I mounted one of these snapshots (not the latest) in /mnt/home and rsync'd it to a server. I might have rsync'd while taking a new snapshot. Server started crashing randomly, either while rsync-ing, or in the morning, during heavy load. Now I removed the snapshots and the system is stable. Other random information that might be useful: I am using gmirror and jail. Disabling SMP has not effect, WITNESS didn't say anything. The filesystem has userquotas. The filesystem stores maildir and has about 1.6Minodes. Config is GENERIC + SMP + QUOTA - unused hardware devices. root# uname -a FreeBSD bavaria.xxxxxx 6.2-RELEASE-p3 FreeBSD 6.2-RELEASE-p3 #5: Fri Apr 27 20:01:20 EEST 2007 cristi@bavaria.xxxxxx:/usr/obj/usr/src/sys/BAVARIA-SMP i386 root# kgdb kernel.debug /var/crash/vmcore.3 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: panic: softdep_setup_inomapdep: found inode already exists cpuid = 0 KDB: enter: panic Uptime: 5h42m12s Dumping 1023 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 1023MB (261837 pages) 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:165 #1 0xc0545e18 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0546149 in panic (fmt=0xc075660c "softdep_setup_inomapdep: found inode already exists") at /usr/src/sys/kern/kern_shutdown.c:565 #3 0xc068c64a in softdep_setup_inomapdep (bp=0xd8c2ba68, ip=0x0, newinum=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1527 #4 0xc067d4dd in ffs_nodealloccg (ip=0xc68e818c, cg=0, ipref=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1762 #5 0xc067bb83 in ffs_hashalloc (ip=0xc68e818c, cg=0, pref=Unhandled dwarf expression opcode 0x93 ) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1248 #6 0xc067b232 in ffs_valloc (pvp=0xc6883440, mode=33024, cred=0xc50d3a80, vpp=0xe733d4e8) at /usr/src/sys/ufs/ffs/ffs_alloc.c:932 #7 0xc06a6bd7 in ufs_makeinode (mode=33024, dvp=0xc6883440, vpp=0xe733d8f0, cnp=0xe733d904) at /usr/src/sys/ufs/ufs/ufs_vnops.c:2220 #8 0xc06a348f in ufs_create (ap=0x0) at /usr/src/sys/ufs/ufs/ufs_vnops.c:189 #9 0xc071e879 in VOP_CREATE_APV (vop=0x0, a=0xe733d7fc) at vnode_if.c:204 #10 0xc0684018 in ffs_snapshot (mp=0xc4e28000, snapfile=0xc699c200 "/jail/mail/home/.snap/2007-04-28-03-34-31") at vnode_if.h:111 #11 0xc06964d7 in ffs_mount (mp=0xc4e28000, td=0xc50e1300) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:312 #12 0xc059eb6e in vfs_domount (td=0xc50e1300, fstype=0xc4efac90 "ufs", fspath=0xc4efa670 "/jail/mail/home", fsflags=16842752, fsdata=0xc4efaca0) at /usr/src/sys/kern/vfs_mount.c:928 #13 0xc059e1da in vfs_donmount (td=0x0, fsflags=16842752, fsoptions=0x0) at /usr/src/sys/kern/vfs_mount.c:676 #14 0xc05a0f84 in kernel_mount (ma=0xc4efac50, flags=0) at pcpu.h:162 #15 0xc06966fb in ffs_cmount (ma=0xc4efac50, data=0x0, flags=0, td=0xc50e1300) at /usr/src/sys/ufs/ffs/ffs_vfsops.c:392 #16 0xc059e3f0 in mount (td=0xc50e1300, uap=0xe733dd04) at /usr/src/sys/kern/vfs_mount.c:742 #17 0xc070b5bb in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = -1077944004, tf_esi = -1077941276, tf_ebp = -1077943864, tf_isp = -416031388, tf_ebx = -1077943824, tf_edx = -1, tf_ecx = -1077940507, tf_eax = 21, tf_trapno = 12, tf_err = 2, tf_eip = 671885143, tf_cs = 51, tf_eflags = 582, tf_esp = -1077944036, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:983 #18 0xc06f60df in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:200 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) Now that I have KDB on serial console I might be able to crash (or even better, test patches) at night on that system. Please don't hesitate to request further information. I will try to elaborate a recipe for crashing the system. From owner-freebsd-fs@FreeBSD.ORG Thu May 3 04:09:17 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5736F16A402; Thu, 3 May 2007 04:09:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay02.kiev.sovam.com (relay02.kiev.sovam.com [62.64.120.197]) by mx1.freebsd.org (Postfix) with ESMTP id DEEC613C44C; Thu, 3 May 2007 04:09:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.227] (helo=fw.zoral.com.ua) by relay02.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.60) (envelope-from ) id 1HjSd0-0003lQ-Kt; Thu, 03 May 2007 07:09:15 +0300 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id l4349Atb017194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 3 May 2007 07:09:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1) with ESMTP id l4349Ai2018654; Thu, 3 May 2007 07:09:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1/Submit) id l43497nK018653; Thu, 3 May 2007 07:09:07 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 3 May 2007 07:09:07 +0300 From: Kostik Belousov To: Rick Macklem Message-ID: <20070503040907.GK2441@deviant.kiev.zoral.com.ua> References: <20070410003505.GA8189@nowhere> <46365F76.7090708@infidyne.com> <20070430213043.GF67738@garage.freebsd.pl> <463665F2.8090605@infidyne.com> <46373CAD.6000502@infidyne.com> <20070501160213.GA496@xor.obsecurity.org> <20070502154934.E30345@fledge.watson.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Vwmj/TXzE7NEH899" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.7, clamav-milter version 0.88.7 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-0.1 required=5.0 tests=ALL_TRUSTED,SPF_NEUTRAL autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on fw.zoral.com.ua X-Scanner-Signature: fa718887f43e9e3967a9d10f43843ba4 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Info: Profiles 1016 [May 02 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {received from trusted relay: not dialup} X-SpamTest-Method: none X-SpamTest-Method: Local Lists X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: Craig Boston , Pawel Jakub Dawidek , freebsd-fs@freebsd.org, freebsd-current@freebsd.org, Robert Watson , Kris Kennaway Subject: Re: ZFS committed to the FreeBSD base. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 04:09:17 -0000 --Vwmj/TXzE7NEH899 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 02, 2007 at 05:28:04PM -0400, Rick Macklem wrote: >=20 >=20 > On Wed, 2 May 2007, Robert Watson wrote: > [stuff snipped]=20 > > > >Historically, such panics have been a result of one of two things: > > > >(1) An immediate resource leak in UMA(9) or malloc(9) allocated memory. > > > >(2) Mis-tuning of a resource limit, perhaps due to sizing the limit base= d=20 > >on > > solely physical memory size, not taking available kernel address space > > into account. > > > >mti_stats reports only on malloc(9), you need to also look at uma(9),=20 > >since many frequently allocated types are allocated directly with the sl= ab=20 > >allocator, and not from kernel malloc. Take a look at the output of "sh= ow=20 > >uma" or "show malloc" in DDB, or respectively "vmstat -z" and "vmstat -m= "=20 > >on a core or on a live system. malloc(9) is actually implemented using= =20 > >two different back-ends: UMA-managed fixed size memory buckets for small= =20 > >allocations, and direct page allocation for large allocations. >=20 > Ok, it does appear I'm leaking NAMEIs. "vmstat -z", which I didn't know > about, was the trick. Handling lookup name buffers is also port specific, > so it wouldn't have shown up in the other ports. >=20 > So, forget what I said w.r.t. a MALLOC bug and thanks for the help. I > should be able to locate the leak pretty easily with "vmstat -z". I fixed two NAMI zone leaks in the last 2-3 month. One was in the nfs server (shall be present in 6.2-RELEASE, AFAIR), second was in UFS snapshotting code, and is MFCed several days ago. --Vwmj/TXzE7NEH899 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGOWBjC3+MBN1Mb4gRArLlAJwLnxwBeDgtpPM02z46i/XXKE3wqQCfZWqG 8R3zc+4s7voa0bqTtixr5yY= =OghD -----END PGP SIGNATURE----- --Vwmj/TXzE7NEH899-- From owner-freebsd-fs@FreeBSD.ORG Thu May 3 19:59:43 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9D6C616A400; Thu, 3 May 2007 19:59:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (ns1.bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 7EA1213C457; Thu, 3 May 2007 19:59:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 47C3A5B2E; Thu, 3 May 2007 12:59:43 -0700 (PDT) To: Pawel Jakub Dawidek In-reply-to: Your message of "Fri, 27 Apr 2007 16:26:06 +0200." <20070427142606.GK49413@garage.freebsd.pl> Date: Thu, 03 May 2007 12:59:43 -0700 From: Bakul Shah Message-Id: <20070503195943.47C3A5B2E@mail.bitblocks.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS: kmem_map too small panic again X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 19:59:43 -0000 > On Fri, Apr 27, 2007 at 12:35:35AM +0200, Pawel Jakub Dawidek wrote: > > On Thu, Apr 26, 2007 at 02:33:47PM -0700, Bakul Shah wrote: > > > An update: > > > > > > I reverted sources to to Apr 24 16:49 UTC and rebuilt the > > > kernel and the bug goes away -- I was able to restore 53GB > > > (840K+ inodes) and do a bunch of du with no problems. > > > > > > But the bug remains on a kernel with the latest zfs changes. > > > All I have to do run du a couple of times in the restored > > > tree to crash the system. There is no crash with multiple du > > > on a similarly sized UFS2, only on ZFS. This is on a > > > Athlon64 X2 Dual Core Processor 3800+ running in 32 bit mode. > > > The exact message is: > > > > > > panic: kmem_malloc(98304): kmem_map too small: 335478784 total allocated > > > > I can reproduce it and I'm working on it. > > The problem is that kern.maxvnodes are tuned based on vnode+UFS_inode > size. In case of ZFS, the size of vnode+ZFS_znode_dnode+dmu_buf is > larger. As a work-around just decrease kern.maxvnodes to something like > 3/4 of the current value. Pawel, thank you for this fix; I have been running -current with it for a few days but as others have reported, this does not fix the problem, only makes it much less likely -- or may be there is another problem. At least now I can get a crash dump! I have two filesystems in one pool with about 1.28M inodes altogether. Based on a few trials it seems it is necessary to walk them both before triggering this panic (or may be it is a function of how many inodes are statted). Every second I sent output of vmstat -z to another machine during testing. Nothing pops out as obviously wrong but here are somethings worth looking at: $ grep ^512 vmstream | firstlast 512: 512, 0, 466, 38, 512: 512, 0, 131491, 3669, $ grep ' Slab' vmstream | firstlast UMA Slabs: 64, 0, 2516, 21, UMA Slabs: 64, 0, 23083, 222, $ grep dmu vmstream | firstlast dmu_buf_impl_t: 140, 0, 912, 68, dmu_buf_impl_t: 140, 0, 136034, 3938, $ grep znode vmstream | firstlast zfs_znode_cache: 236, 0, 261, 43, zfs_znode_cache: 236, 0, 64900, 5596, # firstlast displays the first and last line of its input. From owner-freebsd-fs@FreeBSD.ORG Thu May 3 20:09:15 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B268116A403 for ; Thu, 3 May 2007 20:09:15 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from bavaria.utcluj.ro (bavaria.utcluj.ro [193.226.5.35]) by mx1.freebsd.org (Postfix) with ESMTP id 6BA9B13C45A for ; Thu, 3 May 2007 20:09:15 +0000 (UTC) (envelope-from cristi@net.utcluj.ro) Received: from localhost (localhost [127.0.0.1]) by bavaria.utcluj.ro (Postfix) with ESMTP id 88AE75085B for ; Thu, 3 May 2007 23:09:08 +0300 (EEST) X-Virus-Scanned: by the daemon playing with your mail on bavaria.utcluj.ro Received: from bavaria.utcluj.ro ([127.0.0.1]) by localhost (bavaria.utcluj.ro [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6Vffo4rGufMa for ; Thu, 3 May 2007 23:09:04 +0300 (EEST) Received: from [172.27.2.200] (c7.campus.utcluj.ro [193.226.6.226]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by bavaria.utcluj.ro (Postfix) with ESMTP id AA58050834 for ; Thu, 3 May 2007 23:09:04 +0300 (EEST) Message-ID: <463A4165.3020307@net.utcluj.ro> Date: Thu, 03 May 2007 23:09:09 +0300 From: Cristian KLEIN Organization: Data Communication Center - Technical University of Cluj-Napoca User-Agent: Thunderbird 1.5.0.10 (X11/20070306) MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Quotas not working in Jail X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 20:09:15 -0000 Hi everybody, As many persons might have mentioned, non-superusers running inside a jail are unable to query their quota, unless they have direct access to "quota.user". This is inconvenient for two reasons: 1) Users will be able to get other's quota as well. 2) Many applications (for example dovecot) use quotactl() to retrieve the user's quota, so giving access to quota.user won't help. All this is caused by the following piece of code in vfs_syscalls.c --- cut here --- /* XXX PRISON: could be per prison flag */ static int prison_quotas; #if 0 SYSCTL_INT(_kern_prison, OID_AUTO, quotas, CTLFLAG_RW, &prison_quotas, 0, ""); #endif --- and here --- Does anybody know why '#if 0' is there? Considering the fact that SYSCTL's defined this way can't be written from jail, and that the default is zero, I think it is safe to remove '#if 0'. I changed 'prison_quotas' from KDB on one of my servers, and everything seems to be fine. I can finally show quotas to my users. :) Another fact which bothers me is why does quota(1) read quota.* files directly when quotactl(2) fails? As I understood from a post, quota caching within the kernel might render writing to quota.* useless. In addition, quota handling should strictly be the kernel's job. Userspace shouldn't know nor care how the kernel stores its quota. My 0.02$. From owner-freebsd-fs@FreeBSD.ORG Thu May 3 20:09:37 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8F38E16A403 for ; Thu, 3 May 2007 20:09:37 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 15DE513C484 for ; Thu, 3 May 2007 20:09:36 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 25D4348A32; Thu, 3 May 2007 21:10:31 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id BF186487FB; Thu, 3 May 2007 21:10:25 +0200 (CEST) Date: Thu, 3 May 2007 21:09:56 +0200 From: Pawel Jakub Dawidek To: Kenneth Vestergaard Schmidt Message-ID: <20070503190956.GC7177@garage.freebsd.pl> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="rQ2U398070+RC21q" Content-Disposition: inline In-Reply-To: X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Sun Fire X4500, FreeBSD and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 20:09:37 -0000 --rQ2U398070+RC21q Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 01, 2007 at 10:33:10AM +0200, Kenneth Vestergaard Schmidt wrote: > Mjello. >=20 > Just thought I'd say that we've got a Sun Fire X4500 running with > -CURRENT as of yesterday and ZFS. Works beautifully, after we disabled > MSI and increased VM_KMEM_SIZE_MAX. >=20 > Without the increased VM_KMEM_SIZE_MAX, we got the usual panic (kmem_map > too small). I haven't tried adjusting maxvnodes - that might also have > helped. However, the machine has 16 GB RAM, so it might as well be used > for something. I'm not quite sure how to tweak the box efficiently, but > for now the bottleneck is our network, so we're going to upgrade some > pieces and try again. >=20 > We configured the 48 drives as follows: >=20 > - ad52 and ad60 are magic - the BIOS is hardcoded to boot from them, so > we put them in a gmirror > - 5 RAIDZ2's, each with 9 disks, for a usable total of 7 per array > - one global hotspare >=20 > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > void 20.3T 62.1G 20.3T 0% ONLINE - >=20 > # zfs list > NAME USED AVAIL REFER MOUNTPOINT > void 48.2G 15.5T 41.9K /void >=20 > All in all, a fun little toy :) If it's just a little toy for your, maybe you want to replace it with my teddy bear?:) Great to hear that this beast works with FreeBSD!! Any chance we can trick you into performance comparsion between Solaris/ZFS and FreeBSD/ZFS?:) --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --rQ2U398070+RC21q Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGOjOEForvXbEpPzQRAuaRAJ9ltDAeW/DIFLaK1j+JBov820ArawCgrvx/ lpeOBRYJdErnr8Fp0qOXD3Q= =yviu -----END PGP SIGNATURE----- --rQ2U398070+RC21q-- From owner-freebsd-fs@FreeBSD.ORG Thu May 3 20:09:37 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D722016A407 for ; Thu, 3 May 2007 20:09:37 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 3D95013C487 for ; Thu, 3 May 2007 20:09:36 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 04F7D48A2F; Thu, 3 May 2007 21:12:12 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 9E0C248A2E; Thu, 3 May 2007 21:12:07 +0200 (CEST) Date: Thu, 3 May 2007 21:11:39 +0200 From: Pawel Jakub Dawidek To: St?le Kristoffersen Message-ID: <20070503191139.GD7177@garage.freebsd.pl> References: <46205338.3090803@barryp.org> <20070415111955.GB16971@garage.freebsd.pl> <46224706.4010704@barryp.org> <20070422212019.GJ52622@garage.freebsd.pl> <20070423105619.GA14400@eschew.pusen.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LTeJQqWS0MN7I/qa" Content-Disposition: inline In-Reply-To: <20070423105619.GA14400@eschew.pusen.org> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@FreeBSD.org Subject: Re: ZFS raidz device replacement problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 20:09:37 -0000 --LTeJQqWS0MN7I/qa Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 23, 2007 at 12:56:19PM +0200, St?le Kristoffersen wrote: > On 2007-04-22 at 23:20, Pawel Jakub Dawidek wrote: > > I just committed a fix for 'zpool status -v'. It should now show > > actually file names if corruption is related to file's data. >=20 > I tried the fix and it did work. However, when I deleted the file and > restored it from backup it still showed that the pool had one error. > (Showing only 0x62b as filename again). Shouldn't ZFS automatically clear > the error when deleting the file? >=20 > I tried exporting and importing the pool, didn't change anything, I then > ran a scrub and it fixed it: > scrub: scrub completed with 0 errors on Mon Apr 23 09:45:36 2007 > errors: No known data errors >=20 > I would have preferred if I didn't have to scrub the pool, would it be ha= rd > to fix it on delete? 'zpool clear ' should do the trick. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --LTeJQqWS0MN7I/qa Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGOjPqForvXbEpPzQRAlXdAJ0azudI3iZCIdTpSOAQs8mVQyFPwQCgjlo8 j9uu4fflE6XNl/iCKvAwy04= =kyIQ -----END PGP SIGNATURE----- --LTeJQqWS0MN7I/qa-- From owner-freebsd-fs@FreeBSD.ORG Thu May 3 20:26:17 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9B0B616A400 for ; Thu, 3 May 2007 20:26:17 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id CF79113C484 for ; Thu, 3 May 2007 20:26:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id B13CF48803; Thu, 3 May 2007 21:07:01 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 4C21645B26; Thu, 3 May 2007 21:06:55 +0200 (CEST) Date: Thu, 3 May 2007 21:06:26 +0200 From: Pawel Jakub Dawidek To: Bakul Shah Message-ID: <20070503190626.GB7177@garage.freebsd.pl> References: <20070502052243.485FE5B51@mail.bitblocks.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zx4FCpZtqtKETZ7O" Content-Disposition: inline In-Reply-To: <20070502052243.485FE5B51@mail.bitblocks.com> X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS vs UFS2 overhead and may be a bug? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 20:26:17 -0000 --zx4FCpZtqtKETZ7O Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, May 01, 2007 at 10:22:43PM -0700, Bakul Shah wrote: > Here is a surprising result for ZFS. >=20 > I ran the following script on both ZFS and UF2 filesystems. >=20 > $ dd SPACY# 10G zero bytes allocated > $ truncate -s 10G HOLEY # no space allocated >=20 > $ time dd /dev/null bs=3D1m # A1 > $ time dd /dev/null bs=3D1m # A2 > $ time cat SPACY >/dev/null # B1 > $ time cat HOLEY >/dev/null # B2 > $ time md5 SPACY # C1 > $ time md5 HOLEY # C2 >=20 > I have summarized the results below. >=20 > ZFS UFS2 > Elapsed System Elapsed System Test > dd SPACY bs=3D1m 110.26 22.52 340.38 19.11 A1 > dd HOLEY bs=3D1m 22.44 22.41 24.24 24.13 A2 >=20 > cat SPACY 119.64 33.04 342.77 17.30 B1 > cat HOLEY 222.85 222.08 22.91 22.41 B2 >=20 > md5 SPACY 210.01 77.46 337.51 25.54 C1=09 > md5 HOLEY 856.39 801.21 82.11 28.31 C2 >=20 >=20 > A1, A2: > Numbers are more or less as expected. When doing large > reads, reading from "holes" takes far less time than from a > real disk. We also see that UFS2 disk is about 3 times > slower for sequential reads. >=20 > B1, B2: > UFS2 numbers are as expected but ZFS numbers for the HOLEY > file are much too high. Why should *not* going to a real > disk cost more? We also see that UFS2 handles holey files 10 > times more efficiently than ZFS! >=20 > C1, C2: > Again UFS2 numbers and C1 numbers for ZFS are as expected. > but C2 numbers for ZFS are very high. md5 uses BLKSIZ (=3D=3D > 1k) size reads and does hardly any other system calls. For > ZFS each syscall takes 76.4 microseconds while UFS2 syscalls > are 2.7 us each! zpool iostat shows there is no IO to the > real disk so this implies that for the HOLEY case zfs read > calls have a significantly higher overhead or there is a bug. >=20 > Basically C tests just confirm what we find in B tests. Interesting. There are two problems. First is that cat(1) uses st_blksize to find out best size of I/O request and we force it to PAGE_SIZE, which is very, very wrong for ZFS - it should be equal to recordsize. I need to find discussion about this: /* * According to www.opengroup.org, the meaning of st_blksize is=20 * "a filesystem-specific preferred I/O block size for this=20 * object. In some filesystem types, this may vary from file * to file" * Default to PAGE_SIZE after much discussion. * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more * correct. */ sb->st_blksize =3D PAGE_SIZE; For example cp(1) just uses MAXBSIZE, which is also not really good, but at least MAXBSIZE is much bigger than PAGE_SIZE (it's 64kB). So bascially what you observed with cat(1) is equivalent of running dd(1) with bs=3D4k. I tested it on Solaris and this is not FreeBSD-specific problem, the same is on Solaris. Is there a chance you could send your observations to zfs-discuss@opensolaris.org, but just comparsion between dd(1) with bs=3D128k and bs=3D4k (the other tests might be confusing). --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --zx4FCpZtqtKETZ7O Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGOjKyForvXbEpPzQRAoQwAKDgChwpzr9EYsqBHvY4hqU+Mx1CJQCgy/py mvP2jD6v75vTaL1Cge4kHns= =iCzX -----END PGP SIGNATURE----- --zx4FCpZtqtKETZ7O-- From owner-freebsd-fs@FreeBSD.ORG Thu May 3 20:28:47 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BC34016A406 for ; Thu, 3 May 2007 20:28:47 +0000 (UTC) (envelope-from staalebk@ifi.uio.no) Received: from smtp.bluecom.no (smtp.bluecom.no [193.75.75.28]) by mx1.freebsd.org (Postfix) with ESMTP id 5F88313C447 for ; Thu, 3 May 2007 20:28:47 +0000 (UTC) (envelope-from staalebk@ifi.uio.no) Received: from eschew.pusen.org (unknown [193.69.145.10]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.bluecom.no (Postfix) with ESMTP id 0315A16F7E6; Thu, 3 May 2007 22:28:46 +0200 (CEST) Received: from chiller by eschew.pusen.org with local (Exim 4.50) id 1Hjhv1-0001lA-01; Thu, 03 May 2007 22:28:51 +0200 Date: Thu, 3 May 2007 22:28:50 +0200 From: =?iso-8859-1?Q?St=E5le?= Kristoffersen To: Pawel Jakub Dawidek Message-ID: <20070503202850.GA28808@eschew.pusen.org> References: <46205338.3090803@barryp.org> <20070415111955.GB16971@garage.freebsd.pl> <46224706.4010704@barryp.org> <20070422212019.GJ52622@garage.freebsd.pl> <20070423105619.GA14400@eschew.pusen.org> <20070503191139.GD7177@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20070503191139.GD7177@garage.freebsd.pl> User-Agent: Mutt/1.5.13 (2006-08-11) Cc: freebsd-fs@FreeBSD.org Subject: Re: ZFS raidz device replacement problem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 20:28:47 -0000 On 2007-05-03 at 21:11, Pawel Jakub Dawidek wrote: > On Mon, Apr 23, 2007 at 12:56:19PM +0200, St?le Kristoffersen wrote: > > On 2007-04-22 at 23:20, Pawel Jakub Dawidek wrote: > > > I just committed a fix for 'zpool status -v'. It should now show > > > actually file names if corruption is related to file's data. > > > > I tried the fix and it did work. However, when I deleted the file and > > restored it from backup it still showed that the pool had one error. > > (Showing only 0x62b as filename again). Shouldn't ZFS automatically clear > > the error when deleting the file? > > > > I tried exporting and importing the pool, didn't change anything, I then > > ran a scrub and it fixed it: > > scrub: scrub completed with 0 errors on Mon Apr 23 09:45:36 2007 > > errors: No known data errors > > > > I would have preferred if I didn't have to scrub the pool, would it be hard > > to fix it on delete? > > 'zpool clear ' should do the trick. I forgot to mention that I did try that, and that seems like it only clears the numbers. -- Ståle Kristoffersen staalebk@ifi.uio.no From owner-freebsd-fs@FreeBSD.ORG Thu May 3 21:15:03 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AD7B116A401; Thu, 3 May 2007 21:15:03 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (mail.bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 923C813C447; Thu, 3 May 2007 21:15:03 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 56DAF5B2E; Thu, 3 May 2007 14:15:03 -0700 (PDT) To: Pawel Jakub Dawidek In-reply-to: Your message of "Thu, 03 May 2007 21:06:26 +0200." <20070503190626.GB7177@garage.freebsd.pl> Date: Thu, 03 May 2007 14:15:03 -0700 From: Bakul Shah Message-Id: <20070503211503.56DAF5B2E@mail.bitblocks.com> Cc: freebsd-fs@freebsd.org Subject: Re: ZFS vs UFS2 overhead and may be a bug? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 03 May 2007 21:15:03 -0000 > Interesting. There are two problems. First is that cat(1) uses > st_blksize to find out best size of I/O request and we force it to > PAGE_SIZE, which is very, very wrong for ZFS - it should be equal to > recordsize. I need to find discussion about this: > > /* > * According to www.opengroup.org, the meaning of st_blksize is > * "a filesystem-specific preferred I/O block size for this > * object. In some filesystem types, this may vary from file > * to file" > * Default to PAGE_SIZE after much discussion. > * XXX: min(PAGE_SIZE, vp->v_bufobj.bo_bsize) may be more > * correct. > */ > > sb->st_blksize = PAGE_SIZE; This does seem suboptimal. Almost always one reads an entire file and the overhead of going to the disk is high enough that one may as well read small files in one syscall. Apps that want to keep lots and lots of files open can always adjust the buffer size. Since disk seek access time is the largest cost component, ideally contiguously allocated data should be read in one access in order to avoid any extra seeks. At the very least st_blksize should be as large as the minimum unit of contiguous allocation (== filesystem block size). Even V7 unix had this! > I tested it on Solaris and this is not FreeBSD-specific problem, the > same is on Solaris. Is there a chance you could send your observations > to zfs-discuss@opensolaris.org, but just comparsion between dd(1) with > bs=128k and bs=4k (the other tests might be confusing). I just did so. From owner-freebsd-fs@FreeBSD.ORG Fri May 4 05:45:30 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 362B416A403; Fri, 4 May 2007 05:45:30 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.freebsd.org (Postfix) with ESMTP id F40E013C45D; Fri, 4 May 2007 05:45:29 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout2.pacific.net.au (Postfix) with ESMTP id 90147209268; Fri, 4 May 2007 15:45:10 +1000 (EST) Received: from besplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 50E438C08; Fri, 4 May 2007 15:45:16 +1000 (EST) Date: Fri, 4 May 2007 15:45:15 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Bakul Shah In-Reply-To: <20070503211503.56DAF5B2E@mail.bitblocks.com> Message-ID: <20070504153155.H37499@besplex.bde.org> References: <20070503211503.56DAF5B2E@mail.bitblocks.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: ZFS vs UFS2 overhead and may be a bug? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2007 05:45:30 -0000 On Thu, 3 May 2007, Bakul Shah wrote: >> Interesting. There are two problems. First is that cat(1) uses >> st_blksize to find out best size of I/O request and we force it to >> PAGE_SIZE, which is very, very wrong for ZFS - it should be equal to >> recordsize. I need to find discussion about this: >> ... >> sb->st_blksize = PAGE_SIZE; > > This does seem suboptimal. Almost always one reads an entire It's just broken. > file and the overhead of going to the disk is high enough > that one may as well read small files in one syscall. Apps > that want to keep lots and lots of files open can always > adjust the buffer size. > > Since disk seek access time is the largest cost component, > ideally contiguously allocated data should be read in one > access in order to avoid any extra seeks. At the very least Buffering makes the userland i/o size have almost no effect on physical disk accesses. Perhaps even for zfs, since IIRC your benchmark showed anamolies for the case of sparse files where no disk accesses are involved. > st_blksize should be as large as the minimum unit of > contiguous allocation (== filesystem block size). Even V7 > unix had this! FreeBSD-1.1-FreeBSD-5 also have this (== filesystem block size > filesystem frag size for ffs). Bruce From owner-freebsd-fs@FreeBSD.ORG Fri May 4 12:30:23 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CB94D16A403; Fri, 4 May 2007 12:30:23 +0000 (UTC) (envelope-from daichi@freebsd.org) Received: from natial.ongs.co.jp (natial.ongs.co.jp [202.216.232.58]) by mx1.freebsd.org (Postfix) with ESMTP id 8CD4213C448; Fri, 4 May 2007 12:30:23 +0000 (UTC) (envelope-from daichi@freebsd.org) Received: from parancell.ongs.co.jp (dullmdaler.ongs.co.jp [202.216.232.62]) by natial.ongs.co.jp (Postfix) with ESMTP id DA58A244C2C; Fri, 4 May 2007 20:58:44 +0900 (JST) Message-ID: <463B1FF4.40508@freebsd.org> Date: Fri, 04 May 2007 20:58:44 +0900 From: Daichi GOTO User-Agent: Thunderbird 2.0.0.0 (X11/20070424) MIME-Version: 1.0 To: FreeBSD Hackers , FreeBSD Current , freebsd-fs@freebsd.org, Craig Rodrigues Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Stanislav Sedov , Ed Schouten , Daichi GOTO , Kris Kennaway Subject: [ANN] unionfs patchset-19-20070504 release, it is now MPSAFE and transparent mode as default X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2007 12:30:23 -0000 Hi Guys It is my pleasure and honor to announce the availability of the unionfs patchset-19-20070504. p19 is second patchset after its merged of FreeBSD. Our improvements works of unionfs are going step by step. p19 is milestone release. Patchset-19-20070504: For 7-current http://people.freebsd.org/~daichi/unionfs/unionfs-p19-20070504.diff For 6-stable http://people.freebsd.org/~daichi/unionfs/unionfs6-p19-20070504.diff Changes in unionfs-p19-20070504.diff - It has been became MPSAFE. - Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations. - Fixed kern/111262 issue. - Added support of vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Added whiteout behavior option. ``-o whiteout=always'' is default mode(it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks) - Fixed a mtx lock issue happened with nullfs. - Fixed lock issues around unionfs. - Added NULL check code pointed out by Coverity. (Pointed out by Stanislav Sedov. Thanks) The documents of those unionfs patches: http://people.freebsd.org/~daichi/unionfs/ (English) http://people.freebsd.org/~daichi/unionfs/index-ja.html (Japanese) Request for Test: Unionfs lovers including FreeSBIE developers, ports cluster managers, heavy memory-fs users and folks use unionfs, could you try p19 please? Merge plan: I have plan to commit unionfs-p19-20070504.diff to -current after received unionfs users responses. Thanks P.S. I am going to join BSDCan 2007. Lets meet at Ottawa, Canada :) -- Daichi GOTO, http://people.freebsd.org/~daichi From owner-freebsd-fs@FreeBSD.ORG Fri May 4 16:26:17 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1AD1416A403 for ; Fri, 4 May 2007 16:26:17 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from solow.pil.dk (relay.pil.dk [195.41.47.164]) by mx1.freebsd.org (Postfix) with ESMTP id D6E2113C45B for ; Fri, 4 May 2007 16:26:16 +0000 (UTC) (envelope-from kvs@binarysolutions.dk) Received: from coruscant.local (naboo.binarysolutions.dk [80.196.17.173]) by solow.pil.dk (Postfix) with ESMTP id C42DB1CC0C9; Fri, 4 May 2007 18:26:15 +0200 (CEST) Received: by coruscant.local (Postfix, from userid 502) id D10F1317D3E; Fri, 4 May 2007 18:26:14 +0200 (CEST) To: Pawel Jakub Dawidek References: <20070503190956.GC7177@garage.freebsd.pl> From: Kenneth Vestergaard Schmidt Date: Fri, 04 May 2007 18:26:14 +0200 In-Reply-To: <20070503190956.GC7177@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Thu\, 3 May 2007 21\:09\:56 +0200") Message-ID: User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.0.96 (darwin) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-fs@freebsd.org Subject: Re: Sun Fire X4500, FreeBSD and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2007 16:26:17 -0000 Pawel Jakub Dawidek writes: >> All in all, a fun little toy :) > > If it's just a little toy for your, maybe you want to replace it with my > teddy bear?:) Is your teddy bear rack-mountable, and does it have a dedicated management processor? If so, we'll talk :) > Great to hear that this beast works with FreeBSD!! Any chance we can > trick you into performance comparsion between Solaris/ZFS and > FreeBSD/ZFS?:) At the very least, we want to do some heavy testing with FreeBSD. If we get the time, it would be fun to contrast them to Solaris. It does require that I reinstall Solaris, though :) How would you go about tweaking the machine for maximum efficiency? We have around 13 GB RAM free at the moment, just sitting doing nothing, and that might just as well be used for caching. -- Kenneth Schmidt From owner-freebsd-fs@FreeBSD.ORG Fri May 4 19:42:46 2007 Return-Path: X-Original-To: freebsd-fs@FreeBSD.org Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8931116A401; Fri, 4 May 2007 19:42:46 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 6B5AB13C458; Fri, 4 May 2007 19:42:46 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 228C15B51; Fri, 4 May 2007 12:42:46 -0700 (PDT) To: Bruce Evans In-reply-to: Your message of "Fri, 04 May 2007 15:45:15 +1000." <20070504153155.H37499@besplex.bde.org> Date: Fri, 04 May 2007 12:42:46 -0700 From: Bakul Shah Message-Id: <20070504194246.228C15B51@mail.bitblocks.com> Cc: freebsd-fs@FreeBSD.org, Pawel Jakub Dawidek Subject: Re: ZFS vs UFS2 overhead and may be a bug? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2007 19:42:46 -0000 > >> Interesting. There are two problems. First is that cat(1) uses > >> st_blksize to find out best size of I/O request and we force it to > >> PAGE_SIZE, which is very, very wrong for ZFS - it should be equal to > >> recordsize. I need to find discussion about this: > > >> ... > >> sb->st_blksize = PAGE_SIZE; > > > > This does seem suboptimal. Almost always one reads an entire > > It's just broken. What should it be? > > file and the overhead of going to the disk is high enough > > that one may as well read small files in one syscall. Apps > > that want to keep lots and lots of files open can always > > adjust the buffer size. > > > > Since disk seek access time is the largest cost component, > > ideally contiguously allocated data should be read in one > > access in order to avoid any extra seeks. At the very least > > Buffering makes the userland i/o size have almost no effect on > physical disk accesses. Perhaps even for zfs, since IIRC your > benchmark showed anamolies for the case of sparse files where > no disk accesses are involved. This is perhaps a separate problem from that of sparse file access. In my tests on regular files (not sparse) ZFS took 8% more time to read a 10G file when 4K buffer was used and 90% more time for 1K buffer. May be it is simply the ZFS overhead but as size of read() buffer has an non-negligible effect, something needs to be done. From owner-freebsd-fs@FreeBSD.ORG Sat May 5 19:39:53 2007 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id D723C16A401 for ; Sat, 5 May 2007 19:39:53 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (arm132.internetdsl.tpnet.pl [83.17.198.132]) by mx1.freebsd.org (Postfix) with ESMTP id 2D5FB13C457 for ; Sat, 5 May 2007 19:39:42 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id F2EF545CD9; Sat, 5 May 2007 18:12:08 +0200 (CEST) Received: from localhost (154.81.datacomsa.pl [195.34.81.154]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id A66FF45684; Sat, 5 May 2007 18:12:02 +0200 (CEST) Date: Sat, 5 May 2007 18:11:32 +0200 From: Pawel Jakub Dawidek To: Kenneth Vestergaard Schmidt Message-ID: <20070505161132.GB16398@garage.freebsd.pl> References: <20070503190956.GC7177@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="v9Ux+11Zm5mwPlX6" Content-Disposition: inline In-Reply-To: X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 7.0-CURRENT i386 User-Agent: mutt-ng/devel-r804 (FreeBSD) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: Sun Fire X4500, FreeBSD and ZFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 May 2007 19:39:53 -0000 --v9Ux+11Zm5mwPlX6 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, May 04, 2007 at 06:26:14PM +0200, Kenneth Vestergaard Schmidt wrote: > At the very least, we want to do some heavy testing with FreeBSD. If we > get the time, it would be fun to contrast them to Solaris. It does > require that I reinstall Solaris, though :) >=20 > How would you go about tweaking the machine for maximum efficiency? We > have around 13 GB RAM free at the moment, just sitting doing nothing, > and that might just as well be used for caching. There is a lot memory and quite a few silly limits in the kernel. Currently we allocate most memory for ZFS via malloc(9), so the memory is allocate from kmem_map. This should change in the future, but until then, you need to adjust vm_kmem_size, etc. to make memory usable with ZFS. You'd also want to tune vfs.zfs.arc_max and vfs.zfs.arc_min, because current auto-tuning won't be of any good from that much RAM. I'd start from setting vfs.zfs.arc_max to kmem_map size minus 3GB. You can tune all of them (vfs.zfs.arc_max, vfs.zfs.arc_min, vm.kmem_size and vm.kmem_size_max from /boot/loader.conf). Another old limit is for maximum number of vnodes in the system. It is only auto-tuned up to 100000, which you may want to change. I'm not sure if you can do it from kernel config, so just change MAXVNODES_MAX define in sys/kern/vfs_subr.c. Be aware that most debugging options like INVARIANTS, WITNESS, etc. have very negative impact on ZFS performance in FreeBSD. PS. Even if you don't want to install Solaris in there, there are plenty benchmarks on the net for Solaris/ZFS on this very machine. Would be good to know what results do we have. Good luck! --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --v9Ux+11Zm5mwPlX6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFGPKy0ForvXbEpPzQRAsY7AKCmvMQaXf2VTp+V/O/wE65fG60CrQCgnKBS M1iOGRy3vNLjmLM4hCcJeoc= =6a5R -----END PGP SIGNATURE----- --v9Ux+11Zm5mwPlX6--