Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Apr 2022 09:38:35 -0700
From:      Doug Ambrisko <ambrisko@ambrisko.com>
To:        Alexander Leidinger <Alexander@leidinger.net>
Cc:        Mateusz Guzik <mjguzik@gmail.com>, freebsd-current@freebsd.org
Subject:   Re: nullfs and ZFS issues
Message-ID:  <YmGIiwQen0Fq6lRN@ambrisko.com>
In-Reply-To: <20220421154402.Horde.I6m2Om_fxqMtDMUqpiZAxtP@webmail.leidinger.net>
References:  <Yl31Frx6HyLVl4tE@ambrisko.com> <20220420113944.Horde.5qBL80-ikDLIWDIFVJ4VgzX@webmail.leidinger.net> <YmAy0ZNZv9Cqs7X%2B@ambrisko.com> <20220421083310.Horde.r7YT8777_AvGU_6GO1cC90G@webmail.leidinger.net> <CAGudoHEyCK4kWuJybD4jzCHbGAw46CQkPx_yrPpmRJg3m10sdQ@mail.gmail.com> <20220421154402.Horde.I6m2Om_fxqMtDMUqpiZAxtP@webmail.leidinger.net>

next in thread | previous in thread | raw e-mail | index | archive | help

--HAyyfbF3IpFncfdR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Thu, Apr 21, 2022 at 03:44:02PM +0200, Alexander Leidinger wrote:
| Quoting Mateusz Guzik <mjguzik@gmail.com> (from Thu, 21 Apr 2022  
| 14:50:42 +0200):
| 
| > On 4/21/22, Alexander Leidinger <Alexander@leidinger.net> wrote:
| >> I tried nocache on a system with a lot of jails which use nullfs,
| >> which showed very slow behavior in the daily periodic runs (12h runs
| >> in the night after boot, 24h or more in subsequent nights). Now the
| >> first nightly run after boot was finished after 4h.
| >>
| >> What is the benefit of not disabling the cache in nullfs? I would
| >> expect zfs (or ufs) to cache the (meta)data anyway.
| >>
| >
| > does the poor performance show up with
| > https://people.freebsd.org/~mjg/vnlru_free_pick.diff ?
| 
| I would like to have all the 22 jails run the periodic scripts a  
| second night in a row before trying this.
| 
| > if the long runs are still there, can you get some profiling from it?
| > sysctl -a before and after would be a start.
| >
| > My guess is that you are the vnode limit and bumping into the 1 second sleep.
| 
| That would explain the behavior I see since I added the last jail  
| which seems to have crossed a threshold which triggers the slow  
| behavior.
| 
| Current status (with the 112 nullfs mounts with nocache):
| kern.maxvnodes:               10485760
| kern.numvnodes:                3791064
| kern.freevnodes:               3613694
| kern.cache.stats.heldvnodes:    151707
| kern.vnodes_created:         260288639
| 
| The maxvnodes value is already increased by 10 times compared to the  
| default value on this system.

I've attached mount.patch that when doing mount -v should
show the vnode usage per filesystem.  Note that the problem I was
running into was after some operations arc_prune and arc_evict would
consume 100% of 2 cores and make ZFS really slow.  If you are not
running into that issue then nocache etc. shouldn't be needed.
On my laptop I set ARC to 1G since I don't use swap and in the past
ARC would consume to much memory and things would die.  When the
nullfs holds a bunch of vnodes then ZFS couldn't release them.

FYI, on my laptop with nocache and limited vnodes I haven't run
into this problem.  I haven't tried the patch to let ZFS free
it's and nullfs vnodes on my laptop.  I have only tried it via
bhyve test.  I use bhyve and a md drive to avoid wearing
out my SSD and it's faster to test.  I have found the git, tar,
make world etc. could trigger the issue before but haven't had
any issues with nocache and capping vnodes.

Thanks,

Doug A.

--HAyyfbF3IpFncfdR
Content-Type: text/x-diff; charset=us-ascii
Content-Disposition: attachment; filename="mount.patch"

diff --git a/sbin/mount/mount.c b/sbin/mount/mount.c
index 79d9d6cb0ca..00eefb3a5e0 100644
--- a/sbin/mount/mount.c
+++ b/sbin/mount/mount.c
@@ -692,6 +692,13 @@ prmount(struct statfs *sfp)
 			xo_emit("{D:, }{Lw:fsid}{:fsid}", fsidbuf);
 			free(fsidbuf);
 		}
+		if (sfp->f_nvnodelistsize != 0 || sfp->f_lazyvnodelistsize != 0) {
+			xo_open_container("vnodes");
+				xo_emit("{D:, }{Lwc:vnodes}{Lw:count}{w:count/%ju}{Lw:lazy}{:lazy/%ju}",
+				    (uintmax_t)sfp->f_nvnodelistsize,
+				    (uintmax_t)sfp->f_lazyvnodelistsize);
+			xo_close_container("vnodes");
+		}
 	}
 	xo_emit("{D:)}\n");
 }
diff --git a/sys/kern/vfs_mount.c b/sys/kern/vfs_mount.c
index a495ad86ac4..3648ef8d080 100644
--- a/sys/kern/vfs_mount.c
+++ b/sys/kern/vfs_mount.c
@@ -2625,6 +2626,8 @@ __vfs_statfs(struct mount *mp, struct statfs *sbp)
 	sbp->f_version = STATFS_VERSION;
 	sbp->f_namemax = NAME_MAX;
 	sbp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK;
+	sbp->f_nvnodelistsize = mp->mnt_nvnodelistsize;
+	sbp->f_lazyvnodelistsize = mp->mnt_lazyvnodelistsize;
 
 	return (mp->mnt_op->vfs_statfs(mp, sbp));
 }
diff --git a/sys/sys/mount.h b/sys/sys/mount.h
index 3383bfe8f43..95dd3c76ae5 100644
--- a/sys/sys/mount.h
+++ b/sys/sys/mount.h
@@ -91,7 +91,9 @@ struct statfs {
 	uint64_t f_asyncwrites;		/* count of async writes since mount */
 	uint64_t f_syncreads;		/* count of sync reads since mount */
 	uint64_t f_asyncreads;		/* count of async reads since mount */
-	uint64_t f_spare[10];		/* unused spare */
+	uint32_t f_nvnodelistsize;	    /* (i) # of vnodes */
+	uint32_t f_lazyvnodelistsize;    /* (l) # of lazy vnodes */
+	uint64_t f_spare[9];		/* unused spare */
 	uint32_t f_namemax;		/* maximum filename length */
 	uid_t	  f_owner;		/* user that mounted the filesystem */
 	fsid_t	  f_fsid;		/* filesystem id */

--HAyyfbF3IpFncfdR--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YmGIiwQen0Fq6lRN>