Date: Thu, 21 Apr 2022 09:38:35 -0700 From: Doug Ambrisko <ambrisko@ambrisko.com> To: Alexander Leidinger <Alexander@leidinger.net> Cc: Mateusz Guzik <mjguzik@gmail.com>, freebsd-current@freebsd.org Subject: Re: nullfs and ZFS issues Message-ID: <YmGIiwQen0Fq6lRN@ambrisko.com> In-Reply-To: <20220421154402.Horde.I6m2Om_fxqMtDMUqpiZAxtP@webmail.leidinger.net> References: <Yl31Frx6HyLVl4tE@ambrisko.com> <20220420113944.Horde.5qBL80-ikDLIWDIFVJ4VgzX@webmail.leidinger.net> <YmAy0ZNZv9Cqs7X%2B@ambrisko.com> <20220421083310.Horde.r7YT8777_AvGU_6GO1cC90G@webmail.leidinger.net> <CAGudoHEyCK4kWuJybD4jzCHbGAw46CQkPx_yrPpmRJg3m10sdQ@mail.gmail.com> <20220421154402.Horde.I6m2Om_fxqMtDMUqpiZAxtP@webmail.leidinger.net>
next in thread | previous in thread | raw e-mail | index | archive | help
--HAyyfbF3IpFncfdR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Apr 21, 2022 at 03:44:02PM +0200, Alexander Leidinger wrote: | Quoting Mateusz Guzik <mjguzik@gmail.com> (from Thu, 21 Apr 2022 | 14:50:42 +0200): | | > On 4/21/22, Alexander Leidinger <Alexander@leidinger.net> wrote: | >> I tried nocache on a system with a lot of jails which use nullfs, | >> which showed very slow behavior in the daily periodic runs (12h runs | >> in the night after boot, 24h or more in subsequent nights). Now the | >> first nightly run after boot was finished after 4h. | >> | >> What is the benefit of not disabling the cache in nullfs? I would | >> expect zfs (or ufs) to cache the (meta)data anyway. | >> | > | > does the poor performance show up with | > https://people.freebsd.org/~mjg/vnlru_free_pick.diff ? | | I would like to have all the 22 jails run the periodic scripts a | second night in a row before trying this. | | > if the long runs are still there, can you get some profiling from it? | > sysctl -a before and after would be a start. | > | > My guess is that you are the vnode limit and bumping into the 1 second sleep. | | That would explain the behavior I see since I added the last jail | which seems to have crossed a threshold which triggers the slow | behavior. | | Current status (with the 112 nullfs mounts with nocache): | kern.maxvnodes: 10485760 | kern.numvnodes: 3791064 | kern.freevnodes: 3613694 | kern.cache.stats.heldvnodes: 151707 | kern.vnodes_created: 260288639 | | The maxvnodes value is already increased by 10 times compared to the | default value on this system. I've attached mount.patch that when doing mount -v should show the vnode usage per filesystem. Note that the problem I was running into was after some operations arc_prune and arc_evict would consume 100% of 2 cores and make ZFS really slow. If you are not running into that issue then nocache etc. shouldn't be needed. On my laptop I set ARC to 1G since I don't use swap and in the past ARC would consume to much memory and things would die. When the nullfs holds a bunch of vnodes then ZFS couldn't release them. FYI, on my laptop with nocache and limited vnodes I haven't run into this problem. I haven't tried the patch to let ZFS free it's and nullfs vnodes on my laptop. I have only tried it via bhyve test. I use bhyve and a md drive to avoid wearing out my SSD and it's faster to test. I have found the git, tar, make world etc. could trigger the issue before but haven't had any issues with nocache and capping vnodes. Thanks, Doug A. --HAyyfbF3IpFncfdR Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="mount.patch" diff --git a/sbin/mount/mount.c b/sbin/mount/mount.c index 79d9d6cb0ca..00eefb3a5e0 100644 --- a/sbin/mount/mount.c +++ b/sbin/mount/mount.c @@ -692,6 +692,13 @@ prmount(struct statfs *sfp) xo_emit("{D:, }{Lw:fsid}{:fsid}", fsidbuf); free(fsidbuf); } + if (sfp->f_nvnodelistsize != 0 || sfp->f_lazyvnodelistsize != 0) { + xo_open_container("vnodes"); + xo_emit("{D:, }{Lwc:vnodes}{Lw:count}{w:count/%ju}{Lw:lazy}{:lazy/%ju}", + (uintmax_t)sfp->f_nvnodelistsize, + (uintmax_t)sfp->f_lazyvnodelistsize); + xo_close_container("vnodes"); + } } xo_emit("{D:)}\n"); } diff --git a/sys/kern/vfs_mount.c b/sys/kern/vfs_mount.c index a495ad86ac4..3648ef8d080 100644 --- a/sys/kern/vfs_mount.c +++ b/sys/kern/vfs_mount.c @@ -2625,6 +2626,8 @@ __vfs_statfs(struct mount *mp, struct statfs *sbp) sbp->f_version = STATFS_VERSION; sbp->f_namemax = NAME_MAX; sbp->f_flags = mp->mnt_flag & MNT_VISFLAGMASK; + sbp->f_nvnodelistsize = mp->mnt_nvnodelistsize; + sbp->f_lazyvnodelistsize = mp->mnt_lazyvnodelistsize; return (mp->mnt_op->vfs_statfs(mp, sbp)); } diff --git a/sys/sys/mount.h b/sys/sys/mount.h index 3383bfe8f43..95dd3c76ae5 100644 --- a/sys/sys/mount.h +++ b/sys/sys/mount.h @@ -91,7 +91,9 @@ struct statfs { uint64_t f_asyncwrites; /* count of async writes since mount */ uint64_t f_syncreads; /* count of sync reads since mount */ uint64_t f_asyncreads; /* count of async reads since mount */ - uint64_t f_spare[10]; /* unused spare */ + uint32_t f_nvnodelistsize; /* (i) # of vnodes */ + uint32_t f_lazyvnodelistsize; /* (l) # of lazy vnodes */ + uint64_t f_spare[9]; /* unused spare */ uint32_t f_namemax; /* maximum filename length */ uid_t f_owner; /* user that mounted the filesystem */ fsid_t f_fsid; /* filesystem id */ --HAyyfbF3IpFncfdR--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YmGIiwQen0Fq6lRN>