Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 26 Apr 2022 08:15:45 +0200
From:      Alexander Leidinger <Alexander@leidinger.net>
To:        Eirik =?utf-8?b?w5h2ZXJieQ==?= <ltning@anduin.net>
Cc:        freebsd-current@freebsd.org
Subject:   Re: nullfs and ZFS issues
Message-ID:  <20220426081545.Horde.IzWT3chMkImG5Hr28ZuCwFT@webmail.leidinger.net>
In-Reply-To: <db9ad3bfcbc9b235f4845caba0ca6d7af0f0b091.camel@anduin.net>
References:  <Yl31Frx6HyLVl4tE@ambrisko.com> <20220420113944.Horde.5qBL80-ikDLIWDIFVJ4VgzX@webmail.leidinger.net> <YmAy0ZNZv9Cqs7X%2B@ambrisko.com> <20220421083310.Horde.r7YT8777_AvGU_6GO1cC90G@webmail.leidinger.net> <CAGudoHEyCK4kWuJybD4jzCHbGAw46CQkPx_yrPpmRJg3m10sdQ@mail.gmail.com> <20220421154402.Horde.I6m2Om_fxqMtDMUqpiZAxtP@webmail.leidinger.net> <YmGIiwQen0Fq6lRN@ambrisko.com> <20220422090439.Horde.TabULDW9aIeaNLxngZxdvvN@webmail.leidinger.net> <20220424195817.Horde.W5ApGT13KmR06W2pKA0COxB@webmail.leidinger.net> <20220425152727.Horde.YqhquyTW0ZM3HAbI1kyskic@webmail.leidinger.net> <db9ad3bfcbc9b235f4845caba0ca6d7af0f0b091.camel@anduin.net>

index | next in thread | previous in thread | raw e-mail

[-- Attachment #1 --]
Quoting Eirik Øverby <ltning@anduin.net> (from Mon, 25 Apr 2022  
18:44:19 +0200):

> On Mon, 2022-04-25 at 15:27 +0200, Alexander Leidinger wrote:
>> Quoting Alexander Leidinger <Alexander@leidinger.net> (from Sun, 24
>> Apr 2022 19:58:17 +0200):
>>
>> > Quoting Alexander Leidinger <Alexander@leidinger.net> (from Fri, 22
>> > Apr 2022 09:04:39 +0200):
>> >
>> > > Quoting Doug Ambrisko <ambrisko@ambrisko.com> (from Thu, 21 Apr
>> > > 2022 09:38:35 -0700):
>> >
>> > > > I've attached mount.patch that when doing mount -v should
>> > > > show the vnode usage per filesystem.  Note that the problem I was
>> > > > running into was after some operations arc_prune and arc_evict would
>> > > > consume 100% of 2 cores and make ZFS really slow.  If you are not
>> > > > running into that issue then nocache etc. shouldn't be needed.
>> > >
>> > > I don't run into this issue, but I have a huge perf difference when
>> > > using nocache in the nightly periodic runs. 4h instead of 12-24h
>> > > (22 jails on this system).
>> > >
>> > > > On my laptop I set ARC to 1G since I don't use swap and in the past
>> > > > ARC would consume to much memory and things would die.  When the
>> > > > nullfs holds a bunch of vnodes then ZFS couldn't release them.
>> > > >
>> > > > FYI, on my laptop with nocache and limited vnodes I haven't run
>> > > > into this problem.  I haven't tried the patch to let ZFS free
>> > > > it's and nullfs vnodes on my laptop.  I have only tried it via
>> > >
>> > > I have this patch and your mount patch installed now, without
>> > > nocache and reduced arc reclaim settings (100, 1). I will check the
>> > > runtime for the next 2 days.
>> >
>> > 9-10h runtime with the above settings (compared to 4h with nocache
>> > and 12-24h without any patch and without nocache).
>> > I changed the sysctls back to the defaults and will see in the next
>> > run (in 7h) what the result is with just the patches.
>>
>> And again 9-10h runtime (I've seen a lot of the find processes in the
>> periodic daily run of those 22 jails in the state "*vnode"). Seems
>> nocache gives the best perf for me in this case.
>
> Sorry for jumping in here - I've got a couple of questions:
> - Will this also apply to nullfs read-only mounts? Or is it only in
> case of writing "through" a nullfs mount that these problems are seen?
> - Is it a problem also in 13, or is this "new" in -CURRENT?
>
> We're having weird and unexplained CPU spikes on several systems, even
> after tuning geli to not use gazillions of threads. So far our
> suspicion has been ZFS snapshot cleanups but this is an interesting
> contender - unless the whole "read only" part makes it moot.

For me this started after creating one more jail on this system and I  
dont't see CPU spikes (as the system is running permanently at 100%  
and the distribution of the CPU looks as I would expect it). The  
experience of Doug is a little bit different, as he experiences a high  
amount of CPU usage "for nothing" or even a dead-lock like situation.  
So I would say we see different things based on similar triggers.

The nocache option for nullfs is affecting the number of vnodes in use  
on the system no matter if ro or rw. As such you can give it a try.  
Note, depending on the usage pattern, the nocache option may increase  
lock contention. So it may or may not have a positive or negative  
performance impact.

Bye,
Alexander.

-- 
http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netchild@FreeBSD.org  : PGP 0x8F31830F9F2772BF

[-- Attachment #2 --]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIzBAABCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmJnjhAACgkQEg2wmwP4
2IY0Ew/+MAgJ6GkpOD+3jmRxFznSmnIGyVgOHVb6+vt+iuFXanPrNwKMutkHOsqz
mFXMYfgyC7sCyC0zu6SMF0Tdv9b6JOb8TzSoyujjvLbdVAWL7ccozgFslQLyCGtk
VFKmS+oB1/3gWQp3NwnrHdNVFxCfrqUcIGjJ+aKHP1hVi6LD9S0OYljQiys9ZKCI
90q8PCDd9qPl4vA3XGTmAqBpWYD3blYeTQNwl/YmH381pc6Sul5jD86QDETrCZUR
LdCqukhtL7qdoSyJnHU4mSM+iXIynDGsBtid7z20YLTqrsrwWC3Oo8UtY8enkA67
mxBw4HBWxLHt1CU/YJ5Wc4wF2NIumCuM/F6xJmio4ZMWLfHBnQQuBO0HpYFstb09
yBGdfhVHYZcfSlVGjjGZ8vpoYO9js5yj3OPjgOXztjWsu1EZ0RFwSoVs0LRHgCdD
lj4gpW0OVIn0pOZBLB2zfDIOrv4mBoXEJ8mShpqrKS1Mb6Hz0yFxCUvL8wxUSwZY
AVQzASquJcFp1bm6NZsRx98gXPHCssiY9M+8P5CKTGj4Fi+NTBq204s9uqY+Ze8G
/jCv1Kdm1jvteelZnKIdugFQk8eNJF1LzFkshIaq8YrByHkk1wqm4Mfjup6DG42T
5/iKqZStd4V5TTCMba8AhAYMI4Sel5RT1Utn951sWrqHRwdcL/A=
=gYPL
-----END PGP SIGNATURE-----
help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20220426081545.Horde.IzWT3chMkImG5Hr28ZuCwFT>