Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 30 Aug 2014 10:37:43 -0700
From:      Peter Wemm <peter@wemm.org>
To:        Steven Hartland <smh@freebsd.org>
Cc:        src-committers@freebsd.org, Alan Cox <alc@rice.edu>, svn-src-all@freebsd.org, Dmitry Morozovsky <marck@rinet.ru>, "Matthew D. Fuller" <fullermd@over-yonder.net>, svn-src-head@freebsd.org
Subject:   Re: svn commit: r270759 - in head/sys: cddl/compat/opensolaris/kern cddl/compat/opensolaris/sys cddl/contrib/opensolaris/uts/common/fs/zfs vm
Message-ID:  <39211177.i8nn9sHiCx@overcee.wemm.org>
In-Reply-To: <E0F163ECBF5E407F99AFDB18FAB05C58@multiplay.co.uk>
References:  <201408281950.s7SJo90I047213@svn.freebsd.org> <2714752.cWQfguSlQD@overcee.wemm.org> <E0F163ECBF5E407F99AFDB18FAB05C58@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help

--nextPart23165865.b7KFKC409C
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="us-ascii"

On Saturday 30 August 2014 02:03:42 Steven Hartland wrote:
> ----- Original Message -----
> From: "Peter Wemm" <peter@wemm.org>
>=20
> > On Friday 29 August 2014 21:42:15 Steven Hartland wrote:

>=20
> > If this function returns non-zerp, ARC is given back:
> >=20
> > static int
> > arc_reclaim_needed(void)
> > {
> >=20
> >         if (kmem_free_count() < zfs_arc_free_target) {
> >        =20
> >                 return (1);
> >        =20
> >         }
> >        =20
> >          /*
> >          * Cooperate with pagedaemon when it's time for it to scan
> >          * and reclaim some pages.
> >          */
> >        =20
> >         if (vm_paging_needed()) {
> >        =20
> >                 return (1);
> >        =20
> >         }
> >=20
> > ie: if v_free (ignoring v_cache free pages) gets below the threshol=
d,
> > stop
> > evertyhing and discard ARC pages.
> >=20
> > The vm_paging_needed() code is a NO-OP at this point. It can never
> > return
> >=20
> > true.  Consider:
> >         vm_cnt.v_free_target =3D 4 * vm_cnt.v_free_min +
> >=20
> > vm_cnt.v_free_reserved;
> > vs
> >=20
> >         vm_pageout_wakeup_thresh =3D (vm_cnt.v_free_min / 10) * 11;=

> >=20
> > zfs_arc_free_target defaults to vm_cnt.v_free_target, which is 400%=
 of
> > v_free_min, and compares it against the smaller v_free pool.
> >=20
> > vm_paging_needed() compares the total free pool (v_free + v_cache)
> > against the
> > smaller wakeup threshold - 110% of v_free_min.
> >=20
> > Comparing a larger value against a smaller target than the previous=

> > test will
> > never succeed unless you manually change the arc_free_target sysctl=
.
>=20
> I'm aware of the values involved, and as I said what you're proposing=

> was more akin to where I started, but I was informed that it had alre=
ady
> been tested and didn't work well.

And Karl also said that his tests are on machines that have no v_cache,=
 so=20
he's not testing the scenario.

The code, as written, is wrong.  It's as simple as that.

The logic is wrong.

You've introduced dead code.

Your code changes introduce a scenario that CAUSES one of the very prob=
lems=20
you're using as a justtification for the changes.

Your own testers have admitted that they don't test the scenario that t=
he=20
problem exists with.

> > Also, what about the magic numbers here:
> > u_int zfs_arc_free_target =3D (1 << 19); /* default before pagedaem=
on
> > init only */
>=20
> That is just a total fall back case and should never be triggered unl=
ess
> as the comment states the pagedaemon isn't initialised.
>=20
> > That's half a million pages, or 2GB of physical ram on a 4K page si=
ze
> > system
> > How is this going to work on early boot in the machines in the clus=
ter
> > with
> > less than 2GB of ram?
>=20
> Its there to ensure that ARC doesn't run wild ARC for the few
> milliseconds
> / seconds before pagedaemon is initalised.
>=20
> We can change the value no problem, what would you suggest 1<<16 aka
> 256MB?

Please stop picking magic numbers out of thin air.  You are working wit=
h file=20
system and VM - critical parts of the system.  This is NOT the place to=
 be=20
screwing around with things you don't understand.  alc@ was trying to b=
e=20
polite.

> Thanks for all the feedback, its great to have my understanding of
> how things work in this area confirmed by those who know.
>
> Hopefully we'll be able to get to the bottom of this with everyones
> help and get a solid fix for these issues that have plaged 10 into
> 10.1 :)

I'm very disappointed in the attention to detail and errors in the comm=
it. =20
I'm almost at the point where I want to ask for the whole thing to be b=
acked=20
out.

=2D-=20
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; KI=
6FJV
UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246
--nextPart23165865.b7KFKC409C
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: This is a digitally signed message part.
Content-Transfer-Encoding: 7Bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAABAgAGBQJUAgvvAAoJEDXWlwnsgJ4EBDEH/Rv4pgRYMVSMZWd5b4aKqcnS
O2/6Pwi/UWpEbxjWCRArymuomKizJmum0Caik/xPhH03MFEYI72tZEcuNoC2p5jT
z1MWPtyyODHkfrR8f2gDRnhcTH/NNsMbd0LDOhK8lQFzZi/me6iBq8yovpTIfNn7
nZquAPwvd8nJV1uO6QqZi+T6EsV1y7AV6UyJZFyeJV32dIlSlXXDnGVjoZzHS05C
uAFroAeDl7jqtTEY06SBe9q1Y9i4f9UsiTX7cckdEtK4dlLiYaJOVoofZi5YN9Ol
iuIDwpYlGrG+IEgfMeqbbF9gyxcO191y30S/64N2pwUqpRN3oclEyKWSRu8EPq8=
=7zlO
-----END PGP SIGNATURE-----

--nextPart23165865.b7KFKC409C--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?39211177.i8nn9sHiCx>