Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Feb 2012 22:28:47 +0100
From:      Florian Smeets <flo@FreeBSD.org>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        freebsd-hackers@FreeBSD.org, Jeff Roberson <jroberson@jroberson.net>, Andriy Gapon <avg@FreeBSD.org>
Subject:   Re: [RFT][patch] Scheduling for HTT and not only
Message-ID:  <4F3D750F.8000100@FreeBSD.org>
In-Reply-To: <4F3C0BB9.6050101@FreeBSD.org>
References:  <4F2F7B7F.40508@FreeBSD.org> <4F366E8F.9060207@FreeBSD.org> <4F367965.6000602@FreeBSD.org> <4F396B24.5090602@FreeBSD.org> <alpine.BSF.2.00.1202131012270.2020@desktop> <4F3978BC.6090608@FreeBSD.org> <alpine.BSF.2.00.1202131108460.2020@desktop> <4F3990EA.1080002@FreeBSD.org> <4F3C0BB9.6050101@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enigC4E5750348613DA918A41115
Content-Type: text/plain; charset=KOI8-R
Content-Transfer-Encoding: quoted-printable

On 15.02.12 20:47, Alexander Motin wrote:
> On 02/14/12 00:38, Alexander Motin wrote:
>> I see no much point in committing them sequentially, as they are quite=

>> orthogonal. I need to make one decision. I am going on small vacation
>> next week. It will give time for thoughts to settle. May be I indeed
>> just clean previous patch a bit and commit it when I get back. I've
>> spent too much time trying to make these things formal and so far
>> results are not bad, but also not so brilliant as I would like. May be=

>> it is indeed time to step back and try some more simple solution.
>=20
> I've decided to stop those cache black magic practices and focus on=20
> things that really exist in this world -- SMT and CPU load. I've droppe=
d=20
> most of cache related things from the patch and made the rest of things=
=20
> more strict and predictable:
> http://people.freebsd.org/~mav/sched.htt34.patch
>=20
> This patch adds check to skip fast previous CPU selection if it's SMT=20
> neighbor is in use, not just if no SMT present as in previous patches.
>=20
> I've took affinity/preference algorithm from the first patch and=20
> improved it. That makes pickcpu() to prefer previous core or it's=20
> neighbors in case of equal load. That is very simple to keep it, but=20
> still should give cache hits.
>=20
> I've changed the general algorithm of topology tree processing. First I=
=20
> am looking for idle core on the same last-level cache as before, with=20
> affinity to previous core or it's neighbors on higher level caches.=20
> Original code could put additional thread on already busy core, while=20
> next socket is completely idle. Now if there is no idle core on this=20
> cache, then all other CPUs are checked.
>=20
> CPU groups comparison now done in two steps: first, same as before,=20
> compared summary load of all cores; but now, if it is equal, I am=20
> comparing load of the less/most loaded cores. That should allow to=20
> differentiate whether load 2 really means 1+1 or 2+0. In that case grou=
p=20
> with 2+0 will be taken as more loaded than one with 1+1, making group=20
> choice more grounded and predictable.
>=20
> I've added randomization in case if all above factors are equal.
>=20
> As before I've tested this on Core i7-870 with 4 physical and 8 logical=
=20
> cores and Atom D525 with 2 physical and 4 logical cores. On Core i7 I'v=
e=20
> got speedup up to 10-15% in super-smack MySQL and PostgreSQL indexed=20
> select for 2-8 threads and no penalty in other cases. pbzip2 shows up t=
o=20
> 13% performance increase for 2-5 threads and no penalty in other cases.=

>=20
> Tests on Atom show mostly about the same performance as before in=20
> database benchmarks: faster for 1 thread, slower for 2-3 and about the =

> same for other cases. Single stream network performance improved same a=
s=20
> for the first patch. That CPU is quite difficult to handle as with mix =

> of effective SMT and lack of L3 cache different scheduling approaches=20
> give different results in different situations.
>=20
> Specific performance numbers can be found here:
> http://people.freebsd.org/~mav/bench.ods
> Every point there includes at least 5 samples and except pbzip2 test=20
> that is quite unstable with previous sources all are statistically vali=
d.
>=20
> Florian is now running alternative set of benchmarks on dual-socket=20
> hardware without SMT.
>=20

I have updated my PostgreSQL [1] and pbzip2 [2] benchmarks. You should
be looking for "ULE+mav-htt33". On a system without HTT this patch is at
least as good as stock ULE and in some cases it's a nice improvement.

Florian

[1]
https://docs.google.com/spreadsheet/ccc?key=3D0Ai0N1xDe3uNAdDRxcVFiYjNMSn=
JWOTZhUWVWWlBlemc&hl=3Den_US&pli=3D1#gid=3D4
[2]https://docs.google.com/spreadsheet/ccc?key=3D0Ai0N1xDe3uNAdDRxcVFiYjN=
MSnJWOTZhUWVWWlBlemc&hl=3Den_US&pli=3D1#gid=3D2


--------------enigC4E5750348613DA918A41115
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAk89dRAACgkQapo8P8lCvwmbFACeMCqzyrBB4Sd49R4DvrHJI2bc
QosAn360DLeyOI7bb2MVpm/XBNvBJ5eg
=ABoK
-----END PGP SIGNATURE-----

--------------enigC4E5750348613DA918A41115--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4F3D750F.8000100>