Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Oct 2007 15:09:00 -0500
From:      Josh Paetzel <josh@tcbug.org>
To:        freebsd-performance@freebsd.org, josh.carroll@gmail.com
Cc:        Kip Macy <kip.macy@gmail.com>
Subject:   Re: ULE vs. 4BSD in RELENG_7
Message-ID:  <200710231509.03771.josh@tcbug.org>
In-Reply-To: <8cb6106e0710231257k154e9c6ev4b4ba8c3692206fb@mail.gmail.com>
References:  <8cb6106e0710230902x4edf2c8eu2d912d5de1f5d4a2@mail.gmail.com> <b1fa29170710231047i50859fa7gde2904985a7a8c20@mail.gmail.com> <8cb6106e0710231257k154e9c6ev4b4ba8c3692206fb@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--nextPart1592137.Cs6eKOMjuQ
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

On Tuesday 23 October 2007, Josh Carroll wrote:
> > ULE is tuned towards providing cpu affinity compilation and
> > evidently encoding are workloads that do not benefit from
> > affinity. Before we conclude that it is slower, try building with
> > -j5, -j6, j7.
>
> Here are the results of running ffmpeg with 4 through 8 threads on
> both schedulers:
>
> 4 threads 4bsd:      117.21
> 5 threads 4bsd:       95.75
> 6 threads 4bsd:       93.10
> 7 threads 4bsd:       92.19
> 8 threads 4bsd:       92.38
>
> 4 threads ule:      122.19
> 5 threads ule:      107.26
> 6 threads ule:      101.40
> 7 threads ule:       98.72
> 8 threads ule:       96.38
>
> 4 threads difference: 4.25 %
> 5 threads difference: 12.02 %
> 6 threads difference: 8.92 %
> 7 threads difference: 7.08 %
> 8 threads difference: 4.33 %
>
> I'm not sure why the performance differential is not consistent
> (probably something very technical a scheduler developer could
> explain) :)
>
> Do these results help at all? When running with 9 or more threads,
> ffmpeg spits out a lot of errors, so 8 was as high as I could go:
>
> Error while decoding stream #0.0
> [h264 @ 0x264ae180]too many threads
> [h264 @ 0x264ae180]decode_slice_header error
> [h264 @ 0x264ae180]no frame!
>
> My next step is to run some transcodes with mencoder to see if it
> has similar performance between the two schedulers. When I have
> those results, I'll post them to this thread.
>
> Thanks for the attention,
> Josh

Just curious, but are these results obtained while you are=20
overclocking your 2.4ghz CPU to  3.4ghz?  That might be a useful=20
datapoint.

It also might be useful to know what sort of disks you are using. =20
SATA is notoriously bad at parallel access, and compiling is of=20
course horribly disk bound to begin with.

make buildworld also was never designed for massive parallelism at=20
all, and slows down considerably as you try to scale it up with more=20
cpus and increasing -j past a certain point.  I don't know where the=20
break is, but it defintely has been hit at 16 cores.

=2D-=20
Thanks,

Josh Paetzel

--nextPart1592137.Cs6eKOMjuQ
Content-Type: application/pgp-signature; name=signature.asc 
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQBHHlTfJvkB8SevrssRAnX3AJsF8CsT84wM2yLOf8A1hj2ljLqC2ACbBCuS
Bgeem//qUy+TVAVJoMlwITo=
=5Y55
-----END PGP SIGNATURE-----

--nextPart1592137.Cs6eKOMjuQ--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200710231509.03771.josh>