Date: Sat, 3 Jun 2017 12:25:04 +0000 From: Rick Macklem <rmacklem@uoguelph.ca> To: Colin Percival <cperciva@tarsnap.com>, "freebsd-current@freebsd.org" <freebsd-current@freebsd.org> Cc: Andriy Gapon <avg@FreeBSD.org>, "cem@freebsd.org" <cem@freebsd.org>, "jeff@freebsd.org" <jeff@freebsd.org>, Ryan Stone <rstone@FreeBSD.org> Subject: Re: NFS client perf. degradation when SCHED_ULE is used (was when SMP enabled) Message-ID: <YTXPR01MB01896F824DBDDE12A6FFD0F5DDF40@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> In-Reply-To: <0100015c6c549e3d-228427b4-2734-4ab5-9eef-88fc9ae71f9a-000000@email.amazonses.com> References: <YTXPR01MB01894DA2879C95E634C792D9DDFC0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <YTXPR01MB0189FAF118B27C0E6F9B169EDDFD0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM> <YTXPR01MB0189CF0DCA909BA23F4A4F7BDDF20@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <0100015c6c549e3d-228427b4-2734-4ab5-9eef-88fc9ae71f9a-000000@email.amazonses.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Colin Percival wrote: >On 05/28/17 13:16, Rick Macklem wrote: >> cperciva@ is running a highly parallelized buuildworld and he sees bette= r >> slightly better elapsed times and much lower system CPU for SCHED_ULE. >> >> As such, I suspect it is the single threaded, processes mostly sleeping = waiting >> for I/O case that is broken. >> I suspect this is how many people use NFS, since a highly parallelized m= ake would >> not be a typical NFS client task, I think? > >Running `make buildworld -j36` on an EC2 "c4.8xlarge" instance (36 vCPUs, = 60 >GB RAM, 10 GbE) with GENERIC-NODEBUG, ULE has a slight edge over 4BSD: > >GENERIC-NODEBUG, SCHED_4BSD: > 1h14m12.48s real 6h25m44.59s user 1h4m53.42s sys > 1h15m25.48s real 6h25m12.20s user 1h4m34.23s sys > 1h13m34.02s real 6h25m14.44s user 1h4m09.55s sys > 1h13m44.04s real 6h25m08.60s user 1h4m40.21s sys > 1h14m59.69s real 6h25m53.13s user 1h4m55.20s sys > 1h14m24.00s real 6h24m59.29s user 1h5m37.31s sys > >GENERIC-NODEBUG, SCHED_ULE: > 1h13m00.61s real 6h02m47.59s user 26m45.89s sys > 1h12m30.18s real 6h01m39.97s user 26m16.45s sys > 1h13m08.43s real 6h01m46.94s user 26m39.20s sys > 1h12m18.94s real 6h02m26.80s user 27m39.71s sys > 1h13m21.38s real 6h00m46.13s user 27m14.96s sys > 1h12m01.80s real 6h02m24.48s user 27m18.37s sys > >Running `make buildworld -j2` on an E2 "m4.large" instance (2 vCPUs, 8 GB = RAM, >~ 500 Mbps network), 4BSD has a slight edge over ULE on real and sys >time but is slightly worse on user time: > >GENERIC-NODEBUG, SCHED_4BSD: > 6h29m25.17s real 7h2m56.02s user 14m52.63s sys > 6h29m36.82s real 7h2m58.19s user 15m14.21s sys > 6h28m27.61s real 7h1m38.24s user 14m56.91s sys > 6h27m05.42s real 7h1m38.57s user 15m04.31s sys > >GENERIC-NODEBUG, SCHED_ULE: > 6h34m19.41s real 6h59m43.99s user 18m8.62s sys > 6h33m55.08s real 6h58m44.91s user 18m4.31s sys > 6h34m49.68s real 6h56m03.58s user 17m49.83s sys > 6h35m22.14s real 6h58m12.62s user 17m52.05s sys Doing these test runs, but on the 36v CPU system would be closer to what I was testing. My tests do not use "-j" and run on an 8core chunk of real hardware. >Note that in both cases there is lots of idle time (although far more in t= he >-j36 case); this is partly due to a lack of parallelism in buildworld, but >largely due to having /usr/obj mounted on Amazon EFS. > >These differences all seem within the range which could result from cache >effects due to threads staying on one CPU rather than bouncing around; so >whatever Rick is tripping over, it doesn't seem to be affecting these test= s. Yep. Thanks for doing the testing, rick
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB01896F824DBDDE12A6FFD0F5DDF40>