Date: Fri, 13 Mar 2015 11:23:06 -0400 From: Ryan Stone <rysto32@gmail.com> To: Mateusz Guzik <mjguzik@gmail.com>, Ryan Stone <rysto32@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: [PATCH] Convert the VFS cache lock to an rmlock Message-ID: <CAFMmRNz6LB81Sf%2BXK%2BN57RvoUOuRHPVHjL%2B0Op_QrQ3J95B8Vw@mail.gmail.com> In-Reply-To: <20150312173635.GB9153@dft-labs.eu> References: <CAFMmRNysnUezX9ozGrCpivPCTMYRJtoxm9ijR0yQO03LpXnwBQ@mail.gmail.com> <20150312173635.GB9153@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 12, 2015 at 1:36 PM, Mateusz Guzik <mjguzik@gmail.com> wrote:
> Workloads like buildworld and the like (i.e. a lot of forks + execs) run
> into very severe contention in vm, which is orders of magnitude bigger
> than anything else.
>
> As such your result seems quite suspicious.
>
You're right, I did mess up the testing somewhere (I have no idea how). As
you suggested, I switched to using a separate partition for the objdir, and
ran each build with a freshly newfsed filesystem. I scripted it to be sure
that I was following the same procedure with each run:
# Build known-working commit from head
git checkout 09be0092bd3285dd33e99bcab593981060e99058 || exit 1
for i in `jot 5`
do
# Create a fresh fs for objdir
sudo umount -f /usr/obj 2> /dev/null
sudo newfs -U -j -L OBJ $objdev || exit 1
sudo mount $objdev /usr/obj || exit 1
sudo chmod a+rwx /usr/obj || exit 1
# Ensure disk cache contains all source files
git status > /dev/null
/usr/bin/time -a -o $logfile make -s -j$(sysctl -n hw.ncpu) buildworld
buildkernel
done
I tested on the original 12-core machine, as well as a 2 package x 8 core x
2 HTT (32 logical cores) machine that a co-worker was able to lend me.
Unfortunately, the results show a performance decrease now. It's almost 5%
on the 32 core machine:
$ ministat -w 74 -C 1 12core/*
x 12core/orig.log
+ 12core/rmlock.log
+--------------------------------------------------------------------------+
|x xx x x + + + + +|
| |_________A__________| |_______________A___M__________||
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 2478.81 2487.74 2483.45 2483.652 3.2495646
+ 5 2489.64 2501.67 2498.26 2496.832 4.7394694
Difference at 95.0% confidence
13.18 +/- 5.92622
0.53067% +/- 0.238609%
(Student's t, pooled s = 4.06339)
$ ministat -w 74 -C 1 32core/*
x 32core/orig.log
+ 32core/rmlock.log
+--------------------------------------------------------------------------+
|x x + |
|x x x + ++ +|
||__AM| |_______AM_____| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 1067.97 1072.86 1071.29 1070.314 2.2238997
+ 5 1111.22 1129.05 1122.3 1121.324 6.4046569
Difference at 95.0% confidence
51.01 +/- 6.99181
4.76589% +/- 0.653249%
(Student's t, pooled s = 4.79403)
The difference is due to a significant increase in system time. Write
locks on an rmlock are extremely expensive (they involve an
smp_rendezvous), and the cost likely scales with the number of cores:
x 32core/orig.log
+ 32core/rmlock.log
+--------------------------------------------------------------------------+
|xxx x + +++ +|
||_MA__| |____MA______| |
+--------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 5 5616.63 5715.7 5641.5 5661.72 48.511545
+ 5 6502.51 6781.84 6596.5 6612.39 103.06568
Difference at 95.0% confidence
950.67 +/- 117.474
16.7912% +/- 2.07489%
(Student's t, pooled s = 80.5478)
At this point I'm pretty much at an impasse. The real-time behaviour is
critical to me, but a 5% performance degradation isn't likely to be
acceptable to many people. I'll see what I can do with this.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNz6LB81Sf%2BXK%2BN57RvoUOuRHPVHjL%2B0Op_QrQ3J95B8Vw>
