Date: Fri, 13 Mar 2015 11:23:06 -0400 From: Ryan Stone <rysto32@gmail.com> To: Mateusz Guzik <mjguzik@gmail.com>, Ryan Stone <rysto32@gmail.com>, FreeBSD Current <freebsd-current@freebsd.org> Subject: Re: [PATCH] Convert the VFS cache lock to an rmlock Message-ID: <CAFMmRNz6LB81Sf%2BXK%2BN57RvoUOuRHPVHjL%2B0Op_QrQ3J95B8Vw@mail.gmail.com> In-Reply-To: <20150312173635.GB9153@dft-labs.eu> References: <CAFMmRNysnUezX9ozGrCpivPCTMYRJtoxm9ijR0yQO03LpXnwBQ@mail.gmail.com> <20150312173635.GB9153@dft-labs.eu>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 12, 2015 at 1:36 PM, Mateusz Guzik <mjguzik@gmail.com> wrote: > Workloads like buildworld and the like (i.e. a lot of forks + execs) run > into very severe contention in vm, which is orders of magnitude bigger > than anything else. > > As such your result seems quite suspicious. > You're right, I did mess up the testing somewhere (I have no idea how). As you suggested, I switched to using a separate partition for the objdir, and ran each build with a freshly newfsed filesystem. I scripted it to be sure that I was following the same procedure with each run: # Build known-working commit from head git checkout 09be0092bd3285dd33e99bcab593981060e99058 || exit 1 for i in `jot 5` do # Create a fresh fs for objdir sudo umount -f /usr/obj 2> /dev/null sudo newfs -U -j -L OBJ $objdev || exit 1 sudo mount $objdev /usr/obj || exit 1 sudo chmod a+rwx /usr/obj || exit 1 # Ensure disk cache contains all source files git status > /dev/null /usr/bin/time -a -o $logfile make -s -j$(sysctl -n hw.ncpu) buildworld buildkernel done I tested on the original 12-core machine, as well as a 2 package x 8 core x 2 HTT (32 logical cores) machine that a co-worker was able to lend me. Unfortunately, the results show a performance decrease now. It's almost 5% on the 32 core machine: $ ministat -w 74 -C 1 12core/* x 12core/orig.log + 12core/rmlock.log +--------------------------------------------------------------------------+ |x xx x x + + + + +| | |_________A__________| |_______________A___M__________|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 2478.81 2487.74 2483.45 2483.652 3.2495646 + 5 2489.64 2501.67 2498.26 2496.832 4.7394694 Difference at 95.0% confidence 13.18 +/- 5.92622 0.53067% +/- 0.238609% (Student's t, pooled s = 4.06339) $ ministat -w 74 -C 1 32core/* x 32core/orig.log + 32core/rmlock.log +--------------------------------------------------------------------------+ |x x + | |x x x + ++ +| ||__AM| |_______AM_____| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 1067.97 1072.86 1071.29 1070.314 2.2238997 + 5 1111.22 1129.05 1122.3 1121.324 6.4046569 Difference at 95.0% confidence 51.01 +/- 6.99181 4.76589% +/- 0.653249% (Student's t, pooled s = 4.79403) The difference is due to a significant increase in system time. Write locks on an rmlock are extremely expensive (they involve an smp_rendezvous), and the cost likely scales with the number of cores: x 32core/orig.log + 32core/rmlock.log +--------------------------------------------------------------------------+ |xxx x + +++ +| ||_MA__| |____MA______| | +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 5616.63 5715.7 5641.5 5661.72 48.511545 + 5 6502.51 6781.84 6596.5 6612.39 103.06568 Difference at 95.0% confidence 950.67 +/- 117.474 16.7912% +/- 2.07489% (Student's t, pooled s = 80.5478) At this point I'm pretty much at an impasse. The real-time behaviour is critical to me, but a 5% performance degradation isn't likely to be acceptable to many people. I'll see what I can do with this.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFMmRNz6LB81Sf%2BXK%2BN57RvoUOuRHPVHjL%2B0Op_QrQ3J95B8Vw>