FreeBSD Mail Archives

Date:      Thu, 12 Mar 2015 18:36:35 +0100
From:      Mateusz Guzik <mjguzik@gmail.com>
To:        Ryan Stone <rysto32@gmail.com>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>
Subject:   Re: [PATCH] Convert the VFS cache lock to an rmlock
Message-ID:  <20150312173635.GB9153@dft-labs.eu>
In-Reply-To: <CAFMmRNysnUezX9ozGrCpivPCTMYRJtoxm9ijR0yQO03LpXnwBQ@mail.gmail.com>
References:  <CAFMmRNysnUezX9ozGrCpivPCTMYRJtoxm9ijR0yQO03LpXnwBQ@mail.gmail.com>

index | next in thread | previous in thread | raw e-mail


On Thu, Mar 12, 2015 at 11:14:42AM -0400, Ryan Stone wrote:
> I've just submitted a patch to Differential[1] for review that converts the
> VFS cache to use an rmlock in place of the current rwlock.  My main
> motivation for the change is to fix a priority inversion problem that I saw
> recently.  A real-time priority thread attempted to acquire a write lock on
> the VFS cache lock, but there was already a reader holding it.  The reader
> was preempted by a normal priority thread, and my real-time thread was
> starved.
> 
> [1] https://reviews.freebsd.org/D2051
> 
> 
> I was worried about the performance implications of the change, as I wasn't
> sure how common write operations on the VFS cache would be.  I did a -j12
> buildworld/buildkernel test on a 12-core Haswell Xeon system, as I figured
> that would be a reasonable stress test that simultaneously creates lots of
> small files and reads a lot of files as well.  This actually wound up being
> about a 10% performance *increase* (the units below are seconds of elapsed
> time as measured by /usr/bin/time, so smaller is better):
> 
> $ ministat -C 1 orig.log rmlock.log
> x orig.log
> + rmlock.log
> +------------------------------------------------------------------------------+
> |  +                                                                     x
>     |
> |++++                                            x                    x xxx
>    |
> | |A|
>  |_________A___M____||
> +------------------------------------------------------------------------------+
>     N           Min           Max        Median           Avg        Stddev
> x   6       2710.31       2821.35       2816.75     2798.0617     43.324817
> +   5       2488.25       2500.25       2498.04      2495.756     5.0494782
> Difference at 95.0% confidence
>         -302.306 +/- 44.4709
>         -10.8041% +/- 1.58935%
>         (Student's t, pooled s = 32.4674)
> 
> The one outlier in the rwlock case does confuse me a bit.  What I did was
> booted a freshly-built image with the rmlock lock applied, did a git
> checkout of head, and then did 5 builds in a row.  The git checkout should
> have had the effect of priming the disk cache with the source files.  Then
> I installed the stock head kernel, rebooted, and ran 5 more builds (and
> then 1 more when I noticed the outlier).  The fast outlier was the *first*
> run, which should have been running with a cold disk cache, so I really
> don't know why it would be 90 seconds faster.  I do see that this run also
> had about 500-600 fewer seconds spent in system time:
> 
> x orig.log
> +------------------------------------------------------------------------------+
> |
> x             |
> |x                                                        x   x
> xx             |
> |
> |_________________________A__________M_____________||
> +------------------------------------------------------------------------------+
>     N           Min           Max        Median           Avg        Stddev
> x   6       3515.23       4121.84       4105.57       4001.71     239.61362
> 
> I'm not sure how much that I care, given that the rmlock is universally
> faster (but maybe I should try the "cold boot" case anyway).
> 
> If anybody had any comments or further testing that they would like to see,
> please let me know.

Workloads like buildworld and the like (i.e. a lot of forks + execs) run
into very severe contention in vm, which is orders of magnitude bigger
than anything else.

As such your result seems quite suspicious.

Can you describe in more detail how were you testing?

Did you have a separate fs for obj tree which was mounted+unmounted
before each run?

I suggest you grab a machine from zoo[1] and run some tests on "bigger"
hardware.

A perf improvement, even slight, is definitely welcome.

[1] https://wiki.freebsd.org/TestClusterOneReservations

-- 
Mateusz Guzik <mjguzik gmail.com>

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20150312173635.GB9153>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation