From owner-freebsd-current@FreeBSD.ORG Thu Mar 12 15:14:43 2015 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id EE2A871C for ; Thu, 12 Mar 2015 15:14:43 +0000 (UTC) Received: from mail-ig0-x22d.google.com (mail-ig0-x22d.google.com [IPv6:2607:f8b0:4001:c05::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B4F3687C for ; Thu, 12 Mar 2015 15:14:43 +0000 (UTC) Received: by igdh15 with SMTP id h15so45156487igd.4 for ; Thu, 12 Mar 2015 08:14:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=sOV9P+PjGG0pWHyNJvPS5HqMiqtd8hCTKlCMDmVzs3I=; b=M6DEMEdJ2M1S7VPQcAWOx/i5GFnVvC+KI/CJSe5rHYjoHJiJxpLPZNtKzXnBEhJ4TO MTaVWS0TPUeuVOIks+YdVPWm5VQrPTqqYCinZk66RY7qvWEEqyj8uBpWQ6crflaY7q7D E2VVa3/VLNYdQgyAgIJiGY+SiG6ckTRXZwSlm5fR/v+9mg9Gy0MPOW2nTZOhHMiUSwE8 8FlE0990USuYYCEnilOqKzvZtLhxQLeKoQYkidAVHlEtLA/aHr6zyHXhvJRAlI2v6r+9 WAtm14Ly9qhMI4f31WGXu/A2rJdKhCgyQSE7q0LQS8Ef0ktoZf9VN/EaSYTc+0T9Foou 94YA== MIME-Version: 1.0 X-Received: by 10.107.168.5 with SMTP id r5mr73673824ioe.87.1426173283033; Thu, 12 Mar 2015 08:14:43 -0700 (PDT) Received: by 10.107.156.75 with HTTP; Thu, 12 Mar 2015 08:14:42 -0700 (PDT) Date: Thu, 12 Mar 2015 11:14:42 -0400 Message-ID: Subject: [PATCH] Convert the VFS cache lock to an rmlock From: Ryan Stone To: FreeBSD Current Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.18-1 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Mar 2015 15:14:44 -0000 I've just submitted a patch to Differential[1] for review that converts the VFS cache to use an rmlock in place of the current rwlock. My main motivation for the change is to fix a priority inversion problem that I saw recently. A real-time priority thread attempted to acquire a write lock on the VFS cache lock, but there was already a reader holding it. The reader was preempted by a normal priority thread, and my real-time thread was starved. [1] https://reviews.freebsd.org/D2051 I was worried about the performance implications of the change, as I wasn't sure how common write operations on the VFS cache would be. I did a -j12 buildworld/buildkernel test on a 12-core Haswell Xeon system, as I figured that would be a reasonable stress test that simultaneously creates lots of small files and reads a lot of files as well. This actually wound up being about a 10% performance *increase* (the units below are seconds of elapsed time as measured by /usr/bin/time, so smaller is better): $ ministat -C 1 orig.log rmlock.log x orig.log + rmlock.log +------------------------------------------------------------------------------+ | + x | |++++ x x xxx | | |A| |_________A___M____|| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 2710.31 2821.35 2816.75 2798.0617 43.324817 + 5 2488.25 2500.25 2498.04 2495.756 5.0494782 Difference at 95.0% confidence -302.306 +/- 44.4709 -10.8041% +/- 1.58935% (Student's t, pooled s = 32.4674) The one outlier in the rwlock case does confuse me a bit. What I did was booted a freshly-built image with the rmlock lock applied, did a git checkout of head, and then did 5 builds in a row. The git checkout should have had the effect of priming the disk cache with the source files. Then I installed the stock head kernel, rebooted, and ran 5 more builds (and then 1 more when I noticed the outlier). The fast outlier was the *first* run, which should have been running with a cold disk cache, so I really don't know why it would be 90 seconds faster. I do see that this run also had about 500-600 fewer seconds spent in system time: x orig.log +------------------------------------------------------------------------------+ | x | |x x x xx | | |_________________________A__________M_____________|| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 3515.23 4121.84 4105.57 4001.71 239.61362 I'm not sure how much that I care, given that the rmlock is universally faster (but maybe I should try the "cold boot" case anyway). If anybody had any comments or further testing that they would like to see, please let me know.