From owner-freebsd-arch@FreeBSD.ORG Fri Feb 3 19:37:26 2012 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 191E8106566B for ; Fri, 3 Feb 2012 19:37:26 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 80BF78FC15 for ; Fri, 3 Feb 2012 19:37:24 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q13JbKjL013317 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 3 Feb 2012 21:37:20 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q13JbJQk029323 for ; Fri, 3 Feb 2012 21:37:19 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q13JbJvv029322 for arch@freebsd.org; Fri, 3 Feb 2012 21:37:19 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 3 Feb 2012 21:37:19 +0200 From: Konstantin Belousov To: arch@freebsd.org Message-ID: <20120203193719.GB3283@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="sZvnRN25x3w09J/6" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: Subject: Prefaulting for i/o buffers X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Feb 2012 19:37:26 -0000 --sZvnRN25x3w09J/6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline FreeBSD I/O infrastructure has well known issue with deadlock caused by vnode lock order reversal when buffers supplied to read(2) or write(2) syscalls are backed by mmaped file. I previously published the patches to convert i/o path to use VMIO, based on the Jeff Roberson proposal, see http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the deadlock. Since that work is very intrusive and did not got any follow-up, it get stalled. Below is very lightweight patch which only goal is to fix deadlock in the least intrusive way. This is possible after FreeBSD got the vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs. http://people.freebsd.org/~kib/misc/vm1.3.patch Theory of operation is described in the patched sys/kern/vfs_vnops.c, see preamble comment for vn_io_fault(). The patch borrows the rangelocks implementation from VM6, which was discussed and improved together with Attilio Rao. I was not able to reproduce the deadlock in the targeted test running for several hours, while stock HEAD deadlocks in the first iteration. Below is the benchmark for the worst-case situation for the patched system, reading 1 byte from a file in a loop. The value is the time in seconds to execute read(2) for single byte and lseek back to the start of the file. The loop is executed 100,000,000 times. Machine has 3.4Ghz Core i7 2600K and used HEAD@230866 with debugging options turned off. As you see, the rangelock overhead for the worst (but uncontented) case is less then 10%. x stock-1-byte.txt + vm1-1-byte.txt +--------------------------------------------------------------------------+ |xx ++| |xxx +++| ||A |A|| +--------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 1.063206e-06 1.065569e-06 1.064172e-06 1.064109e-06 9.8031959e-10 + 5 1.167145e-06 1.170244e-06 1.168939e-06 1.1690444e-06 1.2477022e-09 Difference at 95.0% confidence 1.04935e-07 +/- 1.63638e-09 9.86134% +/- 0.153779% (Student's t, pooled s = 1.122e-09) --sZvnRN25x3w09J/6 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAk8sN28ACgkQC3+MBN1Mb4j5YwCgvBRtHeZMSQrKiXG7AZX2sJf8 fbkAoLaL/489HyVNCImU/pq2yNVOJVHS =o8ql -----END PGP SIGNATURE----- --sZvnRN25x3w09J/6--