Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Feb 2012 21:37:19 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        arch@freebsd.org
Subject:   Prefaulting for i/o buffers
Message-ID:  <20120203193719.GB3283@deviant.kiev.zoral.com.ua>

next in thread | raw e-mail | index | archive | help

--sZvnRN25x3w09J/6
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

FreeBSD I/O infrastructure has well known issue with deadlock caused
by vnode lock order reversal when buffers supplied to read(2) or
write(2) syscalls are backed by mmaped file.

I previously published the patches to convert i/o path to use VMIO,
based on the Jeff Roberson proposal, see
http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the
deadlock. Since that work is very intrusive and did not got any
follow-up, it get stalled.

Below is very lightweight patch which only goal is to fix deadlock in
the least intrusive way. This is possible after FreeBSD got the
vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs.
http://people.freebsd.org/~kib/misc/vm1.3.patch

Theory of operation is described in the patched sys/kern/vfs_vnops.c,
see preamble comment for vn_io_fault(). The patch borrows the
rangelocks implementation from VM6, which was discussed and improved
together with Attilio Rao.

I was not able to reproduce the deadlock in the targeted test running
for several hours, while stock HEAD deadlocks in the first iteration.

Below is the benchmark for the worst-case situation for the patched
system, reading 1 byte from a file in a loop. The value is the time in
seconds to execute read(2) for single byte and lseek back to the start
of the file. The loop is executed 100,000,000 times. Machine has
3.4Ghz Core i7 2600K and used HEAD@230866 with debugging options
turned off.

As you see, the rangelock overhead for the worst (but uncontented)
case is less then 10%.

x stock-1-byte.txt
+ vm1-1-byte.txt
+--------------------------------------------------------------------------+
|xx                                                                      ++|
|xxx                                                                    +++|
||A                                                                     |A||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5  1.063206e-06  1.065569e-06  1.064172e-06  1.064109e-06 9.8031959e-10
+   5  1.167145e-06  1.170244e-06  1.168939e-06 1.1690444e-06 1.2477022e-09
Difference at 95.0% confidence
	1.04935e-07 +/- 1.63638e-09
	9.86134% +/- 0.153779%
	(Student's t, pooled s = 1.122e-09)


--sZvnRN25x3w09J/6
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk8sN28ACgkQC3+MBN1Mb4j5YwCgvBRtHeZMSQrKiXG7AZX2sJf8
fbkAoLaL/489HyVNCImU/pq2yNVOJVHS
=o8ql
-----END PGP SIGNATURE-----

--sZvnRN25x3w09J/6--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120203193719.GB3283>