Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Feb 2012 19:40:37 +0000
From:      Attilio Rao <attilio@freebsd.org>
To:        Konstantin Belousov <kostikbel@gmail.com>
Cc:        arch@freebsd.org
Subject:   Re: Prefaulting for i/o buffers
Message-ID:  <CAJ-FndDyFBQvmg1sBXfdZij6jC=WvWoYDBBurAOg=q36mdcPYw@mail.gmail.com>
In-Reply-To: <20120203193719.GB3283@deviant.kiev.zoral.com.ua>
References:  <20120203193719.GB3283@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
2012/2/3 Konstantin Belousov <kostikbel@gmail.com>:
> FreeBSD I/O infrastructure has well known issue with deadlock caused
> by vnode lock order reversal when buffers supplied to read(2) or
> write(2) syscalls are backed by mmaped file.
>
> I previously published the patches to convert i/o path to use VMIO,
> based on the Jeff Roberson proposal, see
> http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the
> deadlock. Since that work is very intrusive and did not got any
> follow-up, it get stalled.
>
> Below is very lightweight patch which only goal is to fix deadlock in
> the least intrusive way. This is possible after FreeBSD got the
> vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs.
> http://people.freebsd.org/~kib/misc/vm1.3.patch
>
> Theory of operation is described in the patched sys/kern/vfs_vnops.c,
> see preamble comment for vn_io_fault(). The patch borrows the
> rangelocks implementation from VM6, which was discussed and improved
> together with Attilio Rao.
>
> I was not able to reproduce the deadlock in the targeted test running
> for several hours, while stock HEAD deadlocks in the first iteration.
>
> Below is the benchmark for the worst-case situation for the patched
> system, reading 1 byte from a file in a loop. The value is the time in
> seconds to execute read(2) for single byte and lseek back to the start
> of the file. The loop is executed 100,000,000 times. Machine has
> 3.4Ghz Core i7 2600K and used HEAD@230866 with debugging options
> turned off.
>
> As you see, the rangelock overhead for the worst (but uncontented)
> case is less then 10%.
>
> x stock-1-byte.txt
> + vm1-1-byte.txt
> +------------------------------------------------------------------------=
--+
> |xx =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0++|
> |xxx =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0+++|
> ||A =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 |A||
> +------------------------------------------------------------------------=
--+
> =C2=A0 =C2=A0N =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Min =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 Max =C2=A0 =C2=A0 =C2=A0 =C2=A0Median =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 Avg =C2=A0 =C2=A0 =C2=A0 =C2=A0Stddev
> x =C2=A0 5 =C2=A01.063206e-06 =C2=A01.065569e-06 =C2=A01.064172e-06 =C2=
=A01.064109e-06 9.8031959e-10
> + =C2=A0 5 =C2=A01.167145e-06 =C2=A01.170244e-06 =C2=A01.168939e-06 1.169=
0444e-06 1.2477022e-09
> Difference at 95.0% confidence
> =C2=A0 =C2=A0 =C2=A0 =C2=A01.04935e-07 +/- 1.63638e-09
> =C2=A0 =C2=A0 =C2=A0 =C2=A09.86134% +/- 0.153779%
> =C2=A0 =C2=A0 =C2=A0 =C2=A0(Student's t, pooled s =3D 1.122e-09)

Do you have an ETA for reviews? When do you plan to commit this?
it would be valuable to get a grasp on the benchmark and refine the
performance difference as much as possible.

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndDyFBQvmg1sBXfdZij6jC=WvWoYDBBurAOg=q36mdcPYw>