Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 25 Feb 2012 16:13:34 +0100
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        Konstantin Belousov <kostikbel@gmail.com>, arch@freebsd.org
Subject:   Re: Prefaulting for i/o buffers
Message-ID:  <20120225151334.GH1344@garage.freebsd.pl>
In-Reply-To: <CAJ-FndABi21GfcCRTZizCPc_Mnxm1EY271BiXcYt9SD_zXFpXw@mail.gmail.com>
References:  <20120203193719.GB3283@deviant.kiev.zoral.com.ua> <CAJ-FndABi21GfcCRTZizCPc_Mnxm1EY271BiXcYt9SD_zXFpXw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--dDnEQgWzhgf+8aPe
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sat, Feb 25, 2012 at 01:01:32PM +0000, Attilio Rao wrote:
> Il 03 febbraio 2012 19:37, Konstantin Belousov <kostikbel@gmail.com> ha s=
critto:
> > FreeBSD I/O infrastructure has well known issue with deadlock caused
> > by vnode lock order reversal when buffers supplied to read(2) or
> > write(2) syscalls are backed by mmaped file.
> >
> > I previously published the patches to convert i/o path to use VMIO,
> > based on the Jeff Roberson proposal, see
> > http://wiki.freebsd.org/VM6. As a side effect, the VM6 fixed the
> > deadlock. Since that work is very intrusive and did not got any
> > follow-up, it get stalled.
> >
> > Below is very lightweight patch which only goal is to fix deadlock in
> > the least intrusive way. This is possible after FreeBSD got the
> > vm_fault_quick_hold_pages(9) and vm_fault_disable_pagefaults(9) KPIs.
> > http://people.freebsd.org/~kib/misc/vm1.3.patch
>=20
> Hi,
> I was reviewing:
> http://people.freebsd.org/~kib/misc/vm1.11.patch
>=20
> and I think it is great. It is simple enough and I don't have further
> comments on it.
>=20
> However, as a side note, I was thinking if we could get one day at the
> point to integrate rangelocks into vnodes lockmgr directly.
> It would be a huge patch, rewrtiting the locking of several members of
> vnodes likely, but I think it would be worth it in terms of cleaness
> of the interface and less overhead. Also, it would be interesting to
> consider merging rangelock implementation in ZFS' one, at some point.

I personal opinion about rangelocks and many other VFS features we
currently have is that it is good idea in theory, but in practise it
tends to overcomplicate VFS.

I'm in opinion that we should move as much stuff as we can to individual
file systems. We try to implement everything in VFS itself in hope that
this will simplify file systems we have. It then turns out only one file
system is really using this stuff (most of the time it is UFS) and this
is PITA for all the other file systems as well as maintaining VFS. VFS
became so complicated over the years that there are maybe few people
that can understand it, and every single change to VFS is a huge risk of
potentially breaking some unrelated parts.

File systems most of the time know much better how they work and what
should be done to make them optimal. For example ZFS had range locking
=66rom day one, but we can't take advantage of this, because our VFS "knows
better" how ZFS locking should be done. There plenty of examples:
- range vnode locking,
- shared vnode locking,
- quota (which I believe is still part of UFS, but I remember ideas of
  moving it to VFS),
- suspend/resume fs,
- buffer cache,
- vnodes reclamation.
I'm sure there are other examples.

In my opinion we should do whatever we can to simplify VFS. Having
complex VFS makes it harder, _not_ easier to develop file systems for
and port file systems to FreeBSD. Interaction with VFS was definiately
the hardest part of my work to port ZFS to FreeBSD.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://tupytaj.pl

--dDnEQgWzhgf+8aPe
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (FreeBSD)

iEYEARECAAYFAk9I+p4ACgkQForvXbEpPzQgHACfZaE1YXFSq/O7ry49e6kZ+dMr
Fq0An28wA1tET0i4dA1GbvqYx0GhGPMS
=oWR/
-----END PGP SIGNATURE-----

--dDnEQgWzhgf+8aPe--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120225151334.GH1344>