Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 10 May 2012 14:58:57 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Grzegorz Bernacki <gber@freebsd.org>
Cc:        svn-src-projects@freebsd.org, src-committers@freebsd.org, mjg@semihalf.com
Subject:   Re: svn commit: r233072 - projects/nand/sys/kern
Message-ID:  <20120510115857.GH2358@deviant.kiev.zoral.com.ua>
In-Reply-To: <4FABC64F.3060502@freebsd.org>
References:  <201203170318.q2H3ITdI047893@svn.freebsd.org> <20120317085116.GC1340@garage.freebsd.pl> <20120317161050.GI75778@deviant.kiev.zoral.com.ua> <4FA8FFB9.7090002@freebsd.org> <20120508095631.GV2358@deviant.kiev.zoral.com.ua> <4FA94609.3060306@freebsd.org> <20120510103105.GG2358@deviant.kiev.zoral.com.ua> <4FABC64F.3060502@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--yn4/E51H+KfvW8qm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, May 10, 2012 at 03:44:47PM +0200, Grzegorz Bernacki wrote:
> On 05/10/12 12:31, Konstantin Belousov wrote:
> >On Tue, May 08, 2012 at 06:12:57PM +0200, Grzegorz Bernacki wrote:
> >>On 05/08/12 11:56, Konstantin Belousov wrote:
> >>>On Tue, May 08, 2012 at 01:12:57PM +0200, Grzegorz Bernacki wrote:
> >>>>On 03/17/12 17:10, Konstantin Belousov wrote:
> >>>>>On Sat, Mar 17, 2012 at 09:51:16AM +0100, Pawel Jakub Dawidek wrote:
> >>>>>>On Sat, Mar 17, 2012 at 03:18:29AM +0000, Grzegorz Bernacki wrote:
> >>>>>>>Author: gber
> >>>>>>>Date: Sat Mar 17 03:18:28 2012
> >>>>>>>New Revision: 233072
> >>>>>>>URL: http://svn.freebsd.org/changeset/base/233072
> >>>>>>>
> >>>>>>>Log:
> >>>>>>>   Add VFS changes necessary for NANDFS to work.
> >>>>>>>
> >>>>>>>   Ignore B_MANAGED buffer by syncer and ignore signal when msleep=
 as
> >>>>>>>   it
> >>>>>>>   can cause file system inconsistency.
> >>>>>>
> >>>>>>I'd suggest running these changes through kib@. Especially
> >>>>>>vn_start_write()
> >>>>>>change below looks ugly, but maybe it is only temporary?
> >>>>>It is not only ugly (and  object against it).
> >>>>>
> >>>>>If the change makes any difference for the filesystem, then I just=
=20
> >>>>>argue
> >>>>>that the filesystem is broken. The vn_start_write() is done on the
> >>>>>VFS entry peripheral, long before filesystem code is hit.
> >>>>>
> >>>>>I did not looked at the managed changes, you would need to describe
> >>>>>what is wrong with current code and what is the purpose of the chang=
es.
> >>>>>B_MANAGED came from xfs, it seems, or at least xfs is the only curre=
nt
> >>>>>consumer of B_MANAGED buffers.
> >>>>
> >>>>Hi Kostik,
> >>>>
> >>>>Without our change in getblk() whenewer we allocate new block we get
> >>>>panic:
> >>>>
> >>>>panic: bremfree: buffer 0xffffff807bf86080 not on a queue.
> >>>>
> >>>>It is because blocks with B_MANAGED flag are not queued on any queue =
in
> >>>>brelse() function. Could you look at it and give us approval to merge
> >>>>this change into HEAD?
> >>>
> >>>Right, but this is in fact the only function of the B_MANAGED flag.
> >>>So the question is, what are you trying to accomplish.
> >>
> >>Hi Kostik,
> >>
> >>There are two separate issues.
> >>First is that if we have B_MANAGED flag it should not cause the panic
> >>when used, so in my opinion fix should go into HEAD regardless of NANDF=
S.
> >>Second thing is the reason of having B_MANAGED flag. We use it because
> >>we want to decide when and where to save buffers. Our FS is log
> >>filesystem and we write buffers every given amount of time or if number
> >>of dirty buffer exceed given threshold. We write buffers in large groups
> >>along with metadata related to this group, so we cannot afford writing
> >>single buffer.  As a result we cannot allow buf deamon to write buffers
> >>in an ad hoc manner.
> >>I hope I answered your question. Please let me know if you have more
> >>concerns. If you have some ideas how we can avoid using B_MANAGED flags
> >>please let us know.
> >
> >I looked at the whole B_MANAGED business over the two days. Finally I
> >think that I agree to some extent with the idea of the patch, but I
> >still do not like details. I put below the change that I think we are
> >discussing.
> >
> >Index: vfs_bio.c
> >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> >--- vfs_bio.c	(.../head/sys/kern/vfs_bio.c)	(revision 235215)
> >+++ vfs_bio.c	(.../projects/nand/sys/kern/vfs_bio.c)	(revision 235215)
> >@@ -2672,7 +2672,8 @@ loop:
> >  		else if ((bp->b_flags&  (B_VMIO | B_INVAL)) =3D=3D 0)
> >  			bp->b_flags |=3D B_CACHE;
> >  		BO_LOCK(bo);
> >-		bremfree(bp);
> >+		if (!(bp->b_flags&  B_MANAGED))
> >+			bremfree(bp);
> >  		BO_UNLOCK(bo);
> >
> >  		/*
> >
> >First note, you should not take BO_LOCK if you are not going to call
> >bremfree().Second, I think you need to assert that b_qindex is already
> >QUEUE_NONE if B_MANAGED is set.
> >
> >Note the comment right after the if (bp !=3D NULL) line in getblk(). I t=
hink
> >it needs to be adjusted to say 'if not managed' or like.
>=20
> Yes, you are right. How about this patch:
>=20
> http://people.freebsd.org/~gber/patches/vfs_bio.diff
Please revert the test condition and exchange the then/else blocks.
It is easier to read this way.

Still the question how xfs will react to the change stands. Xfs is the only
current in-tree consumer of B_MANAGED, might be it is too rotten.

>=20
> ?
> >
> >As a general note, explaining my limited disagreement with the whole idea
> >of using managed buffers, is the question how do you react to low free
> >buffer condition at all ? Do you get the notification on the condition at
> >all ?
>=20
> I agree that B_MANAGED should be used carefully, but this feature might=
=20
> be very useful and we should not get rid of it.
>=20
> Indeed we do not get notifications from bufdaemon, however we have our=20
> own syncer thread which start writes when number of dirty buffer exceeds=
=20
> given threshold. I believe that our threshold of dirty buffers which=20
> triggers writes is much lower than the one used by buf deamon. I think=20
> that this will prevent the system from having too many dirty buffers.=20
> But see below.
>=20
> >
> >Current scheme involves activation of the bufdaemon when getnewbuf() can=
not
> >find a suitable buffer or KVA. With a non-trivial count of buffers not
> >presented on any queue, system has high risk of deadlocking.
>=20
> As noted earlier I don't think that this will be a problem with nandfs.=
=20
> However in case such notifications are really needed, I think we can add=
=20
> an EVENTHANDLER that would fire when buf daemon decides to flush=20
> buffers. Does this sound reasonable?

Note that bufdaemon cannot flush some buffers on its own, mostly the
buffers belonging to the locked vnode. In this case, getnewbuf() caller
threads are used as bufdaemon helpers.

Might be eventhandler calls from bufdaemon are not bad, but I want to see
a code before making the judgement.

--yn4/E51H+KfvW8qm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAk+rrYEACgkQC3+MBN1Mb4h3UgCgxkBXUef5OHssNlvaJTKVxG3V
vwEAn2lIUjeEHqgGYTUK37J/KI3oicAx
=6lww
-----END PGP SIGNATURE-----

--yn4/E51H+KfvW8qm--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120510115857.GH2358>