Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Nov 2012 16:11:37 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Bruce Evans <brde@optusnet.com.au>
Subject:   Re: svn commit: r243311 - in head/sys: fs/ext2fs fs/msdosfs fs/nfsclient fs/nullfs fs/unionfs gnu/fs/reiserfs nfsclient ufs/ffs
Message-ID:  <20121120141137.GX73505@kib.kiev.ua>
In-Reply-To: <CAJ-FndABr3oRv9DtPnBmbwn-UCXFrf7ZfxauWF0XGOwqiMgwxA@mail.gmail.com>
References:  <201211192243.qAJMhjFF055708@svn.freebsd.org> <20121120162708.I924@besplex.bde.org> <CAJ-FndBrxipxRDKZuPww7j1xdz_uh7Qym6GBycP2kMtBGptoFg@mail.gmail.com> <20121120230708.G6016@besplex.bde.org> <CAJ-FndDhUEHNFv5RTBUACPqOeLBAnVZ4EKZhzdjx-FK84nVLhw@mail.gmail.com> <CAJ-FndAOc4ctzSMHH7dUCX90eGwYrAZnJGXpghmv-Mm=CzkS3Q@mail.gmail.com> <20121120234822.J6178@besplex.bde.org> <CAJ-FndABr3oRv9DtPnBmbwn-UCXFrf7ZfxauWF0XGOwqiMgwxA@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--7GRKew+LPhG+z9nO
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Nov 20, 2012 at 01:27:07PM +0000, Attilio Rao wrote:
> On 11/20/12, Bruce Evans <brde@optusnet.com.au> wrote:
> > On Tue, 20 Nov 2012, Attilio Rao wrote:
> >
> >> On 11/20/12, Attilio Rao <attilio@freebsd.org> wrote:
> >>> On 11/20/12, Bruce Evans <brde@optusnet.com.au> wrote:
> >>>> On Tue, 20 Nov 2012, Attilio Rao wrote:
> >>>>
> >>>>> On Tue, Nov 20, 2012 at 6:20 AM, Bruce Evans <brde@optusnet.com.au>
> >>>>> wrote:
> >>>>>> On Mon, 19 Nov 2012, Attilio Rao wrote:
> >>>>>>
> >>>>>>> Log:
> >>>>>>>  r16312 is not any longer real since many years (likely since when
> >>>>>>> VFS
> >>>>>>>  received granular locking) but the comment present in UFS has be=
en
> >>>>>>>  copied all over other filesystems code incorrectly for several
> >>>>>>> times.
> >>>>>>>
> >>>>>>>  Removes comments that makes no sense now.
> >>>>>>
> >>>>>>
> >>>>>> It still made sense (except for bitrot in the function name), but
> >>>>>> might
> >>>>>> not
> >>>>>> be true).  The code made sense with it.  Now the code makes no sen=
se.
> >>>>>>
> >>>>>>
> >>>>>>> Modified: head/sys/ufs/ffs/ffs_vfsops.c
> >>>>>>>
> >>>>>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
> >>>>>>> --- head/sys/ufs/ffs/ffs_vfsops.c       Mon Nov 19 21:58:14 2012
> >>>>>>> (r243310)
> >>>>>>> +++ head/sys/ufs/ffs/ffs_vfsops.c       Mon Nov 19 22:43:45 2012
> >>>>>>> (r243311)
> >>>>>>> @@ -1676,14 +1676,6 @@ ffs_vgetf(mp, ino, flags, vpp, ffs_flags
> >>>>>>>         ump =3D VFSTOUFS(mp);
> >>>>>>>         dev =3D ump->um_dev;
> >>>>>>>         fs =3D ump->um_fs;
> >>>>>>> -
> >>>>>>> -       /*
> >>>>>>> -        * If this malloc() is performed after the getnewvnode()
> >>>>>>
> >>>>>>
> >>>>>> This malloc() didn't match the code, which uses uma_zalloc().  Old
> >>>>>> versions used MALLOC() in both the comment and the code.  ffs's
> >>>>>> comment
> >>>>>> was updated to say malloc() when the code was changed to use
> >>>>>> malloc(),
> >>>>>> then rotted when the code was changed to use uma_zalloc().  In some
> >>>>>> other file systems, the comment still said MALLOC().
> >>>>>>
> >>>>>>
> >>>>>>> -        * it might block, leaving a vnode with a NULL v_data to =
be
> >>>>>>> -        * found by ffs_sync() if a sync happens to fire right th=
en,
> >>>>>>> -        * which will cause a panic because ffs_sync() blindly
> >>>>>>> -        * dereferences vp->v_data (as well it should).
> >>>>>>> -        */
> >>>>>>>         ip =3D uma_zalloc(uma_inode, M_WAITOK | M_ZERO);
> >>>>>>>
> >>>>>>>         /* Allocate a new vnode/inode. */
> >>>>>>>
> >>>>>>
> >>>>>> The code makes no sense now.  The comment explains why ip is
> >>>>>> allocated
> >>>>>> before vp, instead of in the natural, opposite order like it used =
to
> >>>>>> be.  Allocating things in an unnatural  order requires extra code =
to
> >>>>>> free ip when the allocation of vp fails.
> >>>>>
> >>>>> "Used to be" is very arguably. The code has been like its current f=
orm
> >>>>> many more years than the opposite (16 against 3 I think).
> >>>>> And the code makes perfectly sense if you know the history. So I do=
n't
> >>>>> agree with you.
> >>>>
> >>>> But it shouldn't be necessary to know the history of the code to
> >>>> understand it.  The code only makes sense if its comment is not
> >>>> removed,
> >>>> or if you know the history of the code so that you can restore the
> >>>> removed comment.  However, if the comment makes no sense as you clai=
m,
> >>>> then the code that it it describes makes no sense.
> >>>
> >>> The "code that makes no sense" is basically the justification to have
> >>> the allocation before the getnewvnode(). It makes no sense because the
> >>> order makes no sense (you can allocate before or after getnewvnode(),
> >>> you won't have v_data corruption as the comment claims).
> >>>
> >>> Hence the code makes no sense.
> >>
> >> Herm, s/code/comment.
> >
> > I think you are right that the comment makes no sense.  A preemptible
> > kernel may be preempted without it calling malloc() where it may block.
> > Thus ffs_fsync() and anything else that looks at the inode must be
> > doing something to avoid dereferencing v_data if the vnode is not fully
> > constructed.  This seems to be done by iterating over vnodes using
> > MNT_VNODE_FOREACH_ACTIVE*() and not making incomplete vnodes active.
> > ffs_fsync() still just blindly dereferences the inode.
> >
> > I think I am right that the code makes no sense.  It is ordered like
> > it is because placing the allocation of ip before the allocation of
> > vp used to be enough to prevent v_data being dereferenced.  This makes
> > no sense when it isn't enough.
>=20
> In the past, before VFS got locking and kernel was single-threaded,
> the comment and code arranged in this way were sensitive and
> effective.
> As now this is not true anymore, there is no strict relationship
> between the getnewvnode() and sleeping points.
> It is important to remove stale comments because they confuse people,
> the porters and as you can see the code/comment has been cut&paste
> quite a bit around.
>=20

Putting the issue of the comment making no sense at all aside, which is
true but tangential to what I want to note.

Although malloc(9) is not ordered with the vnode locks in the lock
order, it is still good practice to not perform allocations under the
vnode lock. The reason is that pagedaemon might need to clean and write
pages, in particular, belonging to the vnodes. Generally, if you own
vnode lock and do something which might kick pagedaemon, you are creating
troubles for pagedaemon, which would attempt to lock the vnode with
timeout, spending the precious time waiting for lock it cannot obtain.

This is less an issue for the new vnode instantiation locations, because
vnode cannot have resident pages yet. But it is good to follow the same
pattern anywhere.

--7GRKew+LPhG+z9nO
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (FreeBSD)

iQIcBAEBAgAGBQJQq4+YAAoJEJDCuSvBvK1BRrYP/1O385SsRvkGyQN8TfTv2y0N
BwfxUwaipWO9u9viEuDyrjPaIim3Bsh0fQdFaAQIio9C82jBxOCBIdedPoJsem5b
GFzQ+CPPoHPpFuxxjYvozdzxSYw5X06TNh51eiBQn6ZcJLzfhnGHsfnZ3jhzxO6V
CAtWXqndfP/NJOwT2chIeTo+cqNARbkCcLNvTjOEFVPYDHo0ijsh5a9KvMPxXioJ
5u14iRBZd0VzW0PriRe+HZUIKVbxAXMcKj9bn/3LlzgVtV5TFW4fkP9Y9LUVeP/J
c1NK3Hxn57+9aon4+vxRjzXAVvod8WeOCwcp+PWKDVFPemTMk9eCdeQwP1VVcYhY
HdbZvFGkjilMEGIVb4b8B+vJSAfZ1JISitNsyPOpVr6AplRKy6F7v5tXIWjGr4UH
CaTElE8ANFBqzy5n3/x6+oUaLKFywZGaAEAafZ7NSqBD9WKzTS3IBMQRFwcBNUFn
D+0PpvtlkbtlqtZvvw6NsGdM2AapBFtafLwxPULlJPlaaFYN3AbzDinFUUM23rPd
SHnI0OEGjhHXbhUK/gfT/lw2qvFKkxcQTzbZbG5yicW+9IKocuknK1hVUIqNSaJb
OyxTK3tk0HapGDMuTHiXrDW2QM5lkxuTyk0lMTA9vjOkpcCPd8wn5ggwvcg9WOau
in9ebKya+iZfA1KgkX23
=vt9T
-----END PGP SIGNATURE-----

--7GRKew+LPhG+z9nO--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20121120141137.GX73505>