Date: Tue, 22 May 2012 20:55:47 +0100 From: Attilio Rao <attilio@freebsd.org> To: Pawel Jakub Dawidek <pjd@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, Kirk McKusick <mckusick@freebsd.org> Subject: Re: svn commit: r234482 - in head/sys: fs/msdosfs fs/nfsserver kern sys Message-ID: <CAJ-FndBFhNMboM-iOO=GN0OLXFGBTf4QU-enqcGoGobrBi-soA@mail.gmail.com> In-Reply-To: <20120422170842.GA18403@garage.freebsd.pl> References: <201204200650.q3K6oiqO026673@svn.freebsd.org> <20120422170842.GA18403@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
2012/4/22 Pawel Jakub Dawidek <pjd@freebsd.org>: > On Fri, Apr 20, 2012 at 06:50:44AM +0000, Kirk McKusick wrote: >> Author: mckusick >> Date: Fri Apr 20 06:50:44 2012 >> New Revision: 234482 >> URL: http://svn.freebsd.org/changeset/base/234482 >> >> Log: >> =C2=A0 This change creates a new list of active vnodes associated with >> =C2=A0 a mount point. Active vnodes are those with a non-zero use or hol= d >> =C2=A0 count, e.g., those vnodes that are not on the free list. Note tha= t >> =C2=A0 this list is in addition to the list of all the vnodes associated >> =C2=A0 with a mount point. >> >> =C2=A0 To avoid adding another set of linkage pointers to the vnode >> =C2=A0 structure, the active list uses the existing linkage pointers >> =C2=A0 used by the free list (previously named v_freelist, now renamed >> =C2=A0 v_actfreelist). >> >> =C2=A0 This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loop= s >> =C2=A0 over just the active vnodes associated with a mount point (typica= lly >> =C2=A0 less than 1% of the vnodes associated with the mount point). > [...] >> @@ -1099,6 +1128,14 @@ insmntque1(struct vnode *vp, struct moun >> =C2=A0 =C2=A0 =C2=A0 VNASSERT(mp->mnt_nvnodelistsize >=3D 0, vp, >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ("neg mount point vnode= list size")); >> =C2=A0 =C2=A0 =C2=A0 mp->mnt_nvnodelistsize++; >> + =C2=A0 =C2=A0 KASSERT((vp->v_iflag & VI_ACTIVE) =3D=3D 0, >> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 ("Activating already active vnode")); >> + =C2=A0 =C2=A0 vp->v_iflag |=3D VI_ACTIVE; >> + =C2=A0 =C2=A0 mtx_lock(&vnode_free_list_mtx); >> + =C2=A0 =C2=A0 TAILQ_INSERT_HEAD(&mp->mnt_activevnodelist, vp, v_actfre= elist); >> + =C2=A0 =C2=A0 mp->mnt_activevnodelistsize++; >> + =C2=A0 =C2=A0 mtx_unlock(&vnode_free_list_mtx); >> + =C2=A0 =C2=A0 VI_UNLOCK(vp); >> =C2=A0 =C2=A0 =C2=A0 MNT_IUNLOCK(mp); >> =C2=A0 =C2=A0 =C2=A0 return (0); >> =C2=A0} > > > Now, for every vnode that is activated, it has to go through global > mutex, which seems like scalability issue to me. With ZFS it is typical > to have a lot of file systems and this global mutex was not needed > before (well, it was needed, but only to get vnode from the free list). > > If we require vnode interlock to be held during v_actfreelist > manipulation then why can't we use interlock+vnode_free_list_mtx when > operating on the free list and interlock+per-mountpoint-lock when > operating on mnt_activevnodelist? I think this is the better idea for this case and it should really be fixed= . However, note that a per-mount lock here is far from being ideal as you would contest a lot on the mountpoint if you have a lot of activated vnodes. Anyway the approach you propose doesn't introduce any scalability difference than what we already had in place for pre-234482. Also, it would be good to implement things like: mtx_assert(MNT_MTX(mp), MA_OWNED); by providing appropriate wrappers as we do in vnode interface. Thanks, Attilio --=20 Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndBFhNMboM-iOO=GN0OLXFGBTf4QU-enqcGoGobrBi-soA>