Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Sep 2012 11:59:41 +0300
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        FS List <freebsd-fs@freebsd.org>
Subject:   Re: testing/review of atomic export update patch
Message-ID:  <20120918085941.GZ37286@deviant.kiev.zoral.com.ua>
In-Reply-To: <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca>
References:  <20120917122325.GR37286@deviant.kiev.zoral.com.ua> <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help

--hZpDuTGHUtM8eGVR
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote:
> Konstantin Belousov wrote:
> > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote:
> > > Hi,
> > >
> > > There is a simple patch at:
> > >   http://people.freebsd.org/~rmacklem/atomic-export.patch
> > > that can be applied to a kernel + mountd, so that the new
> > > nfsd can be suspended by mountd while the exports are being
> > > reloaded. It adds a new "-S" flag to mountd to enable this.
> > > (This avoids the long standing bug where clients receive ESTALE
> > >  replies to RPCs while mountd is reloading exports.)
> >=20
> > This looks simple, but also somewhat worrisome. What would happen
> > if the mountd crashes after nfsd suspension is requested, but before
> > resume was performed ?
> >=20
> > Might be, mountd should check for suspended nfsd on start and
> > unsuspend
> > it, if some flag is specified ?
> Well, I think that happens with the patch as it stands.
>=20
> suspend is done if the "-S" option is specified, but that is a no op
> if it is already suspended. The resume is done no matter what flags
> are provided, so mountd will always try and do a "resume".
> --> get_exportlist() is always called when mountd is started up and
>     it does the resume unconditionally when it completes.
>     If mountd repeatedly crashes before completing get_exportlist()
>     when it is started up, the exports will be all messed up, so
>     having the nfsd threads suspended doesn't seem so bad for this
>     case (which hopefully never happens;-).
>=20
> Both suspend and resume are just no ops for unpatched kernels.
>=20
> Maybe the comment in front of "resume" should explicitly explain
> this, instead of saying resume is harmless to do under all conditions?
>=20
> Thanks for looking at it, rick
I see.

My another note is that there is no any protection against parallel
instances of suspend/resume happen. For instance, one thread could set
suspend_nfsd =3D 1 and be descheduled, while another executes resume
code sequence meantime. Then it would see suspend_nfsd !=3D 0, while
nfsv4rootfs_lock not held, and tries to unlock it. It seems that
nfsv4_unlock would silently exit. The suspending thread resumes,
and obtains the lock. You end up with suspend_nfsd =3D=3D 0 but lock held.

--hZpDuTGHUtM8eGVR
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (FreeBSD)

iEYEARECAAYFAlBYN/0ACgkQC3+MBN1Mb4iPGgCeM/a6BN9tZLpmw3fstmO+Gd1Q
mKEAniRaUuIkellq4m3LLYRfLo8MzYvE
=Kqj8
-----END PGP SIGNATURE-----

--hZpDuTGHUtM8eGVR--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120918085941.GZ37286>