Date: Tue, 18 Sep 2012 11:59:41 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: FS List <freebsd-fs@freebsd.org> Subject: Re: testing/review of atomic export update patch Message-ID: <20120918085941.GZ37286@deviant.kiev.zoral.com.ua> In-Reply-To: <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca> References: <20120917122325.GR37286@deviant.kiev.zoral.com.ua> <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
--hZpDuTGHUtM8eGVR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote: > Konstantin Belousov wrote: > > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote: > > > Hi, > > > > > > There is a simple patch at: > > > http://people.freebsd.org/~rmacklem/atomic-export.patch > > > that can be applied to a kernel + mountd, so that the new > > > nfsd can be suspended by mountd while the exports are being > > > reloaded. It adds a new "-S" flag to mountd to enable this. > > > (This avoids the long standing bug where clients receive ESTALE > > > replies to RPCs while mountd is reloading exports.) > >=20 > > This looks simple, but also somewhat worrisome. What would happen > > if the mountd crashes after nfsd suspension is requested, but before > > resume was performed ? > >=20 > > Might be, mountd should check for suspended nfsd on start and > > unsuspend > > it, if some flag is specified ? > Well, I think that happens with the patch as it stands. >=20 > suspend is done if the "-S" option is specified, but that is a no op > if it is already suspended. The resume is done no matter what flags > are provided, so mountd will always try and do a "resume". > --> get_exportlist() is always called when mountd is started up and > it does the resume unconditionally when it completes. > If mountd repeatedly crashes before completing get_exportlist() > when it is started up, the exports will be all messed up, so > having the nfsd threads suspended doesn't seem so bad for this > case (which hopefully never happens;-). >=20 > Both suspend and resume are just no ops for unpatched kernels. >=20 > Maybe the comment in front of "resume" should explicitly explain > this, instead of saying resume is harmless to do under all conditions? >=20 > Thanks for looking at it, rick I see. My another note is that there is no any protection against parallel instances of suspend/resume happen. For instance, one thread could set suspend_nfsd =3D 1 and be descheduled, while another executes resume code sequence meantime. Then it would see suspend_nfsd !=3D 0, while nfsv4rootfs_lock not held, and tries to unlock it. It seems that nfsv4_unlock would silently exit. The suspending thread resumes, and obtains the lock. You end up with suspend_nfsd =3D=3D 0 but lock held. --hZpDuTGHUtM8eGVR Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlBYN/0ACgkQC3+MBN1Mb4iPGgCeM/a6BN9tZLpmw3fstmO+Gd1Q mKEAniRaUuIkellq4m3LLYRfLo8MzYvE =Kqj8 -----END PGP SIGNATURE----- --hZpDuTGHUtM8eGVR--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120918085941.GZ37286>