Date: Tue, 18 Sep 2012 11:59:41 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: FS List <freebsd-fs@freebsd.org> Subject: Re: testing/review of atomic export update patch Message-ID: <20120918085941.GZ37286@deviant.kiev.zoral.com.ua> In-Reply-To: <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca> References: <20120917122325.GR37286@deviant.kiev.zoral.com.ua> <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote: > Konstantin Belousov wrote: > > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote: > > > Hi, > > > > > > There is a simple patch at: > > > http://people.freebsd.org/~rmacklem/atomic-export.patch > > > that can be applied to a kernel + mountd, so that the new > > > nfsd can be suspended by mountd while the exports are being > > > reloaded. It adds a new "-S" flag to mountd to enable this. > > > (This avoids the long standing bug where clients receive ESTALE > > > replies to RPCs while mountd is reloading exports.) > > > > This looks simple, but also somewhat worrisome. What would happen > > if the mountd crashes after nfsd suspension is requested, but before > > resume was performed ? > > > > Might be, mountd should check for suspended nfsd on start and > > unsuspend > > it, if some flag is specified ? > Well, I think that happens with the patch as it stands. > > suspend is done if the "-S" option is specified, but that is a no op > if it is already suspended. The resume is done no matter what flags > are provided, so mountd will always try and do a "resume". > --> get_exportlist() is always called when mountd is started up and > it does the resume unconditionally when it completes. > If mountd repeatedly crashes before completing get_exportlist() > when it is started up, the exports will be all messed up, so > having the nfsd threads suspended doesn't seem so bad for this > case (which hopefully never happens;-). > > Both suspend and resume are just no ops for unpatched kernels. > > Maybe the comment in front of "resume" should explicitly explain > this, instead of saying resume is harmless to do under all conditions? > > Thanks for looking at it, rick I see. My another note is that there is no any protection against parallel instances of suspend/resume happen. For instance, one thread could set suspend_nfsd = 1 and be descheduled, while another executes resume code sequence meantime. Then it would see suspend_nfsd != 0, while nfsv4rootfs_lock not held, and tries to unlock it. It seems that nfsv4_unlock would silently exit. The suspending thread resumes, and obtains the lock. You end up with suspend_nfsd == 0 but lock held. [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlBYN/0ACgkQC3+MBN1Mb4iPGgCeM/a6BN9tZLpmw3fstmO+Gd1Q mKEAniRaUuIkellq4m3LLYRfLo8MzYvE =Kqj8 -----END PGP SIGNATURE-----home | help
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120918085941.GZ37286>
