From owner-freebsd-fs@FreeBSD.ORG Tue Sep 18 08:59:46 2012 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 58BB6106566C for ; Tue, 18 Sep 2012 08:59:46 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id E327F8FC0C for ; Tue, 18 Sep 2012 08:59:45 +0000 (UTC) Received: from skuns.kiev.zoral.com.ua (localhost [127.0.0.1]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id q8I8xrUp084443; Tue, 18 Sep 2012 11:59:54 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5) with ESMTP id q8I8xfmZ043766; Tue, 18 Sep 2012 11:59:41 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.5/8.14.5/Submit) id q8I8xfWK043765; Tue, 18 Sep 2012 11:59:41 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 18 Sep 2012 11:59:41 +0300 From: Konstantin Belousov To: Rick Macklem Message-ID: <20120918085941.GZ37286@deviant.kiev.zoral.com.ua> References: <20120917122325.GR37286@deviant.kiev.zoral.com.ua> <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="hZpDuTGHUtM8eGVR" Content-Disposition: inline In-Reply-To: <1777840817.743780.1347917564789.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.5.21 (2010-09-15) X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.0 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FS List Subject: Re: testing/review of atomic export update patch X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Sep 2012 08:59:46 -0000 --hZpDuTGHUtM8eGVR Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Sep 17, 2012 at 05:32:44PM -0400, Rick Macklem wrote: > Konstantin Belousov wrote: > > On Sun, Sep 16, 2012 at 05:41:25PM -0400, Rick Macklem wrote: > > > Hi, > > > > > > There is a simple patch at: > > > http://people.freebsd.org/~rmacklem/atomic-export.patch > > > that can be applied to a kernel + mountd, so that the new > > > nfsd can be suspended by mountd while the exports are being > > > reloaded. It adds a new "-S" flag to mountd to enable this. > > > (This avoids the long standing bug where clients receive ESTALE > > > replies to RPCs while mountd is reloading exports.) > >=20 > > This looks simple, but also somewhat worrisome. What would happen > > if the mountd crashes after nfsd suspension is requested, but before > > resume was performed ? > >=20 > > Might be, mountd should check for suspended nfsd on start and > > unsuspend > > it, if some flag is specified ? > Well, I think that happens with the patch as it stands. >=20 > suspend is done if the "-S" option is specified, but that is a no op > if it is already suspended. The resume is done no matter what flags > are provided, so mountd will always try and do a "resume". > --> get_exportlist() is always called when mountd is started up and > it does the resume unconditionally when it completes. > If mountd repeatedly crashes before completing get_exportlist() > when it is started up, the exports will be all messed up, so > having the nfsd threads suspended doesn't seem so bad for this > case (which hopefully never happens;-). >=20 > Both suspend and resume are just no ops for unpatched kernels. >=20 > Maybe the comment in front of "resume" should explicitly explain > this, instead of saying resume is harmless to do under all conditions? >=20 > Thanks for looking at it, rick I see. My another note is that there is no any protection against parallel instances of suspend/resume happen. For instance, one thread could set suspend_nfsd =3D 1 and be descheduled, while another executes resume code sequence meantime. Then it would see suspend_nfsd !=3D 0, while nfsv4rootfs_lock not held, and tries to unlock it. It seems that nfsv4_unlock would silently exit. The suspending thread resumes, and obtains the lock. You end up with suspend_nfsd =3D=3D 0 but lock held. --hZpDuTGHUtM8eGVR Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iEYEARECAAYFAlBYN/0ACgkQC3+MBN1Mb4iPGgCeM/a6BN9tZLpmw3fstmO+Gd1Q mKEAniRaUuIkellq4m3LLYRfLo8MzYvE =Kqj8 -----END PGP SIGNATURE----- --hZpDuTGHUtM8eGVR--