From owner-freebsd-current@FreeBSD.ORG Tue Jun 30 19:32:58 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 660311065676; Tue, 30 Jun 2009 19:32:58 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (skuns.zoral.com.ua [91.193.166.194]) by mx1.freebsd.org (Postfix) with ESMTP id F14998FC19; Tue, 30 Jun 2009 19:32:53 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id n5UJWmRF082104 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 30 Jun 2009 22:32:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id n5UJWmZx054697; Tue, 30 Jun 2009 22:32:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id n5UJWmsn054696; Tue, 30 Jun 2009 22:32:48 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 30 Jun 2009 22:32:48 +0300 From: Kostik Belousov To: Rick Macklem Message-ID: <20090630193248.GY2884@deviant.kiev.zoral.com.ua> References: <3bbf2fe10906290256x4bfbe263jccef017a557f9410@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3/zQ9zHZ+bvXaNu1" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.1 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean Cc: Attilio Rao , freebsd-fs@freebsd.org, freebsd-current@freebsd.org Subject: Re: umount -f implementation X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 30 Jun 2009 19:32:58 -0000 --3/zQ9zHZ+bvXaNu1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 30, 2009 at 12:01:21PM -0400, Rick Macklem wrote: >=20 >=20 > On Mon, 29 Jun 2009, Attilio Rao wrote: >=20 > > > >While that should be real in principle (immediate shutdown of the fs > >operation and unmounting of the partition) it is totally impossible to > >have it completely unsleeping, so it can happen that also umount -f > >sleeps / delays for some times (example: vflush). > >Currently, umount -f is one of the most complicated thing to handle in > >our VFS because it puts as requirement that vnodes can be reclaimed in > >any moment, adding complexity and possibility for races. > > > >What's the fix for your problem? > > > >From other responses, it does look like pursuing this is appropriate > and that current behaviour is considered a bug. >=20 > I should have noted in the previous email that I suspected that my simple= =20 > patch didn't handle all cases, which I have just determined via testing. >=20 > Unfortunately, the thread doing "umount" can also get stuck in an msleep(= )=20 > while waiting for the mnt_lockref to go to 0, which happens before the > VFS_UNMOUNT() call. (mnt_lockref gets incremented by various system > calls that call vfs_busy().) >=20 > I think I can fix this in the experimental nfsv4 client, since it has > a kernel thread that can check for MNTK_UNMOUNTF being set and then > kill off the RPCs in progress, but that won't help the regular client. This solution sounds good, but see below. >=20 > It's starting to look like too much work for FreeBSD8, but sounds like > it is worth pursuing. (Appologies to anyone that thought I would have it > all fixed in a day or two.) It may be argued by some people, me included, that umount -f shall not override any ownership of kernel resources. In particular, you must not ignore the lockref. Instead, the threads that own misc filesystem resources, like mount reference counter, locked vnodes etc shall be weed out of the syscalls. E.g., finishing stalled rpc calls with some error code that is propagated to return code from vops is good solution. Quite similar problems happen with SIGSTOP and intr NFS mounts. You saw the proposed solution that is quite similar, it forces the threads owning the resources to progress to syscall boundary. Another problem with forced unmounts is that VFS does not block new threads from arriving into VOPs. When finishing the inflight rpcs, you may either leave some new rpcs behind or loop infinitely chasing rpcs that arrive while you finishing old rpcs. Half-measure is the filesystem suspension, that keeps operations that modify filesystem from entering VOPs. UFS uses suspension for unmounts and rw->ro remounts. Umount -f is needed in two different situations, one is normally worked filesystem that shall be unmounted by administrative request, detaching any resources opened by application. Second is the last-resort action when backing storage (server in NFS case, disk for UFS) is misbehaving. I think we must not break first case for the second. --3/zQ9zHZ+bvXaNu1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAkpKaF8ACgkQC3+MBN1Mb4hWSACgtWq2bc/EH/RMoIiDxIX+9X0m BMUAnAxMbWuju006357agvAJMimc252u =b5Pm -----END PGP SIGNATURE----- --3/zQ9zHZ+bvXaNu1--