Date: Mon, 8 Jul 2013 08:43:01 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Andreas Longwitz <longwitz@incore.de> Cc: freebsd-stable@freebsd.org Subject: Re: Shutdown hangs on unmount of a gjournaled file system in 8-Stable Message-ID: <20130708054301.GI91021@kib.kiev.ua> In-Reply-To: <51D9EB23.4070505@incore.de> References: <51D9EB23.4070505@incore.de>
next in thread | previous in thread | raw e-mail | index | archive | help
--HmXnia/qGvWh3jry Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jul 08, 2013 at 12:26:43AM +0200, Andreas Longwitz wrote: > The deadlock can be explained now: pid 1 (init) sleeps on "mount drain" > because mp->mnt_lockref was 1. This setting was done by pid 18 (gjournal > switcher) by calling vfs_busy(). pid 18 now sleeps on "suspwt" because > mp->mnt_writeopcount was 1. This setting was done by pid 1 before going > to sleep by calling vn_start_write() in dounmount(). >=20 > I think the reason for this deadlock is the commit r249055 which seems > not to be compatible with gjournal. Thank you for the analysis. I think 'not compatible' is some understatement. The situation clearly causes a deadlock, you are right. The vfs_busy(); vfs_write_suspend(); call sequence is somewhat dubious, in fact, exactly because unmount could start in between. I think that vfs_write_suspend() must avoid setting MNT_SUSPEND if unmount was started. Patch below, for HEAD, should fix the problem, by marking the callers of vfs_write_suspend(), which are not protected by the covered vnode lock, with the VS_SKIP_UNMOUNT flag. I believe that the conflicts on stable/8 should be trivial, if any. diff --git a/sys/geom/journal/g_journal.c b/sys/geom/journal/g_journal.c index a3c996c..3ce2785 100644 --- a/sys/geom/journal/g_journal.c +++ b/sys/geom/journal/g_journal.c @@ -2960,7 +2960,7 @@ g_journal_do_switch(struct g_class *classp) GJ_TIMER_STOP(1, &bt, "BIO_FLUSH time of %s", sc->sc_name); =20 GJ_TIMER_START(1, &bt); - error =3D vfs_write_suspend(mp); + error =3D vfs_write_suspend(mp, VS_SKIP_UNMOUNT); GJ_TIMER_STOP(1, &bt, "Suspend time of %s", mountpoint); if (error !=3D 0) { GJ_DEBUG(0, "Cannot suspend file system %s (error=3D%d).", diff --git a/sys/kern/vfs_vnops.c b/sys/kern/vfs_vnops.c index 7eac0ef..06e59f9 100644 --- a/sys/kern/vfs_vnops.c +++ b/sys/kern/vfs_vnops.c @@ -1668,8 +1668,7 @@ vn_finished_secondary_write(mp) * Request a filesystem to suspend write operations. */ int -vfs_write_suspend(mp) - struct mount *mp; +vfs_write_suspend(struct mount *mp, int flags) { int error; =20 @@ -1680,6 +1679,21 @@ vfs_write_suspend(mp) } while (mp->mnt_kern_flag & MNTK_SUSPEND) msleep(&mp->mnt_flag, MNT_MTX(mp), PUSER - 1, "wsuspfs", 0); + + /* + * Unmount holds a write reference on the mount point. If we + * own busy reference and drain for writers, we deadlock with + * the reference draining in the unmount path. Callers of + * vfs_write_suspend() must specify VS_SKIP_UNMOUNT if + * vfs_busy() reference is owned and caller is not in the + * unmount context. + */ + if ((flags & VS_SKIP_UNMOUNT) !=3D 0 && + (mp->mnt_kern_flag & MNTK_UNMOUNT) !=3D 0) { + MNT_IUNLOCK(mp); + return (EBUSY); + } + mp->mnt_kern_flag |=3D MNTK_SUSPEND; mp->mnt_susp_owner =3D curthread; if (mp->mnt_writeopcount > 0) diff --git a/sys/sys/vnode.h b/sys/sys/vnode.h index 42bfb65..b0cbcc0 100644 --- a/sys/sys/vnode.h +++ b/sys/sys/vnode.h @@ -398,6 +398,9 @@ extern int vttoif_tab[]; #define VR_START_WRITE 0x0001 /* vfs_write_resume: start write atomically = */ #define VR_NO_SUSPCLR 0x0002 /* vfs_write_resume: do not clear suspension = */ =20 +#define VS_SKIP_UNMOUNT 0x0001 /* vfs_write_suspend: fail if the + filesystem is being unmounted */ + #define VREF(vp) vref(vp) =20 #ifdef DIAGNOSTIC @@ -711,7 +714,7 @@ int vn_io_fault_pgmove(vm_page_t ma[], vm_offset_t offs= et, int xfersize, int vfs_cache_lookup(struct vop_lookup_args *ap); void vfs_timestamp(struct timespec *); void vfs_write_resume(struct mount *mp, int flags); -int vfs_write_suspend(struct mount *mp); +int vfs_write_suspend(struct mount *mp, int flags); int vop_stdbmap(struct vop_bmap_args *); int vop_stdfsync(struct vop_fsync_args *); int vop_stdgetwritemount(struct vop_getwritemount_args *); diff --git a/sys/ufs/ffs/ffs_snapshot.c b/sys/ufs/ffs/ffs_snapshot.c index 9a9c88a..ad157aa 100644 --- a/sys/ufs/ffs/ffs_snapshot.c +++ b/sys/ufs/ffs/ffs_snapshot.c @@ -423,7 +423,7 @@ restart: */ for (;;) { vn_finished_write(wrtmp); - if ((error =3D vfs_write_suspend(vp->v_mount)) !=3D 0) { + if ((error =3D vfs_write_suspend(vp->v_mount, 0)) !=3D 0) { vn_start_write(NULL, &wrtmp, V_WAIT); vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); goto out; diff --git a/sys/ufs/ffs/ffs_suspend.c b/sys/ufs/ffs/ffs_suspend.c index 3198c1a..a8c4578 100644 --- a/sys/ufs/ffs/ffs_suspend.c +++ b/sys/ufs/ffs/ffs_suspend.c @@ -206,7 +206,7 @@ ffs_susp_suspend(struct mount *mp) return (EPERM); #endif =20 - if ((error =3D vfs_write_suspend(mp)) !=3D 0) + if ((error =3D vfs_write_suspend(mp, VS_SKIP_UNMOUNT)) !=3D 0) return (error); =20 ump->um_writesuspended =3D 1; diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c index 57f092c..a87fdfa 100644 --- a/sys/ufs/ffs/ffs_vfsops.c +++ b/sys/ufs/ffs/ffs_vfsops.c @@ -257,7 +257,7 @@ ffs_mount(struct mount *mp) return (error); for (;;) { vn_finished_write(mp); - if ((error =3D vfs_write_suspend(mp)) !=3D 0) + if ((error =3D vfs_write_suspend(mp, 0)) !=3D 0) return (error); MNT_ILOCK(mp); if (mp->mnt_kern_flag & MNTK_SUSPENDED) { @@ -1255,7 +1255,7 @@ ffs_unmount(mp, mntflags) */ for (;;) { vn_finished_write(mp); - if ((error =3D vfs_write_suspend(mp)) !=3D 0) + if ((error =3D vfs_write_suspend(mp, 0)) !=3D 0) return (error); MNT_ILOCK(mp); if (mp->mnt_kern_flag & MNTK_SUSPENDED) { --HmXnia/qGvWh3jry Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.20 (FreeBSD) iQIcBAEBAgAGBQJR2lFlAAoJEJDCuSvBvK1Bs/wQAIsAa78Lxaj34I1LNgp01SAS E2xWUTHvCSEVQIUdMbopqtHagjEgMaIj2mJctYkBMOF6z5QkUh8wdeqq8KeI+EJh osBQEXZ7zUiXGN2o6PoSQP7+FRXakVX3r40jCzV0H5lvcpJ74I/lYuLz818swksf ufXMUxqsd6mQ6NOaOvwgt5ze6LW6y4aQ2h9GWXyeHey8a7Tg0NKcbqHoId+tDkIu hamLTkz/72OFnbvHlER5L5L0KSglmAk5aeuzySoBUk5g/SGFSDgHvdvAzZamdwsE zfZ0Ec2D+thooInS3yyq+gI0B2e9sxpWGekXS7s+S63GRA/0IITHBo0vpU8mPqUS 3JmecCvgsNGi43o/ddHAdyBH9nSw6Ew9RdCmuNk2P85LvAgBobCFkFlYdzI5V5cj FugG4FtKgv79wkYYcWB1FfeDFVOYs0+YeQME0uZNSyCJtpZaGNX0PT+o46hszQ2b IO+Oxnko5vSrfI6IzDOLEk0g6BsZe84YnILo1auLsQQGO6G9jlzp6vhWu+IZJKib pos4quRlvfkvkEbcIx4+9oKuHIbPv6uCHG3nQwpS+mIAAaSn+82LSiO0bFJ+wEzc U+QrwOAwMWdTLcusrziCHmFvU+mHDy4yvnzlWNTAt8zE5Nyso11HOnmHhIH+9rsZ oMojlqI9V98Ew8nEp4LS =fUQf -----END PGP SIGNATURE----- --HmXnia/qGvWh3jry--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130708054301.GI91021>