From owner-freebsd-fs@FreeBSD.ORG Wed Aug 23 11:08:17 2006 Return-Path: X-Original-To: freebsd-fs@freebsd.org Delivered-To: freebsd-fs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 74EA316A4DA for ; Wed, 23 Aug 2006 11:08:17 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from fw.zoral.com.ua (fw.zoral.com.ua [213.186.206.134]) by mx1.FreeBSD.org (Postfix) with ESMTP id A6E5F43D49 for ; Wed, 23 Aug 2006 11:08:16 +0000 (GMT) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by fw.zoral.com.ua (8.13.4/8.13.4) with ESMTP id k7NB88Wh028309 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Aug 2006 14:08:08 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8) with ESMTP id k7NB8AtC082815; Wed, 23 Aug 2006 14:08:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.13.8/8.13.8/Submit) id k7NB88sh082814; Wed, 23 Aug 2006 14:08:08 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 23 Aug 2006 14:08:08 +0300 From: Kostik Belousov To: Tor Egge Message-ID: <20060823110808.GD64800@deviant.kiev.zoral.com.ua> References: <20060821.132151.41668008.Tor.Egge@cvsup.no.freebsd.org> <20060822175540.V58720@delplex.bde.org> <20060822130743.GL56637@deviant.kiev.zoral.com.ua> <20060822.214638.74697110.Tor.Egge@cvsup.no.freebsd.org> <20060823044043.GA64800@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="at6+YcpfzWZg/htY" Content-Disposition: inline In-Reply-To: <20060823044043.GA64800@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.88.4, clamav-milter version 0.88.4 on fw.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=1.9 required=5.0 tests=DNS_FROM_RFC_ABUSE, SPF_NEUTRAL,UNPARSEABLE_RELAY autolearn=no version=3.1.4 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on fw.zoral.com.ua Cc: freebsd-fs@freebsd.org Subject: Re: Deadlock between nfsd and snapshots. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2006 11:08:17 -0000 --at6+YcpfzWZg/htY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Attached is the prototype change. System booted from the modified kernel survived cyclic snapshotting of the exported partition. The partition was also mounted by loopback nfs, and the loop of extracting perl-5.8.8 from archive, grepping tree for foobar and removing it run over nfs. I have at least one questions: > > Protecting the existing i_flag and the timestamps with the vnode > > interlock when the current thread only has a shared vnode lock should > > be sufficient to protect against the races, removing the need for #3, > > #4 and #4 below. Could you, please, explain this point ? I did not wrap all accesses to i_flag and timestamps with vnode interlock, only ufs_itimes, ufs_lazyaccess and ufs_getattr for now. Index: ffs/ffs_snapshot.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/ufs/ffs/ffs_snapshot.c,v retrieving revision 1.128 diff -u -r1.128 ffs_snapshot.c --- ffs/ffs_snapshot.c 21 Aug 2006 17:20:19 -0000 1.128 +++ ffs/ffs_snapshot.c 23 Aug 2006 10:59:24 -0000 @@ -2322,6 +2322,8 @@ loop: MNT_VNODE_FOREACH(vp, mp, mvp) { VI_LOCK(vp); + if (vp->v_type =3D=3D VREG) + ufs_lazyaccess(vp); if ((vp->v_iflag & (VI_DOOMED | VI_OWEINACT)) !=3D VI_OWEINACT || vp->v_usecount > 0 || vp->v_type =3D=3D VNON) { Index: ufs/inode.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/inode.h,v retrieving revision 1.49 diff -u -r1.49 inode.h --- ufs/inode.h 14 Mar 2005 10:21:16 -0000 1.49 +++ ufs/inode.h 23 Aug 2006 10:59:24 -0000 @@ -119,6 +119,7 @@ #define IN_RENAME 0x0010 /* Inode is being renamed. */ #define IN_LAZYMOD 0x0040 /* Modified, but don't write yet. */ #define IN_SPACECOUNTED 0x0080 /* Blocks to be freed in free count. */ +#define IN_LAZYACCESS 0x0100 /* Process IN_ACCESS after the suspension fi= nished */ =20 #define i_devvp i_ump->um_devvp #define i_umbufobj i_ump->um_bo Index: ufs/ufs_extern.h =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_extern.h,v retrieving revision 1.55 diff -u -r1.55 ufs_extern.h --- ufs/ufs_extern.h 14 Mar 2005 10:21:16 -0000 1.55 +++ ufs/ufs_extern.h 23 Aug 2006 10:59:24 -0000 @@ -74,6 +74,7 @@ int ufs_inactive(struct vop_inactive_args *); int ufs_init(struct vfsconf *); void ufs_itimes(struct vnode *vp); +void ufs_lazyaccess(struct vnode *vp); int ufs_lookup(struct vop_cachedlookup_args *); int ufs_readdir(struct vop_readdir_args *); int ufs_reclaim(struct vop_reclaim_args *); Index: ufs/ufs_vnops.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v retrieving revision 1.277 diff -u -r1.277 ufs_vnops.c --- ufs/ufs_vnops.c 31 May 2006 15:55:52 -0000 1.277 +++ ufs/ufs_vnops.c 23 Aug 2006 10:59:24 -0000 @@ -128,31 +128,70 @@ { struct inode *ip; struct timespec ts; + int mnt_locked; =20 ip =3D VTOI(vp); + mnt_locked =3D 0; + + if ((vp->v_mount->mnt_flag & MNT_RDONLY) !=3D 0) { + VI_LOCK(vp); + goto out; + } + + vfs_timestamp(&ts); + + MNT_ILOCK(vp->v_mount); /* For reading of mnt_kern_flags */ + mnt_locked =3D 1; + + VI_LOCK(vp); if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) =3D=3D 0) - return; + goto out_unl; + if ((vp->v_type =3D=3D VBLK || vp->v_type =3D=3D VCHR) && !DOINGSOFTDEP(v= p)) ip->i_flag |=3D IN_LAZYMOD; - else + else if (((vp->v_mount->mnt_kern_flag & MNTK_SUSPENDED) =3D=3D 0) || + ((ip->i_flag & (IN_CHANGE | IN_UPDATE)) !=3D 0)) ip->i_flag |=3D IN_MODIFIED; - if ((vp->v_mount->mnt_flag & MNT_RDONLY) =3D=3D 0) { - vfs_timestamp(&ts); - if (ip->i_flag & IN_ACCESS) { - DIP_SET(ip, i_atime, ts.tv_sec); - DIP_SET(ip, i_atimensec, ts.tv_nsec); - } - if (ip->i_flag & IN_UPDATE) { - DIP_SET(ip, i_mtime, ts.tv_sec); - DIP_SET(ip, i_mtimensec, ts.tv_nsec); - ip->i_modrev++; - } - if (ip->i_flag & IN_CHANGE) { - DIP_SET(ip, i_ctime, ts.tv_sec); - DIP_SET(ip, i_ctimensec, ts.tv_nsec); - } + else if ((ip->i_flag & IN_ACCESS) !=3D 0) + ip->i_flag |=3D IN_LAZYACCESS; +=09 + if (ip->i_flag & IN_ACCESS) { + DIP_SET(ip, i_atime, ts.tv_sec); + DIP_SET(ip, i_atimensec, ts.tv_nsec); + } + if (ip->i_flag & IN_UPDATE) { + DIP_SET(ip, i_mtime, ts.tv_sec); + DIP_SET(ip, i_mtimensec, ts.tv_nsec); + ip->i_modrev++; + } + if (ip->i_flag & IN_CHANGE) { + DIP_SET(ip, i_ctime, ts.tv_sec); + DIP_SET(ip, i_ctimensec, ts.tv_nsec); } + + out: ip->i_flag &=3D ~(IN_ACCESS | IN_CHANGE | IN_UPDATE); + out_unl: + VI_UNLOCK(vp); + if (mnt_locked) + MNT_IUNLOCK(vp->v_mount); +} + +/* + * Clear the IN_LAZYACCESS i_flag. vnode shall be interlocked. + */ + +void +ufs_lazyaccess(vp) + struct vnode *vp; +{ + struct inode *ip; + + ip =3D VTOI(vp); + if ((ip->i_flag & IN_LAZYACCESS) !=3D 0) { + ip->i_flag &=3D ~IN_LAZYACCESS; + ip->i_flag |=3D IN_MODIFIED; + } } =20 /* @@ -266,11 +305,15 @@ } */ *ap; { struct vnode *vp =3D ap->a_vp; + int usecount; =20 VI_LOCK(vp); - if (vp->v_usecount > 1) - ufs_itimes(vp); + usecount =3D vp->v_usecount; VI_UNLOCK(vp); + + if (usecount > 1) + ufs_itimes(vp); + return (0); } =20 @@ -378,8 +421,10 @@ if (ip->i_ump->um_fstype =3D=3D UFS1) { vap->va_rdev =3D ip->i_din1->di_rdev; vap->va_size =3D ip->i_din1->di_size; + VI_LOCK(vp); vap->va_atime.tv_sec =3D ip->i_din1->di_atime; vap->va_atime.tv_nsec =3D ip->i_din1->di_atimensec; + VI_UNLOCK(vp); vap->va_mtime.tv_sec =3D ip->i_din1->di_mtime; vap->va_mtime.tv_nsec =3D ip->i_din1->di_mtimensec; vap->va_ctime.tv_sec =3D ip->i_din1->di_ctime; @@ -390,8 +435,10 @@ } else { vap->va_rdev =3D ip->i_din2->di_rdev; vap->va_size =3D ip->i_din2->di_size; + VI_LOCK(vp); vap->va_atime.tv_sec =3D ip->i_din2->di_atime; vap->va_atime.tv_nsec =3D ip->i_din2->di_atimensec; + VI_UNLOCK(vp); vap->va_mtime.tv_sec =3D ip->i_din2->di_mtime; vap->va_mtime.tv_nsec =3D ip->i_din2->di_mtimensec; vap->va_ctime.tv_sec =3D ip->i_din2->di_ctime; @@ -400,7 +447,9 @@ vap->va_birthtime.tv_nsec =3D ip->i_din2->di_birthnsec; vap->va_bytes =3D dbtob((u_quad_t)ip->i_din2->di_blocks); } + VI_LOCK(vp); vap->va_flags =3D ip->i_flags; + VI_UNLOCK(vp); vap->va_gen =3D ip->i_gen; vap->va_blocksize =3D vp->v_mount->mnt_stat.f_iosize; vap->va_type =3D IFTOVT(ip->i_mode); @@ -1992,11 +2041,15 @@ } */ *ap; { struct vnode *vp =3D ap->a_vp; + int usecount; =20 VI_LOCK(vp); - if (vp->v_usecount > 1) - ufs_itimes(vp); + usecount =3D vp->v_usecount; VI_UNLOCK(vp); + + if (usecount > 1) + ufs_itimes(vp); + return (fifo_specops.vop_close(ap)); } =20 --at6+YcpfzWZg/htY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFE7DcYC3+MBN1Mb4gRAp1bAJ9lQu5xqeISUdM/+exevjAt6osv2QCfXYoB uJDDlH8BXDHMOY1HnDZuT7U= =Vih+ -----END PGP SIGNATURE----- --at6+YcpfzWZg/htY--