From owner-freebsd-current@FreeBSD.ORG Fri Dec 12 13:41:40 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B1181065678; Fri, 12 Dec 2008 13:41:40 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.terabit.net.ua (mail.terabit.net.ua [195.137.202.147]) by mx1.freebsd.org (Postfix) with ESMTP id D70E38FC1B; Fri, 12 Dec 2008 13:41:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from skuns.zoral.com.ua ([91.193.166.194] helo=mail.zoral.com.ua) by mail.terabit.net.ua with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63 (FreeBSD)) (envelope-from ) id 1LB8Gv-0008X3-TJ; Fri, 12 Dec 2008 15:41:38 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id mBCDfUKM061875 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 12 Dec 2008 15:41:30 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3) with ESMTP id mBCDfTLm097664; Fri, 12 Dec 2008 15:41:30 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.3/8.14.3/Submit) id mBCDfTOC097663; Fri, 12 Dec 2008 15:41:29 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Fri, 12 Dec 2008 15:41:29 +0200 From: Kostik Belousov To: David Wolfskill , hackers@freebsd.org, current@freebsd.org Message-ID: <20081212134129.GD2038@deviant.kiev.zoral.com.ua> References: <20081203001538.GC96383@bunrab.catwhisker.org> <20081209190110.GW60731@albert.catwhisker.org> <20081210165022.GJ60731@albert.catwhisker.org> <20081210170620.GS2038@deviant.kiev.zoral.com.ua> <20081211225349.GB5597@albert.catwhisker.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ZPyQn7mfIBXOql8K" Content-Disposition: inline In-Reply-To: <20081211225349.GB5597@albert.catwhisker.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.93.3, clamav-milter version 0.93.3 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua X-Virus-Scanned: mail.terabit.net.ua 1LB8Gv-0008X3-TJ b4c2b8acd62b5bbc628e5797911f2a14 X-Terabit: YES Cc: Subject: Re: NFS (& amd?) dysfunction descending a hierarchy X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Dec 2008 13:41:40 -0000 --ZPyQn7mfIBXOql8K Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 11, 2008 at 02:53:49PM -0800, David Wolfskill wrote: > On Wed, Dec 10, 2008 at 07:06:20PM +0200, Kostik Belousov wrote: > > ... > > > What concerns me is that even if the attempted unmount gets EBUSY, the > > > user-level process descending the directory hierarchy is getting ENOE= NT > > > trying to issue fstatfs() against an open file descriptor. > > >=20 > > > I'm having trouble figuring out any way that makes any sense. > >=20 > > Basically, the problem is that NFS uses shared lookup, and this allows > > for the bug where several negative namecache entries are created for > > non-existent node. Then this node gets created, removing only the first > > negative namecache entry. For some reasons, vnode is reclaimed; amd' > > tasting of unmount is a good reason for vnode to be reclaimed. > >=20 > > Now, you have existing path and a negative cache entry. This was > > reported by Peter Holm first, I listed relevant revisions that > > should fix this in previous mail. >=20 > Well, I messed up the machine I had been using for testing, and needed > to wait for IT to do something to it since I don't have physical or > console access to it. >=20 > So after I happened to demonstrate the effect using my desktop -- which > had been running RELENG_7_1, sources updated as of around 0400 hrs. > US/Pacific -- I decided to go ahead and update the desktop to RELENG_7_1 > as of this morning (which had the commit to sys/kern/vfs_cache.c), then > test. >=20 > It still failed, apparently in the same way; details below. >=20 > First, here's a list of the files that were changed: >=20 > U lib/libarchive/archive_read_support_format_iso9660.c > U lib/libarchive/archive_string.c > U lib/libarchive/archive_string.h > U lib/libc/gen/times.3 > U lib/libc/i386/sys/pipe.S > U lib/libc/i386/sys/reboot.S > U lib/libc/i386/sys/setlogin.S > U lib/libutil/Makefile > U lib/libutil/kinfo_getfile.c > U lib/libutil/kinfo_getvmmap.c > U lib/libutil/libutil.h > U share/man/man4/bce.4 > U share/man/man5/Makefile > U share/man/man5/fstab.5 > U share/man/man5/nullfs.5 > U sys/amd64/Makefile > U sys/boot/forth/loader.conf.5 > U sys/dev/ale/if_ale.c > U sys/dev/bce/if_bce.c > U sys/dev/cxgb/cxgb_main.c > U sys/dev/cxgb/common/cxgb_ael1002.c > U sys/dev/cxgb/common/cxgb_t3_hw.c > U sys/dev/cxgb/common/cxgb_xgmac.c > U sys/dev/re/if_re.c > U sys/fs/nullfs/null_vnops.c > U sys/kern/Make.tags.inc > U sys/kern/kern_descrip.c > U sys/kern/kern_proc.c > U sys/kern/vfs_cache.c > U sys/netinet/in_pcb.h > U sys/pci/if_rlreg.h > U sys/sys/sysctl.h > U sys/sys/user.h > U sys/ufs/ufs/ufs_quota.c > U usr.bin/procstat/Makefile > U usr.bin/procstat/procstat_files.c > U usr.bin/procstat/procstat_vm.c > U usr.bin/tar/util.c > U usr.bin/tar/test/Makefile > U usr.bin/tar/test/test_strip_components.c > U usr.bin/tar/test/test_symlink_dir.c > U usr.bin/xargs/xargs.1 > U usr.sbin/mtree/mtree.c >=20 > We see that sys/kern/vfs_cache.c is, indeed, among them. And: >=20 > dwolf-bsd(7.1-P)[5] grep '\$FreeBSD' /sys/kern/vfs_cache.c > __FBSDID("$FreeBSD: src/sys/kern/vfs_cache.c,v 1.114.2.3 2008/12/09 16:20= :58 kib Exp $"); > dwolf-bsd(7.1-P)[6]=20 >=20 > That should correspond to the desired version of the file. >=20 > Here we see an excerpt from the ktrace output for the amd(8) process and > its children; this is a point when amd(8) is trying an unmount() to see > if it can get away with it: >=20 > 977 amd 1229033597.269612 CALL gettimeofday(0x807ad48,0) > 977 amd 1229033597.269620 RET gettimeofday 0 > 977 amd 1229033597.269630 CALL sigprocmask(SIG_BLOCK,0xbfbfeaec,= 0xbfbfeadc) > 977 amd 1229033597.269637 RET sigprocmask 0 > 977 amd 1229033597.269645 CALL fork > 977 amd 1229033597.273810 RET fork 1712/0x6b0 > 1712 amd 1229033597.273811 RET fork 0 > 977 amd 1229033597.273836 CALL sigprocmask(SIG_SETMASK,0xbfbfead= c,0) > 1712 amd 1229033597.273845 CALL getpid > 977 amd 1229033597.273850 RET sigprocmask 0 > 1712 amd 1229033597.273855 RET getpid 1712/0x6b0 > 977 amd 1229033597.273864 CALL gettimeofday(0x807ad48,0) > 977 amd 1229033597.273874 RET gettimeofday 0 > 1712 amd 1229033597.273878 CALL unmount(0x2832c610,0) > ... > 1712 amd 1229033597.352643 RET unmount -1 errno 16 Device busy > 1712 amd 1229033597.352695 CALL sigprocmask(SIG_BLOCK,0x28097c00,= 0xbfbfea0c) > 1712 amd 1229033597.352728 RET sigprocmask 0 > 1712 amd 1229033597.352751 CALL sigprocmask(SIG_SETMASK,0x28097c1= 0,0) > 1712 amd 1229033597.352769 RET sigprocmask 0 > 1712 amd 1229033597.352781 CALL sigprocmask(SIG_BLOCK,0x28097c00,= 0xbfbfe9dc) > 1712 amd 1229033597.352790 RET sigprocmask 0 > 1712 amd 1229033597.352801 CALL sigprocmask(SIG_SETMASK,0x28097c1= 0,0) > 1712 amd 1229033597.352805 RET sigprocmask 0 > 1712 amd 1229033597.352815 CALL exit(0x10) > 977 amd 1229033597.353085 RET select -1 errno 4 Interrupted sys= tem call > 977 amd 1229033597.353093 PSIG SIGCHLD caught handler=3D0x805de5= 0 mask=3D0x0 code=3D0x0 > 977 amd 1229033597.353103 CALL wait4(0xffffffff,0xbfbfe83c,WNOHA= NG,0) > 977 amd 1229033597.353116 RET wait4 1712/0x6b0 > 977 amd 1229033597.353122 CALL wait4(0xffffffff,0xbfbfe83c,WNOHA= NG,0) > 977 amd 1229033597.353127 RET wait4 -1 errno 10 No child proces= ses >=20 >=20 > So amd(8) master process (pid 977) jorks off a child (pid 1712) to > try an umount(), which it initiates at 1229033597.273878. At > 1229033597.352643 the child gets control back, as well as an EBUSY, > which I would expect to mean that the attempt failed. The child > exits at 1229033597.352815 with a status code of 16. >=20 > Armed with that, we look at a ktrace excerpt from "rm -fr": >=20 > 1660 rm 1229033597.283277 CALL rmdir(0x2822b388) > 1660 rm 1229033597.283283 NAMI "stvef-paks" > 1660 rm 1229033597.285599 RET rmdir 0 > 1660 rm 1229033597.285620 CALL stat(0x2822b3e8,0xbfbfe8dc) > 1660 rm 1229033597.285626 NAMI "stvef-server" > 1660 rm 1229033597.286071 STRU struct stat {dev=3D83951372, ino= =3D20124614, mode=3Ddrwxr-xr-x , nlink=3D3, uid=3D9874, gid=3D929, rdev=3D0= , atime=3D1228844788, stime=3D1227555769, ctime=3D1228845828.326650000, bir= thtime=3D0, size=3D4096, blksize=3D4096, blocks=3D8, flags=3D0x0 } > 1660 rm 1229033597.286078 RET stat 0 > 1660 rm 1229033597.286084 CALL open(0x2822b3e8,O_NONBLOCK,0x1) > 1660 rm 1229033597.286091 NAMI "stvef-server" > 1660 rm 1229033597.287145 RET open 4 > 1660 rm 1229033597.287154 CALL fstat(0x4,0xbfbfe8dc) > 1660 rm 1229033597.287161 STRU struct stat {dev=3D83951372, ino= =3D20124614, mode=3Ddrwxr-xr-x , nlink=3D3, uid=3D9874, gid=3D929, rdev=3D0= , atime=3D1228844788, stime=3D1227555769, ctime=3D1228845828.326650000, bir= thtime=3D0, size=3D4096, blksize=3D4096, blocks=3D8, flags=3D0x0 } > 1660 rm 1229033597.287166 RET fstat 0 > 1660 rm 1229033597.287171 CALL fcntl(0x4,F_SETFD,FD_CLOEXEC) > 1660 rm 1229033597.287177 RET fcntl 0 > 1660 rm 1229033597.287187 CALL fstatfs(0x4,0xbfbfe704) > 1660 rm 1229033597.287195 RET fstatfs -1 errno 2 No such file o= r directory > 1660 rm 1229033597.287202 CALL close(0x4) > 1660 rm 1229033597.287211 RET close 0 >=20 > [Sorry for the long lines....] >=20 > Here we see that the "rm" process (pid 1660) removed a directory > named stvef-paks sucessfully in the interval between 1229033597.283277 > (when the request was made) and 1229033597.285599 (when the 0 return > occurred). The "rm" process proceeds to process a directory named > stvef-server: >=20 > * At 1229033597.285620 it issues a stat(); the successful return > is at 1229033597.286078. >=20 > * At 1229033597.286084 it issues an open(); the successful return > is at 1229033597.287145. The FD is 4. >=20 > * At 1229033597.287154 it issues an fstat() against FD 4; the successful > return is at 1229033597.287166. >=20 > * At 1229033597.287171 it issues an fcntl() against FD 4; the > successful return is at 1229033597.287177. >=20 > * At 1229033597.287187 it issues an fstatfs() against FD 4; the > unsuccessful return is at 1229033597.287195, claiming ENOENT. >=20 > Say WHAT??!? >=20 > I expect to be able to test a bit more promptly now. But is this error transient or permanent ? I.e., would restart of rm successful or failing ? Anyway, this error looks different too. --ZPyQn7mfIBXOql8K Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (FreeBSD) iEYEARECAAYFAklCagkACgkQC3+MBN1Mb4jOBgCeJtuwzLJH5kX0gUBlu2VqG/yD lzAAoK6FmV53wAKwPlQDbf0Ryd4LXsbZ =pDvh -----END PGP SIGNATURE----- --ZPyQn7mfIBXOql8K--