Date: Tue, 18 Sep 2012 12:30:40 -0400 From: "b. f." <bf1783@googlemail.com> To: freebsd-fs@FreeBSD.org Subject: Re: Problems after recent nullfs,vfs changes in 10.0-CURRENT Message-ID: <CAGFTUwNP9qB1D%2B1E6Pw_ud1ESomcDs2bAOY9c_VRYLZ5AiF%2B5g@mail.gmail.com> In-Reply-To: <20120918084924.GY37286@deviant.kiev.zoral.com.ua> References: <CAGFTUwMVAmoN49u1bT_8LuxKo6JKrB6shbw773MpmJ5i7q=Qeg@mail.gmail.com> <20120917121925.GQ37286@deviant.kiev.zoral.com.ua> <20120917183654.GA13273@x2.osted.lan> <20120918084924.GY37286@deviant.kiev.zoral.com.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
The following deals with some problems exposed by r240283-5, particularly (but not only) when used with changes to tmpfs that were first proposed by kib@ on 21 June 2010 on this list, in a thread entitled "Tmpfs elimination of double copy": http://docs.freebsd.org/cgi/getmsg.cgi?fetch=20463+0+archive/2010/freebsd-fs/20100627.freebsd-fs On 9/18/12, Konstantin Belousov <kostikbel@gmail.com> wrote: > On Mon, Sep 17, 2012 at 08:36:54PM +0200, Peter Holm wrote: >> On Mon, Sep 17, 2012 at 03:19:25PM +0300, Konstantin Belousov wrote: >> > Please mail fs@, possibly Cc:-ing me. >> > >> > On Mon, Sep 17, 2012 at 03:04:46AM -0400, b. f. wrote: >> > > The recent nullfs or vfs changes (r240283-5) have exposed some >> > > problems with my tinderbox. In this tinderbox, I've been using >> > > recent >> > > versions of -CURRENT with Gleb's tmpfs rbtree patch: >> > > >> > > http://people.freebsd.org/~gleb/tmpfs-nrbtree.1.patch >> > > >> > > and a merged version of your tmpfs single-buffer patch: >> > > >> > > http://people.freebsd.org/~kib/misc/tmpfs.12.patch >> > > >> > > The tinderbox performs builds in a tmpfs filesystem that is nullfs >> > > grafted to a ufs filesystem. After r240283-5, builds of >> > > ports/lang/ocaml failed when a cp(1) of an executable failed with >> > > ETXTBSY. After reverting r240285, the builds of ocaml succeeded. >> > > >> > > I've attached logs of the failed and successful builds. Can you >> > > guess >> > > whether the problem is solely due to the recent nullfs and vfs >> > > changes, or to some defect in Gleb's proposed changes, or to a >> > > problem >> > > with your proposed tmpfs change, or my merging of it? What further >> > > changes or tests would you suggest to help find the source of the >> > > problem? >> > > >> > > I've attached a diff of the relevant changes to the system sources >> > > used in the tinderbox, and logs of the successful (*.log) and >> > > unsuccessful (*.log.error) ocaml builds. >> > >> > Please show me the mount -v output, and specify which filesystems >> > are used where. The following is a typical layout for one run of the tinderbox (which is in /home/shared/freebsd/tinderbox): /dev/ufs/d1root on / (ufs, local, noatime, writes: sync 13 async 25, reads: sync 553 async 42, fsid 8aabfa4d68614a9f) devfs on /dev (devfs, local, fsid 00ff007171000000) tmpfs on /tmp (tmpfs, local, nosuid, fsid 01ff008787000000) /dev/ufs/d1var on /var (ufs, local, noatime, journaled soft-updates, writes: sync 15 async 269, reads: sync 664 async 12, fsid a5abfa4d331091c9) /dev/ufs/d1usr on /usr (ufs, local, noatime, journaled soft-updates, writes: sync 2 async 0, reads: sync 765 async 12, fsid b4abfa4d94c0f782) /dev/ufs/d1usrlocal on /usr/local (ufs, local, noatime, journaled soft-updates, writes: sync 32 async 298, reads: sync 2867 async 106, fsid c4abfa4d96ab4351) /dev/ufs/d1home on /home (ufs, local, noatime, journaled soft-updates, writes: sync 16 async 123, reads: sync 2065 async 268, fsid ceabfa4d9bb85870) the filesystem used for the port builds: /tmp/tinderbox/7.4-amd64-u1 on /home/shared/freebsd/tinderbox/7.4-amd64-u1 (nullfs, local, fsid 03ff002929000000) /home/shared/freebsd/ports/head on /home/shared/freebsd/tinderbox/7.4-amd64-u1/a/ports (nullfs, local, read-only, fsid 04ff002929000000) /home/shared/freebsd/tinderbox/jails/7.4-amd64/src on /home/shared/freebsd/tinderbox/7.4-amd64-u1/usr/src (nullfs, local, read-only, fsid 05ff002929000000) devfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/dev (devfs, local, fsid 06ff007171000000) /home/shared/freebsd/distfiles on /home/shared/freebsd/tinderbox/7.4-amd64-u1/distcache (nullfs, local, fsid 07ff002929000000) linprocfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/compat/linux/proc (linprocfs, local, fsid 08ff00b5b5000000) procfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/proc (procfs, local, fsid 09ff000202000000) >> > >> > The issue almost definitely is the held reference on the vm object. >> > Lets remove Gleb' patches from the picture at all. >> > >> > After rethinking VV_TEXT handling both for nullfs and tmpfs (patched), >> > I see two issues ATM: >> > >> > 1. VV_TEXT may be set either on the lower vnode, or on the nullfs >> > vnode. >> > So if you executed a file from nullfs alias, lower vnode does not get >> > VV_TEXT set, and executable can still be opened for write. >> > >> > 2. For tmpfs, the hack I added to clear VV_TEXT if swap vm object >> > reference >> > count == 1, is not called often enough. This allows to VV_TEXT to leak, >> > esp. >> > because nullfs after r240283 is not eager to reclaim its vnodes. >> > >> > I updated my branch with tmpfs patches with the following changes: >> > >> > 1. nullfs now bypasses the VV_TEXT set and clear operations to the >> > lower >> > vnode. >> > >> > 2. the tmpfs_clear_text() hack is removed, instead >> > vm_object_deallocate() >> > clears VV_TEXT on the tmpfs vnode if reference count goes to 1. >> > >> > Updated patch is at >> > http://people.freebsd.org/~kib/misc/tmpfs.13.patch >> > I tested it very lightly, so to say. >> >> I see the problem on a pristine r240611. Test scenario included. >> >> + mdconfig -a -t swap -s 1g -u 5 >> + bsdlabel -w md5 auto >> + newfs -U md5a >> + mount /dev/md5a /mnt2 >> + chmod 777 /mnt2 >> + mount >> + grep /mnt >> + grep -q tmpfs >> + mount -t tmpfs tmpfs /mnt >> + chmod 777 /mnt >> + mkdir /mnt2/mp >> + mount -t nullfs /mnt /mnt2/mp >> + cp /usr/bin/true /mnt2/mp/true >> + /mnt/true >> + >> + rm -f /mnt/true >> + cp /usr/bin/true /mnt2/mp/true >> + /mnt2/mp/true >> + >> ./nullfs12.sh: cannot create /mnt2/mp/true: Text file busy >> + echo FAIL 2 >> FAIL 2 >> + mount >> + egrep 'tmpfs|nullfs|/mnt |/mnt2 ' >> /dev/md5a on /mnt2 (ufs, local, soft-updates) >> tmpfs on /mnt (tmpfs, NFS exported, local) >> /mnt on /mnt2/mp (nullfs, local) >> + rm -f /mnt2/mp/true > > Yes, this is very close if not identical to the only test which I performed > with the tmpfs.13.patch. > I can no longer reproduce the port build failures on r240651 amd64 after applying your tmpfs.13.patch, and I haven't encountered any other obvious problems in the short time that I've been using it. I did not rerun Peter Holm's nullfs12.sh test, since you had already subjected your patch to a similar test. Regards, b.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGFTUwNP9qB1D%2B1E6Pw_ud1ESomcDs2bAOY9c_VRYLZ5AiF%2B5g>