Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Sep 2012 12:30:40 -0400
From:      "b. f." <bf1783@googlemail.com>
To:        freebsd-fs@FreeBSD.org
Subject:   Re: Problems after recent nullfs,vfs changes in 10.0-CURRENT
Message-ID:  <CAGFTUwNP9qB1D%2B1E6Pw_ud1ESomcDs2bAOY9c_VRYLZ5AiF%2B5g@mail.gmail.com>
In-Reply-To: <20120918084924.GY37286@deviant.kiev.zoral.com.ua>
References:  <CAGFTUwMVAmoN49u1bT_8LuxKo6JKrB6shbw773MpmJ5i7q=Qeg@mail.gmail.com> <20120917121925.GQ37286@deviant.kiev.zoral.com.ua> <20120917183654.GA13273@x2.osted.lan> <20120918084924.GY37286@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
The following deals with some problems exposed by r240283-5,
particularly (but not only) when used with changes to tmpfs that were
first proposed by kib@ on 21 June 2010 on this list, in a thread
entitled "Tmpfs elimination of double copy":

http://docs.freebsd.org/cgi/getmsg.cgi?fetch=20463+0+archive/2010/freebsd-fs/20100627.freebsd-fs

On 9/18/12, Konstantin Belousov <kostikbel@gmail.com> wrote:
> On Mon, Sep 17, 2012 at 08:36:54PM +0200, Peter Holm wrote:
>> On Mon, Sep 17, 2012 at 03:19:25PM +0300, Konstantin Belousov wrote:
>> > Please mail fs@, possibly Cc:-ing me.
>> >
>> > On Mon, Sep 17, 2012 at 03:04:46AM -0400, b. f. wrote:
>> > > The recent nullfs or vfs changes (r240283-5) have exposed some
>> > > problems with my tinderbox.  In this tinderbox, I've been using
>> > > recent
>> > > versions of -CURRENT with Gleb's tmpfs rbtree patch:
>> > >
>> > > http://people.freebsd.org/~gleb/tmpfs-nrbtree.1.patch
>> > >
>> > > and a merged version of your tmpfs single-buffer patch:
>> > >
>> > > http://people.freebsd.org/~kib/misc/tmpfs.12.patch
>> > >
>> > > The tinderbox performs builds in a tmpfs filesystem that is nullfs
>> > > grafted to a ufs filesystem.  After r240283-5, builds of
>> > > ports/lang/ocaml failed when a cp(1) of an executable failed with
>> > > ETXTBSY. After reverting r240285, the builds of ocaml succeeded.
>> > >
>> > > I've attached logs of the failed and successful builds.  Can you
>> > > guess
>> > > whether the problem is solely due to the recent nullfs and vfs
>> > > changes, or to some defect in Gleb's proposed changes, or to a
>> > > problem
>> > > with your proposed tmpfs change, or my merging of it?  What further
>> > > changes or tests would you suggest to help find the source of the
>> > > problem?
>> > >
>> > > I've attached a diff of the relevant changes to the system sources
>> > > used in the tinderbox, and logs of the successful (*.log) and
>> > > unsuccessful (*.log.error) ocaml builds.
>> >
>> > Please show me the mount -v output, and specify which filesystems
>> > are used where.

The following is a typical layout for one run of the tinderbox (which
is in /home/shared/freebsd/tinderbox):

/dev/ufs/d1root on / (ufs, local, noatime, writes: sync 13 async 25,
reads: sync 553 async 42, fsid 8aabfa4d68614a9f)
devfs on /dev (devfs, local, fsid 00ff007171000000)
tmpfs on /tmp (tmpfs, local, nosuid, fsid 01ff008787000000)
/dev/ufs/d1var on /var (ufs, local, noatime, journaled soft-updates,
writes: sync 15 async 269, reads: sync 664 async 12, fsid
a5abfa4d331091c9)
/dev/ufs/d1usr on /usr (ufs, local, noatime, journaled soft-updates,
writes: sync 2 async 0, reads: sync 765 async 12, fsid
b4abfa4d94c0f782)
/dev/ufs/d1usrlocal on /usr/local (ufs, local, noatime, journaled
soft-updates, writes: sync 32 async 298, reads: sync 2867 async 106,
fsid c4abfa4d96ab4351)
/dev/ufs/d1home on /home (ufs, local, noatime, journaled soft-updates,
writes: sync 16 async 123, reads: sync 2065 async 268, fsid
ceabfa4d9bb85870)

the filesystem used for the port builds:

/tmp/tinderbox/7.4-amd64-u1 on
/home/shared/freebsd/tinderbox/7.4-amd64-u1 (nullfs, local, fsid
03ff002929000000)
/home/shared/freebsd/ports/head on
/home/shared/freebsd/tinderbox/7.4-amd64-u1/a/ports (nullfs, local,
read-only, fsid 04ff002929000000)
/home/shared/freebsd/tinderbox/jails/7.4-amd64/src on
/home/shared/freebsd/tinderbox/7.4-amd64-u1/usr/src (nullfs, local,
read-only, fsid 05ff002929000000)
devfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/dev (devfs,
local, fsid 06ff007171000000)
/home/shared/freebsd/distfiles on
/home/shared/freebsd/tinderbox/7.4-amd64-u1/distcache (nullfs, local,
fsid 07ff002929000000)
linprocfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/compat/linux/proc
(linprocfs, local, fsid 08ff00b5b5000000)
procfs on /home/shared/freebsd/tinderbox/7.4-amd64-u1/proc (procfs,
local, fsid 09ff000202000000)

>> >
>> > The issue almost definitely is the held reference on the vm object.
>> > Lets remove Gleb' patches from the picture at all.
>> >
>> > After rethinking VV_TEXT handling both for nullfs and tmpfs (patched),
>> > I see two issues ATM:
>> >
>> > 1. VV_TEXT may be set either on the lower vnode, or on the nullfs
>> > vnode.
>> > So if you executed a file from nullfs alias, lower vnode does not get
>> > VV_TEXT set, and executable can still be opened for write.
>> >
>> > 2. For tmpfs, the hack I added to clear VV_TEXT if swap vm object
>> > reference
>> > count == 1, is not called often enough. This allows to VV_TEXT to leak,
>> > esp.
>> > because nullfs after r240283 is not eager to reclaim its vnodes.
>> >
>> > I updated my branch with tmpfs patches with the following changes:
>> >
>> > 1. nullfs now bypasses the VV_TEXT set and clear operations to the
>> > lower
>> > vnode.
>> >
>> > 2. the tmpfs_clear_text() hack is removed, instead
>> > vm_object_deallocate()
>> > clears VV_TEXT on the tmpfs vnode if reference count goes to 1.
>> >
>> > Updated patch is at
>> > http://people.freebsd.org/~kib/misc/tmpfs.13.patch
>> > I tested it very lightly, so to say.
>>
>> I see the problem on a pristine r240611. Test scenario included.
>>
>> + mdconfig -a -t swap -s 1g -u 5
>> + bsdlabel -w md5 auto
>> + newfs -U md5a
>> + mount /dev/md5a /mnt2
>> + chmod 777 /mnt2
>> + mount
>> + grep /mnt
>> + grep -q tmpfs
>> + mount -t tmpfs tmpfs /mnt
>> + chmod 777 /mnt
>> + mkdir /mnt2/mp
>> + mount -t nullfs /mnt /mnt2/mp
>> + cp /usr/bin/true /mnt2/mp/true
>> + /mnt/true
>> +
>> + rm -f /mnt/true
>> + cp /usr/bin/true /mnt2/mp/true
>> + /mnt2/mp/true
>> +
>> ./nullfs12.sh: cannot create /mnt2/mp/true: Text file busy
>> + echo FAIL 2
>> FAIL 2
>> + mount
>> + egrep 'tmpfs|nullfs|/mnt |/mnt2 '
>> /dev/md5a on /mnt2 (ufs, local, soft-updates)
>> tmpfs on /mnt (tmpfs, NFS exported, local)
>> /mnt on /mnt2/mp (nullfs, local)
>> + rm -f /mnt2/mp/true
>
> Yes, this is very close if not identical to the only test which I performed
> with the tmpfs.13.patch.
>

I can no longer reproduce the port build failures on r240651 amd64
after applying your tmpfs.13.patch, and I haven't encountered any
other obvious problems in the short time that I've been using it.  I
did not rerun Peter Holm's nullfs12.sh test, since you had already
subjected your patch to a similar test.

Regards,
                b.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGFTUwNP9qB1D%2B1E6Pw_ud1ESomcDs2bAOY9c_VRYLZ5AiF%2B5g>