Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 21 Jun 2010 16:15:22 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Kostik Belousov <kostikbel@gmail.com>
Cc:        freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org
Subject:   Re: Tmpfs elimination of double-copy
Message-ID:  <201006211615.22758.jhb@freebsd.org>
In-Reply-To: <20100621184928.GI13238@deviant.kiev.zoral.com.ua>
References:  <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua>

next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 21 June 2010 2:49:28 pm Kostik Belousov wrote:
> On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote:
> > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote:
> > > Hi,
> > > Below is the patch that eliminates second copy of the data kept by tmpfs
> > > in case a file is mapped. Also, it removes potential deadlocks due to
> > > tmpfs doing copyin/out while page is busy. It is possible that patch
> > > also fixes known issue with sendfile(2) of tmpfs file, but I did not
> > > verified this.
> > > 
> > > Patch essentially consists of three parts:
> > > - move of vm_object' vnp_size from the type-discriminated union to the
> > >   vm_object proper;
> > > - making vm not choke when vm object held in the struct vnode' v_object
> > >   is default or swap object instead of vnode object;
> > > - use of the swap object that keeps data for tmpfs VREG file, also as
> > >   v_object.
> > > 
> > > Peter Holm helped me with the patch, apparently we survive fsx and 
stress2.
> > 
> > Why did you have to move vnp_size out of the union?  Is tmpfs using a non-
> > OBJT_VNODE object to hold file data?
> Tmpfs uses OBJT_SWAP object to keep the data pages for the files.
> Current code allocates another object of type OBJT_VNODE, assigned
> to vp->v_object, to satisfy VM interface for mapping the file, using
> vnode_create_vobject. The objects do not share the pages (I do not think
> this can be easily achieved without serious changes to VM). Thus most,
> if not all, the data is present in two sets of pages.
> 
> When such file is written to, tmpfs copies user buffer both to the swap
> object, and to the v_object.
> 
> Patch I posted assigns the swap object to the vp->v_object. I had to
> make small change to vm_mmap_vnode() to not allocate the vnode pager
> and to not increment vnode use counter when v_object is the swap
> object.
> 
> vnp_size has to be provided on the object layer because our swap
> object is used to e.g. mmap the executables from tmpfs, and image
> activation code relies on vnp_size instead of slower VOP_GETATTR().
> I think this route is easier then converting all vnp_size users to
> VOP_GETATTR for only tmpfs benefit.

Ok, thanks for the expanded explanation. :)  It seems a shame to have to move 
vnp_size out of the pager-specific data.  Maybe add a comment in vm_object.h 
to say that vnp_size is used by multiple object types which is why it can't be 
vnode-specific anymore?

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201006211615.22758.jhb>