From owner-svn-src-all@FreeBSD.ORG Mon Apr 18 19:28:16 2011 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED6FF106564A; Mon, 18 Apr 2011 19:28:16 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 3DAE78FC13; Mon, 18 Apr 2011 19:28:15 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p3IJSBtf051806 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 18 Apr 2011 22:28:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p3IJSA5R027163; Mon, 18 Apr 2011 22:28:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p3IJSAc2027162; Mon, 18 Apr 2011 22:28:10 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 18 Apr 2011 22:28:10 +0300 From: Kostik Belousov To: Matthew D Fleming Message-ID: <20110418192810.GX48734@deviant.kiev.zoral.com.ua> References: <201104181632.p3IGWM5v077720@svn.freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="FT/vNgDLxVq4t/AU" Content-Disposition: inline In-Reply-To: <201104181632.p3IGWM5v077720@svn.freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r220791 - in head: lib/libc/sys sys/compat/freebsd32 sys/kern sys/sys X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Apr 2011 19:28:17 -0000 --FT/vNgDLxVq4t/AU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Apr 18, 2011 at 04:32:22PM +0000, Matthew D Fleming wrote: > Author: mdf > Date: Mon Apr 18 16:32:22 2011 > New Revision: 220791 > URL: http://svn.freebsd.org/changeset/base/220791 >=20 > Log: > Add the posix_fallocate(2) syscall. The default implementation in > vop_stdallocate() is filesystem agnostic and will run as slow as a > read/write loop in userspace; however, it serves to correctly > implement the functionality for filesystems that do not implement a > VOP_ALLOCATE. > =20 > Note that __FreeBSD_version was already bumped today to 900036 for any > ports which would like to use this function. > =20 > Also reserve space in the syscall table for posix_fadvise(2). > =20 > Reviewed by: -arch (previous version) Thank you for taking the remarks into consideration. > =20 > +int > +vop_stdallocate(struct vop_allocate_args *ap) > +{ > +#ifdef __notyet__ > + struct statfs sfs; > +#endif > + struct iovec aiov; > + struct vattr vattr, *vap; > + struct uio auio; > + off_t len, cur, offset; > + uint8_t *buf; > + struct thread *td; > + struct vnode *vp; > + size_t iosize; > + int error, locked; > + > + buf =3D NULL; > + error =3D 0; > + locked =3D 1; > + td =3D curthread; > + vap =3D &vattr; > + vp =3D ap->a_vp; > + len =3D ap->a_len; > + offset =3D ap->a_offset; > + > + error =3D VOP_GETATTR(vp, vap, td->td_ucred); > + if (error !=3D 0) > + goto out; > + iosize =3D vap->va_blocksize; > + if (iosize =3D=3D 0) > + iosize =3D BLKDEV_IOSIZE; > + if (iosize > MAXPHYS) > + iosize =3D MAXPHYS; > + buf =3D malloc(iosize, M_TEMP, M_WAITOK); > + > +#ifdef __notyet__ > + /* > + * Check if the filesystem sets f_maxfilesize; if not use > + * VOP_SETATTR to perform the check. > + */ > + error =3D VFS_STATFS(vp->v_mount, &sfs, td); > + if (error !=3D 0) > + goto out; > + if (sfs.f_maxfilesize) { > + if (offset > sfs.f_maxfilesize || len > sfs.f_maxfilesize || > + offset + len > sfs.f_maxfilesize) { > + error =3D EFBIG; > + goto out; > + } > + } else > +#endif > + if (offset + len > vap->va_size) { > + VATTR_NULL(vap); > + vap->va_size =3D offset + len; > + error =3D VOP_SETATTR(vp, vap, td->td_ucred); > + if (error !=3D 0) > + goto out; > + } I still do not see a reason to do VOP_SETATTR() there. VOP_WRITE() will do auto-extend as needed. Also, see below. > + > + while (len > 0) { > + if (should_yield()) { > + VOP_UNLOCK(vp, 0); > + locked =3D 0; > + kern_yield(-1); Please note that, despite dropping the vnode lock, the snapshot creation is still blocked at this point, due to previous vn_start_write(). If doing vn_finished_write() there, then bwillwrite() before next iteration is desired. > + error =3D vn_lock(vp, LK_EXCLUSIVE); > + if (error !=3D 0) > + break; > + locked =3D 1; > + error =3D VOP_GETATTR(vp, vap, td->td_ucred); > + if (error !=3D 0) > + break; > + } > + > + /* > + * Read and write back anything below the nominal file > + * size. There's currently no way outside the filesystem > + * to know whether this area is sparse or not. > + */ > + cur =3D iosize; > + if ((offset % iosize) !=3D 0) > + cur -=3D (offset % iosize); > + if (cur > len) > + cur =3D len; > + if (offset < vap->va_size) { > + aiov.iov_base =3D buf; > + aiov.iov_len =3D cur; > + auio.uio_iov =3D &aiov; > + auio.uio_iovcnt =3D 1; > + auio.uio_offset =3D offset; > + auio.uio_resid =3D cur; > + auio.uio_segflg =3D UIO_SYSSPACE; > + auio.uio_rw =3D UIO_READ; > + auio.uio_td =3D td; > + error =3D VOP_READ(vp, &auio, 0, td->td_ucred); > + if (error !=3D 0) > + break; > + if (auio.uio_resid > 0) { > + bzero(buf + cur - auio.uio_resid, > + auio.uio_resid); > + } > + } else { > + bzero(buf, cur); > + } Wouldn't VOP_SETATTR() at the start of the function mostly prevent this bzero from executing ? > + > + aiov.iov_base =3D buf; > + aiov.iov_len =3D cur; > + auio.uio_iov =3D &aiov; > + auio.uio_iovcnt =3D 1; > + auio.uio_offset =3D offset; > + auio.uio_resid =3D cur; > + auio.uio_segflg =3D UIO_SYSSPACE; > + auio.uio_rw =3D UIO_WRITE; > + auio.uio_td =3D td; > + > + error =3D VOP_WRITE(vp, &auio, 0, td->td_ucred); > + if (error !=3D 0) > + break; > + > + len -=3D cur; > + offset +=3D cur; > + } > + > + out: > + KASSERT(locked || error !=3D 0, ("How'd I get unlocked with no error?")= ); > + if (locked && error !=3D 0) > + VOP_UNLOCK(vp, 0); > + free(buf, M_TEMP); > + return (error); > +} I estimated what it would take to do the optimized implementation for UFS, and I think that the following change would allow to lessen the code duplication much. What if the vnode lock drop and looping be handled by the syscall, instead of the vop implementation ? In other words, allow the VOP_ALLOCATE() to allocate less then requested, and return the allocated amount to the caller. The loop would be centralized then, freeing fs from doing the dance. Also, if fs considers that suitable, it would do a whole allocation in one run. --FT/vNgDLxVq4t/AU Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk2skMoACgkQC3+MBN1Mb4jHIwCgpCqlFUXXeEEMX1WNBJN+ulTm Rq0AnRKY8W0QQDL+ehilruivUg3J/Ot1 =+QkD -----END PGP SIGNATURE----- --FT/vNgDLxVq4t/AU--