From owner-freebsd-hackers@FreeBSD.ORG Sat Nov 16 18:31:42 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B23F594F; Sat, 16 Nov 2013 18:31:42 +0000 (UTC) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 247792E42; Sat, 16 Nov 2013 18:31:41 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.7/8.14.7) with ESMTP id rAGIVVZ4055440; Sat, 16 Nov 2013 20:31:31 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.8.3 kib.kiev.ua rAGIVVZ4055440 Received: (from kostik@localhost) by tom.home (8.14.7/8.14.7/Submit) id rAGIVTKO055439; Sat, 16 Nov 2013 20:31:29 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 16 Nov 2013 20:31:29 +0200 From: Konstantin Belousov To: Doug Ambrisko Subject: Re: Re: Fix MNAMELEN or reimplement struct statfs Message-ID: <20131116183129.GD59496@kib.kiev.ua> References: <51B3B59B.8050903@erdgeist.org> <201306101152.17966.jhb@freebsd.org> <52854161.6080104@FreeBSD.org> <20131115010854.GA76106@ambrisko.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0vFzRikYu/73UBJt" Content-Disposition: inline In-Reply-To: <20131115010854.GA76106@ambrisko.com> User-Agent: Mutt/1.5.22 (2013-10-16) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-hackers@freebsd.org, Dirk Engling , Jase Thew , mdf@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 16 Nov 2013 18:31:42 -0000 --0vFzRikYu/73UBJt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 14, 2013 at 05:08:54PM -0800, Doug Ambrisko wrote: > On Thu, Nov 14, 2013 at 09:32:17PM +0000, Jase Thew wrote: > | On 10/06/2013 16:52, John Baldwin wrote: > | > On Saturday, June 08, 2013 9:36:27 pm mdf@freebsd.org wrote: > | >> On Sat, Jun 8, 2013 at 3:52 PM, Dirk Engling = wrote: > | >> > | >>> The arbitrary value > | >>> > | >>> #define MNAMELEN 88 /* size of on/from name buf= s */ > | >>> > | >>> struct statfs { > | >>> [...] > | >>> char f_mntfromname[MNAMELEN];/* mounted filesystem */ > | >>> char f_mntonname[MNAMELEN]; /* directory on which mount= ed */ > | >>> }; > | >>> > | >>> currently bites us when trying to use poudriere with errors like > | >>> > | >>> 'mount: tmpfs: File name too long' > | >>> > | >>> > | >>> /poudriere/data/build/91_RELEASE_amd64-REALLY-REALLY-LONG- > | > JAILNAME/ref/wrkdirs > | >>> > | >>> The topic has been discussed several times since 2004 and has been > | >>> postponed each time, the last time when it hit zfs users: > | >>> > | >>> http://lists.freebsd.org/pipermail/freebsd-fs/2010-March/007974.html > | >>> > | >>> So I'd like to point to the calendar, it's 2013 already and there's > | >>> still a static arbitrary (and way too low) limit in one of the core > | >>> areas of the vfs code. > | >>> > | >>> So I'd like to bump the issue and propose either making f_mntfromna= me a > | >>> dynamic allocation or just increase MNAMELEN, using 10.0 as water s= hed. > | >>> > | >> > | >> Gleb Kurtsou did this along with the ino64 GSoC project. Unfortunat= ely, > | >> both he and I hit ENOTIME due to the job that pays the bills and it= 's > | >> never made it back to the main repository. > | >> > | >> IIRC, though, the only reason for doing it with 64-bit ino_t is that= he'd > | >> already finished changing the stat/dirent ABI so what was one more. = I > | >> think he went with 1024 bytes, which also necessitated not allocating > | >> statfs on the stack for the kernel. > | >=20 > | > He also fixed a few other things since changing this ABI is so invasi= ve > | > IIRC. This really is the right fix for this. Is it in an svn branch= =20 > | > that can be updated and a new patch generated? > | >=20 > |=20 > | Hi folks, > |=20 > | Has there been any progress on addressing this issue? With the advent of > | pkgng / poudriere, this limitation is really becoming a frustrating pro= blem. >=20 > I looked at NetBSD and what they did with statvfs. The mount paths > lengths are bigger in NetBSD defines so that helps. However, when > testing it out via a script that keep on doing a nullfs mount in=20 > every increasing directory depth I found that NetBSD would allow the > mount to exceed the value in statvfs. When NetBSD populates the path > in statvfs they truncate it to what fits in statvfs. So I looked at > what that might be like in FreeBSD. So I came up with this simple patch: >=20 > --- /sys/kern/vfs_mount.c 2013-10-01 14:27:35.000000000 -0700 > +++ vfs_mount.c 2013-10-21 14:20:19.000000000 -0700 > @@ -656,7 +656,7 @@ vfs_donmount(struct thread *td, uint64_t > * variables will fit in our mp buffers, including the > * terminating NUL. > */ > - if (fstypelen >=3D MFSNAMELEN - 1 || fspathlen >=3D MNAMELEN - 1) { > + if (fstypelen >=3D MFSNAMELEN - 1 || fspathlen >=3D MAXPATHLEN - 1) { > error =3D ENAMETOOLONG; > goto bail; > } > @@ -748,8 +748,8 @@ sys_mount(td, uap) > return (EOPNOTSUPP); > } > =20 > - ma =3D mount_argsu(ma, "fstype", uap->type, MNAMELEN); > - ma =3D mount_argsu(ma, "fspath", uap->path, MNAMELEN); > + ma =3D mount_argsu(ma, "fstype", uap->type, MFSNAMELEN); > + ma =3D mount_argsu(ma, "fspath", uap->path, MAXPATHLEN); > ma =3D mount_argb(ma, flags & MNT_RDONLY, "noro"); > ma =3D mount_argb(ma, !(flags & MNT_NOSUID), "nosuid"); > ma =3D mount_argb(ma, !(flags & MNT_NOEXEC), "noexec"); > @@ -1039,7 +1039,7 @@ vfs_domount( > * variables will fit in our mp buffers, including the > * terminating NUL. > */ > - if (strlen(fstype) >=3D MFSNAMELEN || strlen(fspath) >=3D MNAMELEN) > + if (strlen(fstype) >=3D MFSNAMELEN || strlen(fspath) >=3D MAXPATHLEN) > return (ENAMETOOLONG); > =20 > if (jailed(td->td_ucred) || usermount =3D=3D 0) { > @@ -1095,9 +1095,9 @@ vfs_domount( > NDFREE(&nd, NDF_ONLY_PNBUF); > vp =3D nd.ni_vp; > if ((fsflags & MNT_UPDATE) =3D=3D 0) { > - pathbuf =3D malloc(MNAMELEN, M_TEMP, M_WAITOK); > + pathbuf =3D malloc(MAXPATHLEN, M_TEMP, M_WAITOK); > strcpy(pathbuf, fspath); > - error =3D vn_path_to_global_path(td, vp, pathbuf, MNAMELEN); > + error =3D vn_path_to_global_path(td, vp, pathbuf, MAXPATHLEN); > /* debug.disablefullpath =3D=3D 1 results in ENODEV */ > if (error =3D=3D 0 || error =3D=3D ENODEV) { > error =3D vfs_domount_first(td, vfsp, pathbuf, vp, > @@ -1147,8 +1147,8 @@ sys_unmount(td, uap) > return (error); > } > =20 > - pathbuf =3D malloc(MNAMELEN, M_TEMP, M_WAITOK); > - error =3D copyinstr(uap->path, pathbuf, MNAMELEN, NULL); > + pathbuf =3D malloc(MAXPATHLEN, M_TEMP, M_WAITOK); > + error =3D copyinstr(uap->path, pathbuf, MAXPATHLEN, NULL); > if (error) { > free(pathbuf, M_TEMP); > return (error); > @@ -1181,7 +1181,7 @@ sys_unmount(td, uap) > vfslocked =3D NDHASGIANT(&nd); > NDFREE(&nd, NDF_ONLY_PNBUF); > error =3D vn_path_to_global_path(td, nd.ni_vp, pathbuf, > - MNAMELEN); > + MAXPATHLEN); > if (error =3D=3D 0 || error =3D=3D ENODEV) > vput(nd.ni_vp); > VFS_UNLOCK_GIANT(vfslocked); >=20 > I seemed to have found a typo bug in an instance in which MFSNAMELEN > wasn't used in the fstype when I did this change. >=20 > With this patch things in general seem to work. You can do a > mount and umount of a long path. The umount of the long path works > by failing on the exact match but then passing when via the FSID. > df/mount looks a little strange since it shows a truncated path=20 > but has valid contents (FS type, space etc.). umount via the truncated > path works if there is only one truncated path that matches. If there > are multiple then it fails. >=20 > This doesn't change and kernel ABI's so then it is safe to apply to the > kernel without rebuilding user-land. >=20 > Future work could be to implement statvfs to return a longer path but > only do it for df/umount etc. The rest of the system could continue > with the existing statfs. mount works because it passed a string into > the kernel so it can be long. >=20 > I'd propose this as a current solution to this problem. It appears to > work well and shouldn't drastically break things. Doing df via the > full path, stat etc. work since the associated path access the vnode. > So things that do a mount, df of the mount point etc. should continue > to work. Scripts that try to figure out the mount points vi df and mount > displaying all mount points would fail. That is probably good enough for > now. >=20 > Comments welcomed. Generally, I agree with the approach, but what is done seems to be too simple to be usable. One obvious and important thing which is broken with the patch is the unmounts from jails. In other words, now it is possible to mount something from jail with appropriate privileges set up, and after that corresponding mount cannot be unmounted, since vfs_mount_alloc() copies trimmed path into f_mntonname, and sys_unmount() matches full path with pathbuf. Hmm, this should be broken in the same way for non-jailed mounts with pathes which do not fit into f_mntonname. I think that struct mount should have a const char * field where the non-trimmed path is stored and used for match at unmount. f_mntonname truncation would be only unfortunate user interface glitch. --0vFzRikYu/73UBJt Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (FreeBSD) iQIcBAEBAgAGBQJSh7oAAAoJEJDCuSvBvK1B/8QP/izzxCeGUxj94MJEfRZhG08X xNy2X5ZxLNO9m2iCa72648bzJxpnJvKiLijj4ZsVTC/vX5O3vOCtP1B6Kve86Xpe 6QqXLLQfSRQFVeAVBA2caH+eGveBbr/71ED1AnVgqY2a69FNhSVPj5YgHOjVcplr o17tv2/f4pUaKVTCH5Qlkypbcrz6CmlTjwj7Z/qRFddf9tObl75JGGfptaHg+Uhy ur/R+6vFVF+iKpjMTZmtMQT4pA8FqwfbCicgz+baIO19ie0h0eWZOLqe9rL4nK4/ 3B9Ux2PBd0y+CxbHg+TbEJfuWKish6FWJrT47Nzk4Wufs8y7R58Rm5EvGGRKokzZ P+HUUTWox1HayOpE4glQYhy1B2J/ABVmVD2iscFmyZHxFFzKNem1Q2PpZkpj4EaH yuj0B4RNqm/p9N0AKnyIzzd92Ouhf8Kx+k9HIbIpjBgQ1fGwk+ANnuBtIA65e/Af l1R4DtbglqNipcXxT49ahaI71uAQZCwgkoDSyyOXYWJHpqhRxblQsRdD4rYVreIb VQxATOJDAWvxZ8+ariz8vc9Tf+AFhsjGexvtC6bgVoKZqrh9WgbEjIo3zf1UuM6Q ynQGThMhDyTTXdxpXq1C6rvDSRzvflUMHQJn3ByjpWZlndki8nFX95R9CzzP21Vt /jfbVyPcUIWrkrNZ/aeA =FE3x -----END PGP SIGNATURE----- --0vFzRikYu/73UBJt--