From owner-freebsd-fs@FreeBSD.ORG Tue Apr 5 19:03:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9187106566B for ; Tue, 5 Apr 2011 19:03:33 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail09.syd.optusnet.com.au (mail09.syd.optusnet.com.au [211.29.132.190]) by mx1.freebsd.org (Postfix) with ESMTP id 634D18FC0A for ; Tue, 5 Apr 2011 19:03:32 +0000 (UTC) Received: from c122-106-155-58.carlnfd1.nsw.optusnet.com.au (c122-106-155-58.carlnfd1.nsw.optusnet.com.au [122.106.155.58]) by mail09.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p35J3TeU017000 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 6 Apr 2011 05:03:30 +1000 Date: Wed, 6 Apr 2011 05:03:29 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Kostik Belousov In-Reply-To: <20110405141631.GA78089@deviant.kiev.zoral.com.ua> Message-ID: <20110406034131.L2689@besplex.bde.org> References: <20110405141631.GA78089@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: Knob to turn off _POSIX_NO_TRUNC X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Apr 2011 19:03:33 -0000 On Tue, 5 Apr 2011, Kostik Belousov wrote: > From very old and gloomy SysV times I remembered filesystem behaviour > that silently truncated the file name components to the NAME_MAX limit, > that was, AFAIR, 14. To much of my dismay, I met some usermode software > recently that blindly tried to create the file from externally provided > name, and sometimes failed with ENAMETOOLONG in similar situation. > The authors are not cooperative. Apply rmrf. > I ended up with the following hack, which almost turns off the > _POSIX_NO_TRUNC behaviour, globally on the system. Patch allowed me > to proceed. The cost in the default case is a single check, which is > performed only on ENAMETOOLONG path. Please don't commit this. This is already handled correctly for gloomy file systems by returning 0 from pathconf(path, _PC_NO_TRUNC) for ones that have short names, although the global definitions of _POSIX_NO_TRUNC and NAME_MAX have always been broken in FreeBSD. msdosfs does this even when it has long names. The main other file systems that do it are hpfs and ntfs, which I think shouldn't do it since they always have long names. Perhaps they do this for compatibility with WinDOS. FreeBSD defines _POSIX_NO_TRUNC as 1, which means that file names are _never_ truncated and is inconsistent what actual file systems like msdsofs do. FreeBSD defines NAME_MAX as a certain value, which means that the name length limit is _always_ this value and is inconsistent with the limit that actual file systems like msdosfs have. > diff --git a/sys/kern/vfs_lookup.c b/sys/kern/vfs_lookup.c > index 50a2570..e9e7697 100644 > --- a/sys/kern/vfs_lookup.c > +++ b/sys/kern/vfs_lookup.c > @@ -99,6 +99,11 @@ SYSCTL_INT(_vfs, OID_AUTO, lookup_shared, CTLFLAG_RW, &lookup_shared, 0, > "Enables/Disables shared locks for path name translation"); > TUNABLE_INT("vfs.lookup_shared", &lookup_shared); > > +static int lookup_trim; > +SYSCTL_INT(_vfs, OID_AUTO, lookup_trim, CTLFLAG_RW, &lookup_trim, 0, > + "Enables/Disables trim of the long path component instead of ENAMETOOLONG"); Enabling this is another way of breaking the guarantee that the compile-time constant _POSIX_NO_TRUNC gives the run-time behaviour. > +TUNABLE_INT("vfs.lookup_trim", &lookup_trim); It is a style bug (bloat) to have a tunable for something that doesn't need to be set for booting. > + > /* > * Convert a pathname into a pointer to a locked vnode. > * > @@ -514,8 +519,14 @@ dirloop: > continue; > cnp->cn_namelen = cp - cnp->cn_nameptr; > if (cnp->cn_namelen > NAME_MAX) { This hard coding of NAME_MAX gives various bugs which are now larger than before. Before, the only bug was that file systems that could support a {NAME_MAX} (= pathconf(path, _PC_NAME_MAX) longer than the compile-time constant NAME_MAX can't do it, since the vfs level rejects such names here. Some check is needed here to avoid overrunning the buffer, but the limit on the buffer size shouldn't the one for a broken implementation of the user API. With a non-broken implementation, NAME_MAX cannot be defined (except possibly visibly only in the kernel) and the limit here should be the maximum needed by all file systems. This can then be increased if a new file system needs it without breaking the user API, or it can be made larger than necessary to allow for expansion with incomplete coordination with new file systems. > - error = ENAMETOOLONG; > - goto bad; Even if the name length is not > NAME_MAX, the ENAMETOOLONG error or truncation may still occur in inidividualy file systems. I doubt that all file systems handle this correctly. For example, msdosfs never returns ENAMETOOLONG, and this is correct since msdosfs advertises {_PC_NO_TRUNC} = 0. But {_PC_NO_TRUNC} = 0 is wrong since the above implements _PC_NO_TRUNC = 1 unconditionally if the the name length exceeds NAME_MAX. > + if (!lookup_trim) { > + error = ENAMETOOLONG; > + goto bad; > + } > + ndp->ni_pathlen -= cnp->cn_namelen - NAME_MAX; > + cnp->cn_namelen = NAME_MAX; > + strcpy(cnp->cn_nameptr + cnp->cn_namelen, cp); > + cp = cnp->cn_nameptr + cnp->cn_namelen; This makes {_PC_NO_TRUNC} as perfectly broken as _POSIX_NO_TRUNC for file systems that implement _POSIX_NO_TRUNC, since the patch doesn't touch individual file systems so all the ones that return 1 for pathconf(path, _PC_NO_TRUNC) become inconsistent with their actual behavior. If you fix this, then the patch becomes correct but uglier. > } > #ifdef NAMEI_DIAGNOSTIC > { char c = *cp; > The code for the pathconf() vop in individual file systems has considerable ugliness related to this. Most file systems want the defaults for everything related to pathnames, but the implementation provides no way to get the defaults from another layer, so most vops duplicate a not-so-big case statement. OTOH vop_stdpathconf() provides a "default" for the whole vop which is very nonstandard and unsuitable as a default for almost everything in it that is not related to pathnames. It also has large style bugs (indentation errors) which were not present in the place it was copied from. It is essentially devfs's patconf vop, and it now seems to be used only by coda, devfs, fdescfs, portalfs and zfs. The use of it in coda, devfs and zfs shows how it can be be used as a default after all: start with an fs-specific vop and only use vop_stdpathconf() for cases that you don't want to override. But vop_stdpathconf() is unsuitable for this, since it provides many cases that are correct for at most devfs or only for a few file systems. These are: _PC_LINK_MAX = LINK_MAX: wrong because most file systems don't support hard links _PC_MAX_CANON = MAX_CANON _PC_MAX_INPUT = MAX_INPUT _PC_VDISABLE = POSIX_VDISABLE: wrong because only devfs supports device files wrong for most devfs files too, since most device files are not ttys MAX_CANON and MAX_INPUT are also wrong _PC_PIPE_BUF = PIPE_BUF: wrong because most file systems don't support named pipes or sockets PIPE_BUF may also be wrong for sockets (I think it is a variable depending on watermarks which can be set by ioctls) After removing these, only the pathname-related cases _PC_NAME_MAX and _PC_NO_TRUNC, and _PC_CHOWN_RESTRICTED are left. Perhaps it's not worth having a default for just these. ufs_pathconf() is now quite large and ugly, with many ufs-specific cases that can't be defaulted, and almost total disorder for the cases. It wouldn't benefit from defaulting just 3 cases. It has complications related to fifos. This seems to result in the pathconf selectors for pathnames failing for pathnames that refer to a fifo. Whether they work is implementation-defined (they are only required to work for pathnames that refer to directories), but FreeBSD normally makes them work for all types of files. Bruce