Date: Wed, 14 Jan 2004 13:01:55 -0800 (PST) From: Don Lewis <truckman@FreeBSD.org> To: rwatson@FreeBSD.org Cc: current@FreeBSD.org Subject: Re: simplifying linux_emul_convpath() Message-ID: <200401142101.i0EL1t7E040382@gw.catspoiler.org> In-Reply-To: <Pine.NEB.3.96L.1040114102304.46433D-100000@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 14 Jan, Robert Watson wrote: > > On Wed, 14 Jan 2004, Don Lewis wrote: > >> I just stumbled across a vnode locking violation in >> linux_emul_convpath(). Rather than locking and unlocking each vnode for >> the VOP_GETATTR() calls, is there any reason that this code should not >> be simplified to just compare the vnode pointers rather than fetching >> the vnode attributes and comparing the attributes for equality. > > For some time, I've been thinking of adding samefile() and fsamefile() > system calls to FreeBSD, which would allow userspace applications to > determine if two names or file handles refer to the same object without > playing games with inode numbers, device ids, etc. The reason to do this > would be that 32-bit inode numbers are subject to collision on large file > systems. My initial implementation simply compared vnode pointers, but > that raises an interesting question about how stacked file systems should > be treated, and depends a lot on the semantics of the stacked file system, > really. My leaning is that in general they should probably be treated as > different objects if they have different vnodes, because with the > exception of nullfs (and occasionally unionfs), that probably is the > desired semantic. You could imagine introducing a VOP to ask "Are you the > same as this other vnode", and pointing it at both vnodes, but I think > that adds unnecessary complexity without a whole lot of benefit. The typical user of something like this would be tar when it is deciding what to hardlink together. One could make a case for making a nullfs mounted copy match the original (or two separately mounted nullfs copies match each other). That would do the "right" think when archiving a file tree containing nullfs mount points and untarring into a single file system, except that it would confuse the heck out of tar because the link counts would be wrong. The VOP would be cheap, too. But what about a crypto or compression layer? The problem for something like tar is that this mechanism doesn't scale well. When creating an archive, tar keeps a database of pathnames of files that have more than one link, with the inode number as the key. Each time encounters a file with multiple links, it does a lookup in the database. If it finds a match, it outputs a record with the pathname it found in the database, and if it didn't find a match it adds a new record to the database. This can be done with reasonable efficiency in userland. If the only way of comparing if two files were the same were to use syscalls, it would be terribly slow. Tar would only be able to keep a list of the pathnames and would have to iterate through the list doing the syscall for each entry in search of a match to the current file it was processing. This is an O^2 problem with a syscall in the loop. Tar might be able to narrow the search by matching file attributes, but it would still be possible to have degenerate cases unless the inode number were used as an attribute (which would not work if you wanted nullfs copies to match). There are programs that could make use of samefile(), such as cp. It would probably want a nullfs copy to match the original.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200401142101.i0EL1t7E040382>