Date: Fri, 31 Oct 2003 11:13:36 -0800 From: andi payn <andi_payn@speedymail.org> To: Terry Lambert <tlambert2@mindspring.com> Cc: freebsd-hackers@freebsd.org Subject: Re: O_NOACCESS? Message-ID: <1067627608.825.56.camel@verdammt.falcotronic.net> In-Reply-To: <3FA22930.C6EC97A9@mindspring.com> References: <1067528798.36829.2128.camel@verdammt.falcotronic.net> <3FA22930.C6EC97A9@mindspring.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2003-10-31 at 01:19, Terry Lambert wrote: > andi payn wrote: > > As far as I can tell, FreeBSD doesn't have anything equivalent to > > linux's O_NOACCESS (which is not in any of the standard headers, but > > it's equal to O_WRONLY | O_RDWR, or O_ACCMODE). In linux, this can be > > used to say, "give me an fd for this file, but don't try to open it for > > reading or writing or anything else." > > The standard does not permit this. > > First off, O_ACCMODE is a bitmask, and is guaranteed to be > inclusive of the bits for O_RDONLY, O_WRONLY, and O_RDWR, but > *not* guaranteed to not be inclusive of additional bits, > reserved or locally defined but outside the _POSIX_SOURCE > namespace. By using this value as a parameter, you could very > well be setting many more bits, and you could be setting bits > for local implementation options that you really, really do > not want to set. Now hold on. The standard (by which I you mean POSIX? or one of the UNIX standards?) doesn't say that you can't have an additional flag called O_NOACCESS with whatever value and meaning you want. Obviously, code that relies on such a flag will be non-portable, since no standard defines such a flag, but that's fine, since the intended uses (writing a FreeBSD-specific backend for fam, for example) aren't expected to be portable anyway. If O_NOACCESS happens to be == O_ACCMODE on FreeBSD--just as it is on linux--and if that happens to also be == O_WRONLY | O_RDWR (with no other flags set), I don't see how that changes anything. > Second, the standard is ambiguous as to how O_RDWR is defined; > it is perfectly permissable to define these values as: > > #define O_RDONLY 1 /* the read bit */ > #define O_WRONLY 2 /* the write bit */ > #define O_RDWR (O_RDONLY|O_WRONLY) /* read + write */ > > In which case, your example is (O_RDWR|O_WRONLY) == O_RDWR. The > standard does not indicate whether the implementation is to use > bits, or sequential manifest constants, only that the bits that > make up the constants be in the range covered by O_ACCMODE. First, again, this is intended to be used for non-portable code, and therefore, the fact that this happens not to be true on FreeBSD means it's irrelevant that it could be true elsewhere. Especially since, if O_NOACCESS were added to FreeBSD, it would still fail to exist entirely on other platforms, which means it matters little what value it might have if it did exist--code written to use O_NOACCESS won't compile on platforms without O_NOACCESS. Second, any platform that defines O_NOACCESS could do so differently. On FreeBSD, as on linux, the most sensible definition is O_NOACCESS == O_WRONLY | O_RDWR == 3. Or a platform that defined O_RDONLY as 1 and O_WRONLY as 2, the most sensible definition would be O_NOACCESS == 0. > In fact, externally, they are bits, but internally, in the kernel, > they are manifest constants. Yes, FFLAGS and OFLAGS convert between the two. If you look at how this works in the linux kernel, you'll see that O_RDONLY (0) converts to FREAD (1); O_WRONLY (1) to FWRITE (2); O_RDWR (2) to FREAD | FWRITE (3); and O_NOACCESS (3) to 0. This could be done the same way in FreeBSD.* * Actually, this is a tiny lie; linux has a 2-bit internal access flags value which it derives in this way, and uses the original passed-in flags for everything except access. FreeBSD instead just adds 1, relying on the fact that the lower 2 bits will never be 3, and therefore all of the other bits will stay the same. This means that enabling this value would make the FFLAGS and OFLAGS macros slightly more complicated on FreeBSD. > > This allows you to get an fd to pass to fcntl (e.g., for dnotify), or > > call ioctl's on, etc.--even if you don't have either read or write > > access to the file. The obvious question is, "Why should this ever be > > allowed?" Well, if you can stat the file, why can't you, e.g., ask > > kevent to monitor it? > > The most useful thing you could do with this, IMO, is opn a directory > for fchdir(). Except that you can already do exactly this with chdir(). But I can see that you might at some point want to check the directory before chdir'ing to it, or pass an fd down into some function instead of a string, and this would be useful in such a case. > Of course, allowing this on directories for which you > are normally denied read/write permissions would be a neat way to > escape from chroot'ed environments and compromise a host system... How would it allow that? If you can open files outside your chroot environment--even files you would otherwise have read access to--it's not much of a chroot! To put this all together: chdir already allows you to change to directories for which you have execute permissions but not read or write; fchdir already prevents you from changing to directories for which you don't have execute permissions; and any path you can pass to open you could pass to chdir instead Therefore: fd = open(path, O_NOACCESS); fchdir(fd); fclose(fd) allows exactly the same exploits as chdir(path) So, if this would allow you to escape chroot, then the kernel is heavily flawed and chdir is an exploit waiting to happen.... What if you inherent the fd from an ancestor? Well, so what? You could just as easily have inherited it with O_RDONLY as O_NOACCESS, and the "exploit" is exactly the same in either case. > > In FreeBSD, this doesn't work; you just get EINVAL. > > > > Having O_NOACCESS would be useful for the fam port, for porting pieces > > of lilo, and probably for other things I haven't thought of yet. (I > > believe that either this was added to linux to support lilo, or the open > > syscall just happened to work this way, and once the lilo developers > > discovered this and took advantage of it, it's been retained that way > > ever since to keep lilo working.) > > The latter is most likely. Actually, you'd be surprised at how much has been explicitly added to the kernel (and, more, to the filesystem code, especially reiser) for lilo's benefit. > In any case, this would not be allowed > by GEOM for the purpose to which LILO is trying to put it, unless > you were to modify GEOM to add a control path for parents of > already opened devices. I think that in the case you're thinking of, lilo is actually passing -1, which has a special meaning to the linux kernel ("pass all ioctl's through"), rather than 3 (but both end up as !FREAD, !FWRITE if you try to use the fd for anything normal). Anyway, I wasn't proposing to make O_NOACCESS work for this purpose. As a quick glance will show, successfully opening the fd wouldn't do lilo any good if all it wants to do is call a dozen ioctl's that FreeBSD's devices have never heard of.... > If you did this, you might as well just > add a proper set of abstract fcntl's to GEOM, and get rid of all > the raw disk crap in user space, and unbreak dislabel and the other > stuff that GEOM broke when it went in. Sure, I'll agree that it would probably be better to expand GEOM than to hack it for lilo's benefit. > > On the other hand, BSD has done without it for many years, and there's > > probably a good reason it's never been added. So, what is that good > > reason? > > fcntl.h: > #define FFLAGS(oflags) ((oflags) + 1) Yes, I mentioned that before. Unlike linux, FreeBSD depends on the fact that, e.g., O_NOBLOCK is the same in FFLAGS(flags) as in flags. This is only true because (flags & 3) != 3 is guaranteed, and therefore ((flags + 1) & 4) == (flags & 4) is also guaranteed. So, this would require either using something more complicated for FFLAGS (and OFLAGS), or changing a few lines of code where this assumption is made. > > I don't think there's a backwards-compatibility issue. > > Unfortunately, yes, there is. The values are not bits, internally > to the kernel. The conversion to internal form merely adds 1, it > doesn't shift the values. Well, OK, let me restate this: I don't think there's a backwards-compatibility issue if this is done properly. Just removing the one-line check against O_WRONLY | O_RDWR would not work, therefore I suppose there would be a backwards-compatibility issue, in the sense that a broken kernel is not backwards-compatible with a working one....
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1067627608.825.56.camel>