Date: Sat, 6 Jun 2009 20:31:51 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: Brooks Davis <brooks@FreeBSD.org> Cc: arch@FreeBSD.org, current@FreeBSD.org Subject: Re: RFT: Allow large values of NGROUPS_MAX Message-ID: <20090606184344.N16744@delplex.bde.org> In-Reply-To: <20090605223636.GA24364@lor.one-eyed-alien.net> References: <20090605223636.GA24364@lor.one-eyed-alien.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 5 Jun 2009, Brooks Davis wrote: > I've been working on fixing the limit of 15 groups per user. This has > primarily consisted of merging a patch from Isilon Systems which breaks > the group array out of struct ucred and stores in in malloc'd storage. > > I've attached a patch which includes that change and increases the value > of NGROUPS_MAX to 32767. It also changes references to cr_groups[0] to > use the cr_gid macro ... I don't like this macro. At least the implementation should keep using the unobfuscated reference. kern_prot.c consistently didn't use the obfuscation except once in a comment. > Before any merge a couple decisions need to be made: > > - How large should NGROUPS_MAX be? Linux uses 65536 and we could > extend things to support that, but it's probably unnecessary. 32767 seems excessive too. Hopefully NGROUPS[_MAX] is used only in broken unconverted applications so a large value rarely wastes space. > - Should we make any attempt to support old binaries when there > are more than 16 groups? There is no way to support broken old binaries. They will have a limit of 16. > The POSIX getgroups/setgroups APIs did not > anticipate this change and thus either will fail outright. No, as you explained in previous mail, POSIX anticipated this perfectly unusably by making NGROUPS_MAX a Runtime Increasable Value despite it being constant. This puts the burden of not failing on applications. Only broken apps that don't do any of dynamic allocation using sysconf(_SC_NGROUPS_MAX), or dynamic allocation using getgroups(0, ...), or dynamic allocation after failing using getgroups(old_NGROUPS_MAX, ...) will fail outright. > We can't > fix setgroups, but we might want to make an optional accommodation for > getgroups to allow for truncated returns to old code. If done unconditionally, this would break the unbroken apps that do dynamic allocation after failing using getgroups(old_NGROUPS_MAX, ...). These apps depend on getgroups() failing with the Standard and documented errno EINVAL when the `gidsetlen' parameter is too small. Maybe some per-app or per-environment branding could be used to configure not generating an error. You would apply it only where security is not very important. getgrouplist(3) guarantees filling of the array when the array is too small. getgroups(2) doesn't guarantee anything, and in fact doesn't fill the array. It wouldn't hurt to fill it. % ... % diff -ru --exclude='.glimpse*' --exclude=.svn --exclude=compile --ignore-matching='$FreeBSD' /usr/src/sys/kern/kern_prot.c ngroups/sys/kern/kern_prot.c % --- /usr/src/sys/kern/kern_prot.c 2009-06-05 15:33:50.000000000 -0500 % +++ ngroups/sys/kern/kern_prot.c 2009-06-05 16:02:28.000000000 -0500 % @@ -243,16 +246,11 @@ % % td->td_retval[0] = td->td_ucred->cr_rgid; % #if defined(COMPAT_43) % - td->td_retval[1] = td->td_ucred->cr_groups[0]; % + td->td_retval[1] = td->td_ucred->cr_gid; % #endif % return (0); % } % % -/* % - * Get effective group ID. The "egid" is groups[0], and could be obtained % - * via getgroups. This syscall exists because it is somewhat painful to do % - * correctly in a library function. % - */ % #ifndef _SYS_SYSPROTO_H_ % struct getegid_args { % int dummy; This comment still seems to apply. I wonder if you noticed and/or fixed the related off-by-1 errors in getgroups() and {NGROUPS_MAX}: POSIX.1-2001-draft7 says: % 16834 IDs of the calling process. It is implementation-defined whether getgroups( ) also returns the % 16835 effective group ID in the grouplist array. FreeBSD does return the egid in the array, always as the first element. Neither of these details is documented in getgroups.2 or setgroups.2. Even the implementation doesn't seem to understand this where it is most important -- the implementation of setgroups() seems to be missing some of the things done by setegid(). getgrouplist(3) has explicit support for the egid being in both the initial and final lists (or missing in the initial list?). % ... % 16841 If the effective group ID of the process is returned with the supplementary group IDs, the value % 16842 returned shall always be greater than or equal to one and less than or equal to the value of % 16843 {NGROUPS_MAX}+1. This and probably other things indicate that _supplementary_ actually means supplementary -- it doesn't count the egid. But in FreeBSD, the egid is counted. This gives 2 off-by-1 errors that partially compensate for each other: - NGROUPS_MAX is 15, not 16, since it is impossible to have 16 _supplementary_ groups in cr_groups[0..15] after using cr_groups[0] for the egid - getgroups() cannot return {NGROUPS_MAX}+1 as is needed for it to return the egid plus {NGROUPS_MAX} supplementary groups - getgroups() does return {NGROUPS_MAX}+1 once {NGROUPS_MAX} is corrected. Some userland code depends on these bugs -- the static array size should be NGROUPS_MAX + 1, but it is typically plain NGROUPS. OTOH, libc/gen/initgroups.c uses NGROUPS + 1 and getgrouplist() followed by setgroups(). I think getgrouplist() can produce the extra 1 and then setgroups() and thus initgroups() will fail. I thought again about making cr_gid separate from the array, so that it can be named cr_gid without obfuscation. This wouldn't be good since it would complicate more than the copyin/out in get/setgroups() -- there are a few temporary gid arrays, and some iterations over the arrays that want to see the egid. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090606184344.N16744>