From owner-freebsd-bugs@FreeBSD.ORG Tue Mar 1 20:40:25 2005 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8C56C16A50E for ; Tue, 1 Mar 2005 20:40:25 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4176F43D53 for ; Tue, 1 Mar 2005 20:40:25 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.1/8.13.1) with ESMTP id j21KeOaD001774 for ; Tue, 1 Mar 2005 20:40:24 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.1/8.13.1/Submit) id j21KeOom001773; Tue, 1 Mar 2005 20:40:24 GMT (envelope-from gnats) Date: Tue, 1 Mar 2005 20:40:24 GMT Message-Id: <200503012040.j21KeOom001773@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Bruce Evans Subject: Re: bin/78087: groups program inconsistency X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: Bruce Evans List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2005 20:40:25 -0000 The following reply was made to PR bin/78087; it has been noted by GNATS. From: Bruce Evans To: Russ Francis Cc: freebsd-gnats-submit@FreeBSD.org, dds@FreeBSD.org Subject: Re: bin/78087: groups program inconsistency Date: Wed, 2 Mar 2005 07:35:55 +1100 (EST) On Mon, 28 Feb 2005, Russ Francis wrote: > I saw this report and 78087 which looks similar and thought they looked curious. > > I have tried to duplicate this issue without luck and am running > > FreeBSD 5.3-RELEASE-p2 > > bash-2.05b$ sysctl kern.ngroups > kern.ngroups: 16 > > relevant portion of /etc/group > --- snip --- > foouser:*:1004: > foo1:*:20001:foouser > foo2:*:20002:foouser > foo3:*:20003:foouser > foo4:*:20004:foouser > foo5:*:20005:foouser > foo6:*:20006:foouser > foo7:*:20007:foouser > foo8:*:20008:foouser > foo9:*:20009:foouser > foo10:*:20010:foouser > foo11:*:20011:foouser > foo12:*:20012:foouser > foo13:*:20013:foouser > foo14:*:20014:foouser > #foo15:*:20015:foouser There is a longstanding off-by-1 bug or two concerning {NGROUPS_MAX}. POSIX specifies that {NGROUPS_MAX} gives the number of supplementary group ids *in addition* to the effective group id. {NGROUPS_MAX} is 16 in FreeBSD so you should be able to have up to foo16 in the above. This never worked in FreeBSD, so {NOGROUPS_MAX} is off-by-1. ISTR that there is another off-by-1 bug in setgroups(2) or in library callers of it. I forget the details. > su-2.05b$ whoami > foouser > su-2.05b$ id > uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1), > 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6), > 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11), > 20012(foo12), 20013(foo13), 20014(foo14) > su-2.05b$ id foouser > uid=1004(foouser) gid=1004(foouser) groups=1004(foouser), 20001(foo1), > 20002(foo2), 20003(foo3), 20004(foo4), 20005(foo5), 20006(foo6), > 20007(foo7), 20008(foo8), 20009(foo9), 20010(foo10), 20011(foo11), > 20012(foo12), 20013(foo13), 20014(foo14) > > When I uncomment the line '#foo15:*:20015:foouser' from /etc/group > I get the following error > > bash-2.05b$ su foouser > Password: > su: setusercontext: Invalid argument There now seems to be an off-by-2 bug. I have something like the above foo[1-15] in my /etc/group for just one user, for the purpose of running old POSIX tests (NIST PCTS) of {NGROUPS_MAX}. This used to work, but I now get the above error from su. login fails too, but doesn't print a message. The POSIX tests only worked because I misconfigured them to hide the longstanding off-by-one error: % SUPPLEMENTARY_GROUPS "0,1000,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,1013,1014" The account is supposed to have primary group 0 (wheel) and 16 supplementary groups 1000-1015, but #1015 doesn't work so I left it out of /etc/groups and fudged the tests to succeed by putting the primary group in the supplementary groups. The new bug is exposed by: % RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v % Working file: initgroups.c % head: 1.8 % ... % ---------------------------- % revision 1.8 % date: 2003/11/19 15:51:26; author: dds; state: Exp; lines: +7 -2 % Fix problem where initgroups would silently truncate groups with % more than NGROUP elements without providing the opportunity to % setgroups to fail and correctly return error and set errno. % % MFC after: 2 weeks % ---------------------------- % % Index: initgroups.c % =================================================================== % RCS file: /home/ncvs/src/lib/libc/gen/initgroups.c,v % retrieving revision 1.7 % retrieving revision 1.8 % diff -u -2 -r1.7 -r1.8 % --- initgroups.c 1 Feb 2002 00:57:29 -0000 1.7 % +++ initgroups.c 19 Nov 2003 15:51:26 -0000 1.8 % ... % @@ -51,7 +51,12 @@ % gid_t agroup; % { % - int groups[NGROUPS], ngroups; % + int ngroups; % + /* % + * Provide space for one group more than NGROUPS to allow % + * setgroups to fail and set errno. % + */ % + gid_t groups[NGROUPS + 1]; This has some style bugs. % % - ngroups = NGROUPS; % + ngroups = NGROUPS + 1; % getgrouplist(uname, agroup, groups, &ngroups); % return (setgroups(ngroups, groups)); getgrouplist() puts `agroup' in groups[0] and groups[1], so there are only 14 slots left for the supplementary groups. Thus when there are 15 supplementary groups all different from `agroup', the above change makes ngroups = 17 and setgroups() fails. Immediately before the above change, the last group was just discarded so it couldn't have worked, but think PCTS would have found the bug if the last group had not worked long ago. getgrouplist() has to put `agroup' in groups[0] because the group list has to include the primary group in FreeBSD due to a dubious efficiency hack in the implementation -- in FreeBSD, the effective group id is just groups[0]. This efficiency hacks is what gives the first off-by-1 error -- {NGROUPS_MAX} iis missing the adjustment for the hack. getgrouplist() normally puts `agroup' in groups[1] according to the following comment: % /* % * When installing primary group, duplicate it; % * the first element of groups is the effective gid % * and will be overwritten when a setgid file is executed. % */ I don't understand why this is needed. It obviously increments the off-by-1 error count, since the setgroups(2) up immediately in the kernel if the groups count exceeds {NGROUPS_MAX}. getgrouplist() has changed signficantly near this bug since rev.1.1, but the change only reduces the bugs: % Index: getgrouplist.c % =================================================================== % RCS file: /home/ncvs/src/lib/libc/gen/getgrouplist.c,v % retrieving revision 1.1 % retrieving revision 1.13 % diff -u -2 -r1.1 -r1.13 % --- getgrouplist.c 27 May 1994 04:56:40 -0000 1.1 % +++ getgrouplist.c 16 Feb 2003 17:29:09 -0000 1.13 % ... % @@ -70,17 +70,21 @@ % */ % setgrent(); % - while (grp = getgrent()) { % - if (grp->gr_gid == agroup) % - continue; % - if (ngroups >= maxgroups) { % - ret = -1; % - break; % + while ((grp = getgrent()) != NULL) { % + for (i = 0; i < ngroups; i++) { % + if (grp->gr_gid == groups[i]) % + goto skip; Now getgrouplist() avoids wasting slots for duplicates in all cases after the first 1 or 2 slots, where it use to only avoid duplicates of `agroup'. Avoiding duplicates allows {NGROUPS_MAX} groups to actually work in some cases, but cases where there are even {NGROUPS_MAX} - 1 different groups all different from `agroup' apparently never worked. The kernel doesn't do any duplicate removal but probably should. POSIX doesn't specify setgroups(), but it specifies getgroups(2). For getgroups(), it is implementation-defined whether the effective group is returned, and applications must be prepared for {NGROUPS_MAX} + 1 gids being reurned. FreeBSD does return the egid in the array. FreeBSD's setgroups needs the egid in slot 0 in the array. Bruce