Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 2 Dec 2019 02:45:20 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Dmitry Marakasov <amdmi3@amdmi3.ru>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: How can kill(-1, 0) return EPERM?
Message-ID:  <20191202004520.GI10580@kib.kiev.ua>
In-Reply-To: <20191201231114.GG4071@hades.panopticon>
References:  <20191129151606.GD4071@hades.panopticon> <20191129164509.GE4071@hades.panopticon> <20191129225834.GY10580@kib.kiev.ua> <20191201002411.GF4071@hades.panopticon> <20191201144813.GD10580@kib.kiev.ua> <20191201231114.GG4071@hades.panopticon>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 02, 2019 at 02:11:14AM +0300, Dmitry Marakasov wrote:
> * Konstantin Belousov (kostikbel@gmail.com) wrote:
> 
> > > > > > I'm helping to investigate some userspace issue [1], where kill(-1, SIGKILL)
> > > > > > fails with EPERM. I've managed to isolate this case in a small program:
> > > > > > 
> > > > > > 
> > > > > > ```
> > > > > > #include <err.h>
> > > > > > #include <errno.h>
> > > > > > #include <signal.h>
> > > > > > #include <stdio.h>
> > > > > > #include <string.h>
> > > > > > #include <unistd.h>
> > > > > > 
> > > > > > int main() {
> > > > > >     if (setuid(66) == -1)  // uucp, just for the test
> > > > > >         err(1, "setuid");
> > > > > > 
> > > > > >     int res = kill(-1, 0);  // <- fails with EPERM
> > > > > >     fprintf(stderr, "kill(-1, 0) result=%d, errno=%s\n", res, strerror(errno));
> > > > > > 
> > > > > >     return 0;
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > when run from root on 12.1 kill call fails with EPERM. However I cannot
> > > > > > comprehend what it is caused by and how it's even possible: kill(2) manpage
> > > > > > says that with pid=-1 kill should only send (and in this case of sig=0,
> > > > > > /not/ send) signals to the processes belonging to the current uid, so there
> > > > > > should be no permission problems. I've also looked into the kernel code
> > > > > > (sys_kill, killpg1), and it matches to what manpage says, I see no way
> > > > > > for it to return EPERM: sys_kill() should fall through to the switch, call
> > > > > > killpg1() with all=1 and killpg1() if(all) branch may only set `ret` to
> > > > > > either 0 or ESRCH. Am I missing something, or is there a problem somewhere?
> > > > > 
> > > > > It looks like I have misread the `else if' path of this core.
> > > > > 
> > > > >     if (all) {
> > > > >         /*
> > > > >          * broadcast
> > > > >          */
> > > > >         sx_slock(&allproc_lock);
> > > > >         FOREACH_PROC_IN_SYSTEM(p) {
> > > > >             if (p->p_pid <= 1 || p->p_flag & P_SYSTEM ||
> > > > >                 p == td->td_proc || p->p_state == PRS_NEW) {
> > > > >                 continue;
> > > > >             }
> > > > >             PROC_LOCK(p);
> > > > >             err = p_cansignal(td, p, sig);
> > > > >             if (err == 0) {
> > > > >                 if (sig)
> > > > >                     pksignal(p, sig, ksi);
> > > > >                 ret = err;
> > > > >             }
> > > > >             else if (ret == ESRCH)
> > > > >                 ret = err;
> > > > >             PROC_UNLOCK(p);
> > > > >         }
> > > > >         sx_sunlock(&allproc_lock);
> > > > >     } ...
> > > > > 
> > > > > so it's clear now where EPERM comes from. However it looks like the
> > > > > behavior contradicts the manpage - there are no signs of check that
> > > > > the signalled process has the same uid as the caller.
> > > > 
> > > > I am not sure what you mean by 'signs of check'.  Look at p_cansignal()
> > > > and cr_cansignal() implementation.
> > > 
> > > I've meant that according to the manpage
> > > 
> > >      If pid is -1:
> > >              If the user has super-user privileges, the signal is sent to all
> > >              processes excluding system processes (with P_SYSTEM flag set),
> > >              process with ID 1 (usually init(8)), and the process sending the
> > >              signal.  If the user is not the super user, the signal is sent to
> > >              all processes with the same uid as the user excluding the process
> > >              sending the signal.  No error is returned if any process could be
> > >              signaled.
> > > 
> > > IMO there should be an additional check in this condition:
> > > 
> > >              if (p->p_pid <= 1 || p->p_flag & P_SYSTEM ||
> > >                  p == td->td_proc || p->p_state == PRS_NEW) {
> > >                  continue;
> > >              }
> > > 
> > > E.g. something like
> > > 
> > >              if (p->p_pid <= 1 || p->p_flag & P_SYSTEM ||
> > >                  p == td->td_proc || p->p_state == PRS_NEW ||
> > >                  (td->td_ucred->cr_ruid != 0 &&
> > > 			p->td_ucred->cr_ruid != td->td_ucred->cr_ruid) {
> > >                  continue;
> > >              }
> > > 
> > > e.g. it should not even attempt to signal processes with other uids.
> > Why ?  You are trying to outguess p_cansignal(), which could deny
> > action for much more reasons, so you would get EPERM still, e.g. if the
> > target is suid.  Or, p_cansignal() also might allow to send the signal
> > even for mismatched uids, again look at it code.
> 
> Exactly because of that - p_cansignal behaves in it's own way, which
> doesn't match the indended/documented kill(2) behavior. You're right
> in a sence that plain uid check is not sufficient though.
> 
> > I might guess that your complain is really about a different aspect
> > of it.  If you look at the posix description of the EPERM error from
> > kill(2) (really kill(3)), it says
> > [EPERM]  The process does not have permission to send the signal to
> >          any receiving process.
> > In other words, we should not return EPERM if we signalled at least one
> > of the process.
> > 
> > Is this the problem ?
> 
> It's not limited with this aspect. Here are some possible cases:
> 
> - no processes with the same UID
> - one process with the same UID which can be signalled
> - one process with the same UID which cannot (say, because of MAC) be signalled
> - case of one process with the different UID which can be signalled
> 
> The kill(-1, 0) with the given UID results are/should be, correspondingly:
> 
> As per documentation: ESRCH, 0, EPERM, ESRCH
> As per current implementation: EPERM, 0, EPERM, 0
> As per implementation which relies solely on *_cansignal:
>     ESRCH, 0, ESRCH, 0 (unrelated process signalled)
> 
> As you can see, relying soleley on p_cansignel would fix one case, but
> break the others. We may need a trimmed down variant of p_cansignal for
> this. Or make the latter take the mask of which checks it should
> perform or skip.

No, I do not see.  Matching uid is red herring, according to POSIX ESRCH
should be returned when there is no visible matching processes at all,
EPERM should be returned if there are processes, but no processes can be
signalled due to permission issues.  Otherwise, at least one process got
a signal, and the return value should be zero.

I tried to adjust this in https://reviews.freebsd.org/D22621, where
I already added you as subscriber.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20191202004520.GI10580>