From owner-svn-src-all@FreeBSD.ORG Mon Jan 5 03:27:13 2015 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 40ECDC98; Mon, 5 Jan 2015 03:27:13 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 2A0302D18; Mon, 5 Jan 2015 03:27:13 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.9/8.14.9) with ESMTP id t053RDDa037800; Mon, 5 Jan 2015 03:27:13 GMT (envelope-from kib@FreeBSD.org) Received: (from kib@localhost) by svn.freebsd.org (8.14.9/8.14.9/Submit) id t053RAhp037781; Mon, 5 Jan 2015 03:27:10 GMT (envelope-from kib@FreeBSD.org) Message-Id: <201501050327.t053RAhp037781@svn.freebsd.org> X-Authentication-Warning: svn.freebsd.org: kib set sender to kib@FreeBSD.org using -f From: Konstantin Belousov Date: Mon, 5 Jan 2015 03:27:10 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r276686 - in stable/10: lib/libc/sys sys/compat/freebsd32 sys/conf sys/kern sys/sys X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 05 Jan 2015 03:27:13 -0000 Author: kib Date: Mon Jan 5 03:27:09 2015 New Revision: 276686 URL: https://svnweb.freebsd.org/changeset/base/276686 Log: Merge reaper facility. MFC r270443 (by mjg): Properly reparent traced processes when the tracer dies. MFC r273452 (by mjg): Plug unnecessary PRS_NEW check in kern_procctl. MFC 275800: Add a facility for non-init process to declare itself the reaper of the orphaned descendants. MFC r275821: Add missed break. MFC r275846 (by mckusick): Add some additional clarification and fix a few gammer nits. MFC r275847 (by bdrewery): Bump Dd for r275846. Added: stable/10/sys/kern/kern_procctl.c - copied, changed from r275800, head/sys/kern/kern_procctl.c Modified: stable/10/lib/libc/sys/procctl.2 stable/10/sys/compat/freebsd32/freebsd32.h stable/10/sys/compat/freebsd32/freebsd32_misc.c stable/10/sys/conf/files stable/10/sys/kern/init_main.c stable/10/sys/kern/kern_exit.c stable/10/sys/kern/kern_fork.c stable/10/sys/kern/sys_process.c stable/10/sys/sys/proc.h stable/10/sys/sys/procctl.h Directory Properties: stable/10/ (props changed) Modified: stable/10/lib/libc/sys/procctl.2 ============================================================================== --- stable/10/lib/libc/sys/procctl.2 Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/lib/libc/sys/procctl.2 Mon Jan 5 03:27:09 2015 (r276686) @@ -2,6 +2,10 @@ .\" Written by: John H. Baldwin .\" All rights reserved. .\" +.\" Copyright (c) 2014 The FreeBSD Foundation +.\" Portions of this documentation were written by Konstantin Belousov +.\" under sponsorship from the FreeBSD Foundation. +.\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: @@ -25,7 +29,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 19, 2013 +.Dd December 16, 2014 .Dt PROCCTL 2 .Os .Sh NAME @@ -67,7 +71,7 @@ The control request to perform is specif .Fa cmd argument. The following commands are supported: -.Bl -tag -width "Dv PROC_SPROTECT" +.Bl -tag -width "Dv PROC_REAP_GETPIDS" .It Dv PROC_SPROTECT Set process protection state. This is used to mark a process as protected from being killed if the system @@ -95,6 +99,182 @@ When used with mark all future child processes of each selected process as protected. Future child processes will also mark all of their future child processes. .El +.It Dv PROC_REAP_ACQUIRE +Acquires the reaper status for the current process. +The status means that children orphaned by the reaper's descendants +that were forked after the acquisition of the status are reparented to the +reaper. +After the system initialization, +.Xr init 8 +is the default reaper. +.Pp +.It Dv PROC_REAP_RELEASE +Releases the reaper state for the current process. +The reaper of the current process becomes the new reaper of the +current process's descendants. +.It Dv PROC_REAP_STATUS +Provides the information about the reaper of the specified process, +or the process itself when it is a reaper. +The +.Fa data +argument must point to a +.Vt procctl_reaper_status +structure which is filled in by the syscall on successful return. +.Bd -literal +struct procctl_reaper_status { + u_int rs_flags; + u_int rs_children; + u_int rs_descendants; + pid_t rs_reaper; + pid_t rs_pid; +}; +.Ed +The +.Fa rs_flags +may have the following flags returned: +.Bl -tag -width "Dv REAPER_STATUS_REALINIT" +.It Dv REAPER_STATUS_OWNED +The specified process has acquired the reaper status and has not +released it. +When the flag is returned, the specified process +.Fa id , +pid, identifies the reaper, otherwise the +.Fa rs_reaper +field of the structure is set to the pid of the reaper +for the specified process id. +.It Dv REAPER_STATUS_REALINIT +The specified process is the root of the reaper tree, i.e. +.Xr init 8 . +.El +The +.Fa rs_children +field returns the number of children of the reaper. +The +.Fa rs_descendants +field returns the total number of descendants of the reaper(s), +not counting descendants of the reaper in the subtree. +The +.Fa rs_reaper +field returns the reaper pid. +The +.Fa rs_pid +returns the pid of one reaper child if there are any descendants. +.It Dv PROC_REAP_GETPIDS +Queries the list of descendants of the reaper of the specified process. +The request takes a pointer to a +.Vt procctl_reaper_pids +structure in the +.Fa data +parameter. +.Bd -literal +struct procctl_reaper_pids { + u_int rp_count; + struct procctl_reaper_pidinfo *rp_pids; +}; +.Ed +When called, the +.Fa rp_pids +field must point to an array of +.Vt procctl_reaper_pidinfo +structures, to be filled in on return, +and the +.Fa rp_count +field must specify the size of the array, +into which no more than +.Fa rp_count +elements will be filled in by the kernel. +.Pp +The +.Vt "struct procctl_reaper_pidinfo" +structure provides some information about one of the reaper's descendants. +Note that for a descendant that is not a child, it may be incorrectly +identified because of a race in which the original child process exited +and the exited process's pid was reused for an unrelated process. +.Bd -literal +struct procctl_reaper_pidinfo { + pid_t pi_pid; + pid_t pi_subtree; + u_int pi_flags; +}; +.Ed +The +.Fa pi_pid +field is the process id of the descendant. +The +.Fa pi_subtree +field provides the pid of the child of the reaper, which is the (grand-)parent +of the process. +The +.Fa pi_flags +field returns the following flags, further describing the descendant: +.Bl -tag -width "Dv REAPER_PIDINFO_VALID" +.It Dv REAPER_PIDINFO_VALID +Set to indicate that the +.Vt procctl_reaper_pidinfo +structure was filled in by the kernel. +Zero-filling the +.Fa rp_pids +array and testing the +.Dv REAPER_PIDINFO_VALID +flag allows the caller to detect the end +of the returned array. +.It Dv REAPER_PIDINFO_CHILD +The +.Fa pi_pid +field identifies the direct child of the reaper. +.El +.It Dv PROC_REAP_KILL +Request to deliver a signal to some subset of the descendants of the reaper. +The +.Fa data +parameter must point to a +.Vt procctl_reaper_kill +structure, which is used both for parameters and status return. +.Bd -literal +struct procctl_reaper_kill { + int rk_sig; + u_int rk_flags; + pid_t rk_subtree; + u_int rk_killed; + pid_t rk_fpid; +}; +.Ed +The +.Fa rk_sig +field specifies the signal to be delivered. +Zero is not a valid signal number, unlike +.Xr kill 2 . +The +.Fa rk_flags +field further directs the operation. +It is or-ed from the following flags: +.Bl -tag -width "Dv REAPER_KILL_CHILDREN" +.It Dv REAPER_KILL_CHILDREN +Deliver the specified signal only to direct children of the reaper. +.It Dv REAPER_KILL_SUBTREE +Deliver the specified signal only to descendants that were forked by +the direct child with pid specified in the +.Fa rk_subtree +field. +.El +If neither the +.Dv REAPER_KILL_CHILDREN +nor the +.Dv REAPER_KILL_SUBTREE +flags are specified, all current descendants of the reaper are signalled. +.Pp +If a signal was delivered to any process, the return value from the request +is zero. +In this case, the +.Fa rk_killed +field identifies the number of processes signalled. +The +.Fa rk_fpid +field is set to the pid of the first process for which signal +delivery failed, e.g. due to the permission problems. +If no such process exist, the +.Fa rk_fpid +field is set to -1. .El .Sh RETURN VALUES If an error occurs, a value of -1 is returned and @@ -109,7 +289,7 @@ will fail if: .It Bq Er EFAULT The .Fa arg -points outside the process's allocated address space. +parameter points outside the process's allocated address space. .It Bq Er EINVAL The .Fa cmd @@ -132,11 +312,48 @@ An invalid operation or flag was passed for a .Dv PROC_SPROTECT command. +.It Bq Er EPERM +The +.Fa idtype +argument is not equal to +.Dv P_PID , +or +.Fa id +is not equal to the pid of the calling process, for +.Dv PROC_REAP_ACQUIRE +or +.Dv PROC_REAP_RELEASE +requests. +.It Bq Er EINVAL +Invalid or undefined flags were passed to a +.Dv PROC_REAP_KILL +request. +.It Bq Er EINVAL +An invalid or zero signal number was requested for a +.Dv PROC_REAP_KILL +request. +.It Bq Er EINVAL +The +.Dv PROC_REAP_RELEASE +request was issued by the +.Xr init 8 +process. +.It Bq Er EBUSY +The +.Dv PROC_REAP_ACQUIRE +request was issued by a process that had already acquired reaper status +and has not yet released it. .El .Sh SEE ALSO -.Xr ptrace 2 +.Xr kill 2 , +.Xr ptrace 2 , +.Xr wait 2 , +.Xr init 8 .Sh HISTORY The .Fn procctl function appeared in -.Fx 10 . +.Fx 10.0 . +The reaper facility is based on a similar feature of Linux and +DragonflyBSD, and first appeared in +.Fx 10.2 . Modified: stable/10/sys/compat/freebsd32/freebsd32.h ============================================================================== --- stable/10/sys/compat/freebsd32/freebsd32.h Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/compat/freebsd32/freebsd32.h Mon Jan 5 03:27:09 2015 (r276686) @@ -387,4 +387,10 @@ struct kld32_file_stat { char pathname[MAXPATHLEN]; }; +struct procctl_reaper_pids32 { + u_int rp_count; + u_int rp_pad0[15]; + uint32_t rp_pids; +}; + #endif /* !_COMPAT_FREEBSD32_FREEBSD32_H_ */ Modified: stable/10/sys/compat/freebsd32/freebsd32_misc.c ============================================================================== --- stable/10/sys/compat/freebsd32/freebsd32_misc.c Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/compat/freebsd32/freebsd32_misc.c Mon Jan 5 03:27:09 2015 (r276686) @@ -3062,20 +3062,63 @@ int freebsd32_procctl(struct thread *td, struct freebsd32_procctl_args *uap) { void *data; - int error, flags; + union { + struct procctl_reaper_status rs; + struct procctl_reaper_pids rp; + struct procctl_reaper_kill rk; + } x; + union { + struct procctl_reaper_pids32 rp; + } x32; + int error, error1, flags; switch (uap->com) { case PROC_SPROTECT: error = copyin(PTRIN(uap->data), &flags, sizeof(flags)); - if (error) + if (error != 0) return (error); data = &flags; break; + case PROC_REAP_ACQUIRE: + case PROC_REAP_RELEASE: + if (uap->data != NULL) + return (EINVAL); + data = NULL; + break; + case PROC_REAP_STATUS: + data = &x.rs; + break; + case PROC_REAP_GETPIDS: + error = copyin(uap->data, &x32.rp, sizeof(x32.rp)); + if (error != 0) + return (error); + CP(x32.rp, x.rp, rp_count); + PTRIN_CP(x32.rp, x.rp, rp_pids); + data = &x.rp; + break; + case PROC_REAP_KILL: + error = copyin(uap->data, &x.rk, sizeof(x.rk)); + if (error != 0) + return (error); + data = &x.rk; + break; default: return (EINVAL); } - return (kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id), - uap->com, data)); + error = kern_procctl(td, uap->idtype, PAIR32TO64(id_t, uap->id), + uap->com, data); + switch (uap->com) { + case PROC_REAP_STATUS: + if (error == 0) + error = copyout(&x.rs, uap->data, sizeof(x.rs)); + break; + case PROC_REAP_KILL: + error1 = copyout(&x.rk, uap->data, sizeof(x.rk)); + if (error == 0) + error = error1; + break; + } + return (error); } int Modified: stable/10/sys/conf/files ============================================================================== --- stable/10/sys/conf/files Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/conf/files Mon Jan 5 03:27:09 2015 (r276686) @@ -2916,6 +2916,7 @@ kern/kern_pmc.c standard kern/kern_poll.c optional device_polling kern/kern_priv.c standard kern/kern_proc.c standard +kern/kern_procctl.c standard kern/kern_prot.c standard kern/kern_racct.c standard kern/kern_rangelock.c standard Modified: stable/10/sys/kern/init_main.c ============================================================================== --- stable/10/sys/kern/init_main.c Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/kern/init_main.c Mon Jan 5 03:27:09 2015 (r276686) @@ -496,7 +496,8 @@ proc0_init(void *dummy __unused) prison0.pr_cpuset = cpuset_ref(td->td_cpuset); p->p_peers = 0; p->p_leader = p; - + p->p_reaper = p; + LIST_INIT(&p->p_reaplist); strncpy(p->p_comm, "kernel", sizeof (p->p_comm)); strncpy(td->td_name, "swapper", sizeof (td->td_name)); @@ -821,8 +822,11 @@ create_init(const void *udata __unused) KASSERT(initproc->p_pid == 1, ("create_init: initproc->p_pid != 1")); /* divorce init's credentials from the kernel's */ newcred = crget(); + sx_xlock(&proctree_lock); PROC_LOCK(initproc); initproc->p_flag |= P_SYSTEM | P_INMEM; + initproc->p_treeflag |= P_TREE_REAPER; + LIST_INSERT_HEAD(&initproc->p_reaplist, &proc0, p_reapsibling); oldcred = initproc->p_ucred; crcopy(newcred, oldcred); #ifdef MAC @@ -833,6 +837,7 @@ create_init(const void *udata __unused) #endif initproc->p_ucred = newcred; PROC_UNLOCK(initproc); + sx_xunlock(&proctree_lock); crfree(oldcred); cred_update_thread(FIRST_THREAD_IN_PROC(initproc)); cpu_set_fork_handler(FIRST_THREAD_IN_PROC(initproc), start_init, NULL); Modified: stable/10/sys/kern/kern_exit.c ============================================================================== --- stable/10/sys/kern/kern_exit.c Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/kern/kern_exit.c Mon Jan 5 03:27:09 2015 (r276686) @@ -125,6 +125,31 @@ proc_realparent(struct proc *child) return (parent); } +void +reaper_abandon_children(struct proc *p, bool exiting) +{ + struct proc *p1, *p2, *ptmp; + + sx_assert(&proctree_lock, SX_LOCKED); + KASSERT(p != initproc, ("reaper_abandon_children for initproc")); + if ((p->p_treeflag & P_TREE_REAPER) == 0) + return; + p1 = p->p_reaper; + LIST_FOREACH_SAFE(p2, &p->p_reaplist, p_reapsibling, ptmp) { + LIST_REMOVE(p2, p_reapsibling); + p2->p_reaper = p1; + p2->p_reapsubtree = p->p_reapsubtree; + LIST_INSERT_HEAD(&p1->p_reaplist, p2, p_reapsibling); + if (exiting && p2->p_pptr == p) { + PROC_LOCK(p2); + proc_reparent(p2, p1); + PROC_UNLOCK(p2); + } + } + KASSERT(LIST_EMPTY(&p->p_reaplist), ("p_reaplist not empty")); + p->p_treeflag &= ~P_TREE_REAPER; +} + static void clear_orphan(struct proc *p) { @@ -162,7 +187,8 @@ sys_sys_exit(struct thread *td, struct s void exit1(struct thread *td, int rv) { - struct proc *p, *nq, *q; + struct proc *p, *nq, *q, *t; + struct thread *tdt; struct vnode *ttyvp = NULL; mtx_assert(&Giant, MA_NOTOWNED); @@ -450,24 +476,34 @@ exit1(struct thread *td, int rv) WITNESS_WARN(WARN_PANIC, NULL, "process (pid %d) exiting", p->p_pid); /* - * Reparent all of our children to init. + * Reparent all children processes: + * - traced ones to the original parent (or init if we are that parent) + * - the rest to init */ sx_xlock(&proctree_lock); q = LIST_FIRST(&p->p_children); if (q != NULL) /* only need this if any child is S_ZOMB */ - wakeup(initproc); + wakeup(q->p_reaper); for (; q != NULL; q = nq) { nq = LIST_NEXT(q, p_sibling); PROC_LOCK(q); - proc_reparent(q, initproc); q->p_sigparent = SIGCHLD; - /* - * Traced processes are killed - * since their existence means someone is screwing up. - */ - if (q->p_flag & P_TRACED) { - struct thread *temp; + if (!(q->p_flag & P_TRACED)) { + proc_reparent(q, q->p_reaper); + } else { + /* + * Traced processes are killed since their existence + * means someone is screwing up. + */ + t = proc_realparent(q); + if (t == p) { + proc_reparent(q, q->p_reaper); + } else { + PROC_LOCK(t); + proc_reparent(q, t); + PROC_UNLOCK(t); + } /* * Since q was found on our children list, the * proc_reparent() call moved q to the orphan @@ -476,8 +512,8 @@ exit1(struct thread *td, int rv) */ clear_orphan(q); q->p_flag &= ~(P_TRACED | P_STOPPED_TRACE); - FOREACH_THREAD_IN_PROC(q, temp) - temp->td_dbgflags &= ~TDB_SUSPEND; + FOREACH_THREAD_IN_PROC(q, tdt) + tdt->td_dbgflags &= ~TDB_SUSPEND; kern_psignal(q, SIGKILL); } PROC_UNLOCK(q); @@ -553,7 +589,7 @@ exit1(struct thread *td, int rv) mtx_unlock(&p->p_pptr->p_sigacts->ps_mtx); pp = p->p_pptr; PROC_UNLOCK(pp); - proc_reparent(p, initproc); + proc_reparent(p, p->p_reaper); p->p_sigparent = SIGCHLD; PROC_LOCK(p->p_pptr); @@ -566,8 +602,8 @@ exit1(struct thread *td, int rv) } else mtx_unlock(&p->p_pptr->p_sigacts->ps_mtx); - if (p->p_pptr == initproc) - kern_psignal(p->p_pptr, SIGCHLD); + if (p->p_pptr == p->p_reaper || p->p_pptr == initproc) + childproc_exited(p); else if (p->p_sigparent != 0) { if (p->p_sigparent == SIGCHLD) childproc_exited(p); @@ -840,6 +876,8 @@ proc_reap(struct thread *td, struct proc LIST_REMOVE(p, p_list); /* off zombproc */ sx_xunlock(&allproc_lock); LIST_REMOVE(p, p_sibling); + reaper_abandon_children(p, true); + LIST_REMOVE(p, p_reapsibling); PROC_LOCK(p); clear_orphan(p); PROC_UNLOCK(p); Modified: stable/10/sys/kern/kern_fork.c ============================================================================== --- stable/10/sys/kern/kern_fork.c Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/kern/kern_fork.c Mon Jan 5 03:27:09 2015 (r276686) @@ -267,11 +267,21 @@ retry: * Scan the active and zombie procs to check whether this pid * is in use. Remember the lowest pid that's greater * than trypid, so we can avoid checking for a while. + * + * Avoid reuse of the process group id, session id or + * the reaper subtree id. Note that for process group + * and sessions, the amount of reserved pids is + * limited by process limit. For the subtree ids, the + * id is kept reserved only while there is a + * non-reaped process in the subtree, so amount of + * reserved pids is limited by process limit times + * two. */ p = LIST_FIRST(&allproc); again: for (; p != NULL; p = LIST_NEXT(p, p_list)) { while (p->p_pid == trypid || + p->p_reapsubtree == trypid || (p->p_pgrp != NULL && (p->p_pgrp->pg_id == trypid || (p->p_session != NULL && @@ -618,12 +628,22 @@ do_fork(struct thread *td, int flags, st * of init. This effectively disassociates the child from the * parent. */ - if (flags & RFNOWAIT) - pptr = initproc; - else + if ((flags & RFNOWAIT) != 0) { + pptr = p1->p_reaper; + p2->p_reaper = pptr; + } else { + p2->p_reaper = (p1->p_treeflag & P_TREE_REAPER) != 0 ? + p1 : p1->p_reaper; pptr = p1; + } p2->p_pptr = pptr; LIST_INSERT_HEAD(&pptr->p_children, p2, p_sibling); + LIST_INIT(&p2->p_reaplist); + LIST_INSERT_HEAD(&p2->p_reaper->p_reaplist, p2, p_reapsibling); + if (p2->p_reaper == p1) + p2->p_reapsubtree = p2->p_pid; + else + p2->p_reapsubtree = p1->p_reapsubtree; sx_xunlock(&proctree_lock); /* Inform accounting that we have forked. */ Copied and modified: stable/10/sys/kern/kern_procctl.c (from r275800, head/sys/kern/kern_procctl.c) ============================================================================== --- head/sys/kern/kern_procctl.c Mon Dec 15 12:01:42 2014 (r275800, copy source) +++ stable/10/sys/kern/kern_procctl.c Mon Jan 5 03:27:09 2015 (r276686) @@ -32,7 +32,7 @@ __FBSDID("$FreeBSD$"); #include #include -#include +#include #include #include #include @@ -336,6 +336,7 @@ sys_procctl(struct thread *td, struct pr case PROC_REAP_STATUS: if (error == 0) error = copyout(&x.rs, uap->data, sizeof(x.rs)); + break; case PROC_REAP_KILL: error1 = copyout(&x.rk, uap->data, sizeof(x.rk)); if (error == 0) Modified: stable/10/sys/kern/sys_process.c ============================================================================== --- stable/10/sys/kern/sys_process.c Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/kern/sys_process.c Mon Jan 5 03:27:09 2015 (r276686) @@ -43,7 +43,6 @@ __FBSDID("$FreeBSD$"); #include #include #include -#include #include #include #include @@ -1234,196 +1233,3 @@ stopevent(struct proc *p, unsigned int e msleep(&p->p_step, &p->p_mtx, PWAIT, "stopevent", 0); } while (p->p_step); } - -static int -protect_setchild(struct thread *td, struct proc *p, int flags) -{ - - PROC_LOCK_ASSERT(p, MA_OWNED); - if (p->p_flag & P_SYSTEM || p_cansched(td, p) != 0) - return (0); - if (flags & PPROT_SET) { - p->p_flag |= P_PROTECTED; - if (flags & PPROT_INHERIT) - p->p_flag2 |= P2_INHERIT_PROTECTED; - } else { - p->p_flag &= ~P_PROTECTED; - p->p_flag2 &= ~P2_INHERIT_PROTECTED; - } - return (1); -} - -static int -protect_setchildren(struct thread *td, struct proc *top, int flags) -{ - struct proc *p; - int ret; - - p = top; - ret = 0; - sx_assert(&proctree_lock, SX_LOCKED); - for (;;) { - ret |= protect_setchild(td, p, flags); - PROC_UNLOCK(p); - /* - * If this process has children, descend to them next, - * otherwise do any siblings, and if done with this level, - * follow back up the tree (but not past top). - */ - if (!LIST_EMPTY(&p->p_children)) - p = LIST_FIRST(&p->p_children); - else for (;;) { - if (p == top) { - PROC_LOCK(p); - return (ret); - } - if (LIST_NEXT(p, p_sibling)) { - p = LIST_NEXT(p, p_sibling); - break; - } - p = p->p_pptr; - } - PROC_LOCK(p); - } -} - -static int -protect_set(struct thread *td, struct proc *p, int flags) -{ - int error, ret; - - switch (PPROT_OP(flags)) { - case PPROT_SET: - case PPROT_CLEAR: - break; - default: - return (EINVAL); - } - - if ((PPROT_FLAGS(flags) & ~(PPROT_DESCEND | PPROT_INHERIT)) != 0) - return (EINVAL); - - error = priv_check(td, PRIV_VM_MADV_PROTECT); - if (error) - return (error); - - if (flags & PPROT_DESCEND) - ret = protect_setchildren(td, p, flags); - else - ret = protect_setchild(td, p, flags); - if (ret == 0) - return (EPERM); - return (0); -} - -#ifndef _SYS_SYSPROTO_H_ -struct procctl_args { - idtype_t idtype; - id_t id; - int com; - void *data; -}; -#endif -/* ARGSUSED */ -int -sys_procctl(struct thread *td, struct procctl_args *uap) -{ - int error, flags; - void *data; - - switch (uap->com) { - case PROC_SPROTECT: - error = copyin(uap->data, &flags, sizeof(flags)); - if (error) - return (error); - data = &flags; - break; - default: - return (EINVAL); - } - - return (kern_procctl(td, uap->idtype, uap->id, uap->com, data)); -} - -static int -kern_procctl_single(struct thread *td, struct proc *p, int com, void *data) -{ - - PROC_LOCK_ASSERT(p, MA_OWNED); - switch (com) { - case PROC_SPROTECT: - return (protect_set(td, p, *(int *)data)); - default: - return (EINVAL); - } -} - -int -kern_procctl(struct thread *td, idtype_t idtype, id_t id, int com, void *data) -{ - struct pgrp *pg; - struct proc *p; - int error, first_error, ok; - - sx_slock(&proctree_lock); - switch (idtype) { - case P_PID: - p = pfind(id); - if (p == NULL) { - error = ESRCH; - break; - } - if (p->p_state == PRS_NEW) - error = ESRCH; - else - error = p_cansee(td, p); - if (error == 0) - error = kern_procctl_single(td, p, com, data); - PROC_UNLOCK(p); - break; - case P_PGID: - /* - * Attempt to apply the operation to all members of the - * group. Ignore processes in the group that can't be - * seen. Ignore errors so long as at least one process is - * able to complete the request successfully. - */ - pg = pgfind(id); - if (pg == NULL) { - error = ESRCH; - break; - } - PGRP_UNLOCK(pg); - ok = 0; - first_error = 0; - LIST_FOREACH(p, &pg->pg_members, p_pglist) { - PROC_LOCK(p); - if (p->p_state == PRS_NEW || p_cansee(td, p) != 0) { - PROC_UNLOCK(p); - continue; - } - error = kern_procctl_single(td, p, com, data); - PROC_UNLOCK(p); - if (error == 0) - ok = 1; - else if (first_error == 0) - first_error = error; - } - if (ok) - error = 0; - else if (first_error != 0) - error = first_error; - else - /* - * Was not able to see any processes in the - * process group. - */ - error = ESRCH; - break; - default: - error = EINVAL; - break; - } - sx_sunlock(&proctree_lock); - return (error); -} Modified: stable/10/sys/sys/proc.h ============================================================================== --- stable/10/sys/sys/proc.h Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/sys/proc.h Mon Jan 5 03:27:09 2015 (r276686) @@ -592,6 +592,14 @@ struct proc { LIST_ENTRY(proc) p_orphan; /* (e) List of orphan processes. */ LIST_HEAD(, proc) p_orphans; /* (e) Pointer to list of orphans. */ u_int p_treeflag; /* (e) P_TREE flags */ + struct proc *p_reaper; /* (e) My reaper. */ + LIST_HEAD(, proc) p_reaplist; /* (e) List of my descendants + (if I am reaper). */ + LIST_ENTRY(proc) p_reapsibling; /* (e) List of siblings - descendants of + the same reaper. */ + pid_t p_reapsubtree; /* (e) Pid of the direct child of the + reaper which spawned + our subtree. */ }; #define p_session p_pgrp->pg_session @@ -648,6 +656,7 @@ struct proc { #define P_TREE_ORPHANED 0x00000001 /* Reparented, on orphan list */ #define P_TREE_FIRST_ORPHAN 0x00000002 /* First element of orphan list */ +#define P_TREE_REAPER 0x00000004 /* Reaper of subtree */ /* * These were process status values (p_stat), now they are only used in @@ -897,6 +906,7 @@ void proc_reparent(struct proc *child, s struct pstats *pstats_alloc(void); void pstats_fork(struct pstats *src, struct pstats *dst); void pstats_free(struct pstats *ps); +void reaper_abandon_children(struct proc *p, bool exiting); int securelevel_ge(struct ucred *cr, int level); int securelevel_gt(struct ucred *cr, int level); void sess_hold(struct session *); Modified: stable/10/sys/sys/procctl.h ============================================================================== --- stable/10/sys/sys/procctl.h Mon Jan 5 02:06:26 2015 (r276685) +++ stable/10/sys/sys/procctl.h Mon Jan 5 03:27:09 2015 (r276686) @@ -30,7 +30,17 @@ #ifndef _SYS_PROCCTL_H_ #define _SYS_PROCCTL_H_ +#ifndef _KERNEL +#include +#include +#endif + #define PROC_SPROTECT 1 /* set protected state */ +#define PROC_REAP_ACQUIRE 2 /* reaping enable */ +#define PROC_REAP_RELEASE 3 /* reaping disable */ +#define PROC_REAP_STATUS 4 /* reaping status */ +#define PROC_REAP_GETPIDS 5 /* get descendants */ +#define PROC_REAP_KILL 6 /* kill descendants */ /* Operations for PROC_SPROTECT (passed in integer arg). */ #define PPROT_OP(x) ((x) & 0xf) @@ -42,10 +52,51 @@ #define PPROT_DESCEND 0x10 #define PPROT_INHERIT 0x20 -#ifndef _KERNEL -#include -#include +/* Result of PREAP_STATUS (returned by value). */ +struct procctl_reaper_status { + u_int rs_flags; + u_int rs_children; + u_int rs_descendants; + pid_t rs_reaper; + pid_t rs_pid; + u_int rs_pad0[15]; +}; + +/* struct procctl_reaper_status rs_flags */ +#define REAPER_STATUS_OWNED 0x00000001 +#define REAPER_STATUS_REALINIT 0x00000002 + +struct procctl_reaper_pidinfo { + pid_t pi_pid; + pid_t pi_subtree; + u_int pi_flags; + u_int pi_pad0[15]; +}; + +#define REAPER_PIDINFO_VALID 0x00000001 +#define REAPER_PIDINFO_CHILD 0x00000002 + +struct procctl_reaper_pids { + u_int rp_count; + u_int rp_pad0[15]; + struct procctl_reaper_pidinfo *rp_pids; +}; + +struct procctl_reaper_kill { + int rk_sig; /* in - signal to send */ + u_int rk_flags; /* in - REAPER_KILL flags */ + pid_t rk_subtree; /* in - subtree, if REAPER_KILL_SUBTREE */ + u_int rk_killed; /* out - count of processes sucessfully + killed */ + pid_t rk_fpid; /* out - first failed pid for which error + is returned */ + u_int rk_pad0[15]; +}; +#define REAPER_KILL_CHILDREN 0x00000001 +#define REAPER_KILL_SUBTREE 0x00000002 + +#ifndef _KERNEL __BEGIN_DECLS int procctl(idtype_t, id_t, int, void *); __END_DECLS