From owner-dev-commits-src-main@freebsd.org Fri Apr 9 20:44:36 2021 Return-Path: Delivered-To: dev-commits-src-main@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9B3FF5BC14F; Fri, 9 Apr 2021 20:44:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4FH99J3h9xz4bHQ; Fri, 9 Apr 2021 20:44:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 6CAE0248FF; Fri, 9 Apr 2021 20:44:36 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 139Kia6e071047; Fri, 9 Apr 2021 20:44:36 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 139Kia6o071046; Fri, 9 Apr 2021 20:44:36 GMT (envelope-from git) Date: Fri, 9 Apr 2021 20:44:36 GMT Message-Id: <202104092044.139Kia6o071046@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Konstantin Belousov Subject: git: 2fd1ffefaa4d - main - Stop arming kqueue timers on knote owner suspend or terminate MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kib X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 2fd1ffefaa4d2cd99a19f866a949cb2cd58ef998 Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-main@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commit messages for the main branch of the src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Apr 2021 20:44:36 -0000 The branch main has been updated by kib: URL: https://cgit.FreeBSD.org/src/commit/?id=2fd1ffefaa4d2cd99a19f866a949cb2cd58ef998 commit 2fd1ffefaa4d2cd99a19f866a949cb2cd58ef998 Author: Konstantin Belousov AuthorDate: 2021-03-05 23:29:08 +0000 Commit: Konstantin Belousov CommitDate: 2021-04-09 20:43:51 +0000 Stop arming kqueue timers on knote owner suspend or terminate This way, even if the process specified very tight reschedule intervals, it should be stoppable/killable. Reported and reviewed by: markj Tested by: markj, pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D29106 --- sys/kern/init_main.c | 1 + sys/kern/kern_event.c | 60 ++++++++++++++++++++++++++++++++++++++++++++------ sys/kern/kern_fork.c | 1 + sys/kern/kern_sig.c | 1 + sys/kern/sys_process.c | 1 + sys/sys/proc.h | 4 ++++ 6 files changed, 61 insertions(+), 7 deletions(-) diff --git a/sys/kern/init_main.c b/sys/kern/init_main.c index 2d6a4f636240..86cc2494272f 100644 --- a/sys/kern/init_main.c +++ b/sys/kern/init_main.c @@ -524,6 +524,7 @@ proc0_init(void *dummy __unused) callout_init_mtx(&p->p_itcallout, &p->p_mtx, 0); callout_init_mtx(&p->p_limco, &p->p_mtx, 0); callout_init(&td->td_slpcallout, 1); + TAILQ_INIT(&p->p_kqtim_stop); /* Create credentials. */ newcred = crget(); diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c index 5e9f1fc35dfe..31b091e20984 100644 --- a/sys/kern/kern_event.c +++ b/sys/kern/kern_event.c @@ -676,8 +676,10 @@ timer2sbintime(int64_t data, int flags) struct kq_timer_cb_data { struct callout c; + struct proc *p; struct knote *kn; int cpuid; + TAILQ_ENTRY(kq_timer_cb_data) link; sbintime_t next; /* next timer event fires at */ sbintime_t to; /* precalculated timer period, 0 for abs */ }; @@ -689,22 +691,65 @@ kqtimer_sched_callout(struct kq_timer_cb_data *kc) kc->cpuid, C_ABSOLUTE); } +void +kqtimer_proc_continue(struct proc *p) +{ + struct kq_timer_cb_data *kc, *kc1; + struct bintime bt; + sbintime_t now; + + PROC_LOCK_ASSERT(p, MA_OWNED); + + getboottimebin(&bt); + now = bttosbt(bt); + + TAILQ_FOREACH_SAFE(kc, &p->p_kqtim_stop, link, kc1) { + TAILQ_REMOVE(&p->p_kqtim_stop, kc, link); + if (kc->next <= now) + filt_timerexpire(kc->kn); + else + kqtimer_sched_callout(kc); + } +} + static void filt_timerexpire(void *knx) { struct knote *kn; struct kq_timer_cb_data *kc; + struct proc *p; + sbintime_t now; kn = knx; - kn->kn_data++; - KNOTE_ACTIVATE(kn, 0); /* XXX - handle locking */ - - if ((kn->kn_flags & EV_ONESHOT) != 0) - return; kc = kn->kn_ptr.p_v; - if (kc->to == 0) + + if ((kn->kn_flags & EV_ONESHOT) != 0 || kc->to == 0) { + kn->kn_data++; + KNOTE_ACTIVATE(kn, 0); return; - kc->next += kc->to; + } + + for (now = sbinuptime(); kc->next <= now; kc->next += kc->to) + kn->kn_data++; + KNOTE_ACTIVATE(kn, 0); /* XXX - handle locking */ + + /* + * Initial check for stopped kc->p is racy. It is fine to + * miss the set of the stop flags, at worst we would schedule + * one more callout. On the other hand, it is not fine to not + * schedule when we we missed clearing of the flags, we + * recheck them under the lock and observe consistent state. + */ + p = kc->p; + if (P_SHOULDSTOP(p) || P_KILLED(p)) { + PROC_LOCK(p); + if (P_SHOULDSTOP(p) || P_KILLED(p)) { + TAILQ_INSERT_TAIL(&p->p_kqtim_stop, kc, link); + PROC_UNLOCK(p); + return; + } + PROC_UNLOCK(p); + } kqtimer_sched_callout(kc); } @@ -762,6 +807,7 @@ filt_timerattach(struct knote *kn) kn->kn_status &= ~KN_DETACHED; /* knlist_add clears it */ kn->kn_ptr.p_v = kc = malloc(sizeof(*kc), M_KQUEUE, M_WAITOK); kc->kn = kn; + kc->p = curproc; kc->cpuid = PCPU_GET(cpuid); callout_init(&kc->c, 1); filt_timerstart(kn, to); diff --git a/sys/kern/kern_fork.c b/sys/kern/kern_fork.c index 3da8205d8ab0..2a092b192878 100644 --- a/sys/kern/kern_fork.c +++ b/sys/kern/kern_fork.c @@ -606,6 +606,7 @@ do_fork(struct thread *td, struct fork_req *fr, struct proc *p2, struct thread * LIST_INIT(&p2->p_orphans); callout_init_mtx(&p2->p_itcallout, &p2->p_mtx, 0); + TAILQ_INIT(&p2->p_kqtim_stop); /* * This begins the section where we must prevent the parent diff --git a/sys/kern/kern_sig.c b/sys/kern/kern_sig.c index 212b4997dd5e..46b520030dcd 100644 --- a/sys/kern/kern_sig.c +++ b/sys/kern/kern_sig.c @@ -2424,6 +2424,7 @@ runfast: PROC_SUNLOCK(p); out_cont: itimer_proc_continue(p); + kqtimer_proc_continue(p); out: /* If we jump here, proc slock should not be owned. */ PROC_SLOCK_ASSERT(p, MA_NOTOWNED); diff --git a/sys/kern/sys_process.c b/sys/kern/sys_process.c index 8c0f743aef8a..51c7bd662600 100644 --- a/sys/kern/sys_process.c +++ b/sys/kern/sys_process.c @@ -1095,6 +1095,7 @@ kern_ptrace(struct thread *td, int req, pid_t pid, void *addr, int data) thread_unsuspend(p); PROC_SUNLOCK(p); itimer_proc_continue(p); + kqtimer_proc_continue(p); break; case PT_WRITE_I: diff --git a/sys/sys/proc.h b/sys/sys/proc.h index 8d41b9b20f10..b82de183aa44 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -183,6 +183,7 @@ struct kaudit_record; struct kcov_info; struct kdtrace_proc; struct kdtrace_thread; +struct kq_timer_cb_data; struct mqueue_notifier; struct p_sched; struct proc; @@ -727,6 +728,8 @@ struct proc { */ LIST_ENTRY(proc) p_orphan; /* (e) List of orphan processes. */ LIST_HEAD(, proc) p_orphans; /* (e) Pointer to list of orphans. */ + + TAILQ_HEAD(, kq_timer_cb_data) p_kqtim_stop; /* (c) */ }; #define p_session p_pgrp->pg_session @@ -1093,6 +1096,7 @@ void fork_exit(void (*)(void *, struct trapframe *), void *, void fork_return(struct thread *, struct trapframe *); int inferior(struct proc *p); void itimer_proc_continue(struct proc *p); +void kqtimer_proc_continue(struct proc *p); void kern_proc_vmmap_resident(struct vm_map *map, struct vm_map_entry *entry, int *resident_count, bool *super); void kern_yield(int);