From nobody Thu Oct 26 19:07:49 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4SGb0Q2b5qz4y9hC; Thu, 26 Oct 2023 19:07:50 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4SGb0P5dB7z3Ll6; Thu, 26 Oct 2023 19:07:49 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1698347269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=q4PyG08kZ6Wum/HNSDVZjSe6qbyIPnA/HD9Pxxoo138=; b=T1Q6qtLEN1UW6xZPja7tZ39fB+8Zl1luhW1GRpk02cejfXqkpaowgIZO4KWVsP5ybQEsLM cb70AhLOGHuoSqBtTcrl6CgbrnCdRn2WnoIAdwA4vzOeANpFfLK4r+nXJMGCD5jqSMirpO Py3aUWqn5wPVLhjYHpnjlkRo1+Pyrs7ArTSf4IuJ0q86EqbA/I+OXqxWP+tufqNzFSas9a 2TIXYC2wcFbZ5vKYxI93jVaiC9mKnD7F2DK2UBUQevRccTQ4OLdsq/VTd7HwrTpXCl8PsM 73Sns9FnOwPZQEqgF4SQS6AbUGICHN/kKsiU0rx9phbIQU0ZZUsv68Icu1aSKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1698347269; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=q4PyG08kZ6Wum/HNSDVZjSe6qbyIPnA/HD9Pxxoo138=; b=ysZK7X6IkNFscLgLuv1jY39tE4zDtjVPGQ4LZ0pRzbygozI8VZc+EABl8vEtfgfm3udBUg ULJKNSuUMiGpwsqNPntek/53UyWMePqRIVwc/qClT6ftYoEFz6NtvooMKaw5lApNmyVNKh RLBGGmcZjw+Bk1K6uA7nI95dHTeNdsx147LJYmBzKbHa6XFx9ZR3U/08cY0bDmnwPYOLp4 96U57C6DMhN6KIgZlW+apSsSCTQ8XlpCxAsCOpYZwMx5sDjV49W1BmwF3LGj0+Qz7Ajd9U 6Guxx0Jgvz1FxN5MthGPoa9/O6PSM5GttHxxxCaUf1fJXc16hTAQh4WlrVL0SA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1698347269; a=rsa-sha256; cv=none; b=wWYexasDBEkG2ZPfpCVRz8eiuZjLeok9wuMJIjJUbbPZ5q/07cg3huD/u/qIMd0mNguTPH QxwgrGPTVLRqgYAHIV221HeTWwd7qInt9Ih7i0Cb60gf79nlLJxfvEx3negUjvzEOhRNcb DXwbhbmo8Ersv/4GcgjdFEzHf6xJIWnuPQ7J6owKi7tm/molMq49Qt3W9FzjWJy99Xl2ik VSU4Pub8kIRgibQ0wx7j5Bst1mE0UjlAdiH30wfOBRMf+/1hAHJqtdGIU9EUKjWyr+34Xv zPkiP2qMo0zRQXjUSjwa5XQpc+MsaHVlRAUWTqWv+rVDPPidnpeLOn7UiI2D5Q== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4SGb0P1jjlzjGQ; Thu, 26 Oct 2023 19:07:49 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 39QJ7nHP041093; Thu, 26 Oct 2023 19:07:49 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 39QJ7nux041090; Thu, 26 Oct 2023 19:07:49 GMT (envelope-from git) Date: Thu, 26 Oct 2023 19:07:49 GMT Message-Id: <202310261907.39QJ7nux041090@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Konstantin Belousov Subject: git: 46a2f8227470 - stable/13 - Add membarrier(2) List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: kib X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 46a2f82274701912ef16ba2ef48ab30f4eb6a227 Auto-Submitted: auto-generated The branch stable/13 has been updated by kib: URL: https://cgit.FreeBSD.org/src/commit/?id=46a2f82274701912ef16ba2ef48ab30f4eb6a227 commit 46a2f82274701912ef16ba2ef48ab30f4eb6a227 Author: Konstantin Belousov AuthorDate: 2021-10-07 21:10:07 +0000 Commit: Konstantin Belousov CommitDate: 2023-10-26 04:07:29 +0000 Add membarrier(2) (cherry picked from commit 4a69fc16a583face922319c476f3e739d9ce9140) --- lib/libc/sys/Symbol.map | 1 + sys/conf/files | 1 + sys/kern/kern_exec.c | 2 + sys/kern/kern_membarrier.c | 239 +++++++++++++++++++++++++++++++++++++++++++++ sys/kern/syscalls.master | 8 ++ sys/sys/membarrier.h | 70 +++++++++++++ sys/sys/proc.h | 6 ++ sys/sys/syscallsubr.h | 2 + 8 files changed, 329 insertions(+) diff --git a/lib/libc/sys/Symbol.map b/lib/libc/sys/Symbol.map index a5cafd1977b3..f9af2922ed3c 100644 --- a/lib/libc/sys/Symbol.map +++ b/lib/libc/sys/Symbol.map @@ -418,6 +418,7 @@ FBSD_1.6 { FBSD_1.7 { _Fork; kqueuex; + membarrier; swapoff; }; diff --git a/sys/conf/files b/sys/conf/files index 5f2e09eb8c2f..4b624a71c772 100644 --- a/sys/conf/files +++ b/sys/conf/files @@ -3874,6 +3874,7 @@ kern/kern_lockstat.c optional kdtrace_hooks kern/kern_loginclass.c standard kern/kern_malloc.c standard kern/kern_mbuf.c standard +kern/kern_membarrier.c standard kern/kern_mib.c standard kern/kern_module.c standard kern/kern_mtxpool.c standard diff --git a/sys/kern/kern_exec.c b/sys/kern/kern_exec.c index c02e644aae91..e8e3d8d8801d 100644 --- a/sys/kern/kern_exec.c +++ b/sys/kern/kern_exec.c @@ -827,6 +827,8 @@ interpret: p->p_flag2 &= ~P2_NOTRACE; if ((p->p_flag2 & P2_STKGAP_DISABLE_EXEC) == 0) p->p_flag2 &= ~P2_STKGAP_DISABLE; + p->p_flag2 &= ~(P2_MEMBAR_PRIVE | P2_MEMBAR_PRIVE_SYNCORE | + P2_MEMBAR_GLOBE); if (p->p_flag & P_PPWAIT) { p->p_flag &= ~(P_PPWAIT | P_PPTRACE); cv_broadcast(&p->p_pwait); diff --git a/sys/kern/kern_membarrier.c b/sys/kern/kern_membarrier.c new file mode 100644 index 000000000000..eabd00e8ddf4 --- /dev/null +++ b/sys/kern/kern_membarrier.c @@ -0,0 +1,239 @@ +/*- + * Copyright (c) 2021 The FreeBSD Foundation + * + * This software were developed by Konstantin Belousov + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include + +#define MEMBARRIER_SUPPORTED_CMDS ( \ + MEMBARRIER_CMD_GLOBAL | \ + MEMBARRIER_CMD_GLOBAL_EXPEDITED | \ + MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED | \ + MEMBARRIER_CMD_PRIVATE_EXPEDITED | \ + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED | \ + MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE | \ + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE) + +static void +membarrier_action_seqcst(void *arg __unused) +{ + atomic_thread_fence_seq_cst(); +} + +static void +membarrier_action_seqcst_sync_core(void *arg __unused) +{ + atomic_thread_fence_seq_cst(); + cpu_sync_core(); +} + +static void +do_membarrier_ipi(cpuset_t *csp, void (*func)(void *)) +{ + atomic_thread_fence_seq_cst(); + smp_rendezvous_cpus(*csp, smp_no_rendezvous_barrier, func, + smp_no_rendezvous_barrier, NULL); + atomic_thread_fence_seq_cst(); +} + +static void +check_cpu_switched(int c, cpuset_t *csp, uint64_t *swt, bool init) +{ + struct pcpu *pc; + uint64_t sw; + + if (CPU_ISSET(c, csp)) + return; + + pc = cpuid_to_pcpu[c]; + if (pc->pc_curthread == pc->pc_idlethread) { + CPU_SET(c, csp); + return; + } + + /* + * Sync with context switch to ensure that override of + * pc_curthread with non-idle thread pointer is visible before + * reading of pc_switchtime. + */ + atomic_thread_fence_acq(); + + sw = pc->pc_switchtime; + if (init) + swt[c] = sw; + else if (sw != swt[c]) + CPU_SET(c, csp); +} + +/* + * + * XXXKIB: We execute the requested action (seq_cst and possibly + * sync_core) on current CPU as well. There is no guarantee that + * current thread executes anything with the full fence semantics + * during syscall execution. Similarly, cpu_core_sync() semantics + * might be not provided by the syscall return. E.g. on amd64 we + * typically return without IRET. + */ +int +kern_membarrier(struct thread *td, int cmd, unsigned flags, int cpu_id) +{ + struct proc *p, *p1; + struct thread *td1; + cpuset_t cs; + uint64_t *swt; + int c, error; + bool first; + + if (flags != 0 || (cmd & ~MEMBARRIER_SUPPORTED_CMDS) != 0) + return (EINVAL); + + if (cmd == MEMBARRIER_CMD_QUERY) { + td->td_retval[0] = MEMBARRIER_SUPPORTED_CMDS; + return (0); + } + + p = td->td_proc; + error = 0; + + switch (cmd) { + case MEMBARRIER_CMD_GLOBAL: + swt = malloc((mp_maxid + 1) * sizeof(*swt), M_TEMP, M_WAITOK); + CPU_ZERO(&cs); + sched_pin(); + CPU_SET(PCPU_GET(cpuid), &cs); + for (first = true; error == 0; first = false) { + CPU_FOREACH(c) + check_cpu_switched(c, &cs, swt, first); + if (CPU_CMP(&cs, &all_cpus) == 0) + break; + error = pause_sig("mmbr", 1); + if (error == EWOULDBLOCK) + error = 0; + } + sched_unpin(); + free(swt, M_TEMP); + atomic_thread_fence_seq_cst(); + break; + + case MEMBARRIER_CMD_GLOBAL_EXPEDITED: + if ((td->td_proc->p_flag2 & P2_MEMBAR_GLOBE) == 0) { + error = EPERM; + } else { + CPU_ZERO(&cs); + CPU_FOREACH(c) { + td1 = cpuid_to_pcpu[c]->pc_curthread; + p1 = td1->td_proc; + if (p1 != NULL && + (p1->p_flag2 & P2_MEMBAR_GLOBE) != 0) + CPU_SET(c, &cs); + } + do_membarrier_ipi(&cs, membarrier_action_seqcst); + } + break; + + case MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED: + if ((p->p_flag2 & P2_MEMBAR_GLOBE) == 0) { + PROC_LOCK(p); + p->p_flag2 |= P2_MEMBAR_GLOBE; + PROC_UNLOCK(p); + } + break; + + case MEMBARRIER_CMD_PRIVATE_EXPEDITED: + if ((td->td_proc->p_flag2 & P2_MEMBAR_PRIVE) == 0) { + error = EPERM; + } else { + pmap_active_cpus(vmspace_pmap(p->p_vmspace), &cs); + do_membarrier_ipi(&cs, membarrier_action_seqcst); + } + break; + + case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED: + if ((p->p_flag2 & P2_MEMBAR_PRIVE) == 0) { + PROC_LOCK(p); + p->p_flag2 |= P2_MEMBAR_PRIVE; + PROC_UNLOCK(p); + } + break; + + case MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE: + if ((td->td_proc->p_flag2 & P2_MEMBAR_PRIVE_SYNCORE) == 0) { + error = EPERM; + } else { + /* + * Calculating the IPI multicast mask from + * pmap active mask means that we do not call + * cpu_sync_core() on CPUs that were missed + * from pmap active mask but could be switched + * from or to meantime. This is fine at least + * on amd64 because threads always use slow + * (IRETQ) path to return from syscall after + * context switch. + */ + pmap_active_cpus(vmspace_pmap(p->p_vmspace), &cs); + + do_membarrier_ipi(&cs, + membarrier_action_seqcst_sync_core); + } + break; + + case MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE: + if ((p->p_flag2 & P2_MEMBAR_PRIVE_SYNCORE) == 0) { + PROC_LOCK(p); + p->p_flag2 |= P2_MEMBAR_PRIVE_SYNCORE; + PROC_UNLOCK(p); + } + break; + + default: + error = EINVAL; + break; + } + + return (error); +} + +int +sys_membarrier(struct thread *td, struct membarrier_args *uap) +{ + return (kern_membarrier(td, uap->cmd, uap->flags, uap->cpu_id)); +} diff --git a/sys/kern/syscalls.master b/sys/kern/syscalls.master index d383a50ce3d1..0a977ca0eff3 100644 --- a/sys/kern/syscalls.master +++ b/sys/kern/syscalls.master @@ -3287,6 +3287,14 @@ u_int flags ); } +584 AUE_NULL STD|CAPENABLED { + int membarrier( + int cmd, + unsigned flags, + int cpu_id + ); + } + ; Please copy any additions and changes to the following compatability tables: ; sys/compat/freebsd32/syscalls.master diff --git a/sys/sys/membarrier.h b/sys/sys/membarrier.h new file mode 100644 index 000000000000..958b769da23e --- /dev/null +++ b/sys/sys/membarrier.h @@ -0,0 +1,70 @@ +/*- + * Copyright (c) 2021 The FreeBSD Foundation + * + * This software were developed by Konstantin Belousov + * under sponsorship from the FreeBSD Foundation. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ + +#ifndef __SYS_MEMBARRIER_H__ +#define __SYS_MEMBARRIER_H__ + +#include + +/* + * The enum membarrier_cmd values are bits. The MEMBARRIER_CMD_QUERY + * command returns a bitset indicating which commands are supported. + * Also the value of MEMBARRIER_CMD_QUERY is zero, so it is + * effectively not returned by the query. + */ +enum membarrier_cmd { + MEMBARRIER_CMD_QUERY = 0x00000000, + MEMBARRIER_CMD_GLOBAL = 0x00000001, + MEMBARRIER_CMD_SHARED = MEMBARRIER_CMD_GLOBAL, + MEMBARRIER_CMD_GLOBAL_EXPEDITED = 0x00000002, + MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED = 0x00000004, + MEMBARRIER_CMD_PRIVATE_EXPEDITED = 0x00000008, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED = 0x00000010, + MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE = 0x00000020, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE = 0x00000040, + + /* + * RSEQ constants are defined for source compatibility but are + * not yes supported, MEMBARRIER_CMD_QUERY does not return + * them in the mask. + */ + MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ = 0x00000080, + MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ = 0x00000100, +}; + +enum membarrier_cmd_flag { + MEMBARRIER_CMD_FLAG_CPU = 0x00000001, +}; + +#ifndef _KERNEL +__BEGIN_DECLS +int membarrier(int, unsigned, int); +__END_DECLS +#endif /* _KERNEL */ + +#endif /* __SYS_MEMBARRIER_H__ */ diff --git a/sys/sys/proc.h b/sys/sys/proc.h index 369bc607e9c7..b279839dbf8d 100644 --- a/sys/sys/proc.h +++ b/sys/sys/proc.h @@ -838,6 +838,12 @@ struct proc { external thread_single() is permitted */ #define P2_REAPKILLED 0x00080000 +#define P2_MEMBAR_PRIVE 0x00100000 /* membar private expedited + registered */ +#define P2_MEMBAR_PRIVE_SYNCORE 0x00200000 /* membar private expedited + sync core registered */ +#define P2_MEMBAR_GLOBE 0x00400000 /* membar global expedited + registered */ /* Flags protected by proctree_lock, kept in p_treeflags. */ #define P_TREE_ORPHANED 0x00000001 /* Reparented, on orphan list */ diff --git a/sys/sys/syscallsubr.h b/sys/sys/syscallsubr.h index de3c7780fc2c..aabcb19448cc 100644 --- a/sys/sys/syscallsubr.h +++ b/sys/sys/syscallsubr.h @@ -201,6 +201,8 @@ int kern_minherit(struct thread *td, uintptr_t addr, size_t len, int inherit); int kern_mkdirat(struct thread *td, int fd, const char *path, enum uio_seg segflg, int mode); +int kern_membarrier(struct thread *td, int cmd, unsigned flags, + int cpu_id); int kern_mkfifoat(struct thread *td, int fd, const char *path, enum uio_seg pathseg, int mode); int kern_mknodat(struct thread *td, int fd, const char *path,