From nobody Thu Sep 21 04:13:04 2023 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Rrhp83shpz4tSCx; Thu, 21 Sep 2023 04:13:04 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Rrhp829jsz3Q1t; Thu, 21 Sep 2023 04:13:04 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1695269584; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=PRmpjTtHhSPx+FAXAMmx3eZ8e4hGN6a6Dlnv83dPAzE=; b=YQx5OPVNLxC8BM19LgAY9DFLrnYtfr/XHEdS6sJsULPeL3xoOyhT7qOHIswztQN9TeMJEj RfstS8UnDRNq1m0U7TNwrT2JMv3+zkzQFBRD31lxw+qBSBh12u+UI9SLTnt9CQ4VpEPDOm TaGbQ55qE57oH05F7H99uL044ZbBq/kaGiKksjuAS3rIv8ssMwEr5U2kHX13hgNtUZQji2 ihaV3x4gdd965O+dlGNX3xaPwQWHkrKgas3ByhmMZGQQ4hZ/a7qWg4QT350CoFOVFHtyGv KgkkjOqfXvLQ35LUj+ncnMz90bKX2VNzxU9FRTCMgCd7riudri0zEwmrLukzDQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1695269584; a=rsa-sha256; cv=none; b=DkPgse6EJr6lk1dY/jJZw05/Bl1SeIW1kMovRFgCuX1qJLFnniq/Obm4uIOfuQ+mKDEUHt MnAiHAD67QG4c5LBJLiEO1g0ahLpx6efACC9uc1wkOp37r6FGbIlZ5OmgJyeI95aoRkIeP QMhLHUy88An7ybCk+ZhF2159uyE1xVUopOHu8UJ7V8nVKY4oQBAMdxyDb+7XjCbSc8M+yl acJpNo/LSm1sm2K/9KaNLaSqyRTu+cGo2uVmhSifRf7eo5nzWuyGxVkwO0FIFohIVKP9YF sYjo/m5gg2rWqNn9fo04xVELHg7esCx64j+VPf4/psX+t4jl/UKy/pqy89XlVg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1695269584; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=PRmpjTtHhSPx+FAXAMmx3eZ8e4hGN6a6Dlnv83dPAzE=; b=uEel1XuwQW0Gjy6feHWCldekWlxnmL7L3M/fCREzR9V4T67x93zYWhU/F/iUWPIPKU7N32 5TnRZsjZVMZlQWqY3i0ed1ZpvOHd2UZ/ZadDW6xoYkas+LxBDkqbi5OTJVkzNt9IeuMftq 7suKMC8TYid9m1SR0NcVfntLIZFc4rQq+fMjSVAjJEaLDqwiyx5fb66AdEh9IIuOW/+FBB U9UP4PFudcSYtL49eNMktGeTu+cCszHPCh02pAFuLGSnkwIVsSctcWxm1ExqUgsov9H8Qu OI4itk8p5P6mK0Jf2kCsGqPM4rBu6Dg+n+JOGG1sbHgp3moDNtMflZAyKOTIuQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Rrhp81G28z8FR; Thu, 21 Sep 2023 04:13:04 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.17.1/8.17.1) with ESMTP id 38L4D4g2076516; Thu, 21 Sep 2023 04:13:04 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.17.1/8.17.1/Submit) id 38L4D42w076513; Thu, 21 Sep 2023 04:13:04 GMT (envelope-from git) Date: Thu, 21 Sep 2023 04:13:04 GMT Message-Id: <202309210413.38L4D42w076513@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Zhenlei Huang Subject: git: cf7974fd9e55 - main - sysctl: Update 'master' copy of vnet SYSCTLs on kernel environment variables change List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: zlei X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: cf7974fd9e554552989237c3d6bc736d672ac7c6 Auto-Submitted: auto-generated The branch main has been updated by zlei: URL: https://cgit.FreeBSD.org/src/commit/?id=cf7974fd9e554552989237c3d6bc736d672ac7c6 commit cf7974fd9e554552989237c3d6bc736d672ac7c6 Author: Zhenlei Huang AuthorDate: 2023-09-21 04:11:28 +0000 Commit: Zhenlei Huang CommitDate: 2023-09-21 04:11:28 +0000 sysctl: Update 'master' copy of vnet SYSCTLs on kernel environment variables change Complete phase three of 3da1cf1e88f8. With commit 110113bc086f, vnet sysctl variables can be loader tunable but the feature is limited. When the kernel modules have been initialized, any changes (e.g. via kenv) to kernel environment variable will not affect subsequently created VNETs. This change relexes the limitation by listening on kernel environment variable's set / unset events, and then update the 'master' copy of vnet SYSCTL or restore it to its initial value. With this change, TUNABLE_XXX_FETCH can be greately eliminated for vnet loader tunables. Reviewed by: glebius Fixes: 110113bc086f sysctl(9): Enable vnet sysctl variables to be loader tunable MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D41825 --- sys/kern/kern_environment.c | 3 ++ sys/kern/kern_sysctl.c | 107 +++++++++++++++++++++++++++++++++++++++++++- sys/kern/link_elf.c | 2 + sys/kern/link_elf_obj.c | 8 ++++ sys/net/vnet.c | 33 ++++++++++++++ sys/net/vnet.h | 6 +++ sys/sys/eventhandler.h | 5 +++ 7 files changed, 162 insertions(+), 2 deletions(-) diff --git a/sys/kern/kern_environment.c b/sys/kern/kern_environment.c index 761734674bdf..a0967d044a96 100644 --- a/sys/kern/kern_environment.c +++ b/sys/kern/kern_environment.c @@ -38,6 +38,7 @@ #include #include +#include #include #include #include @@ -666,6 +667,7 @@ kern_setenv(const char *name, const char *value) kenvp[i + 1] = NULL; mtx_unlock(&kenv_lock); } + EVENTHANDLER_INVOKE(setenv, name); return (0); } @@ -689,6 +691,7 @@ kern_unsetenv(const char *name) kenvp[i] = NULL; mtx_unlock(&kenv_lock); zfree(oldenv, M_KENV); + EVENTHANDLER_INVOKE(unsetenv, name); return (0); } mtx_unlock(&kenv_lock); diff --git a/sys/kern/kern_sysctl.c b/sys/kern/kern_sysctl.c index a1d502d58bff..780eb6099b07 100644 --- a/sys/kern/kern_sysctl.c +++ b/sys/kern/kern_sysctl.c @@ -127,6 +127,7 @@ static int sysctl_remove_oid_locked(struct sysctl_oid *oidp, int del, int recurse); static int sysctl_old_kernel(struct sysctl_req *, const void *, size_t); static int sysctl_new_kernel(struct sysctl_req *, void *, size_t); +static int name2oid(char *, int *, int *, struct sysctl_oid **); static struct sysctl_oid * sysctl_find_oidname(const char *name, struct sysctl_oid_list *list) @@ -512,8 +513,14 @@ sysctl_register_oid(struct sysctl_oid *oidp) if ((oidp->oid_kind & CTLTYPE) != CTLTYPE_NODE && (oidp->oid_kind & CTLFLAG_TUN) != 0 && (oidp->oid_kind & CTLFLAG_NOFETCH) == 0) { - /* only fetch value once */ - oidp->oid_kind |= CTLFLAG_NOFETCH; +#ifdef VIMAGE + /* + * Can fetch value multiple times for VNET loader tunables. + * Only fetch once for non-VNET loader tunables. + */ + if ((oidp->oid_kind & CTLFLAG_VNET) == 0) +#endif + oidp->oid_kind |= CTLFLAG_NOFETCH; /* try to fetch value from kernel environment */ sysctl_load_tunable_by_oid_locked(oidp); } @@ -969,6 +976,102 @@ sysctl_register_all(void *arg) } SYSINIT(sysctl, SI_SUB_KMEM, SI_ORDER_FIRST, sysctl_register_all, NULL); +#ifdef VIMAGE +static void +sysctl_setenv_vnet(void *arg __unused, char *name) +{ + struct sysctl_oid *oidp; + int oid[CTL_MAXNAME]; + int error, nlen; + + SYSCTL_WLOCK(); + error = name2oid(name, oid, &nlen, &oidp); + if (error) + goto out; + + if ((oidp->oid_kind & CTLTYPE) != CTLTYPE_NODE && + (oidp->oid_kind & CTLFLAG_VNET) != 0 && + (oidp->oid_kind & CTLFLAG_TUN) != 0 && + (oidp->oid_kind & CTLFLAG_NOFETCH) == 0) { + /* Update value from kernel environment */ + sysctl_load_tunable_by_oid_locked(oidp); + } +out: + SYSCTL_WUNLOCK(); +} + +static void +sysctl_unsetenv_vnet(void *arg __unused, char *name) +{ + struct sysctl_oid *oidp; + int oid[CTL_MAXNAME]; + int error, nlen; + + SYSCTL_WLOCK(); + /* + * The setenv / unsetenv event handlers are invoked by kern_setenv() / + * kern_unsetenv() without exclusive locks. It is rare but still possible + * that the invoke order of event handlers is different from that of + * kern_setenv() and kern_unsetenv(). + * Re-check environment variable string to make sure it is unset. + */ + if (testenv(name)) + goto out; + error = name2oid(name, oid, &nlen, &oidp); + if (error) + goto out; + + if ((oidp->oid_kind & CTLTYPE) != CTLTYPE_NODE && + (oidp->oid_kind & CTLFLAG_VNET) != 0 && + (oidp->oid_kind & CTLFLAG_TUN) != 0 && + (oidp->oid_kind & CTLFLAG_NOFETCH) == 0) { + size_t size; + + switch (oidp->oid_kind & CTLTYPE) { + case CTLTYPE_INT: + case CTLTYPE_UINT: + size = sizeof(int); + break; + case CTLTYPE_LONG: + case CTLTYPE_ULONG: + size = sizeof(long); + break; + case CTLTYPE_S8: + case CTLTYPE_U8: + size = sizeof(int8_t); + break; + case CTLTYPE_S16: + case CTLTYPE_U16: + size = sizeof(int16_t); + break; + case CTLTYPE_S32: + case CTLTYPE_U32: + size = sizeof(int32_t); + break; + case CTLTYPE_S64: + case CTLTYPE_U64: + size = sizeof(int64_t); + break; + case CTLTYPE_STRING: + MPASS(oidp->oid_arg2 > 0); + size = oidp->oid_arg2; + break; + default: + goto out; + } + vnet_restore_init(oidp->oid_arg1, size); + } +out: + SYSCTL_WUNLOCK(); +} + +/* + * Register the kernel's setenv / unsetenv events. + */ +EVENTHANDLER_DEFINE(setenv, sysctl_setenv_vnet, NULL, EVENTHANDLER_PRI_ANY); +EVENTHANDLER_DEFINE(unsetenv, sysctl_unsetenv_vnet, NULL, EVENTHANDLER_PRI_ANY); +#endif + /* * "Staff-functions" * diff --git a/sys/kern/link_elf.c b/sys/kern/link_elf.c index 568f1e1dbd95..eb7ce3828deb 100644 --- a/sys/kern/link_elf.c +++ b/sys/kern/link_elf.c @@ -506,6 +506,7 @@ link_elf_init(void* arg) TAILQ_INIT(&set_pcpu_list); #ifdef VIMAGE TAILQ_INIT(&set_vnet_list); + vnet_save_init((void *)VNET_START, VNET_STOP - VNET_START); #endif } @@ -767,6 +768,7 @@ parse_vnet(elf_file_t ef) return (ENOSPC); } memcpy((void *)ef->vnet_base, (void *)ef->vnet_start, size); + vnet_save_init((void *)ef->vnet_base, size); elf_set_add(&set_vnet_list, ef->vnet_start, ef->vnet_stop, ef->vnet_base); diff --git a/sys/kern/link_elf_obj.c b/sys/kern/link_elf_obj.c index d4ad963e8181..0b2befc02c1a 100644 --- a/sys/kern/link_elf_obj.c +++ b/sys/kern/link_elf_obj.c @@ -547,6 +547,8 @@ link_elf_link_preload(linker_class_t cls, const char *filename, memcpy(vnet_data, ef->progtab[pb].addr, ef->progtab[pb].size); ef->progtab[pb].addr = vnet_data; + vnet_save_init(ef->progtab[pb].addr, + ef->progtab[pb].size); #endif } else if ((ef->progtab[pb].name != NULL && strcmp(ef->progtab[pb].name, ".ctors") == 0) || @@ -1120,6 +1122,12 @@ link_elf_load_file(linker_class_t cls, const char *filename, } else bzero(ef->progtab[pb].addr, shdr[i].sh_size); +#ifdef VIMAGE + if (ef->progtab[pb].addr != (void *)mapbase && + strcmp(ef->progtab[pb].name, VNET_SETNAME) == 0) + vnet_save_init(ef->progtab[pb].addr, + ef->progtab[pb].size); +#endif /* Update all symbol values with the offset. */ for (j = 0; j < ef->ddbsymcnt; j++) { es = &ef->ddbsymtab[j]; diff --git a/sys/net/vnet.c b/sys/net/vnet.c index c4a623698341..ac937125a19d 100644 --- a/sys/net/vnet.c +++ b/sys/net/vnet.c @@ -178,6 +178,11 @@ static MALLOC_DEFINE(M_VNET_DATA, "vnet_data", "VNET data"); */ VNET_DEFINE_STATIC(char, modspace[VNET_MODMIN] __aligned(__alignof(void *))); +/* + * A copy of the initial values of all virtualized global variables. + */ +static uintptr_t vnet_init_var; + /* * Global lists of subsystem constructor and destructors for vnets. They are * registered via VNET_SYSINIT() and VNET_SYSUNINIT(). Both lists are @@ -356,6 +361,7 @@ vnet_data_startup(void *dummy __unused) df->vnd_len = VNET_MODMIN; TAILQ_INSERT_HEAD(&vnet_data_free_head, df, vnd_link); sx_init(&vnet_data_free_lock, "vnet_data alloc lock"); + vnet_init_var = (uintptr_t)malloc(VNET_BYTES, M_VNET_DATA, M_WAITOK); } SYSINIT(vnet_data, SI_SUB_KLD, SI_ORDER_FIRST, vnet_data_startup, NULL); @@ -473,6 +479,33 @@ vnet_data_copy(void *start, int size) VNET_LIST_RUNLOCK(); } +/* + * Save a copy of the initial values of virtualized global variables. + */ +void +vnet_save_init(void *start, size_t size) +{ + MPASS(vnet_init_var != 0); + MPASS(VNET_START <= (uintptr_t)start && + (uintptr_t)start + size <= VNET_STOP); + memcpy((void *)(vnet_init_var + ((uintptr_t)start - VNET_START)), + start, size); +} + +/* + * Restore the 'master' copies of virtualized global variables to theirs + * initial values. + */ +void +vnet_restore_init(void *start, size_t size) +{ + MPASS(vnet_init_var != 0); + MPASS(VNET_START <= (uintptr_t)start && + (uintptr_t)start + size <= VNET_STOP); + memcpy(start, + (void *)(vnet_init_var + ((uintptr_t)start - VNET_START)), size); +} + /* * Support for special SYSINIT handlers registered via VNET_SYSINIT() * and VNET_SYSUNINIT(). diff --git a/sys/net/vnet.h b/sys/net/vnet.h index 1d37fe85eec3..5485889ceaa7 100644 --- a/sys/net/vnet.h +++ b/sys/net/vnet.h @@ -311,6 +311,12 @@ void *vnet_data_alloc(int size); void vnet_data_copy(void *start, int size); void vnet_data_free(void *start_arg, int size); +/* + * Interfaces to manipulate the initial values of virtualized global variables. + */ +void vnet_save_init(void *, size_t); +void vnet_restore_init(void *, size_t); + /* * Virtual sysinit mechanism, allowing network stack components to declare * startup and shutdown methods to be run when virtual network stack diff --git a/sys/sys/eventhandler.h b/sys/sys/eventhandler.h index 47024ecf87a9..c0d9811dd1b9 100644 --- a/sys/sys/eventhandler.h +++ b/sys/sys/eventhandler.h @@ -326,4 +326,9 @@ struct ifaddr; typedef void (*rt_addrmsg_fn)(void *, struct ifaddr *, int); EVENTHANDLER_DECLARE(rt_addrmsg, rt_addrmsg_fn); +/* Kernel environment variable change event */ +typedef void (*env_change_fn)(void *, const char *); +EVENTHANDLER_DECLARE(setenv, env_change_fn); +EVENTHANDLER_DECLARE(unsetenv, env_change_fn); + #endif /* _SYS_EVENTHANDLER_H_ */