From owner-svn-src-all@freebsd.org Fri Jul 5 12:26:31 2019 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B3E2315C9D3B; Fri, 5 Jul 2019 12:26:31 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5FED96D7E7; Fri, 5 Jul 2019 12:26:31 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 2BDCFA454; Fri, 5 Jul 2019 12:26:31 +0000 (UTC) (envelope-from hselasky@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x65CQV5U056370; Fri, 5 Jul 2019 12:26:31 GMT (envelope-from hselasky@FreeBSD.org) Received: (from hselasky@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x65CQUev056366; Fri, 5 Jul 2019 12:26:30 GMT (envelope-from hselasky@FreeBSD.org) Message-Id: <201907051226.x65CQUev056366@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: hselasky set sender to hselasky@FreeBSD.org using -f From: Hans Petter Selasky Date: Fri, 5 Jul 2019 12:26:30 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org Subject: svn commit: r349763 - in stable/12/sys: kern sys X-SVN-Group: stable-12 X-SVN-Commit-Author: hselasky X-SVN-Commit-Paths: in stable/12/sys: kern sys X-SVN-Commit-Revision: 349763 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 5FED96D7E7 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.95 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; NEURAL_HAM_SHORT(-0.95)[-0.954,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US]; NEURAL_HAM_LONG(-1.00)[-1.000,0] X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Jul 2019 12:26:32 -0000 Author: hselasky Date: Fri Jul 5 12:26:30 2019 New Revision: 349763 URL: https://svnweb.freebsd.org/changeset/base/349763 Log: MFC r340404, r340415, r340417, r340419 and r340420: Synchronize epoch(9) code with -head to make merging new patches and features easier. The FreeBSD version has been bumped to force recompilation of external kernel modules. Sponsored by: Mellanox Technologies MFC r340404: Uninline epoch(9) entrance and exit. There is no proof that modern processors would benefit from avoiding a function call, but bloating code. In fact, clang created an uninlined real function for many object files in the network stack. - Move epoch_private.h into subr_epoch.c. Code copied exactly, avoiding any changes, including style(9). - Remove private copies of critical_enter/exit. Reviewed by: kib, jtl Differential Revision: https://reviews.freebsd.org/D17879 MFC r340415: The dualism between epoch_tracker and epoch_thread is fragile and unnecessary. So, expose CK types to kernel and use a single normal structure for epoch_tracker. Reviewed by: jtl, gallatin MFC r340417: With epoch not inlined, there is no point in using _lite KPI. While here, remove some unnecessary casts. MFC r340419: style(9), mostly adjusting overly long lines. MFC r340420: epoch(9) revert r340097 - no longer a need for multiple sections per cpu I spoke with Samy Bahra and recent changes to CK to make ck_epoch_call and ck_epoch_poll not modify the record have eliminated the need for this. Deleted: stable/12/sys/sys/epoch_private.h Modified: stable/12/sys/kern/genoffset.c stable/12/sys/kern/subr_epoch.c stable/12/sys/sys/epoch.h stable/12/sys/sys/param.h Directory Properties: stable/12/ (props changed) Modified: stable/12/sys/kern/genoffset.c ============================================================================== --- stable/12/sys/kern/genoffset.c Fri Jul 5 10:31:37 2019 (r349762) +++ stable/12/sys/kern/genoffset.c Fri Jul 5 12:26:30 2019 (r349763) @@ -36,7 +36,6 @@ __FBSDID("$FreeBSD$"); #include #include -OFFSYM(td_pre_epoch_prio, thread, u_char); OFFSYM(td_priority, thread, u_char); OFFSYM(td_epochnest, thread, u_char); OFFSYM(td_critnest, thread, u_int); Modified: stable/12/sys/kern/subr_epoch.c ============================================================================== --- stable/12/sys/kern/subr_epoch.c Fri Jul 5 10:31:37 2019 (r349762) +++ stable/12/sys/kern/subr_epoch.c Fri Jul 5 12:26:30 2019 (r349763) @@ -55,6 +55,27 @@ __FBSDID("$FreeBSD$"); static MALLOC_DEFINE(M_EPOCH, "epoch", "epoch based reclamation"); +#ifdef __amd64__ +#define EPOCH_ALIGN CACHE_LINE_SIZE*2 +#else +#define EPOCH_ALIGN CACHE_LINE_SIZE +#endif + +TAILQ_HEAD (epoch_tdlist, epoch_tracker); +typedef struct epoch_record { + ck_epoch_record_t er_record; + volatile struct epoch_tdlist er_tdlist; + volatile uint32_t er_gen; + uint32_t er_cpuid; +} __aligned(EPOCH_ALIGN) *epoch_record_t; + +struct epoch { + struct ck_epoch e_epoch __aligned(EPOCH_ALIGN); + epoch_record_t e_pcpu_record; + int e_idx; + int e_flags; +}; + /* arbitrary --- needs benchmarking */ #define MAX_ADAPTIVE_SPIN 100 #define MAX_EPOCHS 64 @@ -119,11 +140,15 @@ epoch_init(void *arg __unused) epoch_call_count = counter_u64_alloc(M_WAITOK); epoch_call_task_count = counter_u64_alloc(M_WAITOK); - pcpu_zone_record = uma_zcreate("epoch_record pcpu", sizeof(struct epoch_record), - NULL, NULL, NULL, NULL, UMA_ALIGN_PTR, UMA_ZONE_PCPU); + pcpu_zone_record = uma_zcreate("epoch_record pcpu", + sizeof(struct epoch_record), NULL, NULL, NULL, NULL, + UMA_ALIGN_PTR, UMA_ZONE_PCPU); CPU_FOREACH(cpu) { - GROUPTASK_INIT(DPCPU_ID_PTR(cpu, epoch_cb_task), 0, epoch_call_task, NULL); - taskqgroup_attach_cpu(qgroup_softirq, DPCPU_ID_PTR(cpu, epoch_cb_task), NULL, cpu, -1, "epoch call task"); + GROUPTASK_INIT(DPCPU_ID_PTR(cpu, epoch_cb_task), 0, + epoch_call_task, NULL); + taskqgroup_attach_cpu(qgroup_softirq, + DPCPU_ID_PTR(cpu, epoch_cb_task), NULL, cpu, -1, + "epoch call task"); } inited = 1; global_epoch = epoch_alloc(0); @@ -156,6 +181,15 @@ epoch_ctor(epoch_t epoch) } } +static void +epoch_adjust_prio(struct thread *td, u_char prio) +{ + + thread_lock(td); + sched_prio(td, prio); + thread_unlock(td); +} + epoch_t epoch_alloc(int flags) { @@ -191,45 +225,120 @@ epoch_free(epoch_t epoch) free(epoch, M_EPOCH); } +static epoch_record_t +epoch_currecord(epoch_t epoch) +{ + + return (zpcpu_get_cpu(epoch->e_pcpu_record, curcpu)); +} + +#define INIT_CHECK(epoch) \ + do { \ + if (__predict_false((epoch) == NULL)) \ + return; \ + } while (0) + void -epoch_enter_preempt_KBI(epoch_t epoch, epoch_tracker_t et) +epoch_enter_preempt(epoch_t epoch, epoch_tracker_t et) { + struct epoch_record *er; + struct thread *td; - epoch_enter_preempt(epoch, et); + MPASS(cold || epoch != NULL); + INIT_CHECK(epoch); + MPASS(epoch->e_flags & EPOCH_PREEMPT); +#ifdef EPOCH_TRACKER_DEBUG + et->et_magic_pre = EPOCH_MAGIC0; + et->et_magic_post = EPOCH_MAGIC1; +#endif + td = curthread; + et->et_td = td; + td->td_epochnest++; + critical_enter(); + sched_pin(); + + td->td_pre_epoch_prio = td->td_priority; + er = epoch_currecord(epoch); + TAILQ_INSERT_TAIL(&er->er_tdlist, et, et_link); + ck_epoch_begin(&er->er_record, &et->et_section); + critical_exit(); } void -epoch_exit_preempt_KBI(epoch_t epoch, epoch_tracker_t et) +epoch_enter(epoch_t epoch) { + struct thread *td; + epoch_record_t er; - epoch_exit_preempt(epoch, et); + MPASS(cold || epoch != NULL); + INIT_CHECK(epoch); + td = curthread; + + td->td_epochnest++; + critical_enter(); + er = epoch_currecord(epoch); + ck_epoch_begin(&er->er_record, NULL); } void -epoch_enter_KBI(epoch_t epoch) +epoch_exit_preempt(epoch_t epoch, epoch_tracker_t et) { + struct epoch_record *er; + struct thread *td; - epoch_enter(epoch); + INIT_CHECK(epoch); + td = curthread; + critical_enter(); + sched_unpin(); + MPASS(td->td_epochnest); + td->td_epochnest--; + er = epoch_currecord(epoch); + MPASS(epoch->e_flags & EPOCH_PREEMPT); + MPASS(et != NULL); + MPASS(et->et_td == td); +#ifdef EPOCH_TRACKER_DEBUG + MPASS(et->et_magic_pre == EPOCH_MAGIC0); + MPASS(et->et_magic_post == EPOCH_MAGIC1); + et->et_magic_pre = 0; + et->et_magic_post = 0; +#endif +#ifdef INVARIANTS + et->et_td = (void*)0xDEADBEEF; +#endif + ck_epoch_end(&er->er_record, &et->et_section); + TAILQ_REMOVE(&er->er_tdlist, et, et_link); + er->er_gen++; + if (__predict_false(td->td_pre_epoch_prio != td->td_priority)) + epoch_adjust_prio(td, td->td_pre_epoch_prio); + critical_exit(); } void -epoch_exit_KBI(epoch_t epoch) +epoch_exit(epoch_t epoch) { + struct thread *td; + epoch_record_t er; - epoch_exit(epoch); + INIT_CHECK(epoch); + td = curthread; + MPASS(td->td_epochnest); + td->td_epochnest--; + er = epoch_currecord(epoch); + ck_epoch_end(&er->er_record, NULL); + critical_exit(); } /* - * epoch_block_handler_preempt is a callback from the ck code when another thread is - * currently in an epoch section. + * epoch_block_handler_preempt() is a callback from the CK code when another + * thread is currently in an epoch section. */ static void -epoch_block_handler_preempt(struct ck_epoch *global __unused, ck_epoch_record_t *cr, - void *arg __unused) +epoch_block_handler_preempt(struct ck_epoch *global __unused, + ck_epoch_record_t *cr, void *arg __unused) { epoch_record_t record; struct thread *td, *owner, *curwaittd; - struct epoch_thread *tdwait; + struct epoch_tracker *tdwait; struct turnstile *ts; struct lock_object *lock; int spincount, gen; @@ -317,25 +426,27 @@ epoch_block_handler_preempt(struct ck_epoch *global __ if (TD_IS_INHIBITED(curwaittd) && TD_ON_LOCK(curwaittd) && ((ts = curwaittd->td_blocked) != NULL)) { /* - * We unlock td to allow turnstile_wait to reacquire the - * the thread lock. Before unlocking it we enter a critical - * section to prevent preemption after we reenable interrupts - * by dropping the thread lock in order to prevent curwaittd - * from getting to run. + * We unlock td to allow turnstile_wait to reacquire + * the thread lock. Before unlocking it we enter a + * critical section to prevent preemption after we + * reenable interrupts by dropping the thread lock in + * order to prevent curwaittd from getting to run. */ critical_enter(); thread_unlock(td); owner = turnstile_lock(ts, &lock); /* - * The owner pointer indicates that the lock succeeded. Only - * in case we hold the lock and the turnstile we locked is still - * the one that curwaittd is blocked on can we continue. Otherwise - * The turnstile pointer has been changed out from underneath - * us, as in the case where the lock holder has signalled curwaittd, + * The owner pointer indicates that the lock succeeded. + * Only in case we hold the lock and the turnstile we + * locked is still the one that curwaittd is blocked on + * can we continue. Otherwise the turnstile pointer has + * been changed out from underneath us, as in the case + * where the lock holder has signalled curwaittd, * and we need to continue. */ if (owner != NULL && ts == curwaittd->td_blocked) { - MPASS(TD_IS_INHIBITED(curwaittd) && TD_ON_LOCK(curwaittd)); + MPASS(TD_IS_INHIBITED(curwaittd) && + TD_ON_LOCK(curwaittd)); critical_exit(); turnstile_wait(ts, owner, curwaittd->td_tsqueue); counter_u64_add(turnstile_count, 1); @@ -385,9 +496,8 @@ epoch_wait_preempt(epoch_t epoch) if ((epoch->e_flags & EPOCH_LOCKED) == 0) WITNESS_WARN(WARN_GIANTOK | WARN_SLEEPOK, NULL, "epoch_wait() can be long running"); - KASSERT(!in_epoch(epoch), - ("epoch_wait_preempt() called in the middle " - "of an epoch section of the same epoch")); + KASSERT(!in_epoch(epoch), ("epoch_wait_preempt() called in the middle " + "of an epoch section of the same epoch")); #endif thread_lock(td); DROP_GIANT(); @@ -400,7 +510,8 @@ epoch_wait_preempt(epoch_t epoch) td->td_pinned = 0; sched_bind(td, old_cpu); - ck_epoch_synchronize_wait(&epoch->e_epoch, epoch_block_handler_preempt, NULL); + ck_epoch_synchronize_wait(&epoch->e_epoch, epoch_block_handler_preempt, + NULL); /* restore CPU binding, if any */ if (was_bound != 0) { @@ -501,7 +612,7 @@ epoch_call_task(void *arg __unused) head = ck_stack_batch_pop_npsc(&cb_stack); for (cursor = head; cursor != NULL; cursor = next) { struct ck_epoch_entry *entry = - ck_epoch_entry_container(cursor); + ck_epoch_entry_container(cursor); next = CK_STACK_NEXT(cursor); entry->function(entry); @@ -511,7 +622,7 @@ epoch_call_task(void *arg __unused) int in_epoch_verbose(epoch_t epoch, int dump_onfail) { - struct epoch_thread *tdwait; + struct epoch_tracker *tdwait; struct thread *td; epoch_record_t er; @@ -544,12 +655,4 @@ int in_epoch(epoch_t epoch) { return (in_epoch_verbose(epoch, 0)); -} - -void -epoch_adjust_prio(struct thread *td, u_char prio) -{ - thread_lock(td); - sched_prio(td, prio); - thread_unlock(td); } Modified: stable/12/sys/sys/epoch.h ============================================================================== --- stable/12/sys/sys/epoch.h Fri Jul 5 10:31:37 2019 (r349762) +++ stable/12/sys/sys/epoch.h Fri Jul 5 12:26:30 2019 (r349763) @@ -29,10 +29,17 @@ #ifndef _SYS_EPOCH_H_ #define _SYS_EPOCH_H_ + +struct epoch_context { + void *data[2]; +} __aligned(sizeof(void *)); + +typedef struct epoch_context *epoch_context_t; + #ifdef _KERNEL #include #include -#endif +#include struct epoch; typedef struct epoch *epoch_t; @@ -43,22 +50,19 @@ typedef struct epoch *epoch_t; extern epoch_t global_epoch; extern epoch_t global_epoch_preempt; -struct epoch_context { - void *data[2]; -} __aligned(sizeof(void *)); - -typedef struct epoch_context *epoch_context_t; - - struct epoch_tracker { - void *datap[3]; -#ifdef EPOCH_TRACKER_DEBUG - int datai[5]; -#else - int datai[1]; +#ifdef EPOCH_TRACKER_DEBUG +#define EPOCH_MAGIC0 0xFADECAFEF00DD00D +#define EPOCH_MAGIC1 0xBADDBABEDEEDFEED + uint64_t et_magic_pre; #endif + TAILQ_ENTRY(epoch_tracker) et_link; + struct thread *et_td; + ck_epoch_section_t et_section; +#ifdef EPOCH_TRACKER_DEBUG + uint64_t et_magic_post; +#endif } __aligned(sizeof(void *)); - typedef struct epoch_tracker *epoch_tracker_t; epoch_t epoch_alloc(int flags); @@ -68,26 +72,15 @@ void epoch_wait_preempt(epoch_t epoch); void epoch_call(epoch_t epoch, epoch_context_t ctx, void (*callback) (epoch_context_t)); int in_epoch(epoch_t epoch); int in_epoch_verbose(epoch_t epoch, int dump_onfail); -#ifdef _KERNEL DPCPU_DECLARE(int, epoch_cb_count); DPCPU_DECLARE(struct grouptask, epoch_cb_task); #define EPOCH_MAGIC0 0xFADECAFEF00DD00D #define EPOCH_MAGIC1 0xBADDBABEDEEDFEED -void epoch_enter_preempt_KBI(epoch_t epoch, epoch_tracker_t et); -void epoch_exit_preempt_KBI(epoch_t epoch, epoch_tracker_t et); -void epoch_enter_KBI(epoch_t epoch); -void epoch_exit_KBI(epoch_t epoch); +void epoch_enter_preempt(epoch_t epoch, epoch_tracker_t et); +void epoch_exit_preempt(epoch_t epoch, epoch_tracker_t et); +void epoch_enter(epoch_t epoch); +void epoch_exit(epoch_t epoch); - -#if defined(KLD_MODULE) && !defined(KLD_TIED) -#define epoch_enter_preempt(e, t) epoch_enter_preempt_KBI((e), (t)) -#define epoch_exit_preempt(e, t) epoch_exit_preempt_KBI((e), (t)) -#define epoch_enter(e) epoch_enter_KBI((e)) -#define epoch_exit(e) epoch_exit_KBI((e)) -#else -#include -#endif /* KLD_MODULE */ - -#endif /* _KERNEL */ -#endif +#endif /* _KERNEL */ +#endif /* _SYS_EPOCH_H_ */ Modified: stable/12/sys/sys/param.h ============================================================================== --- stable/12/sys/sys/param.h Fri Jul 5 10:31:37 2019 (r349762) +++ stable/12/sys/sys/param.h Fri Jul 5 12:26:30 2019 (r349763) @@ -60,7 +60,7 @@ * in the range 5 to 9. */ #undef __FreeBSD_version -#define __FreeBSD_version 1200512 /* Master, propagated to newvers */ +#define __FreeBSD_version 1200513 /* Master, propagated to newvers */ /* * __FreeBSD_kernel__ indicates that this system uses the kernel of FreeBSD,