From owner-freebsd-stable@freebsd.org Thu Sep 22 07:59:39 2016 Return-Path: Delivered-To: freebsd-stable@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 17AE3BE5FA8 for ; Thu, 22 Sep 2016 07:59:39 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 994C0A0D; Thu, 22 Sep 2016 07:59:38 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from tom.home (kib@localhost [127.0.0.1]) by kib.kiev.ua (8.15.2/8.15.2) with ESMTPS id u8M7xXR1035597 (version=TLSv1 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 22 Sep 2016 10:59:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua u8M7xXR1035597 Received: (from kostik@localhost) by tom.home (8.15.2/8.15.2/Submit) id u8M7xX8J035596; Thu, 22 Sep 2016 10:59:33 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 22 Sep 2016 10:59:33 +0300 From: Konstantin Belousov To: Slawa Olhovchenkov Cc: John Baldwin , freebsd-stable@freebsd.org, alc@freebsd.org Subject: Re: nginx and FreeBSD11 Message-ID: <20160922075933.GL38409@kib.kiev.ua> References: <20160907191348.GD22212@zxy.spb.ru> <1823460.vTm8IvUQsF@ralph.baldwin.cx> <20160918162241.GE2960@zxy.spb.ru> <2122051.7RxZBKUSFc@ralph.baldwin.cx> <20160920065244.GO2840@zxy.spb.ru> <20160920192053.GP2840@zxy.spb.ru> <20160920201925.GI38409@kib.kiev.ua> <20160920203853.GR2840@zxy.spb.ru> <20160920211517.GJ38409@kib.kiev.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160920211517.GJ38409@kib.kiev.ua> User-Agent: Mutt/1.6.1 (2016-04-27) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on tom.home X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2016 07:59:39 -0000 On Wed, Sep 21, 2016 at 12:15:17AM +0300, Konstantin Belousov wrote: > > > diff --git a/sys/vm/vm_map.c b/sys/vm/vm_map.c > > > index a23468e..f754652 100644 > > > --- a/sys/vm/vm_map.c > > > +++ b/sys/vm/vm_map.c > > > @@ -481,6 +481,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > if (oldvm == newvm) > > > return; > > > > > > + spinlock_enter(); > > > /* > > > * Point to the new address space and refer to it. > > > */ > > > @@ -489,6 +490,7 @@ vmspace_switch_aio(struct vmspace *newvm) > > > > > > /* Activate the new mapping. */ > > > pmap_activate(curthread); > > > + spinlock_exit(); > > > > > > /* Remove the daemon's reference to the old address space. */ > > > KASSERT(oldvm->vm_refcnt > 1, Did you tested the patch ? Below is, I believe, the committable fix, of course supposing that the patch above worked. If you want to retest it on stable/11, ignore efirt.c chunks. diff --git a/sys/amd64/amd64/efirt.c b/sys/amd64/amd64/efirt.c index f1d67f7..c883af8 100644 --- a/sys/amd64/amd64/efirt.c +++ b/sys/amd64/amd64/efirt.c @@ -53,6 +53,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include #include #include @@ -301,6 +302,17 @@ efi_enter(void) PMAP_UNLOCK(curpmap); return (error); } + + /* + * IPI TLB shootdown handler invltlb_pcid_handler() reloads + * %cr3 from the curpmap->pm_cr3, which would disable runtime + * segments mappings. Block the handler's action by setting + * curpmap to impossible value. See also comment in + * pmap.c:pmap_activate_sw(). + */ + if (pmap_pcid_enabled && !invpcid_works) + PCPU_SET(curpmap, NULL); + load_cr3(VM_PAGE_TO_PHYS(efi_pml4_page) | (pmap_pcid_enabled ? curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); /* @@ -317,7 +329,9 @@ efi_leave(void) { pmap_t curpmap; - curpmap = PCPU_GET(curpmap); + curpmap = &curproc->p_vmspace->vm_pmap; + if (pmap_pcid_enabled && !invpcid_works) + PCPU_SET(curpmap, curpmap); load_cr3(curpmap->pm_cr3 | (pmap_pcid_enabled ? curpmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid : 0)); if (!pmap_pcid_enabled) diff --git a/sys/amd64/amd64/pmap.c b/sys/amd64/amd64/pmap.c index 63042e4..59e1b67 100644 --- a/sys/amd64/amd64/pmap.c +++ b/sys/amd64/amd64/pmap.c @@ -6842,6 +6842,7 @@ pmap_activate_sw(struct thread *td) { pmap_t oldpmap, pmap; uint64_t cached, cr3; + register_t rflags; u_int cpuid; oldpmap = PCPU_GET(curpmap); @@ -6865,16 +6866,43 @@ pmap_activate_sw(struct thread *td) pmap == kernel_pmap, ("non-kernel pmap thread %p pmap %p cpu %d pcid %#x", td, pmap, cpuid, pmap->pm_pcids[cpuid].pm_pcid)); + + /* + * If the INVPCID instruction is not available, + * invltlb_pcid_handler() is used for handle + * invalidate_all IPI, which checks for curpmap == + * smp_tlb_pmap. Below operations sequence has a + * window where %CR3 is loaded with the new pmap's + * PML4 address, but curpmap value is not yet updated. + * This causes invltlb IPI handler, called between the + * updates, to execute as NOP, which leaves stale TLB + * entries. + * + * Note that the most typical use of + * pmap_activate_sw(), from the context switch, is + * immune to this race, because interrupts are + * disabled (while the thread lock is owned), and IPI + * happends after curpmap is updated. Protect other + * callers in a similar way, by disabling interrupts + * around the %cr3 register reload and curpmap + * assignment. + */ + if (!invpcid_works) + rflags = intr_disable(); + if (!cached || (cr3 & ~CR3_PCID_MASK) != pmap->pm_cr3) { load_cr3(pmap->pm_cr3 | pmap->pm_pcids[cpuid].pm_pcid | cached); if (cached) PCPU_INC(pm_save_cnt); } + PCPU_SET(curpmap, pmap); + if (!invpcid_works) + intr_restore(rflags); } else if (cr3 != pmap->pm_cr3) { load_cr3(pmap->pm_cr3); + PCPU_SET(curpmap, pmap); } - PCPU_SET(curpmap, pmap); #ifdef SMP CPU_CLR_ATOMIC(cpuid, &oldpmap->pm_active); #else