From owner-freebsd-virtualization@freebsd.org Fri Dec 29 23:51:12 2017 Return-Path: Delivered-To: freebsd-virtualization@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id AF1F5E81377 for ; Fri, 29 Dec 2017 23:51:12 +0000 (UTC) (envelope-from grehan@freebsd.org) Received: from alto.onthenet.com.au (alto.OntheNet.com.au [203.13.68.12]) by mx1.freebsd.org (Postfix) with ESMTP id 64D5B18E1 for ; Fri, 29 Dec 2017 23:51:12 +0000 (UTC) (envelope-from grehan@freebsd.org) Received: from iredmail.onthenet.com.au (iredmail.onthenet.com.au [203.13.68.150]) by alto.onthenet.com.au (Postfix) with ESMTPS id BE62520B1AEF for ; Sat, 30 Dec 2017 09:45:54 +1000 (AEST) Received: from localhost (iredmail.onthenet.com.au [127.0.0.1]) by iredmail.onthenet.com.au (Postfix) with ESMTP id B9D502820A9 for ; Sat, 30 Dec 2017 09:45:54 +1000 (AEST) X-Amavis-Modified: Mail body modified (using disclaimer) - iredmail.onthenet.com.au Received: from iredmail.onthenet.com.au ([127.0.0.1]) by localhost (iredmail.onthenet.com.au [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 8lYHhMt0burU for ; Sat, 30 Dec 2017 09:45:54 +1000 (AEST) Received: from Peters-MacBook-Pro-2.local (96-82-80-65-static.hfc.comcastbusiness.net [96.82.80.65]) by iredmail.onthenet.com.au (Postfix) with ESMTPSA id 00BF52809DB; Sat, 30 Dec 2017 09:45:52 +1000 (AEST) Subject: Re: bhyve/amd: interrupt delivered when it shouldn't be? To: Andriy Gapon References: <42c22179-ae42-e4bb-e77d-a1d49fe634ed@FreeBSD.org> From: Peter Grehan Cc: freebsd-virtualization@freebsd.org Message-ID: <450137ba-52dd-8b4c-63d2-3c3ce1909d69@freebsd.org> Date: Fri, 29 Dec 2017 15:45:49 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <42c22179-ae42-e4bb-e77d-a1d49fe634ed@FreeBSD.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-CMAE-Score: 0 X-CMAE-Analysis: v=2.3 cv=KPZ08mNo c=1 sm=1 tr=0 a=A6CF0fG5TOl4vs6YHvqXgw==:117 a=mwgbnDbW7alINpy3vhoKyg==:17 a=IkcTkHD0fZMA:10 a=ocR9PWop10UA:10 a=gzpREF0ZZXXHYDOYZD4A:9 a=QEXdDO2ut3YA:10 wl=host:3 X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Dec 2017 23:51:12 -0000 Hi Andriy, > The hardware is AMD. Ryzen ? > But what I see suggests that at this point a Local APIC timer interrupt gets > delivered to the thread. And that causes all the mess as the thread holding the > spinlock gets preempted. > > Does this ring a bell to anyone? I have seen something similar to this after about ~20 mins when doing a current -j 16 buildworld in a guest, with the symptom being a spinlock timeout, with one vCPU spinning in smp_targeted_tlb_shootdown() at smp_targeted_tlb_shootdown+0x352/frame 0xfffffe02c80098d0 smp_masked_invlpg() at smp_masked_invlpg+0x4c/frame 0xfffffe02c8009900 pmap_invalidate_page() at pmap_invalidate_page+0x191/frame 0xfffffe02c8009950 pmap_ts_referenced() at pmap_ts_referenced+0x7b3/frame 0xfffffe02c8009a00 vm_pageout() at vm_pageout+0xe04/frame 0xfffffe02c8009a70 ... and all the others eventually spinning on that held lock. However, NMIs are still able to get through (the post-panic ddb NMI IPI) so the VM isn't completely locked up - either an interrupt is missed, or a write isn't seen by the vCPU issuing the tlb shootdown. > Is there any suspect code? Not sure yet, but the interrupt-injection path could do with a close inspection. > It seems that we set v_intr_masking bit, so the rFLAGS / eFLAGS should be > completely virtualized. So, maybe a hardware issue? Hard to say. Running with all vCPUs pinned makes the problem go away, but that could just mean the issue is isolated to when vCPUs migrate. later, Peter.