From owner-svn-src-head@freebsd.org Tue Mar 31 15:57:52 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5B74B2618E9; Tue, 31 Mar 2020 15:57:52 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48sDW13YKDz48Ns; Tue, 31 Mar 2020 15:57:49 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "Let's Encrypt Authority X3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id B60F512C24; Tue, 31 Mar 2020 15:57:42 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id F3AF91BF3D; Tue, 31 Mar 2020 17:57:40 +0200 (CEST) From: "Kristof Provost" To: "Mark Johnston" Cc: "Li-Wen Hsu" , src-committers , svn-src-all , svn-src-head Subject: Re: svn commit: r359436 - in head/sys: kern net sys Date: Tue, 31 Mar 2020 17:57:40 +0200 X-Mailer: MailMate (1.13.1r5671) Message-ID: <49973196-5F08-4DCE-BA5F-F9B359703A08@FreeBSD.org> In-Reply-To: <9A4C20AA-8E13-47C8-B162-F2304F8C79B7@FreeBSD.org> References: <202003301422.02UEMrxL059978@repo.freebsd.org> <20200331015905.GC65028@raichu> <20200331023127.GA97238@raichu> <20200331151700.GC97238@raichu> <9A4C20AA-8E13-47C8-B162-F2304F8C79B7@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8"; format=flowed; markup=markdown Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 31 Mar 2020 15:57:52 -0000 On 31 Mar 2020, at 17:28, Kristof Provost wrote: > On 31 Mar 2020, at 17:17, Mark Johnston wrote: >> On Tue, Mar 31, 2020 at 03:51:27PM +0800, Li-Wen Hsu wrote: >>> On Tue, Mar 31, 2020 at 3:00 PM Kristof Provost >>> wrote: >>>> >>>> On 31 Mar 2020, at 7:56, Li-Wen Hsu wrote: >>>>> On Tue, Mar 31, 2020 at 10:55 AM Mark Johnston >>>>> wrote: >>>>>>>> It seems could be triggered by sys.netinet6.frag6.* >>>>>>>> sys.netpfil.common.* sbin.pfctl.pfctl_test.* tests, and there >>>>>>>> are lots >>>>>>>> of test cases timed out. >>>>>>>> >>>>>>>> Can you help check these? >>>>>>> >>>>>>> I see, it is actually caused by r359438. I'm looking at it now. >>>>>> >>>>>> I verified that the netpfil and netinet6 tests pass with r359477. >>>>> >>>>> Thanks for the fixing, the latest test panics at epair_qflush: >>>>> >>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/14747/consoleFull >>>>> >>>>> while executing sys.netpfil.pf.* tests. I'm not sure if this is >>>>> related or because of previous commits (I suspect the later). I'll >>>>> look into this. >>>>> >>>> That’s a know issue with epair (since EPOCH, I believe). >>>> A number of the pf tests are disabled due to this. See 238870. >>> >>> I also think so, btw, currently every test run panics so I am afraid >>> that the recent commits might make status worse (or say, make the >>> issue easier to reproduce?) >> >> I haven't been able to reproduce any panics or test failures so far. > > Once you disable the ‘atf_skip’ lines in the pf tests a simple > `sudo kldload pfsync && cd /usr/tests/sys/netpfil/pf && sudo kyua > test` is likely sufficient. > The names:names test is a great candidate for this. Remove the `atf_skip …` line in /usr/tests/sys/netpfil/pf/names and run that a few times. It’s not 100% reliable, but the test is very fast and will likely panic every other run or more. Example backtrace: panic: epair_qflush: ifp=0xfffff800079c9000, epair_softc gone? sc=0 cpuid = 1 time = 1585666518 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001bd7e790 vpanic() at vpanic+0x182/frame 0xfffffe001bd7e7e0 panic() at panic+0x43/frame 0xfffffe001bd7e840 epair_qflush() at epair_qflush+0x1a8/frame 0xfffffe001bd7e890 if_down() at if_down+0x12d/frame 0xfffffe001bd7e8c0 if_detach_internal() at if_detach_internal+0x2ee/frame 0xfffffe001bd7e920 if_vmove() at if_vmove+0x3c/frame 0xfffffe001bd7e970 vnet_if_return() at vnet_if_return+0x50/frame 0xfffffe001bd7e990 vnet_destroy() at vnet_destroy+0x130/frame 0xfffffe001bd7e9c0 prison_deref() at prison_deref+0x29d/frame 0xfffffe001bd7ea00 taskqueue_run_locked() at taskqueue_run_locked+0xaa/frame 0xfffffe001bd7ea80 taskqueue_thread_loop() at taskqueue_thread_loop+0x94/frame 0xfffffe001bd7eab0 fork_exit() at fork_exit+0x80/frame 0xfffffe001bd7eaf0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001bd7eaf0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 0 tid 100014 ] Stopped at kdb_enter+0x37: movq $0,0x10927a6(%rip) db> You might see different panics too. The epair teardown flow is complex, and broken. Best regards, Kristof