Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 18 Feb 2020 22:08:50 +0100
From:      "Kristof Provost" <kp@FreeBSD.org>
To:        "Gleb Smirnoff" <glebius@freebsd.org>
Cc:        freebsd-net <freebsd-net@freebsd.org>
Subject:   Re: vtnet IFF_NEEDSEPOCH?
Message-ID:  <5520BD42-7D17-4561-A2CD-C690B159D15E@FreeBSD.org>
In-Reply-To: <20200218193708.GH5741@FreeBSD.org>
References:  <B45B8E77-CDB2-450E-951C-1E6E6BCC2527@FreeBSD.org> <20200218193708.GH5741@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On 18 Feb 2020, at 20:37, Gleb Smirnoff wrote:
> On Tue, Feb 18, 2020 at 10:52:09AM +0100, Kristof Provost wrote:
> K> Hi,
> K>
> K> I’ve been playing around with a risc-v qemu image, and run into 
> this panic
> K> with vtnet:
> K>
> K> 	DHCPDISCOVER on vtnet0 to 255.255.255.255 port 67 interval 5
> K> 	panic: Assertion in_epoch(net_epoch_preempt) failed at
> K> /usr/src/sys/net/netisr.c:1093
> K> 	cpuid = 0
> K> 	time = 1581981733
> K> 	KDB: stack backtrace:
> K> 	db_trace_self() at db_trace_self
> K> 	db_fetch_ksymtab() at db_fetch_ksymtab+0x12a
> K> 	kdb_backtrace() at kdb_backtrace+0x2c
> K> 	vpanic() at vpanic+0x144
> K> 	panic() at panic+0x26
> K> 	netisr_dispatch_src() at netisr_dispatch_src+0x3c0
> K> 	netisr_dispatch() at netisr_dispatch+0x10
> K> 	ether_ifattach() at ether_ifattach+0x2fa
> K> 	vtmmio_attach() at vtmmio_attach+0x490c
> K> 	vtmmio_attach() at vtmmio_attach+0x4624
> K> 	vtmmio_attach() at vtmmio_attach+0x544a
> K> 	virtqueue_intr() at virtqueue_intr+0xc
> K> 	vtmmio_attach() at vtmmio_attach+0x2008
> K> 	db_dump_intr_event() at db_dump_intr_event+0x730
> K> 	fork_exit() at fork_exit+0x68
> K> 	fork_trampoline() at fork_trampoline+0xa
> K> 	KDB: enter: panic
> K> 	[ thread pid 12 tid 100023 ]
> K> 	Stopped at      kdb_enter+0x44: sd      zero,0(a0)
> K> 	db>
> K>
> K> It seems pretty clear that the vtmmio path doesn’t enter epoch 
> before it
> K> runs the vtnet_attach() code.
> K> On the other hand, I run vtnet CURRENT guests in bhyve, and don’t 
> see this
> K> panic. In that case it lives on top of PCI rather than mmio, but I 
> don’t
> K> see why/where that’d enter epoch.
>
> The transition from ether_ifattach to netisr_dispatch looks strange.
> Is that something run trough EVENTHANDLER_INVOKE?
>
> Can you please print in kgdb?
>
>> list *ether_ifattach+0x2fa
>
Not immediately, no. This is risc-v, and there’s no kgdb for it right 
now.

I can give you objdump output however:

	ffffffc00036fd6c <ether_ifattach>:
	/*
	 * Perform common duties while attaching to interface list
	 */
	void
	ether_ifattach(struct ifnet *ifp, const u_int8_t *lla)
	{
	ffffffc00036fd6c:       7139                    addi    sp,sp,-64
	ffffffc00036fd6e:       fc06                    sd      ra,56(sp)
	ffffffc00036fd70:       f822                    sd      s0,48(sp)
	ffffffc00036fd72:       f426                    sd      s1,40(sp)
	ffffffc00036fd74:       f04a                    sd      s2,32(sp)
	ffffffc00036fd76:       ec4e                    sd      s3,24(sp)
	ffffffc00036fd78:       e852                    sd      s4,16(sp)
	ffffffc00036fd7a:       e456                    sd      s5,8(sp)
	ffffffc00036fd7c:       e05a                    sd      s6,0(sp)
	ffffffc00036fd7e:       0080                    addi    s0,sp,64
	ffffffc00036fd80:       892e                    mv      s2,a1
	ffffffc00036fd82:       89aa                    mv      s3,a0
	ffffffc00036fd84:       4539                    li      a0,14

	…
	        if (__predict_false(ifp->if_flags & IFF_NEEDSEPOCH))
	ffffffc000370036:       0709c503                lbu     a0,112(s3)
	ffffffc00037003a:       02057513                andi    a0,a0,32
	ffffffc00037003e:       e92d                    bnez    
a0,ffffffc0003700b0 <ether_input+0xba>
	        while (m) {
	ffffffc000370040:       020a8763                beqz    
s5,ffffffc00037006e <ether_input+0x78>
	ffffffc000370044:       5a7d                    li      s4,-1
	                mn = m->m_nextpkt;
	ffffffc000370046:       008ab483                ld      s1,8(s5)
	                m->m_nextpkt = NULL;
	ffffffc00037004a:       000ab423                sd      zero,8(s5)
	                MPASS((m->m_pkthdr.csum_flags & CSUM_SND_TAG) == 0);
	ffffffc00037004e:       038aa503                lw      a0,56(s5)
	ffffffc000370052:       08aa5963                bge     
s4,a0,ffffffc0003700e4 <ether_input+0xee>
	                KASSERT(m->m_pkthdr.rcvif == ifp, ("%s: ifnet mismatch 
m %p "
	ffffffc000370056:       020ab683                ld      a3,32(s5)
	ffffffc00037005a:       0b369763                bne     
a3,s3,ffffffc000370108 <ether_input+0x112>
	                netisr_dispatch(NETISR_ETHER, m);
	ffffffc00037005e:       4515                    li      a0,5
	ffffffc000370060:       85d6                    mv      a1,s5
	ffffffc000370062:       0000c097                auipc   ra,0xc
	ffffffc000370066:       12c080e7                jalr    300(ra) # 
ffffffc00037c18e <netisr_dispatch>
	ffffffc00037006a:       8aa6                    mv      s5,s1
	        while (m) {
	ffffffc00037006c:       fce9                    bnez    
s1,ffffffc000370046 <ether_input+0x50>
	        if (__predict_false(ifp->if_flags & IFF_NEEDSEPOCH))
	ffffffc00037006e:       0709c503                lbu     a0,112(s3)
	ffffffc000370072:       02057513                andi    a0,a0,32
	ffffffc000370076:       e939                    bnez    
a0,ffffffc0003700cc <ether_input+0xd6>
	ffffffc000370078:       00023503                ld      a0,0(tp) # 0 
<kernel_lma-0x80200000>
	        CURVNET_RESTORE();
	ffffffc00037007c:       4e053503                ld      a0,1248(a0)
	ffffffc000370080:       c971                    beqz    
a0,ffffffc000370154 <ether_input+0x15e>
	ffffffc000370082:       00090a63                beqz    
s2,ffffffc000370096 <ether_input+0xa0>
	ffffffc000370086:       5e4a7537                lui     a0,0x5e4a7
	ffffffc00037008a:       f285051b                addiw   a0,a0,-216
	ffffffc00037008e:       01092583                lw      a1,16(s2)
	ffffffc000370092:       0ca59163                bne     
a1,a0,ffffffc000370154 <ether_input+0x15e>
	ffffffc000370096:       00023503                ld      a0,0(tp) # 0 
<kernel_lma-0x80200000>
	ffffffc00037009a:       4f253023                sd      s2,1248(a0) # 
5e4a74e0 <kernel_lma-0x21d58b20>
	}


In fact, that looks like it’s inside ether_input(), not 
ether_ifattach().

	ffffffc00036fff6 <ether_input>:
	{
	ffffffc00036fff6:       711d                    addi    sp,sp,-96

So I suspect our call stack is something like virtqueue_intr() -> 
vtnet_rx_vq_intr() -> vtnet_rxq_eof() -> vtnet_rxq_input() -> 
ether_input().
I don’t see anything entering epoch in that path, which presumably 
explains the panic, but I still don’t understand why my bhyve current 
vm doesn’t panic in the same way.

Best regards,
Kristof



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5520BD42-7D17-4561-A2CD-C690B159D15E>