Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 25 Oct 2019 14:11:55 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 241489] netmap + if_vlan panics related to 'Widen NET_EPOCH coverage' work (r353292)
Message-ID:  <bug-241489-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D241489

            Bug ID: 241489
           Summary: netmap + if_vlan panics related to 'Widen NET_EPOCH
                    coverage' work (r353292)
           Product: Base System
           Version: CURRENT
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: aleksandr.fedorov@itglobal.com

We widely use the configuration: vm (bhyve) - vale switch - if_vlan.
After r353292 we observe the following panic:

Unread portion of the kernel message buffer:
panic: Assertion in_epoch(net_epoch_preempt) failed at
/afedorov/vstack-develop-freebsd/sys/net/if_vlan.c:1142
cpuid =3D 11
time =3D 1572003077
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe01d541d=
1e0
vpanic() at vpanic+0x17e/frame 0xfffffe01d541d240
panic() at panic+0x43/frame 0xfffffe01d541d2a0
vlan_transmit() at vlan_transmit+0x165/frame 0xfffffe01d541d2f0
nm_os_generic_xmit_frame() at nm_os_generic_xmit_frame+0x7c/frame
0xfffffe01d541d310
generic_netmap_txsync() at generic_netmap_txsync+0x2d6/frame 0xfffffe01d541=
d3e0
netmap_bwrap_notify() at netmap_bwrap_notify+0x96/frame 0xfffffe01d541d410
nm_vale_flush() at nm_vale_flush+0xa53/frame 0xfffffe01d541d510
netmap_vale_vp_txsync() at netmap_vale_vp_txsync+0x508/frame 0xfffffe01d541=
d5a0
netmap_ioctl() at netmap_ioctl+0x1b4/frame 0xfffffe01d541d670
freebsd_netmap_ioctl() at freebsd_netmap_ioctl+0x88/frame 0xfffffe01d541d6b0
devfs_ioctl() at devfs_ioctl+0xca/frame 0xfffffe01d541d700
VOP_IOCTL_APV() at VOP_IOCTL_APV+0x85/frame 0xfffffe01d541d720
vn_ioctl() at vn_ioctl+0x13d/frame 0xfffffe01d541d830
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe01d541d850
kern_ioctl() at kern_ioctl+0x295/frame 0xfffffe01d541d8b0
sys_ioctl() at sys_ioctl+0x15c/frame 0xfffffe01d541d980
amd64_syscall() at amd64_syscall+0x2b5/frame 0xfffffe01d541dab0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe01d541dab0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip =3D 0x800819c2a, rsp =3D
0x7fffdfffce68, rbp =3D 0x7fffdfffcef0 ---
Uptime: 19m12s
Dumping 5033 out of 65374 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..=
91%

The problem is that NET_EPOCH_ENTER() is moved out of the vlan_transmit()
function. But the netmap generic code calls the vlan_transmit () function
directly from nm_os_generic_xmit_frame (), without entering to the EPOCH
section. So, kernel panics on NET_EPOCH_ASSERT().

Temporarily, I solved the problem with the following patch, but I'm not sur=
e if
this is the correct solution.

--- a/sys/dev/netmap/netmap_freebsd.c
+++ b/sys/dev/netmap/netmap_freebsd.c
@@ -420,6 +420,7 @@ nm_os_generic_xmit_frame(struct nm_os_gen_arg *a)
 {
        int ret;
        u_int len =3D a->len;
+       struct epoch_tracker et;
        struct ifnet *ifp =3D a->ifp;
        struct mbuf *m =3D a->m;

@@ -453,9 +454,11 @@ nm_os_generic_xmit_frame(struct nm_os_gen_arg *a)
        M_HASHTYPE_SET(m, M_HASHTYPE_OPAQUE);
        m->m_pkthdr.flowid =3D a->ring_nr;
        m->m_pkthdr.rcvif =3D ifp; /* used for tx notification */
+       NET_EPOCH_ENTER(et);
        CURVNET_SET(ifp->if_vnet);
        ret =3D NA(ifp)->if_transmit(ifp, m);
        CURVNET_RESTORE();
+       NET_EPOCH_EXIT(et);
        return ret ? -1 : 0;
 }

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-241489-227>