Date: Wed, 16 Mar 2016 13:49:28 +0100 From: Hans Petter Selasky <hps@selasky.org> To: John Baldwin <jhb@freebsd.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r296909 - head/sys/ofed/drivers/infiniband/ulp/ipoib Message-ID: <56E95658.2010703@selasky.org> In-Reply-To: <2420759.o4YiE5Za4X@ralph.baldwin.cx> References: <201603151547.u2FFlQKN078643@repo.freebsd.org> <2420759.o4YiE5Za4X@ralph.baldwin.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On 03/15/16 18:43, John Baldwin wrote: > On Tuesday, March 15, 2016 03:47:26 PM Hans Petter Selasky wrote: >> Author: hselasky >> Date: Tue Mar 15 15:47:26 2016 >> New Revision: 296909 >> URL: https://svnweb.freebsd.org/changeset/base/296909 >> >> Log: >> Fix witness panic in the ipoib_ioctl() function when unloading the >> ipoib module. >> >> The bpfdetach() function is trying to turn off promiscious mode on the >> network interface it is attached to while holding a mutex. The fix >> consists of ignoring any further calls to the ipoib_ioctl() function >> when the network interface is going to be detached. The ipoib_ioctl() >> function might sleep. >> >> Sponsored by: Mellanox Technologies >> MFC after: 1 week > > In all of the other NIC drivers I have worked on the fix has been to call > if_detach (or ether_ifdetach) as the first step in your detach method before > you try to stop the interface. The idea is to remove external connections > from your driver first, and the only after those have drained do you actually > shut down the hardware. Given you are calling if_detach() just before > if_free() below, ipoib just seems to be broken here. Hi John, It is a problem that if_ioctl() callbacks come from various places without proper refcounting and unknown context properties, like sleeping allowed or not. The WITNESS assert I'm trying to solve is this one: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe04661876c0 > ipoib_ioctl() at ipoib_ioctl+0x29/frame 0xfffffe0466187700 > if_setflag() at if_setflag+0xd5/frame 0xfffffe0466187760 > ifpromisc() at ifpromisc+0x2c/frame 0xfffffe0466187790 > bpf_detachd_locked() at bpf_detachd_locked+0x1e8/frame 0xfffffe04661877d0 > bpfdetach() at bpfdetach+0x152/frame 0xfffffe0466187810 > ipoib_detach() at ipoib_detach+0x42/frame 0xfffffe0466187830 > ipoib_remove_one() at ipoib_remove_one+0x19e/frame 0xfffffe04661878a0 > ib_unregister_client() at ib_unregister_client+0x5e/frame 0xfffffe04661878e0 > ipoib_cleanup_module() at ipoib_cleanup_module+0x52/frame 0xfffffe04661878f0 > _module_run() at _module_run+0x7b/frame 0xfffffe0466187920 > linker_file_unload() at linker_file_unload+0x45f/frame 0xfffffe0466187970 > kern_kldunload() at kern_kldunload+0xbc/frame 0xfffffe04661879a0 > amd64_syscall() at amd64_syscall+0x2db/frame 0xfffffe0466187ab0 > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe0466187ab0 As you can see bpfdetach() is calling ifpromisc() locked, which basically means we cannot do anything there with regard to ipoib. ipoib is not a NIC driver, but an IP-bearer only. Comparing with PCI devices is not valid here, because with a PCI device you usually only need to write a PCI register to switch promiscious mode on/off, but with IPOIB it involves commands towards the infiniband stack which might sleep. > For example, detach in a typical NIC driver does this: > > struct foo_softc *sc; > > sc = device_get_softc(dev); > ether_ifdetach(sc->sc_ifp); > FOO_LOCK(sc); > foo_stop(sc); > FOO_UNLOCK(sc); > callout_drain(&sc->timer); > bus_teardown_intr(...); > bus_release_resource(...); > if_free(sc->sc_ifp); > > Similarly, devices with a character device in /dev should be calling > destroy_dev() first before shutting down hardware, etc. Right. The problem is not really the detach sequence, but that bpfdetach() calls ifpromisc() locked. I'm not sure if we can change that. --HPS
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?56E95658.2010703>