Date: Wed, 16 Mar 2016 14:31:37 -0700 From: John Baldwin <jhb@freebsd.org> To: Gleb Smirnoff <glebius@freebsd.org> Cc: Hans Petter Selasky <hselasky@freebsd.org>, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r296909 - head/sys/ofed/drivers/infiniband/ulp/ipoib Message-ID: <5463538.GTDDlnsCR6@ralph.baldwin.cx> In-Reply-To: <20160316202738.GL1328@FreeBSD.org> References: <201603151547.u2FFlQKN078643@repo.freebsd.org> <2420759.o4YiE5Za4X@ralph.baldwin.cx> <20160316202738.GL1328@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday, March 16, 2016 01:27:38 PM Gleb Smirnoff wrote: > On Tue, Mar 15, 2016 at 10:43:53AM -0700, John Baldwin wrote: > J> > Log: > J> > Fix witness panic in the ipoib_ioctl() function when unloading the > J> > ipoib module. > J> > > J> > The bpfdetach() function is trying to turn off promiscious mode on the > J> > network interface it is attached to while holding a mutex. The fix > J> > consists of ignoring any further calls to the ipoib_ioctl() function > J> > when the network interface is going to be detached. The ipoib_ioctl() > J> > function might sleep. > J> > > J> > Sponsored by: Mellanox Technologies > J> > MFC after: 1 week > J> > J> In all of the other NIC drivers I have worked on the fix has been to call > J> if_detach (or ether_ifdetach) as the first step in your detach method before > J> you try to stop the interface. The idea is to remove external connections > J> from your driver first, and the only after those have drained do you actually > J> shut down the hardware. Given you are calling if_detach() just before > J> if_free() below, ipoib just seems to be broken here. > J> > J> For example, detach in a typical NIC driver does this: > J> > J> struct foo_softc *sc; > J> > J> sc = device_get_softc(dev); > J> ether_ifdetach(sc->sc_ifp); > J> FOO_LOCK(sc); > J> foo_stop(sc); > J> FOO_UNLOCK(sc); > J> callout_drain(&sc->timer); > J> bus_teardown_intr(...); > J> bus_release_resource(...); > J> if_free(sc->sc_ifp); > J> > J> Similarly, devices with a character device in /dev should be calling > J> destroy_dev() first before shutting down hardware, etc. > > In the new KPI in the projects/ifnet branch, I am doing the following: > > - there is only one if_detach(), that does both detach and free > - if_detach() is called at the very last stage, after stopping hardware (if hw is still present) > - driver is responsible to be ready to be called from upper half of the stack, > regardless of the state of hardware. Ususally it means that it should lock internal mutex, > and check internal RUNNING flag. > > Note, that hardware can go away asynchronously, so demand to call if_detach() > before stopping hardware can't be fulfiled. How do you prevent a userland application from doing the equivalent of 'ifconfig up' in between a driver calling foo_stop() to clear RUNNING and the later call to if_detach? This race is why I think as a general rule drivers should try to detach external references first before trying to quiesce the hardware. Certainly if the hardware is gone there's nothing to quiesce, but that's orthogonal to the race of how you ensure the user doesn't "undo" the explicit hardware quiesce performed during detach when your hardware is still around. (You do agree that a driver should explicitly stop hardware when being detached so it doesn't continue to DMA or fire interrupts, right? And that needs to happen at least once after ensuring you can't be "upped".) Previously some drivers tried to handle this by duplicating logic to set a "detaching" flag during detach to ignore all ioctls. It seems cleaner to me for drivers to instead call if_detach first (and let it drain once it is fixed to actually do some sort of drain ala bus_teardown_intr()) rather than having that code duplicated in all the drivers. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?5463538.GTDDlnsCR6>