Date: Tue, 18 Dec 2018 15:18:44 +0000 From: bugzilla-noreply@freebsd.org To: bugs@FreeBSD.org Subject: [Bug 234135] a race between ifioctl and clone_destroy in lagg and similar network drivers Message-ID: <bug-234135-227@https.bugs.freebsd.org/bugzilla/>
next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D234135 Bug ID: 234135 Summary: a race between ifioctl and clone_destroy in lagg and similar network drivers Product: Base System Version: CURRENT Hardware: Any OS: Any Status: New Severity: Affects Only Me Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: avg@FreeBSD.org We have got a page fault crash with the following stack trace: _sx_slock_hard lagg_media_status ifmedia_ioctl lagg_ioctl ifioctl kern_ioctl sys_ioctl It appears that the crash happened because of a destroyed sx lock. My theory is that the following race happened. lagg_clone_destroy() and lagg_ioctl() were called concurrently. lagg_clone_destroy() won a race to lock sc_sx while lagg_media_status() got blocked on it. I think that after some adaptive spinning the thread was pl= aced on a sleep queue. Then, lagg_clone_destroy() unlocked the lock and proceed= ed to destroy it. After the lagg_media_status() thread was waken up it found the lock in the destroyed state and crashed on it in a typical fashion (trying to dereference a NULL pointer as a struct thread pointer). Here is a more detailed, step by step description of the above. Thread T1 calls ifioctl with, for instance, SIOCGIFMEDIA parameter. T1 calls ifunit_ref(), finds the interface with !(if_flags & IFF_DYING). T1 increments the interface's reference count and proceeds to call ifhwioct= l() and then the interface's (driver's) if_ioctl method. Enter thread T2. T2 calls ifioctl(SIOCIFDESTROY) on the same interface. T2 invokes if_clone_destroy() that looks up the interface by name and increments its reference count. Then, T2 calls if_clone_destroyif() that calls ifc_simple_destroy(). The latter calls ifcs_destroy method on the interface. >From here on we consider driver-specific code that, obviously, varies from driver to driver. But after having reviewed a handful of drivers that use if_clone_simple I see that all of them have the same pattern. So, T2 calls ifcs_destroy. A driver's ifcs_destroy would handle its internal state. Then, it would typically call if_free() on the interface. Since the interface at this point has multiple outstanding references, including one taken by T2 itself, it is not actually freed. It's just marked as IFF_DYING. Also, its reference count is decremented by one, so that it can = be actually freed after T2 and T1 release their references. Afterwards, ifcs_destroy would typically free if_softc. At this point the driver's if_ioctl method is being executed by T1. The method can access if_softc that has been freed by now. So, that's the race. Any internal locking using a lock in the driver's softc instance does not h= elp, because the lock would be destroyed and freed together with the if_softc in ifcs_destroy. So, if_ioctl attempting to get that lock is the same kind of the problem. --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-234135-227>