Date: Wed, 15 Mar 2017 10:26:39 +0900 From: "Kristof Provost" <kp@FreeBSD.org> To: "Gleb Smirnoff" <glebius@FreeBSD.org> Cc: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r315136 - head/sys/netpfil/pf Message-ID: <7B1C8879-E636-4315-99A2-A258AB9AE500@FreeBSD.org> In-Reply-To: <20170314215706.GB1072@FreeBSD.org> References: <201703120542.v2C5gvM4075391@repo.freebsd.org> <20170314215706.GB1072@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 15 Mar 2017, at 6:57, Gleb Smirnoff wrote: > On Sun, Mar 12, 2017 at 05:42:57AM +0000, Kristof Provost wrote: > K> Log: > K> pf: Fix incorrect rw_sleep() in pf_unload() > K> > K> When we unload we don't hold the pf_rules_lock, so we cannot call > rw_sleep() > K> with it, because it would release a lock we do not hold. There's > no need for the > K> lock either, so we can just tsleep(). > K> > K> While here also make the same change in pf_purge_thread(), > because it explicitly > K> takes the lock before rw_sleep() and then immediately releases it > afterwards. > > The correct change would to be grab lock in pf_unload(), exactly as > pf_purge_thread() > does. With your change you introduces a possible infinite sleep due to > race, since > there is no timeout and no lock. > I must be missing something, because I don’t see the race, and don’t see how we could end up with an infinite sleep. Even if pf_purge_thread() somehow misses that pf_end_threads is non-zero (say it was not sleeping but executing the last vnet cleanup after the pf_end_threads check) while pf_unload() calls the wakeup_one(pf_purge_thread) it would still terminate the next time the tsleep() timed out. The only way around that would be to hold the PF_RULES_LOCK() during all of pf_purge_thread() that’s not actually sleeping. That’s non-trivial because pf_purge_unlinked_rules() also takes the PF_RULES_LOCK and there are lock ordering constraints with some of the other locks taken in the other cleanup code. Given that unloading pf is a non-supported development use case (see MOD_QUIESCE in pf_modevent()), I think I’d rather accept the occasional 10 second delay in unloading. There is another issue with unloading (vnet_pf_uninit() gets called after pf_unload(), which means we try to take a destroyed lock), for which I’ll have a patch soon. Regards, Kristof
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7B1C8879-E636-4315-99A2-A258AB9AE500>