From owner-svn-src-head@freebsd.org Wed Mar 15 19:18:44 2017 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 405F6D0E87F; Wed, 15 Mar 2017 19:18:44 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mail.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 1F2CD21E; Wed, 15 Mar 2017 19:18:44 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from ralph.baldwin.cx (c-73-231-226-104.hsd1.ca.comcast.net [73.231.226.104]) by mail.baldwin.cx (Postfix) with ESMTPSA id 3755810A814; Wed, 15 Mar 2017 15:18:42 -0400 (EDT) From: John Baldwin To: Kristof Provost Cc: Gleb Smirnoff , src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r315136 - head/sys/netpfil/pf Date: Tue, 14 Mar 2017 23:45:57 -0700 Message-ID: <1803226.Igex2bR0P8@ralph.baldwin.cx> User-Agent: KMail/4.14.10 (FreeBSD/11.0-STABLE; KDE/4.14.10; amd64; ; ) In-Reply-To: <7B1C8879-E636-4315-99A2-A258AB9AE500@FreeBSD.org> References: <201703120542.v2C5gvM4075391@repo.freebsd.org> <20170314215706.GB1072@FreeBSD.org> <7B1C8879-E636-4315-99A2-A258AB9AE500@FreeBSD.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (mail.baldwin.cx); Wed, 15 Mar 2017 15:18:42 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.99.2 at mail.baldwin.cx X-Virus-Status: Clean X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Mar 2017 19:18:44 -0000 On Wednesday, March 15, 2017 10:26:39 AM Kristof Provost wrote: > On 15 Mar 2017, at 6:57, Gleb Smirnoff wrote: > > On Sun, Mar 12, 2017 at 05:42:57AM +0000, Kristof Provost wrote: > > K> Log: > > K> pf: Fix incorrect rw_sleep() in pf_unload() > > K> > > K> When we unload we don't hold the pf_rules_lock, so we cannot c= all=20 > > rw_sleep() > > K> with it, because it would release a lock we do not hold. There= 's=20 > > no need for the > > K> lock either, so we can just tsleep(). > > K> > > K> While here also make the same change in pf_purge_thread(),=20 > > because it explicitly > > K> takes the lock before rw_sleep() and then immediately releases= it=20 > > afterwards. > > > > The correct change would to be grab lock in pf_unload(), exactly as= =20 > > pf_purge_thread() > > does. With your change you introduces a possible infinite sleep due= to=20 > > race, since > > there is no timeout and no lock. > > > I must be missing something, because I don=E2=80=99t see the race, an= d don=E2=80=99t=20 > see how we > could end up with an infinite sleep. You are ignoring interrupts and preemption. Suppose you get an interru= pt after 'wakeup_one(pf_purge_thread)' and before 'tsleep(..., 0)' in pf_unload(). If the interrupt preempts and results in the purge thread= running and issuing its wakeup before the thread executing pf_unload() resumes, then eventually when pf_unload() resumes it will do a tsleep()= with no timeout that will never be awoken. You obviously didn't test this in a debug kernel since there is a KASSE= RT explicitly to catch obvious tsleep races in _sleep(): KASSERT(sbt !=3D 0 || mtx_owned(&Giant) || lock !=3D NULL, ("sleeping without a lock")); You should fix this in the way that Gleb suggested. Also, all kthreads/kprocs do a wakeup() inside of exit1() or kthread_ex= it() to allow you to wait for a kthread to exit when unloading a module. Th= e general structure should be something like: struct thread *my_thread; void thread_main(void *arg) { LOCK(&mylock); while (!thread_quit) { UNLOCK(&mylock); /* do work */ LOCK(&mylock); if (!thread_quit && no_work_to_do) lock_sleep(&some_wchan, &mylock, ...); } UNLOCK(&mylock); kthread_exit(); } void unload_handler(...) { ... LOCK(&mylock); thread_quit =3D true; wakeup(&some_wchan); lock_sleep(my_thread, &mylock, ...); UNLOCK(&mylock); } void load_handler(...) { ... kthread_add(thread_main, arg, NULL, &my_thread, ...); } If you want to create a proc then you can use 'struct proc *my_proc' and sleep on 'my_proc' instead (along with using kproc_exit(), though kthread_exit() from the last thread in a kproc should call kproc_exit()= for you). --=20 John Baldwin