Date: Fri, 22 Jan 2010 17:40:41 +0100 From: Attilio Rao <attilio@freebsd.org> To: Jilles Tjoelker <jilles@stack.nl> Cc: Giovanni Trematerra <giovanni.trematerra@gmail.com>, freebsd-arch@freebsd.org Subject: Re: [PATCH] kthread_{suspend, resume, suspend_check} locking bugs Message-ID: <3bbf2fe11001220840o42a59d9cu476fe24063cd55b8@mail.gmail.com> In-Reply-To: <20100116125451.GA99364@stack.nl> References: <3bbf2fe11001091709t228c48cft1c17af686e9e9c46@mail.gmail.com> <20100116125451.GA99364@stack.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
2010/1/16 Jilles Tjoelker <jilles@stack.nl>: > On Sun, Jan 10, 2010 at 02:09:43AM +0100, Attilio Rao wrote: >> I think that routines kthread_suspend(), kthread_resume() and >> kthread_suspend_check() are not adeguately SMP protected. >> That is because, in particular, the critical path doesn't protect, >> together, TDF_KTH_SUSP and sleeping activity. The right pattern should >> be to use the thread lock spinlock as an interlock and use msleep. >> Such bugs have not been revealed probabilly because there has been a >> lack of testing of such primitives and there are not, currently, >> consumers within our stock kernel. > >> Additively, kthread_suspend_check() seems to require to always pass >> curthread, which is silly (as we don't have to conform to any >> particular KPI), thus I think it is appropriate for the prototype to >> change. >> The following patch should fix the issue: >> http://www.freebsd.org/~attilio/kthread.diff > > The analysis and patch look sensible. After have digged more with jhb I made a new patch: http://www.freebsd.org/~attilio/kthread_races.diff The previous one has the problem that thread_lock is going to change in ULE, and thus panic, if the awaken thread will be scheduled on a different runqueue wrt the one where it got sleeping. On this optic it would be good to panic if a td_lock lock is passed to msleep_spin() but that is not an easy condition to check (I thought about asserting the name of the locks to not be a threads container one... sigh...). On the original code, there is also another problem. The waitchannel for suspender and suspending threads are different while they should not be (and kproc_*() seems to agree with me). It might be fixed as well. Gianni, may you test this patch with the modules you made? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe11001220840o42a59d9cu476fe24063cd55b8>