Date: Thu, 10 Aug 2006 18:17:05 +0200 From: Divacky Roman <xdivac02@stud.fit.vutbr.cz> To: Brooks Davis <brooks@one-eyed-alien.net> Cc: hackers@freebsd.org Subject: Re: SoC: help with LISTs and killing procs Message-ID: <20060810161705.GB19047@stud.fit.vutbr.cz> In-Reply-To: <20060810154305.GA21483@lor.one-eyed-alien.net> References: <20060810151616.GA17109@stud.fit.vutbr.cz> <20060810152359.GA21318@lor.one-eyed-alien.net> <20060810153543.GA19047@stud.fit.vutbr.cz> <20060810154305.GA21483@lor.one-eyed-alien.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Aug 10, 2006 at 10:43:05AM -0500, Brooks Davis wrote:
> On Thu, Aug 10, 2006 at 05:35:43PM +0200, Divacky Roman wrote:
> > On Thu, Aug 10, 2006 at 10:23:59AM -0500, Brooks Davis wrote:
> > > On Thu, Aug 10, 2006 at 05:16:17PM +0200, Divacky Roman wrote:
> > > > hi
> > > >
> > > > I am doing this:
> > > >
> > > > (pseudocode)
> > > > LIST_FOREACH_SAFE(em, &td_em->shared->threads, threads, tmp_em) {
> > > >
> > > > kill(em, SIGKILL);
> > > > }
> > > >
> > > > kill(SIGKILL) calls exit() which calls my exit_hook()
> > > >
> > > > my exit_hook() does LIST_REMOVE(em, threads).
> > > >
> > > > the problem is that this is not synchronous so I am getting a panic by INVARIANTS
> > > > that "Bad link elm prev->next != elm". This is because I list 1st item in the list
> > > > I call kill on it, then process 2nd list, then scheduler preempts my code and calls
> > > > exit() on the first proc which removes the first entry and bad things happen.
> > > >
> > > > I see this possible solutions:
> > > >
> > > > make this synchronous, it can be done by something like:
> > > >
> > > > ....
> > > > kill(em, SIGKILL);
> > > > wait_for_proc_to_vanish();
> > > >
> > > > pls. tell me what do you think about this solution and if its correct what is the wait_for_proc_to_vanish()
> > > >
> > > > maybe there's some better solution, pls tell me.
> > >
> > > It sounds like you need a lock protecting the list. If you held it over
> > > the whole loop you could signal all processes before the exit_hook could
> > > remove any.
> >
> > I dont understand. I am protecting the lock by a rw_rlock();
> >
> > the exit_hook() then acquires rw_wlock(); when removing the entry.
> > what exactly do you suggest me to do? I dont get it.
>
> This can't be the case. If you're holding a read lock around the
> loop (it must cover the entire loop), it should not be possible for the
> exit_hook() to obtain a write lock while you are in the loop. Just to
> verify, is the lock for the list and not per element?
oh.. I see whats going on.. in the exit_hook I am doing this:
em = em_find(p->p_pid, EMUL_UNLOCKED); // this performs EMUL_RLOCK(&emul_lock);
...
EMUL_RUNLOCK(&emul_lock);
EMUL_WLOCK(&emul_lock);
LIST_REMOVE(em, threads);
SLIST_REMOVE(&emuldata_head, em, linux_emuldata, emuldatas);
EMUL_WUNLOCK(&emul_lock);
the EMUL_RUNLOCK() unlocks it so it doesnt wait. This should be turned into rw_try_upgrade()
but I dont understand how ;(
anyway, I still dont understand how should I use the lock to achieve the synchronization.
my code looks like:
EMUL_RLOCK(&emul_lock);
LIST_FOREACH_SAFE(em, &td_em->shared->threads, threads, tmp_em) {
}
EMUL_RUNLOCK(&emul_lock);
what do you suggest? I need to process the list first and then let the exit_hook in the various processes run.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060810161705.GB19047>
