From owner-freebsd-hackers@FreeBSD.ORG Thu Aug 10 16:34:02 2006 Return-Path: X-Original-To: hackers@freebsd.org Delivered-To: freebsd-hackers@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1192116A4DA for ; Thu, 10 Aug 2006 16:34:02 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from sccmmhc91.asp.att.net (sccmmhc91.asp.att.net [204.127.203.211]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7912A43D46 for ; Thu, 10 Aug 2006 16:33:58 +0000 (GMT) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net ([12.207.12.9]) by sccmmhc91.asp.att.net (sccmmhc91) with ESMTP id <20060810163356m910086rl1e>; Thu, 10 Aug 2006 16:33:57 +0000 Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.13.6/8.13.6) with ESMTP id k7AGXnXp022295; Thu, 10 Aug 2006 11:33:50 -0500 (CDT) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.13.6/8.13.6/Submit) id k7AGXcSx022294; Thu, 10 Aug 2006 11:33:38 -0500 (CDT) (envelope-from brooks) Date: Thu, 10 Aug 2006 11:33:37 -0500 From: Brooks Davis To: Divacky Roman Message-ID: <20060810163337.GA22097@lor.one-eyed-alien.net> References: <20060810151616.GA17109@stud.fit.vutbr.cz> <20060810152359.GA21318@lor.one-eyed-alien.net> <20060810153543.GA19047@stud.fit.vutbr.cz> <20060810154305.GA21483@lor.one-eyed-alien.net> <20060810161705.GB19047@stud.fit.vutbr.cz> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="liOOAslEiF7prFVr" Content-Disposition: inline In-Reply-To: <20060810161705.GB19047@stud.fit.vutbr.cz> User-Agent: Mutt/1.5.11 Cc: hackers@freebsd.org Subject: Re: SoC: help with LISTs and killing procs X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Aug 2006 16:34:02 -0000 --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Aug 10, 2006 at 06:17:05PM +0200, Divacky Roman wrote: > On Thu, Aug 10, 2006 at 10:43:05AM -0500, Brooks Davis wrote: > > On Thu, Aug 10, 2006 at 05:35:43PM +0200, Divacky Roman wrote: > > > On Thu, Aug 10, 2006 at 10:23:59AM -0500, Brooks Davis wrote: > > > > On Thu, Aug 10, 2006 at 05:16:17PM +0200, Divacky Roman wrote: > > > > > hi > > > > >=20 > > > > > I am doing this: > > > > >=20 > > > > > (pseudocode) > > > > > LIST_FOREACH_SAFE(em, &td_em->shared->threads, threads, tmp_em) { > > > > >=20 > > > > > kill(em, SIGKILL); > > > > > } > > > > >=20 > > > > > kill(SIGKILL) calls exit() which calls my exit_hook() > > > > >=20 > > > > > my exit_hook() does LIST_REMOVE(em, threads). > > > > >=20 > > > > > the problem is that this is not synchronous so I am getting a pan= ic by INVARIANTS > > > > > that "Bad link elm prev->next !=3D elm". This is because I list 1= st item in the list > > > > > I call kill on it, then process 2nd list, then scheduler preempts= my code and calls > > > > > exit() on the first proc which removes the first entry and bad th= ings happen.=20 > > > > >=20 > > > > > I see this possible solutions: > > > > >=20 > > > > > make this synchronous, it can be done by something like: > > > > >=20 > > > > > .... > > > > > kill(em, SIGKILL); > > > > > wait_for_proc_to_vanish(); > > > > >=20 > > > > > pls. tell me what do you think about this solution and if its cor= rect what is the wait_for_proc_to_vanish() > > > > >=20 > > > > > maybe there's some better solution, pls tell me. > > > >=20 > > > > It sounds like you need a lock protecting the list. If you held it= over > > > > the whole loop you could signal all processes before the exit_hook = could > > > > remove any. > > >=20 > > > I dont understand. I am protecting the lock by a rw_rlock(); > > >=20 > > > the exit_hook() then acquires rw_wlock(); when removing the entry. > > > what exactly do you suggest me to do? I dont get it. > >=20 > > This can't be the case. If you're holding a read lock around the > > loop (it must cover the entire loop), it should not be possible for the > > exit_hook() to obtain a write lock while you are in the loop. Just to > > verify, is the lock for the list and not per element? >=20 > oh.. I see whats going on.. in the exit_hook I am doing this: >=20 >=20 > em =3D em_find(p->p_pid, EMUL_UNLOCKED); // this performs EMUL_RLOC= K(&emul_lock); > ... > EMUL_RUNLOCK(&emul_lock); > =09 > EMUL_WLOCK(&emul_lock); > LIST_REMOVE(em, threads); > SLIST_REMOVE(&emuldata_head, em, linux_emuldata, emuldatas); > EMUL_WUNLOCK(&emul_lock); > =20 > the EMUL_RUNLOCK() unlocks it so it doesnt wait. This should be turned in= to rw_try_upgrade() > but I dont understand how ;( >=20 > anyway, I still dont understand how should I use the lock to achieve the = synchronization. >=20 > my code looks like: >=20 > EMUL_RLOCK(&emul_lock); > LIST_FOREACH_SAFE(em, &td_em->shared->threads, threads, tmp_em) { > } > EMUL_RUNLOCK(&emul_lock); >=20 > what do you suggest? I need to process the list first and then let the > exit_hook in the various processes run. I'll admit to not being super familiar with the rwlock code, but unless exit_hook is being run in the same context as loop (i.e. the signal handling isn't asynchronous) the unlock shouldn't result in the release of the loops' reader lock and thus the write lock request should fail. -- Brooks --liOOAslEiF7prFVr Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (FreeBSD) iD8DBQFE21/hXY6L6fI4GtQRAirjAKCxdwcw78bOADcRGwRcfazKUnD0JQCfZGhP llKfXRb6rd2I9HYpT96oocI= =HdX9 -----END PGP SIGNATURE----- --liOOAslEiF7prFVr--