From owner-freebsd-hackers@FreeBSD.ORG Thu Feb 5 16:51:40 2009 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BA0EE106567E for ; Thu, 5 Feb 2009 16:51:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 7E5A08FC12 for ; Thu, 5 Feb 2009 16:51:40 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (pool-98-109-39-197.nwrknj.fios.verizon.net [98.109.39.197]) by cyrus.watson.org (Postfix) with ESMTPSA id 0D16446B09; Thu, 5 Feb 2009 11:51:40 -0500 (EST) Received: from localhost (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.14.3/8.14.3) with ESMTP id n15GpXj2069048; Thu, 5 Feb 2009 11:51:33 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-hackers@freebsd.org Date: Thu, 5 Feb 2009 08:19:22 -0500 User-Agent: KMail/1.9.7 References: <02026848-7F83-405C-B4F3-EDD8B47DA294@gmail.com> <498736C2.3040207@elischer.org> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200902050819.22726.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 05 Feb 2009 11:51:33 -0500 (EST) X-Virus-Scanned: ClamAV 0.94.2/8955/Thu Feb 5 08:55:49 2009 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.1 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00, DATE_IN_PAST_03_06 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Nikola =?utf-8?q?Kne=C5=BEevi=C4=87?= Subject: Re: blockable sleep lock (sleep mutex) 16 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Feb 2009 16:51:41 -0000 On Wednesday 04 February 2009 11:05:02 am Nikola Kne=C5=BEevi=C4=87 wrote: > On 2 Feb 2009, at 19:09 , Julian Elischer wrote: >=20 > >>> It says "non-sleepable locks", yet it classifies click_instance =20 > >>> as sleep mutex. I think witness code should emit messages which =20 > >>> are more clear. > >> It is confusing, but you can't do an M_WAITOK malloc while holding =20 > >> a mutex. Basically, sleeping actually means calling "*sleep() =20 > >> (such as mtx_sleep()) or cv_*wait*()". Blocking on a mutex is not =20 > >> sleeping, it's "blocking". Some locks (such as sx(9)) do "sleep" =20 > >> when you contest them. In the scheduler, sleeping and blocking are =20 > >> actually quite different (blocking uses turnstiles that handle =20 > >> priority inversions via priority propagation, sleeping uses sleep =20 > >> queues which do not do any of that). The underyling idea is that =20 > >> mutexes should be held for "short" periods of time, and that any =20 > >> sleeps are potentially unbounded. Holding a mutex while sleeping =20 > >> could result in a mutex being held for a long time. > > > > > > the locking overview page > > man 9 locking > > tries to explain this.. > > I've been pestering John to proofread it and make suggestiosn for a =20 > > while now. >=20 >=20 > Thanks John and Julian. I agree, man pages should be more clear :) >=20 > I've switched from using mtx to sx locks, since they offer sleeping =20 > while hold. >=20 > Unfortunately, I've ran into something really weird now, when I unload =20 > the module: > ---8<--- > #0 doadump () at pcpu.h:195 > #1 0xffffffff8049ef98 in boot (howto=3D260) at /usr/src/sys/kern/=20 > kern_shutdown.c:418 > #2 0xffffffff8049f429 in panic (fmt=3DVariable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:574 > #3 0xffffffff8075cd26 in trap_fatal (frame=3D0xc, eva=3DVariable "eva" i= s =20 > not available. > ) at /usr/src/sys/amd64/amd64/trap.c:764 > #4 0xffffffff8075da62 in trap (frame=3D0xffffffff87699940) at /usr/src/= =20 > sys/amd64/amd64/trap.c:290 > #5 0xffffffff80743bfe in calltrap () at /usr/src/sys/amd64/amd64/=20 > exception.S:209 > #6 0xffffffff8052a411 in strcmp (s1=3D0xffffffff80824a0c "sigacts", > s2=3D0xffffffff877cd3a9
) = =20 > at /usr/src/sys/libkern/strcmp.c:45 > #7 0xffffffff804d7c61 in enroll (description=3D0xffffffff80824a0c =20 > "sigacts", lock_class=3D0xffffffff80a19fe0) > at /usr/src/sys/kern/subr_witness.c:1439 > #8 0xffffffff804d7fb1 in witness_init (lock=3D0xffffff00016f4ca8) at /=20 > usr/src/sys/kern/subr_witness.c:618 > #9 0xffffffff8049fd31 in sigacts_alloc () at /usr/src/sys/kern/=20 > kern_sig.c:3280 > #10 0xffffffff80481121 in fork1 (td=3D0xffffff0001384a50, flags=3D20, =20 > pages=3DVariable "pages" is not available. > ) at /usr/src/sys/kern/kern_fork.c:453 > #11 0xffffffff80481450 in fork (td=3D0xffffff0001384a50, uap=3DVariable = =20 > "uap" is not available. > ) at /usr/src/sys/kern/kern_fork.c:106 > #12 0xffffffff8075d260 in syscall (frame=3D0xffffffff87699c80) at /usr/=20 > src/sys/amd64/amd64/trap.c:907 > #13 0xffffffff80743e0b in Xfast_syscall () at /usr/src/sys/amd64/amd64/=20 > exception.S:330 > #14 0x0000000800ca0a6c in ?? () > --->8--- >=20 > and in fra 7: > (kgdb) p *w > $5 =3D {w_name =3D 0xffffffff877cd3a9
bounds>, w_class =3D 0xffffffff80a19fe0, w_list =3D { > stqe_next =3D 0xffffffff80accce0}, w_typelist =3D {stqe_next =3D =20 > 0xffffffff80accce0}, w_children =3D 0x0, > w_file =3D 0xffffffff877d1fa0
bounds>, w_line =3D 307, w_level =3D 0, w_refcount =3D 2, > w_Giant_squawked =3D 0 '\0', w_other_squawked =3D 0 '\0', =20 > w_same_squawked =3D 0 '\0', w_displayed =3D 0 '\0'} > (kgdb) p *w->w_class > $6 =3D {lc_name =3D 0xffffffff808564e0 "sleep mutex", lc_flags =3D 9, =20 > lc_ddb_show =3D 0xffffffff80492e6b , > lc_lock =3D 0xffffffff804938be , lc_unlock =3D =20 > 0xffffffff804933fc } >=20 > This happens after modevent exists. >=20 > What puzzles me here is w_refcount of 2, while w_name is out of =20 > bounds. Locks I've created I properly destroyed (at least I think I =20 > did :)). You are probably missing some sx_destroy()'s. You need to destroy each loc= k=20 you create with sx_init(). =2D-=20 John Baldwin