Date: Sun, 17 Nov 2002 14:54:04 -0500 (EST) From: Robert Watson <rwatson@freebsd.org> To: "Joel M. Baldwin" <qumqats@outel.org> Cc: current@freebsd.org Subject: Re: more info from panic from running dnet on SMP kernel ( lock order reversal, recursed on non-recursive lock ) Message-ID: <Pine.NEB.3.96L.1021117145242.93303I-100000@fledge.watson.org> In-Reply-To: <211086306.1037497825@[192.168.1.20]>
next in thread | previous in thread | raw e-mail | index | archive | help
Hmm. It looks like there is indeed a lock leak in the RFTHREAD code. Maybe a change like the following might help: PROC_LOCK(p2); psignal(p2, SIGKILL); PROC_UNLOCK(p2); } Change the } to: =09=09} else =09=09=09PROC_UNLOCK(p1->p_leader); And see if that gets rid of the problem. Any chance this is highly reproduceable, btw? :-) And what app are you running that's using RFTHREAD -- linux thread stuff? Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories On Sun, 17 Nov 2002, Joel M. Baldwin wrote: >=20 > running dnet on a SMP kernel causes the kernel to panic. >=20 >=20 > lock order reversal > 1st 0xc2c803e8 process lock (process lock) @=20 > ../../../kern/kern_fork.c:571 > 2nd 0xc03cfce0 proctree (proctree) @ ../../../kern/kern_fork.c:596 > recursed on non-recursive lock (sleep mutex) process lock @=20 > ../../../kern/kern_fork.c:599 > first acquired @ ../../../kern/kern_fork.c:571 > panic: recurse > cpuid =3D 1; lapic.id =3D 01000000 > Debugger("panic") > Stopped at Debugger+0x55: xchgl %ebx,in_Debugger.0 > db> t > Debugger(c03926fa,1000000,c0395ada,d26f5c08,1) at Debugger+0x55 > panic(c0395ada,c038feab,23b,c038feab,257) at panic+0x11f > witness_lock(c2c803e8,8,c038feab,257,0) at witness_lock+0x3e6 > _mtx_lock_flags(c2c803e8,0,c038feab,257,d26f5cb8) at=20 > _mtx_lock_flags+0xb2 > fork1(c2773d00,6050,0,d26f5cd4,c2c803e8) at fork1+0xbfc > rfork(c2773d00,d26f5d10,c03b07a2,407,1) at rfork+0x65 > syscall(2f,2f,2f,0,80ddf10) at syscall+0x28e > Xint0x80_syscall() at Xint0x80_syscall+0x1d > --- syscall (251, FreeBSD ELF32, rfork), eip =3D 0x8087d14, esp =3D=20 > 0xbfbff4a8, ebp =3D 0xbfbff524 --- > db> ps > pid proc addr uid ppid pgrp flag stat wmesg wchan=20 > cmd > 6217 c2b98e00 d28a7000 0 6215 6216 0000000 newpanic: unknown=20 > thread state > cpuid =3D 1; lapic.id =3D 01000000 > boot() called on cpu#1 > Uptime: 1h43m39s > pfs_vncache_unload(): 1 entries remaining > Dumping 255 MB > 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 > Dump complete > Automatic reboot in 15 seconds - press a key on the console to abort > Rebooting... > cpu_reset called on cpu#1 > cpu_reset: Restarting BSP > cpu_reset_proxy: Stopped CPU 1 >=20 >=20 >=20 > And then when the system came back up and I took a closer > look at the core dump. >=20 >=20 > (kgdb) where > #0 doadump () at ../../../kern/kern_shutdown.c:232 > #1 0xc02114ad in boot (howto=3D260) at ../../../kern/kern_shutdown.c:364 > #2 0xc0211767 in panic () at ../../../kern/kern_shutdown.c:517 > #3 0xc014f2bc in db_ps (dummy1=3D-1070342907, dummy2=3D0, dummy3=3D-1,= =20 > dummy4=3D0xd26f5a24 "") > at ../../../ddb/db_ps.c:169 > #4 0xc014d142 in db_command (last_cmdp=3D0xc03be920, cmd_table=3D0x0,=20 > aux_cmd_tablep=3D0xc03b5540, > aux_cmd_tablep_end=3D0xc03b5558) at ../../../ddb/db_command.c:346 > #5 0xc014d256 in db_command_loop () at ../../../ddb/db_command.c:472 > #6 0xc014feea in db_trap (type=3D3, code=3D0) at ../../../ddb/db_trap.c:= 72 > #7 0xc033da10 in kdb_trap (type=3D3, code=3D0, regs=3D0xd26f5b80) > at ../../../i386/i386/db_interface.c:166 > #8 0xc0356a3f in trap (frame=3D > {tf_fs =3D -1069481960, tf_es =3D 16, tf_ds =3D 16, tf_edi =3D=20 > -1032372992, tf_esi =3D 256, tf_ebp =3D -764453940, tf_isp =3D -764453972= ,=20 > tf_ebx =3D 0, tf_edx =3D 0, tf_ecx =3D 1, tf_eax =3D 18, tf_trapno =3D 3,= tf_err=20 > =3D 0, tf_eip =3D -1070342907, tf_cs =3D 8, tf_eflags =3D 662, tf_esp =3D= =20 > -1069883258, tf_ss =3D -1069996294}) at ../../../i386/i386/trap.c:603 > #9 0xc033f238 in calltrap () at {standard input}:99 > #10 0xc021174f in panic (fmt=3D0x0) at ../../../kern/kern_shutdown.c:503 > #11 0xc02333d6 in witness_lock (lock=3D0xc2c803e8, flags=3D8, > file=3D0xc038feab "../../../kern/kern_fork.c", line=3D599) at=20 > ../../../kern/subr_witness.c:609 > #12 0xc02079c2 in _mtx_lock_flags (m=3D0xc03cf4c0, opts=3D0,=20 > file=3D0xc042cfd4 "=E8\003=C8=C2=AB=FE8=C0;\002", > line=3D-1027079192) at ../../../kern/kern_mutex.c:328 > #13 0xc01fd3ec in fork1 (td=3D0xc2773d00, flags=3D24656, pages=3D0,=20 > procp=3D0xd26f5cd4) > at ../../../kern/kern_fork.c:599 > #14 0xc01fc6c5 in rfork (td=3D0xc2773d00, uap=3D0xd26f5d10) at=20 > ../../../kern/kern_fork.c:168 > #15 0xc035739e in syscall (frame=3D > {tf_fs =3D 47, tf_es =3D 47, tf_ds =3D 47, tf_edi =3D 0, tf_esi =3D= =20 > 135126800, tf_ebp =3D -1077938908, tf_isp =3D -764453516, tf_ebx =3D 2,= =20 > tf_edx =3D 135381248, tf_ecx =3D 135381248, tf_eax =3D 251, tf_trapno =3D= 0,=20 > tf_err =3D 2, tf_eip =3D 134774036, tf_cs =3D 31, tf_eflags =3D 659, tf_e= sp =3D=20 > -1077939032, tf_ss =3D 47}) > at ../../../i386/i386/trap.c:1033 > #16 0xc033f28d in Xint0x80_syscall () at {standard input}:141 > ---Can't read userspace from dump, or kernel process--- >=20 >=20 >=20 >=20 >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-current" in the body of the message >=20 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1021117145242.93303I-100000>