From owner-freebsd-current@FreeBSD.ORG  Sun Oct 17 19:17:43 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1D76716A4CE
	for <current@FreeBSD.org>; Sun, 17 Oct 2004 19:17:43 +0000 (GMT)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id AF75043D5D
	for <current@FreeBSD.org>; Sun, 17 Oct 2004 19:17:42 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (localhost [127.0.0.1])
	by fledge.watson.org (8.13.1/8.13.1) with ESMTP id i9HJHcNk032428
	for <current@FreeBSD.org>; Sun, 17 Oct 2004 15:17:38 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Received: from localhost (robert@localhost)i9HJHcgZ032425
	for <current@FreeBSD.org>; Sun, 17 Oct 2004 15:17:38 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Sun, 17 Oct 2004 15:17:37 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: current@FreeBSD.org
Message-ID: <Pine.NEB.3.96L.1041017151317.29749A-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Subject: Preemption-related bug in propagate_priority(): it's OK to hold
	Giant but not be running
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 17 Oct 2004 19:17:43 -0000


Ran into an interesting problem relating to WITNESS and a
multiply-destroyed mutex today, but I didn't get the real panic message
because of what looks like one or two other bugs.  Here's the high-level:

sodealloc() tries to destroy the mutex on a socket more than once, causing
WITNESS to panic.  The soft clock ithread preempts the serial output in
panic() when a critical section is released, resulting in a context switch
to that ithread.  That ithread then attempts to lock a mutex, and is
horrified to find out that Giant is held by the thread it preempted, but
that that thread isn't blocked:

tiger-2# ppanic: process 15955(tcpconnect):2 holds Giant but isn't blocked
on a lock

  % rwatson -- note that double p is from the initiating panic

cpuid = 0
KDB: enter: panic
[thread 100071]
Stopped at      kdb_enter+0x2b: nop
db> trace
kdb_enter(c07fc101) at kdb_enter+0x2b
panic(c07ff1ec,3e53,c5f90588,2,c080c957) at panic+0x127
propagate_priority(c547b180,c08c47b8,c5f91780,c08bdbc0,0) at
propagate_priority+
0x139
turnstile_wait(0,c08bdbc0,c5f91780,c08bdbc0,2,c07fb4a5,217) at
turnstile_wait+0x
2de
_mtx_lock_sleep(c08bdbc0,c547b180,0,c07fd124,f7) at _mtx_lock_sleep+0x15f
_mtx_lock_flags(c08bdbc0,0,c07fd124,f7,0) at _mtx_lock_flags+0x85
softclock(0) at softclock+0x16a
ithread_loop(c5436e80,e942fd48,c5436e80,c05f7c80,0) at ithread_loop+0x124
fork_exit(c05f7c80,c5436e80,e942fd48) at fork_exit+0xa4
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe942fd7c, ebp = 0 ---
db> show locks
db> show pcpu
cpuid        = 0
curthread    = 0xc547b180: pid 87 "swi4: clock sio"
curpcb       = 0xe942fda0
fpcurthread  = none
idlethread   = 0xc53e4a80: pid 14 "idle: cpu0"
APIC ID      = 0
currentldt   = 0x28
spin locks held:
db> ps
  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
15955 c5f90400 ebbb1000    0   525 15955 0004002 [Can run] tcpconnect
...
   87 c548b400 e946e000    0     0     0 000020c [LOCK  Giant c5432d00]
swi4: clock sio
db> trace 15955
sched_switch(c5f91780,c5451d80,6) at sched_switch+0x16f
mi_switch(6,c5451d80,c5451ed0,c5451d80,ebba492c) at mi_switch+0x264
maybe_preempt(c5451d80) at maybe_preempt+0x156
sched_add(c5451d80,4,c53e2d80,c5451d80,c5450800) at sched_add+0x153
setrunqueue(c5451d80,4) at setrunqueue+0xab
ithread_schedule(c53e2d80,1f,c5f91780,c5f91780,2580) at
ithread_schedule+0xb3
intr_execute_handlers(c53dc0d8,ebba49a8,1f,ebba49f0,c0780ad3) at
intr_execute_ha
ndlers+0xf5
lapic_handle_intr(4f) at lapic_handle_intr+0x2e
Xapic_isr2() at Xapic_isr2+0x33
--- interrupt, eip = 0xc0619446, esp = 0xebba49ec, ebp = 0xebba49f0 ---
critical_exit(3f8,ebba4a30,c076f731,c08fa000,0) at critical_exit+0x92
_mtx_unlock_spin_flags(c08fa000,0,c08174fc,be4) at
_mtx_unlock_spin_flags+0x8d
siocnputc(c08987c0,70) at siocnputc+0xb9
cnputc(70) at cnputc+0x4d
putchar(70,ebba4b44) at putchar+0x52
kvprintf(c07fc2ed,c06219b8,ebba4b44,a,ebba4b64) at kvprintf+0x77
printf(c07fc2ed,c08bf060,c08bf060,100,c07ff91e,ebba4b98) at printf+0x43
panic(c07ff91e,c07b69b1,c0813aa8,c07ff6af,c5a23ebc) at panic+0xcb
witness_destroy(c5a23ebc,c5a23dec,ebba4bd0,c063f15f,c5a23ebc) at
witness_destroy
+0x40
mtx_destroy(c5a23ebc,c5e94e00,c5a23ebc,ebba4be8,c063f4fd) at
mtx_destroy+0x7a
sodealloc(c5a23dec,c5a23eb0,c5832bb0,c5a23dec,ebba4c00) at sodealloc+0xf3
sofree(c5a23dec) at sofree+0x281
soclose(c5a23dec,c5832bb0,0,ebba4c2c,c05efad0) at soclose+0x24d
soo_close(c5832bb0,c5f91780) at soo_close+0x4b
fdrop_locked(c5832bb0,c5f91780,c53ccf40,0,c07f90b1) at fdrop_locked+0x84
fdrop(c5832bb0,c5f91780,ebba4c78,c0628c5c,c07f90ba) at fdrop+0x24
closef(c5832bb0,c5f91780,c5ebc628,0,c07f90b1) at closef+0x1db
close(c5f91780,ebba4d14,1,14b,292) at close+0xe6
syscall(2f,bfbf002f,bfbf002f,3,bfbfedf7) at syscall+0x213
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (6, FreeBSD ELF32, close), eip = 0x280c154b, esp = 0xbfbfec0c,
ebp =
 0xbfbfec48 ---
db> show locks 15955
exclusive sleep mutex Giant r = 0 (0xc08bdbc0) locked @
kern/kern_descrip.c:967

So on face value, it looks like propagate_priority() is unhappy about the
tcpconnect thread (that was preempted) still holding Giant.  However,
given that preemption is supposed to preempt but leave a thread runnable,
this would seem not to be a panic-worthy condition.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Principal Research Scientist, McAfee Research