From owner-freebsd-current@FreeBSD.ORG  Tue Oct 19 21:50:29 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 82B1916A4CE
	for <freebsd-current@freebsd.org>;
	Tue, 19 Oct 2004 21:50:29 +0000 (GMT)
Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0436A43D1D
	for <freebsd-current@freebsd.org>;
	Tue, 19 Oct 2004 21:50:29 +0000 (GMT)	(envelope-from john@baldwin.cx)
Received: (qmail 4618 invoked from network); 19 Oct 2004 21:50:28 -0000
Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx)
	([216.27.160.63])          (envelope-sender <john@baldwin.cx>)
	encrypted SMTP
	for <rwatson@freebsd.org>; 19 Oct 2004 21:50:27 -0000
Received: from zion.baldwin.cx (zion.baldwin.cx [192.168.0.7])
	by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i9JLoOo5050732;
	Tue, 19 Oct 2004 17:50:24 -0400 (EDT)
	(envelope-from john@baldwin.cx)
Received: from zion.baldwin.cx (localhost [127.0.0.1])
	by zion.baldwin.cx (8.12.10/8.12.10) with ESMTP id i9JLoO8h005770;
	Tue, 19 Oct 2004 17:50:24 -0400 (EDT)
	(envelope-from john@zion.baldwin.cx)
Received: from localhost (localhost [[UNIX: localhost]])
	by zion.baldwin.cx (8.12.10/8.12.10/Submit) id i9JLoOtH005769;
	Tue, 19 Oct 2004 17:50:24 -0400 (EDT)
	(envelope-from john)
From: John Baldwin <jhb@freebsd.org>
To: Julian Elischer <julian@elischer.org>
Date: Tue, 19 Oct 2004 17:45:59 -0400
User-Agent: KMail/1.6
References: <Pine.NEB.3.96L.1041017151317.29749A-100000@fledge.watson.org>
	<200410191718.37092.jhb@FreeBSD.org> <41758849.5080500@elischer.org>
In-Reply-To: <41758849.5080500@elischer.org>
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <200410191745.59493.jhb@FreeBSD.org>
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx
cc: freebsd-current@freebsd.org
cc: Robert Watson <rwatson@freebsd.org>
Subject: Re: Preemption-related bug in propagate_priority(): it's OK to hold
	Giant but not be running
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 19 Oct 2004 21:50:29 -0000

On Tuesday 19 October 2004 05:34 pm, Julian Elischer wrote:
> John Baldwin wrote:
> >On Tuesday 19 October 2004 05:03 pm, John Baldwin wrote:
> >>On Sunday 17 October 2004 03:17 pm, Robert Watson wrote:
> >>>Ran into an interesting problem relating to WITNESS and a
> >>>multiply-destroyed mutex today, but I didn't get the real panic message
> >>>because of what looks like one or two other bugs.  Here's the
> >>> high-level:
> >>>
> >>>sodealloc() tries to destroy the mutex on a socket more than once,
> >>>causing WITNESS to panic.  The soft clock ithread preempts the serial
> >>>output in panic() when a critical section is released, resulting in a
> >>>context switch to that ithread.  That ithread then attempts to lock a
> >>>mutex, and is horrified to find out that Giant is held by the thread it
> >>>preempted, but that that thread isn't blocked:
> >>>
> >>>tiger-2# ppanic: process 15955(tcpconnect):2 holds Giant but isn't
> >>>blocked on a lock
> >>>
> >>>  % rwatson -- note that double p is from the initiating panic
> >>>
> >>>cpuid = 0
> >>>KDB: enter: panic
> >>>[thread 100071]
> >>>Stopped at      kdb_enter+0x2b: nop
> >>>db> trace
> >>>kdb_enter(c07fc101) at kdb_enter+0x2b
> >>>panic(c07ff1ec,3e53,c5f90588,2,c080c957) at panic+0x127
> >>>propagate_priority(c547b180,c08c47b8,c5f91780,c08bdbc0,0) at
> >>>propagate_priority+
> >>>0x139
> >>>turnstile_wait(0,c08bdbc0,c5f91780,c08bdbc0,2,c07fb4a5,217) at
> >>>turnstile_wait+0x
> >>>2de
> >>>_mtx_lock_sleep(c08bdbc0,c547b180,0,c07fd124,f7) at
> >>> _mtx_lock_sleep+0x15f _mtx_lock_flags(c08bdbc0,0,c07fd124,f7,0) at
> >>> _mtx_lock_flags+0x85 softclock(0) at softclock+0x16a
> >>>ithread_loop(c5436e80,e942fd48,c5436e80,c05f7c80,0) at
> >>> ithread_loop+0x124 fork_exit(c05f7c80,c5436e80,e942fd48) at
> >>> fork_exit+0xa4
> >>>fork_trampoline() at fork_trampoline+0x8
> >>>--- trap 0x1, eip = 0, esp = 0xe942fd7c, ebp = 0 ---
> >>>db> show locks
> >>>db> show pcpu
> >>>cpuid        = 0
> >>>curthread    = 0xc547b180: pid 87 "swi4: clock sio"
> >>>curpcb       = 0xe942fda0
> >>>fpcurthread  = none
> >>>idlethread   = 0xc53e4a80: pid 14 "idle: cpu0"
> >>>APIC ID      = 0
> >>>currentldt   = 0x28
> >>>spin locks held:
> >>>db> ps
> >>>  pid   proc     uarea   uid  ppid  pgrp  flag   stat  wmesg    wchan
> >>>cmd 15955 c5f90400 ebbb1000    0   525 15955 0004002 [Can run]
> >>> tcpconnect ...
> >>>   87 c548b400 e946e000    0     0     0 000020c [LOCK  Giant c5432d00]
> >>>swi4: clock sio
> >>>db> trace 15955
> >>>sched_switch(c5f91780,c5451d80,6) at sched_switch+0x16f
> >>>mi_switch(6,c5451d80,c5451ed0,c5451d80,ebba492c) at mi_switch+0x264
> >>>maybe_preempt(c5451d80) at maybe_preempt+0x156
> >>>sched_add(c5451d80,4,c53e2d80,c5451d80,c5450800) at sched_add+0x153
> >>>setrunqueue(c5451d80,4) at setrunqueue+0xab
> >>>ithread_schedule(c53e2d80,1f,c5f91780,c5f91780,2580) at
> >>>ithread_schedule+0xb3
> >>>intr_execute_handlers(c53dc0d8,ebba49a8,1f,ebba49f0,c0780ad3) at
> >>>intr_execute_ha
> >>>ndlers+0xf5
> >>>lapic_handle_intr(4f) at lapic_handle_intr+0x2e
> >>>Xapic_isr2() at Xapic_isr2+0x33
> >>>--- interrupt, eip = 0xc0619446, esp = 0xebba49ec, ebp = 0xebba49f0 ---
> >>>critical_exit(3f8,ebba4a30,c076f731,c08fa000,0) at critical_exit+0x92
> >>>_mtx_unlock_spin_flags(c08fa000,0,c08174fc,be4) at
> >>>_mtx_unlock_spin_flags+0x8d
> >>>siocnputc(c08987c0,70) at siocnputc+0xb9
> >>>cnputc(70) at cnputc+0x4d
> >>>putchar(70,ebba4b44) at putchar+0x52
> >>>kvprintf(c07fc2ed,c06219b8,ebba4b44,a,ebba4b64) at kvprintf+0x77
> >>>printf(c07fc2ed,c08bf060,c08bf060,100,c07ff91e,ebba4b98) at printf+0x43
> >>>panic(c07ff91e,c07b69b1,c0813aa8,c07ff6af,c5a23ebc) at panic+0xcb
> >>>witness_destroy(c5a23ebc,c5a23dec,ebba4bd0,c063f15f,c5a23ebc) at
> >>>witness_destroy
> >>>+0x40
> >>>mtx_destroy(c5a23ebc,c5e94e00,c5a23ebc,ebba4be8,c063f4fd) at
> >>>mtx_destroy+0x7a
> >>>sodealloc(c5a23dec,c5a23eb0,c5832bb0,c5a23dec,ebba4c00) at
> >>> sodealloc+0xf3 sofree(c5a23dec) at sofree+0x281
> >>>soclose(c5a23dec,c5832bb0,0,ebba4c2c,c05efad0) at soclose+0x24d
> >>>soo_close(c5832bb0,c5f91780) at soo_close+0x4b
> >>>fdrop_locked(c5832bb0,c5f91780,c53ccf40,0,c07f90b1) at fdrop_locked+0x84
> >>>fdrop(c5832bb0,c5f91780,ebba4c78,c0628c5c,c07f90ba) at fdrop+0x24
> >>>closef(c5832bb0,c5f91780,c5ebc628,0,c07f90b1) at closef+0x1db
> >>>close(c5f91780,ebba4d14,1,14b,292) at close+0xe6
> >>>syscall(2f,bfbf002f,bfbf002f,3,bfbfedf7) at syscall+0x213
> >>>Xint0x80_syscall() at Xint0x80_syscall+0x1f
> >>>--- syscall (6, FreeBSD ELF32, close), eip = 0x280c154b, esp =
> >>>0xbfbfec0c, ebp =
> >>> 0xbfbfec48 ---
> >>>db> show locks 15955
> >>>exclusive sleep mutex Giant r = 0 (0xc08bdbc0) locked @
> >>>kern/kern_descrip.c:967
> >>>
> >>>So on face value, it looks like propagate_priority() is unhappy about
> >>> the tcpconnect thread (that was preempted) still holding Giant. 
> >>> However, given that preemption is supposed to preempt but leave a
> >>> thread runnable, this would seem not to be a panic-worthy condition.
> >>
> >>Yeah, pp doesn't like a thread in TD_CAN_RUN().  That state can only
> >> happen if you have already panic'd though.  I can fix it to allow
> >> TD_CAN_RUN if the kernel has already panic'd though.
> >
> >Actually, this won't work.  Here's the problem: the reason you got to pp()
> > is that you blocked on Giant, right.  So your ithread is going to block
> > after pp() returns.  Since the thread at the end of the chain is
> > TD_CAN_RUN and due to the weird TDF_INPANIC stuff we have in
> > choosethread(), the thread holding the lock that the ithread needs is
> > _never_ going to run, so your ithread will never resume.  The end result
> > is that if you made pp() just return rather than panic'ing in this case,
> > you would get a single 'p' character on your output, and then the machine
> > would deadlock.  At least this way you get into ddb. :-P
>
> it would be relatively simple to put those 'discarded' threads onto a
> single list in the kernel
> and 'c' could put them back on the run queue :-)

That's a really nasty hack.  I almost wonder if we shouldn't just bump the 
priority of the thread that panics way up instead, or maybe use a dedicated 
thread for handling panics or something.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/