From owner-freebsd-current@FreeBSD.ORG Mon Aug 23 18:10:30 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D786816A4EC for ; Mon, 23 Aug 2004 18:10:30 +0000 (GMT) Received: from mail1.speakeasy.net (mail1.speakeasy.net [216.254.0.201]) by mx1.FreeBSD.org (Postfix) with ESMTP id 732E843D2F for ; Mon, 23 Aug 2004 18:10:12 +0000 (GMT) (envelope-from jhb@FreeBSD.org) Received: (qmail 14150 invoked from network); 23 Aug 2004 18:10:12 -0000 Received: from dsl027-160-063.atl1.dsl.speakeasy.net (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) encrypted SMTP for ; 23 Aug 2004 18:10:11 -0000 Received: from [10.50.40.208] (gw1.twc.weather.com [216.133.140.1]) (authenticated bits=0) by server.baldwin.cx (8.12.11/8.12.11) with ESMTP id i7NI9rEE088096; Mon, 23 Aug 2004 14:10:04 -0400 (EDT) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-current@FreeBSD.org Date: Mon, 23 Aug 2004 14:10:06 -0400 User-Agent: KMail/1.6.2 References: <20040821125950.L84878@carver.gumbysoft.com> In-Reply-To: <20040821125950.L84878@carver.gumbysoft.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200408231410.06587.jhb@FreeBSD.org> X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on server.baldwin.cx cc: current@FreeBSD.org Subject: Re: new twist on IPI deadlock X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Aug 2004 18:10:32 -0000 On Saturday 21 August 2004 04:12 pm, Doug White wrote: > Got this on my xeon today, with hyperthreading disabled, and dropping to > ddb with NMI after a hang. Looks like smp_rendezvous_action() colliding > with smp_tlb_shootdown(). > > smp_rendezvous and smp_tlb_shootdown use different IPI vectors and > different locks, but I wonder if the operations aren't orthogonal, and > doing multiple IPIs at once can cause unexpected behavior. Cute! This might actually explain the SMP deadlocks with KSE apps. The fix is probably to make the TLB code use the same mutex as the SMP rendezvous code. > kernel trap 19 with interrupts disabled > NMI ... going to debugger > [thread 100168] > Stopped at smp_rendezvous_action+0x30: cmpl mp_ncpus,%eax > db> tr > smp_rendezvous_action(fd) at smp_rendezvous_action+0x30 > smp_rendezvous(0,c06a0724,0,c2a66420) at smp_rendezvous+0xd7 > i386_ldt_grow(c2a66420,12,8,dfe2a000,c2a63f60) at i386_ldt_grow+0x1b1 > i386_set_ldt(c2a66420,bfbfe968,c2a63de0,0,dfd61d40) at i386_set_ldt+0x2de > sysarch(c2a66420,dfd61d14,2,0,206) at sysarch+0x67 > syscall(2f,2f,2f,2807f010,0) at syscall+0x287 > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (165, FreeBSD ELF32, sysarch), eip = 0x2807196f, esp = > 0xbfbfe954, ebp = 0x- > [...] > db> tr 28658 > sched_switch(f6,dc511000,dc512000) at sched_switch+0x9b > smp_invlpg_range(dc511000,dc512000) at smp_invlpg_range+0x1c > pmap_invalidate_range(c0775e20,dc511000,dc512000,c2294780,c2277170) at > pmap_invalidate_5 > pmap_qenter(dc511000,c2294790,1) at pmap_qenter+0x50 > sf_buf_alloc(c1a329e0,0,0,0,0) at sf_buf_alloc+0x1a9 > uiomove_fromphys(c28a9600,3000,8d8,dfed5c88,0) at uiomove_fromphys+0x92 > pipe_read(c2a68bf4,dfed5c88,c2b7c400,0,c349a2c0) at pipe_read+0x238 > dofileread(c349a2c0,c2a68bf4,0,812a000,4000) at dofileread+0x95 > read(c349a2c0,dfed5d14,3,0,296) at read+0x3b > syscall(2f,2f,2f,80da500,80f7034) at syscall+0x287 > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (3, FreeBSD ELF32, read), eip = 0x80a5783, esp = 0xbfbfe69c, > ebp = 0xbfbfe6- > db> tr 28690 > smp_rendezvous_action(fd) at smp_rendezvous_action+0x30 > smp_rendezvous(0,c06a0724,0,c2a66420) at smp_rendezvous+0xd7 > i386_ldt_grow(c2a66420,12,8,dfe2a000,c2a63f60) at i386_ldt_grow+0x1b1 > i386_set_ldt(c2a66420,bfbfe968,c2a63de0,0,dfd61d40) at i386_set_ldt+0x2de > sysarch(c2a66420,dfd61d14,2,0,206) at sysarch+0x67 > syscall(2f,2f,2f,2807f010,0) at syscall+0x287 > Xint0x80_syscall() at Xint0x80_syscall+0x1f > --- syscall (165, FreeBSD ELF32, sysarch), eip = 0x2807196f, esp = > 0xbfbfe954, ebp = 0x- -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org