From owner-freebsd-current@FreeBSD.ORG Sun Aug 15 05:04:53 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2EFCE16A4CE for ; Sun, 15 Aug 2004 05:04:53 +0000 (GMT) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 22E3743D45 for ; Sun, 15 Aug 2004 05:04:53 +0000 (GMT) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id 193D072DD4; Sat, 14 Aug 2004 22:04:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id 14AB572DCB for ; Sat, 14 Aug 2004 22:04:53 -0700 (PDT) Date: Sat, 14 Aug 2004 22:04:53 -0700 (PDT) From: Doug White To: current@freebsd.org Message-ID: <20040814214821.S6429@carver.gumbysoft.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: traceback from hung system X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 15 Aug 2004 05:04:53 -0000 Hey folks, Hoping that Peter & other hardware people see this. This is on a dual Xeon 2.4GHz Dell PE1750, for reference. It looks like the hangs that Robert and I have been experiencing recently with buildworld are related to Scott & sandvine's problems with IPI delivery. I was able, with a hack from rwatson to avoid stop_cpus() when going into ddb, to get into ddb and get a traceback: stopped at smp_tlb_shootdown+0x45: jb smp_tlb_shootdown+0x3c db> tr smp_tlb_shootdown(f6,db5f7000,db5f8000) at smp_tlb_shootdown+0x45 smp_invlpg_range(db5f7000,db5f8000) at smp_invlpg_range+0x1c pmap_invalidate_range(c0775de0,db5f7000,db5f8000,c227a000,c22774a4) at pmap_invalidate_range+0xb5 pmap_qenter(db5f7000,c227a010,1) at pmap_qenter+0x50 sf_buf_alloc(c1526388,0,0,0,0) at sf_buf_alloc+0x1a9 uiomove_fromphys(c3f05d58,0,27a5,dfe8cc88,0) at uiomove_fromphys+0x92 pipe_read(c287fdd0,dfe8cc88,c2c11080,0,c2c679a0) at pipe_read+0x238 dofileread(c2c679a0,c287fdd0,0,812a000,4000) at dofileread+0x95 read(c2c679a0,dfe8cd14,3,0,296) at read+0x3b syscall(2f,2f,2f,80da500,80f7034) at syscall+0x287 Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (3, FreeBSD ELF32, read), eip = 0x80a5783, esp = 0xbfbfe72c, ebp = 0xbfbfe748 --- Additional details and data collection at http://www.gumbysoft.com/debug-20040814 (includes 'ps' and 'tr' output for all processes on CPUs.) I can get the hang to come up easily enough, although I don't know how reliable the jump to DDB is. I tried to get into gdb via firewire but it wasn't working for me. The variable smp_tlb_wait was set to 0 according to 'x smp_tlb_wait' in ddb. ddb isn't one of my strong points so any hints on things to inspect would be appreciated. I doubt a crashdump will work in this context. I'm also going to try the same trick to get into my dual 600MHz P3, which also hangs. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org