From owner-freebsd-current@FreeBSD.ORG Tue Jul 21 14:27:12 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA485106566B; Tue, 21 Jul 2009 14:27:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id AAB3C8FC13; Tue, 21 Jul 2009 14:27:12 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 5D61246B23; Tue, 21 Jul 2009 10:27:12 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id D42C38A09C; Tue, 21 Jul 2009 10:27:11 -0400 (EDT) From: John Baldwin To: Kamigishi Rei Date: Tue, 21 Jul 2009 10:27:06 -0400 User-Agent: KMail/1.9.7 References: <4A659F98.2060007@haruhiism.net> <200907210857.01690.jhb@freebsd.org> <4A65C9D1.6080902@haruhiism.net> In-Reply-To: <4A65C9D1.6080902@haruhiism.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200907211027.06589.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 21 Jul 2009 10:27:11 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Lawrence Stewart , freebsd-current@freebsd.org Subject: Re: [follow-up] Fatal trap 12 in r195146+ in netisr_queue_internal X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Jul 2009 14:27:13 -0000 On Tuesday 21 July 2009 9:59:45 am Kamigishi Rei wrote: > John Baldwin wrote: > > On Tuesday 21 July 2009 6:59:36 am Kamigishi Rei wrote: > > > >> Everything goes fine until - under heavy load on an interface, usually - > >> we reach a point where: > >> 1. m->mtx_lock is 4 (== MTX_UNOWNED). > >> 2. v is assigned mtx_lock's value (4 == MTX_UNOWNED). > >> 3. condition (v == MTX_UNOWNED) fails. > >> > > This will not happen. If you look at the disassembly you will see this can't > > happen either. Do you have a crashdump from a crash? > > > I've got about 40 crash dumps on unmodded (without debug code) kernel, > and 3 or 4 with debug stuff (KASSERTs added by me). > I can reproduce this on my test server (Core2 Duo 3.0, 4GB RAM), on my > home PC (Core2 Quad 2.5), and in VMWare with 2 CPUs in VT-x mode on my > laptop. > It can't be reproduced on single-CPU single-core (including > hyperthreaded) systems. > > Quoting, > > (kgdb) fr 6 > #6 0xffffffff80586255 in _mtx_lock_sleep (m=0xffffffff80e60823, > tid=18446742977255365296, opts=Variable "opts" is not available. > ) at /usr/src/sys/kern/kern_mutex.c:407 > 407 owner = (struct thread *)(v & ~MTX_FLAGMASK); > > (kgdb) print m->mtx_lock > $14 = 4 > (kgdb) print v > $15 = 21946368 % printf "%x\n" 21946368 14ee000 Can you print out 'owner' as well? You won't get a panic until you actually dereference 'owner' to get 'owner->td_state' even though gdb will show this as the faulting line (gdb can sometimes get confused by compiler optimization). You are seeing these values because mtx_lock was changed (due to either a mtx_unlock() or a mtx_init()) while you were spinning. That value of v is not what I have typically seen in these panics. Do you also have the original fatal kernel trap messages? -- John Baldwin