From owner-freebsd-stable@FreeBSD.ORG Mon Jun 15 16:14:09 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 83070106568E for ; Mon, 15 Jun 2009 16:14:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 54DA88FC1E for ; Mon, 15 Jun 2009 16:14:09 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 0BB5346B0C; Mon, 15 Jun 2009 12:14:09 -0400 (EDT) Received: from jhbbsd.hudson-trading.com (unknown [209.249.190.8]) by bigwig.baldwin.cx (Postfix) with ESMTPA id EC92C8A072; Mon, 15 Jun 2009 12:14:07 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Mon, 15 Jun 2009 08:54:57 -0400 User-Agent: KMail/1.9.7 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200906150854.58042.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 15 Jun 2009 12:14:08 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00, DATE_IN_PAST_03_06,RDNS_NONE autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: pluknet Subject: Re: lockup on 6.4 while bce in MGETHDR X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Jun 2009 16:14:10 -0000 On Monday 15 June 2009 4:49:06 am pluknet wrote: > Hi. > > This is on 6.4-S as of April with uptime about 2 months. > Machine could not be accessed via network, didn't reply on ping. > Looking at backtrace it's here: > > if_bce.c::bce_get_buf(): > /* This is a new mbuf allocation. */ > MGETHDR(m_new, M_DONTWAIT, MT_DATA); > > > >From brk-seq: > > db> bt > Tracing pid 31 tid 100034 td 0xc833ad00 > kdb_enter(c097ef95) at kdb_enter+0x2b > siointr1(c83a3c00) at siointr1+0xce > siointr(c83a3c00) at siointr+0x5e > intr_execute_handlers(c80ee4c8,e8963abc,4,e8963b0c,c08cba63,...) at > intr_execute_handlers+0xe1 > lapic_handle_intr(38) at lapic_handle_intr+0x2e > Xapic_isr1() at Xapic_isr1+0x33 > --- interrupt, eip = 0xc06a393e, esp = 0xe8963b00, ebp = 0xe8963b0c > --- > _mtx_lock_sleep(c0a655c0,c833ad00,0,0,0) at _mtx_lock_sleep+0xb6 > kmem_malloc(c14680c0,1000,101,e8963b8c,c082cadd,...) at > kmem_malloc+0x328 > page_alloc(c1456000,1000,e8963b7f,101,c8b10016,...) at page_alloc+0x1a > slab_zalloc(c1456000,101,0,d278ca90,c915aa3c,...) at slab_zalloc+0xdd > uma_zone_slab(c1456000,1) at uma_zone_slab+0xf0 > uma_zalloc_bucket(c1456000,1) at uma_zalloc_bucket+0x15c > uma_zalloc_arg(c1456000,d0729e00,1) at uma_zalloc_arg+0x292 > bce_get_buf(c8363000,0,e8963c7c,e8963c7e,e8963c80) at bce_get_buf+0xef > bce_fill_rx_chain(c8363000,7047,d0729000,cbd86000,59375b37,...) at > bce_fill_rx_chain+0x48 > bce_rx_intr(c8363000) at bce_rx_intr+0x301 > bce_intr(c8363000) at bce_intr+0xf4 > ithread_execute_handlers(c8337c90,c8230a80) at > ithread_execute_handlers+0x125 > ithread_loop(c835f860,e8963d38) at ithread_loop+0x55 > fork_exit(c0694a38,c835f860,e8963d38) at fork_exit+0x71 > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xe8963d6c, ebp = 0 --- > > db> ps > pid ppid pgrp uid state wmesg wchan cmd > 17102 17100 15549 0 L *vm page 0xcce93480 awk > 17101 17100 15549 0 L *Giant 0xc8737480 grep > 17100 17098 15549 0 S wait 0xc8726a78 sh > 17098 15570 15549 0 S wait 0xc98c2a78 sh > 16935 30771 30771 8382 RL httpd > 16656 16349 16349 10346 S biord 0xdc389368 php > 16564 16460 16460 18332 SL vmpfw 0xc365da98 lynx > 16563 16351 16351 18332 LL *vm page 0xcce93480 lynx I suspect these are your real problems. Try doing 'show lockchain 16563' or 'show lockchain 17101'. -- John Baldwin