From owner-freebsd-current@FreeBSD.ORG Wed Jun 5 09:50:58 2013 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 73372B54 for ; Wed, 5 Jun 2013 09:50:58 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from cell.glebius.int.ru (glebius.int.ru [81.19.69.10]) by mx1.freebsd.org (Postfix) with ESMTP id F3D691EA4 for ; Wed, 5 Jun 2013 09:50:57 +0000 (UTC) Received: from cell.glebius.int.ru (localhost [127.0.0.1]) by cell.glebius.int.ru (8.14.6/8.14.6) with ESMTP id r559ohAB099420; Wed, 5 Jun 2013 13:50:43 +0400 (MSK) (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by cell.glebius.int.ru (8.14.6/8.14.6/Submit) id r559ohO3099419; Wed, 5 Jun 2013 13:50:43 +0400 (MSK) (envelope-from glebius@FreeBSD.org) X-Authentication-Warning: cell.glebius.int.ru: glebius set sender to glebius@FreeBSD.org using -f Date: Wed, 5 Jun 2013 13:50:43 +0400 From: Gleb Smirnoff To: Ian FREISLICH Subject: Re: Recurring panic Message-ID: <20130605095043.GB67170@glebius.int.ru> References: MIME-Version: 1.0 Content-Type: text/plain; charset=koi8-r Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: current@freebsd.org X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Jun 2013 09:50:58 -0000 On Wed, Jun 05, 2013 at 10:18:21AM +0200, Ian FREISLICH wrote: I> I have the following recurring panic on all my heavily network I> loaded -CURRENT routers. The current process is always different. I> I> Gleb, can you please chime in with what you've managed to uncover. The panics appear on selfd mutex. The mtx_lock value is a free mutex, but it has 1 extra bit set: (kgdb) p/x sfp->sf_mtx->mtx_lock $3 = 0x1000004 Rarely (only one panic observed) more than one bit is set: $3 = 0x21000004 It is important that selfd mutexes are taken from mtxpool(9), which is allocated at a early boot stage. Thus, across reboots all possible sfp->sf_mtx mutexes usually fall into the same virtual memory region. I'm not sure, but I suppose, they fall into same physical region. This can lead one to idea that RAM in the box has problems. But it is running ECC memory, and it doesn't experience other random panics. The only special about the box is that it is running pf(4) with huge ruleset and a lot of traffic. So the pf(4) is the number one suspected, albeit it isn't closely related to selfds. -- Totus tuus, Glebius.