From owner-freebsd-stable@FreeBSD.ORG Mon Nov 4 22:52:53 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id D7EC575D for ; Mon, 4 Nov 2013 22:52:53 +0000 (UTC) (envelope-from shawn@wallbridge.net) Received: from mail.wallbridge.net (mail.wallbridge.net [72.55.175.210]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 54A4F293B for ; Mon, 4 Nov 2013 22:52:52 +0000 (UTC) Received: from [192.168.1.100] (rrcs-24-43-36-195.west.biz.rr.com [24.43.36.195]) by mail.wallbridge.net (OpenSMTPD) with ESMTP id f13707b7 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128); Mon, 4 Nov 2013 16:52:42 -0600 (CST) Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1812\)) Subject: Re: 9.2-RELEASE Kernel panic, mbuf underflow From: Shawn Wallbridge In-Reply-To: <20131102195425.GI73243@funkthat.com> Date: Mon, 4 Nov 2013 14:52:39 -0800 Message-Id: References: <20131102195425.GI73243@funkthat.com> To: John-Mark Gurney X-Mailer: Apple Mail (2.1812) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Nov 2013 22:52:53 -0000 On Nov 2, 2013, at 12:54 PM, John-Mark Gurney wrote: > Shawn Wallbridge wrote this message on Tue, Oct 29, 2013 at 21:37 = -0700: >> I have a file server that keeps panic?ing with a mbuf cluster in the = 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer = underflow. >=20 > Ok, after some tracking stuff down, I do not think it has anything to > do w/ mbufs, as the stats appear to be correct... The problem is that > mbuf clusters takes into the fact that some clusters might be still > associated w/ packets (from usr.bin/netstat/mbuf.c): > printf("%ju/%ju/%ju/%ju mbuf clusters in use " > "(current/cache/total/max)\n", > cluster_count - packet_free, cluster_free + packet_free, > cluster_count + cluster_free, cluster_limit); >=20 > notice how current is cluster_count - packet_free instead of something > like cluster_count - cluster_free... And I just printed your values > from vmcore.6, and apparently packet_count is 0, while packet_free is > 5215... >=20 > cluster_count is 2049, cluster_free is 1997.. >=20 > And because packet is a secondary zone of mbufs, things apparently get > confused... So I wouldn't go down this road anymore... This looks > like a simple race/accounting error in the status... >=20 >> I have opened a PR, but I haven?t had any movement on it. This = happened while I was running 9.1-RELEASE as well. >>=20 >> Here is the PR.. >>=20 >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D183424 >>=20 >> And I have uploaded the crash dumps here.. >>=20 >> http://www.wallbridge.net/crash/ >>=20 >> If anyone has any ideas, I would be grateful as this is a production = box and it?s really impacting us.=20 >=20 > Have you done a full fsck on the fs to make sure that there isn't any > corruption on the disk that keeps popping up? I do realize that it > will take a LONG time to fsck... Sadly, you're last three cores > (all on 9.2-R) are for different inodes... >=20 > Could you tell me the path and filename of inodes: 3226539015, > 3224134148 and 3343904256? It could help us track down which app is > causing this and being able to reproduce this... >=20 > To find the inode on the fs use find -inum , so: > find -inum 3226539015 -or -inum 3224134148 -or -inum 3343904256 >=20 > will do it in one pass so it won't take so long... >=20 > Thanks. >=20 > --=20 > John-Mark Gurney Voice: +1 415 225 5579 >=20 > "All that I will do, has been done, All that I have, has not." > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org=94 Just wanted to update the list on this,=20 The machine has now crashed with the INVARIANTS kernel, the kernel dumps = are here.. wallbridge.net/crash/20131104/core.txt.4.gz wallbridge.net/crash/20131104/info.4.gz wallbridge.net/crash/20131104/vmcore.4.gz shawn