From owner-freebsd-stable@FreeBSD.ORG Tue Jul 22 20:05:20 2008 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 799FC1065688 for ; Tue, 22 Jul 2008 20:05:20 +0000 (UTC) (envelope-from royce@alaska.net) Received: from iris.acsalaska.net (iris.acsalaska.net [209.112.173.229]) by mx1.freebsd.org (Postfix) with ESMTP id 3880A8FC0A for ; Tue, 22 Jul 2008 20:05:20 +0000 (UTC) (envelope-from royce@alaska.net) Received: from [10.0.102.101] (209-112-156-43-adslb0fh.acsalaska.net [209.112.156.43]) by iris.acsalaska.net (8.14.1/8.14.1) with ESMTP id m6MJjUOk025326; Tue, 22 Jul 2008 11:45:30 -0800 (AKDT) (envelope-from royce@alaska.net) Message-ID: <488638DA.7010005@alaska.net> Date: Tue, 22 Jul 2008 11:45:30 -0800 From: Royce Williams User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: freebsd-stable@FreeBSD.org X-Enigmail-Version: 0.95.6 OpenPGP: url=http://www.tycho.org/royce/royce@alaska.net.asc X-Face: ">19[ShfDD9'g", GrH$'v:=qBVZdg.kXSBR6*ZC$am:D Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-ACS-Spam-Status: no X-ACS-Scanned-By: MD 2.63; SA 3.2.4; spamdefang 1.122 Cc: Subject: 6.3-RELEASE-p3 recurring panics on multiple SM PDSMi+ X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jul 2008 20:05:20 -0000 We have 10 SuperMicro PDSMi+ 5015M-MTs that are panic'ing every few days. This started shortly after upgrade from 6.2-RELEASE to 6.3-RELEASE with freebsd-update. Other than switching to a debugging kernel, a little sysctl tuning, and patching with freebsd-update, they are stock. The debugging kernel was built from source that is also being patched with freebsd-update. These systems are running postfix and Courier imapd for an ISP with a userbase on the order of 10^4 users. They use gmirror, but the mailstore is over NFS. That NFS server is under pretty high load. All of the servers with this app and load pattern are panic'ing. I have little experience with kernel debugging, but the box in question is out of our farm and available for testing, and I am motivated to cooperate. :-) The full debugging kernel options I used are: include SMP options KDB options KDB_TRACE options DDB options BREAK_TO_DEBUGGER options WITNESS options WITNESS_SKIPSPIN db> trace Tracing pid 71182 tid 100325 td 0xcc08b180 kdb_enter(c095f294) at kdb_enter+0x2b panic(c09768ad,1000,14000000,c145bc88,1000,...) at panic+0x127 kmem_malloc(c14680c0,1000,102,eba6a8cc,c07e3fa5,...) at kmem_malloc+0x89 page_alloc(c1453780,1000,eba6a8bf,102,c06b8a84,...) at page_alloc+0x1a slab_zalloc(c1453780,102,c14537e0,c1453780,c1460d5c,...) at slab_zalloc+0xa1 uma_zone_slab(c1453780,2) at uma_zone_slab+0xf0 uma_zalloc_bucket(c1453780,2) at uma_zalloc_bucket+0x11c uma_zalloc_arg(c1453780,0,2) at uma_zalloc_arg+0x24c cache_enter(cd02c220,c9e62880,eba6a9fc) at cache_enter+0xa6 nfs_readdirplusrpc(cd02c220,eba6aa60,cc0ab880) at nfs_readdirplusrpc+0x6a6 nfs_doio(cd02c220,dce59668,cc0ab880,cc08b180,dce59668,...) at nfs_doio+0x20f nfs_bioread(cd02c220,eba6acb0,0,cc0ab880) at nfs_bioread+0xa64 nfs_readdir(eba6ac90) at nfs_readdir+0xe6 VOP_READDIR_APV(c09ebbc0,eba6ac90) at VOP_READDIR_APV+0x38 getdirentries(cc08b180,eba6ad04) at getdirentries+0x146 syscall(3b,3b,3b,9085f00,9085f00,...) at syscall+0x22f Xint0x80_syscall() at Xint0x80_syscall+0x1f --- syscall (196, FreeBSD ELF32, getdirentries), eip = 0xb825a79b, esp = 0xbfbfa1fc, ebp = 0xbfbfa228 --- Royce -- Royce D. Williams - http://royce.ws/ I don't like that man. I must get to know him better. - A. Lincoln