From owner-freebsd-stable@FreeBSD.ORG  Thu Oct 12 15:18:10 2006
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@FreeBSD.org
Delivered-To: freebsd-stable@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6FB7616A407
	for <freebsd-stable@FreeBSD.org>; Thu, 12 Oct 2006 15:18:10 +0000 (UTC)
	(envelope-from enatiello@broadviewnet.net)
Received: from unix29.broadviewnet.net (smtp-01.broadviewnet.net [64.115.0.67])
	by mx1.FreeBSD.org (Postfix) with SMTP id 01DDB43D72
	for <freebsd-stable@FreeBSD.org>; Thu, 12 Oct 2006 15:18:05 +0000 (GMT)
	(envelope-from enatiello@broadviewnet.net)
Received: (qmail 22997 invoked by uid 32008); 12 Oct 2006 11:19:29 -0400
Received: from unknown (HELO enatiello-01.broadviewnet.net) (64.115.0.249)
	by unix29.broadviewnet.net with SMTP; 12 Oct 2006 11:19:29 -0400
From: Ernest Natiello <enatiello@broadviewnet.net>
To: Gleb Smirnoff <glebius@FreeBSD.org>
In-Reply-To: <20061012101525.GM59833@cell.sick.ru>
References: <20061012091309.GK59833@FreeBSD.org>
	<E1GXxPc-0009zm-T4@dilbert.firstcallgroup.co.uk>
	<20061012101525.GM59833@cell.sick.ru>
Content-Type: text/plain
Date: Thu, 12 Oct 2006 11:18:03 -0400
Message-Id: <1160666283.5159.22.camel@localhost>
Mime-Version: 1.0
X-Mailer: Evolution 2.6.1 
Content-Transfer-Encoding: 7bit
Cc: freebsd-stable@FreeBSD.org
Subject: Re: freebsd panic on HP Proliant DL360
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Oct 2006 15:18:10 -0000

Hello,
     Thank you very much for all of the help.  I am trying to understand
this issue, as it has been plaguing me for quite some time.
     So, extrapolating from the below kgdb output, am I to assume that
the process causing the error is tcpserver?  And should I further infer
that tcpserver would cause this issue on all instances of FreeBSD
RELENG_6, regardless of hardware?
     I have three other servers HP Proliant DL380s (2u) which are
operating in a _similar_ capacity, (incoming vs. outgoing mailservers)
running the exact same software, which have never had a problem.
     These three servers are running: FreeBSD unix29 6.1-PRERELEASE
FreeBSD 6.1-PRERELEASE #0: Mon Mar 27 10:42:56 EST 2006
root@unix34.broadviewnet.net:/usr/obj/usr/src/sys/UNIX34 i386

     The operating system on this machine was rsync'd from one of the
servers that is having the panic issue, yet it continues to operate
flawlessly.
     I guess I could try swapping the services between two of the
servers and see if the behavior follows the move.  Does that sound
viable?

Thank you very much,
Ernest


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x104
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc0679cd1
stack pointer           = 0x28:0xe9226af0
frame pointer           = 0x28:0xe9226afc
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = resume, IOPL = 0
current process         = 71782 (tcpserver)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 1d7h12m9s
Dumping 2047 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 2047MB (524026 pages) 2032 2016 2000 1984 1968 1952 1936 1920
1904 1888 1872 1856 1840 1824 1808 1792 1776 1760 1744 1728 1712 1696
1680 1664 1648 1632 1616 1600 1584 1568 1552 1536 1520 1504 1488 1472
1456 1440 1424 1408 1392 1376 1360 1344 1328 1312 1296 1280 1264 1248
1232 1216 1200 1184 1168 1152 1136 1120 1104 1088 1072 1056 1040 1024
1008 992 976 960 944 928 912 896 880 864 848 832 816 800 784 768 752 736
720 704 688 672 656 640 624 608 592 576 560 544 528 512 496 480 464 448
432 416 400 384 368 352 336 320 304 288 272 256 240 224 208 192 176 160
144 128 112 96 80 64 48 32 16

#0  doadump () at pcpu.h:165
165             __asm __volatile("movl %%fs:0,%0" : "=r" (td));


On Thu, 2006-10-12 at 14:15 +0400, Gleb Smirnoff wrote:
> On Thu, Oct 12, 2006 at 11:03:36AM +0100, Pete French wrote:
> P> > This is a known problem. It is fixed in HEAD, but unfortunately it
> P> > isn't mergeable to RELENG_6. The problem isn't related to either pf,
> P> > ipf or NIC drivers.
> P> 
> P> This is a little alarming - because what you seem to be saying is that
> P> if you have DL360's then you need to either run current, or accept that
> P> they will panic every so often for as long as you are running RELENG_6.
> P> We are looking to change our hardware soon, and DL360's were top of the
> P> list for replacements!
> 
> Again, this has nothing to do with hardware. It is general problem in RELENG_6.
> 
> P> Is there a PR reference for this describing the solution to the problem
> P> in HEAD somewhere that I could take a look at ?
> 
> The problem wasn't fixed with a single commit. Maybe Robert, who is carbon
> copied,  can provide more details.
>