From owner-freebsd-stable@FreeBSD.ORG Thu Oct 12 15:56:42 2006 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F39A016A47B for ; Thu, 12 Oct 2006 15:56:41 +0000 (UTC) (envelope-from enatiello@broadviewnet.net) Received: from unix29.broadviewnet.net (smtp-01.broadviewnet.net [64.115.0.67]) by mx1.FreeBSD.org (Postfix) with SMTP id A2CF543D60 for ; Thu, 12 Oct 2006 15:56:26 +0000 (GMT) (envelope-from enatiello@broadviewnet.net) Received: (qmail 33606 invoked by uid 32008); 12 Oct 2006 11:57:47 -0400 Received: from unknown (HELO enatiello-01.broadviewnet.net) (64.115.0.249) by unix29.broadviewnet.net with SMTP; 12 Oct 2006 11:57:47 -0400 From: Ernest Natiello To: Gleb Smirnoff In-Reply-To: <20061012154826.GO59833@cell.sick.ru> References: <20061012091309.GK59833@FreeBSD.org> <20061012101525.GM59833@cell.sick.ru> <1160666283.5159.22.camel@localhost> <20061012154826.GO59833@cell.sick.ru> Content-Type: text/plain Date: Thu, 12 Oct 2006 11:56:21 -0400 Message-Id: <1160668581.5159.24.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@FreeBSD.org Subject: Re: freebsd panic on HP Proliant DL360 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Oct 2006 15:56:42 -0000 here we go: (kgdb) frame 7 #7 0xc072191d in ip_ctloutput (so=0x1, sopt=0xe9226c90) at /usr/src/sys/netinet/ip_output.c:1184 1184 INP_LOCK(inp); (kgdb) p *sopt $1 = {sopt_dir = SOPT_SET, sopt_level = 0, sopt_name = 1, sopt_val = 0x0, sopt_valsize = 0, sopt_td = 0xc73add80} (kgdb) frame 11 #11 0xc06c3ce6 in setsockopt (td=0xc73add80, uap=0x1) at /usr/src/sys/kern/uipc_syscalls.c:1307 1307 return (kern_setsockopt(td, uap->s, uap->level, uap->name, (kgdb) p td->td_proc->p_comm $2 = "tcpserver\000\000\000\000\000\000\000\000\000\000" (kgdb) On Thu, 2006-10-12 at 19:48 +0400, Gleb Smirnoff wrote: > On Thu, Oct 12, 2006 at 11:18:03AM -0400, Ernest Natiello wrote: > E> Hello, > E> Thank you very much for all of the help. I am trying to understand > E> this issue, as it has been plaguing me for quite some time. > E> So, extrapolating from the below kgdb output, am I to assume that > E> the process causing the error is tcpserver? > > Probably it is. However, can you run the gdb commands I mentioned > in previous post, to make us sure. > > E> And should I further infer > E> that tcpserver would cause this issue on all instances of FreeBSD > E> RELENG_6, regardless of hardware? > > I think so. A tcpserver(8) in given configuration. > > E> I have three other servers HP Proliant DL380s (2u) which are > E> operating in a _similar_ capacity, (incoming vs. outgoing mailservers) > E> running the exact same software, which have never had a problem. > E> These three servers are running: FreeBSD unix29 6.1-PRERELEASE > E> FreeBSD 6.1-PRERELEASE #0: Mon Mar 27 10:42:56 EST 2006 > E> root@unix34.broadviewnet.net:/usr/obj/usr/src/sys/UNIX34 i386 > E> The operating system on this machine was rsync'd from one of the > E> servers that is having the panic issue, yet it continues to operate > E> flawlessly. > > The discussed problem is a race between remote client closing TCP > connection (may be resetting?), and local software performing > setsockopt() system call on the same socket. > > It may happen that this particulat server has to deal with clients > that drop the connection randomly, and other servers don't. That's > why other servers are stable. > > E> I guess I could try swapping the services between two of the > E> servers and see if the behavior follows the move. Does that sound > E> viable? > > You can try it. > > And don't forget to run gdb commands, and see what is the actual > socket option that causes the problem. >