From owner-freebsd-stable@FreeBSD.ORG Sat Jun 6 14:33:36 2009 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3409C106566B for ; Sat, 6 Jun 2009 14:33:36 +0000 (UTC) (envelope-from nakaji@jp.freebsd.org) Received: from www.heimat.gr.jp (unknown [IPv6:2001:3e0:a84::1]) by mx1.freebsd.org (Postfix) with ESMTP id ED8248FC16 for ; Sat, 6 Jun 2009 14:33:35 +0000 (UTC) (envelope-from nakaji@jp.freebsd.org) X-Virus-Scanned: amavisd-new at heimat.gr.jp Received: from ra333.heimat.gr.jp.kankyo-u.ac.jp (ra333.heimat.gr.jp [IPv6:2001:3e0:a84:0:200:4cff:fe17:573c]) by www.heimat.gr.jp (8.14.3/8.14.3) with ESMTP id n56EXNiA033342 for ; Sat, 6 Jun 2009 23:33:24 +0900 (JST) (envelope-from nakaji@jp.freebsd.org) From: NAKAJI Hiroyuki To: freebsd-stable@FreeBSD.ORG Date: Sat, 06 Jun 2009 23:33:23 +0900 Message-ID: <86d49h300c.fsf@ra333.heimat.gr.jp> User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.0.94 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Status: No, score=-5.9 required=13.0 tests=AWL,BAYES_00, CONTENT_TYPE_PRESENT,NO_RELAYS,QENCPTR1 autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on www.heimat.gr.jp Cc: Subject: Big problem still remains with 7.2-STABLE locking up X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Jun 2009 14:33:36 -0000 Hi, I noticed, some months ago, frequent lockups on my RELENG_6 server with ECS PM800-M2, Celeron 2.6GHz (UP), 2GB ram, ATA HDDs and 3Com NIC(xl0), and then I gave up this old server. Last month, I replaced this 'unstable' server to the new one with 7.2-RELEASE which worked very well until I setup it as 'a server'. The problem began just after it started 'the services'. My story is very similar to Pete's. http://lists.freebsd.org/pipermail/freebsd-stable/2009-January/047487.html I followed some instructions in the list thread. But unfortunately, the big problem still remains. 7.2-STABLE server locks up frequently. Help! :-( The server is NEC Express5800 S70/SD. o CPU: Intel(R) Celeron(R) CPU 440 @ 2.00GHz (2280.25-MHz K8-class CPU) o 6GB RAM o ACPI APIC Table: o 80GB and 250GB SATA HDDs o http://www.heimat.gr.jp/~nakaji/localhost/dmesg.boot The kernel configuration is: include GENERIC ident HEIMAT options MSGBUF_SIZE=81920 makeoptions DEBUG=-g options KDB options DDB options BREAK_TO_DEBUGGER options QUOTA options DEVICE_POLLING options HZ=1000 options SW_WATCHDOG options DEBUG_VFS_LOCKS options INVARIANTS options INVARIANT_SUPPORT options WITNESS options WITNESS_SKIPSPIN options LOCK_PROFILING This server runs as web server, nfs server, dhcp server, ntp server, mail server with spam checks, ML server, usenet server and so on. From /etc/rc.conf*, there are some "_enable" lines as shown below. o ntpdate o ntpd o nfs_server o sshd o inetd o named o sendmail o rtadvd o watchdogd o dhcpd o snmpd o apache22 o samba o zope29 o zope210 o amavisd o amavisd_milter o cvsupd o ntop o compat6x o munin_node o spamd o spamass_milter o smartd o mailman o sshblock o innd o skkserv >From munin's graphs, the 'resets' value in netstat is increasing while on other 'desktops' it remains zero. Though I did not find if there is a threshold of 'resets', when it reaches to 0.8 - 1.2 the server gets "lockup". No ping response, no messages on cosole, no keyboard response, and, of cource, Ctrl-Alt-Esc does not function, when it locks up. I wonder why netstat's reset is increasing. I had learned a workaround from other Japanese guys, that is, enabling ichwd and running watchdogd can reboot the box when it locks up if the box has ICH. Exactly, after about 4 hours, the box rebooted while I was in bed last night. Watchdogd functions very well. Advice? Thanks. -- NAKAJI Hiroyuki