From owner-freebsd-questions@FreeBSD.ORG Tue Sep 9 18:22:26 2003 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AD52316A4BF for ; Tue, 9 Sep 2003 18:22:26 -0700 (PDT) Received: from dragoncrest.jasnetworks.net (dragoncrest.jasnetworks.net [65.194.254.12]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D5A643F85 for ; Tue, 9 Sep 2003 18:22:25 -0700 (PDT) (envelope-from dragoncrest@voyager.net) Received: from j2v6e9.voyager.net (works.jasnetworks.net [192.168.0.2]) h89KPWGb026479; Tue, 9 Sep 2003 20:25:33 GMT (envelope-from dragoncrest@voyager.net) Message-Id: <5.2.0.9.2.20030910092343.00a17d10@pop.voyager.net> X-Sender: dragoncrest@pop.voyager.net X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Wed, 10 Sep 2003 09:31:27 -0700 To: =?iso-8859-1?Q?Rapha=EBl?= Marmier From: Dragoncrest In-Reply-To: References: <5.2.0.9.2.20030910053700.01d83680@pop.voyager.net> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable cc: matt@pc1-nott2-3-cust18.nott.cable.ntl.com cc: freebsd-questions@FreeBSD.ORG Subject: Re: Mail server reboot while upgrading ports X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Sep 2003 01:22:26 -0000 At 01:17 AM 9/10/03 +0200, Rapha=EBl Marmier wrote: >heat? Both could be overheating the same way if it is the same hardware in= =20 >the same room under identical conditions. Try to move one to the fridge=20 >and see if it stop freezing ;) Actually, it's rebooting randomly during installs or upgrades=20 only. It did it once due to spam assassign, but that was a long time=20 back. I did however encounter something of interest that might give us=20 some insight into this. Even though I'd periodically have failed builds or= =20 installs, or even reboots at random during this, I always seemed to get a=20 lot of this one ruby error. Here's two examples. /usr/local/lib/ruby/site_ruby/1.6/pkginfo.rb:45: [BUG] rb_gc_mark():=20 unknown data type 0x7(0x8053e04) corrupted object ruby 1.6.8 (2003-03-26) [i386-freebsd4] /usr/local/lib/ruby/site_ruby/1.6/pkgdb.rb:336: [BUG] rb_gc_mark(): unknown= =20 data type 0x7(0x82024dc) corrupted object ruby 1.6.8 (2003-03-26) [i386-freebsd4] I've since done a pkgdb -F to see if that would help and it did=20 find one small error in the database. So not sure if that fixed it or=20 not. I am however fully completed with all the upgrades after 5 painful=20 hours, so that's at least good. That means I shouldn't have to touch this= =20 for a while longer at which this problem may crop up its ugly head. I'll=20 keep an eye on it and report anything new I find. I don't expect this to=20 be easy to solve. But we'll keep looking for clues. Speaking of clues, is there a way to log everything that goes on=20 in a TTY session? I've noticed that when this thing crashes it prints=20 something to the screen (I can't see it cause I'm away at another desk when= =20 it does it) about the crash and a reboot in 15 seconds but I'm never fast=20 enough over there to catch it. I want to try and catch that in a file if=20 possible so I better know what's wrong with this thing. I'm sure it has=20 something to do with adding a switch to the logging system, but I can't=20 find where to add the switch and which one would do it. Any input would be= =20 welcome. Thanks.